Pulls data from NASA DONKI API, and stages data in S3 to load to Redshift using boto3
Builds an ETL pipeline of album information by The Beatles using the Spotify API, and loads to PostgreSQL
Migrates hundreds of Excel files with the same format to SQL using Python.
Pulls TV season episode information for longest running TV series and stores in NoSQL database.
Streams coordinates of the International Space Station (ISS) using Kafka
Places Johns Hopkins University COVID-19 data into S3 bucket, and is processed using PySpark in Databricks to create a [dashboard](https://covid-19-jacobs.herokuapp.com/).
Reads federal government consumer complaints csv and aggregates summary statistics.
Scrapes data on *Animal Crossing* villager popularity, joins to Kaggle table of villager traits, and appends to MySQL table using a CRON job.
Sentiment analysis conducted on the Harry Potter book series using a naive bayes classifier and natural language processing.
A simple find and replace script developed to anonymize confidential code in multiple text-based files without having to physically open them.