Streaming Twitter Sentiment Analysis with AWS & PySpark

About this project

The goal of this project was to familiarize myself with AWS tools and processes (especially EC2, S3, Athena and Firehose) + large, unwieldy, dirty datasets and distributed computing software like Databricks by creating a streaming dashboard/pipeline. When operational, it can provide a live update of tweets and some general sentiment information about a word of my choosing (for this example, I was interested in the sentiment around the word "racism").


