Twitter Analytics

Performed ETL (Extract, Transform, Load) on 1 TB of Twitter data using PySpark on AWS
Processed a massive collection of 200 million tweets into an efficiently designed MySQL structure, utilizing indexing and sharding techniques.
Developed a scalable microservice using Vertx to retrieve data from the backend efficiently.
Implemented an Elastic Load Balancer and Auto Scaling Rules in AWS to achieve a desired throughput of 100,000 requests per second.
Tech Stack: AWS, PySpark, Vertx, Java