4 posts tagged with "Scala"

NYC taxi Data AnalysisMay 3, 2020

This project demonstrates the Analysis of NYC Taxi Data. We have approx 1.4 billion taxi rides between 2009 and 2016 (approx 400 GB uncompressed CSV Or approx 35 GB snappy parquet). We have analyzed most pickup/drop off zones, peak hours for taxi, trip distribution, peak hours for trips, top 3 pickup/drop, how people are paying, how payment type evolved with Time, Ride Sharing Opportunity.

Spark
Scala
SBT
Click Stream AnalysisMar 28, 2020

This project demonstrates the aggregation of data on a rolling window of events (Not necessarily time). We capture click streams via a web page hosted on Akka HTTP Webserver. The click events sent to Kafka are read by spark structured streaming App to perform the aggregations.

Spark Structured Streaming
Scala
Kafka
Akka HTTP
SBT
Tweets AnalysisMar 22, 2020

This project analyzes tweets to draw various insights, viz average tweet length, most popular Hashtag, sentiment analysis on Tweets, and Sentiment Analysis on COVID Tweets.

Spark Streaming
Scala
SBT
File SystemMay 1, 2019

This project emulates an Immutable File System in Scala. We support mkdir, ls, pwd, touch, cd, rm, echo commands

Scala