CricTwee - Tweet Analysis for the Game of Cricket

The major focus of the project was on analyzing the filtered tweets to carry out following NLP tasks -

  • Sentiment Analysis of tweets for unpleasant, sad, neutral, happy, ecstatic classes and evaluation of the results - Naive Bayes Classification
  • Named Entity Recognition for players, teams, venues and locations - Gazzets
  • Clustering the tweets based on cosine similarity - TFIDF, Cosine Similarity, k-means
  • Summarizing tweets based on higher relevance scores for events and peak moments in the game - k-means++ and Gazzets

Dataset - Used Tweepy to capture live feed and manually annotated around 1000 tweets, defined gazettes for NER, Events
Technologies/Platforms - Python, HTML, CSS, D3.js, Javascript
Tools - Tweepy, NLTK, k-means
Results(clockwise) - Sentiment Analysis, Named Entity Recognition, Clustering, Summarization

Duration - Apr, 2015 - May, 2015
Team Members - 2
Links -