Skip to content
#

stemming-porters

Here are 27 public repositories matching this topic...

Data consists of tweets scrapped using Twitter API. Objective is sentiment labelling using a lexicon approach, performing text pre-processing (such as language detection, tokenisation, normalisation, vectorisation), building pipelines for text classification models for sentiment analysis, followed by explainability of the final classifier

  • Updated Apr 3, 2022
  • Jupyter Notebook

In this project I Preprocess the movie dataset into a more convenient format, and then transform them into feature vector through CountVectorizer then Assessed word relevancy via term frequency-inverse document frequency i ALSO TRIED hashing vectorizer for memory efficienct on large data and using SDG CLASSIFER

  • Updated Apr 24, 2026
  • Jupyter Notebook

Implemented a Content-based recommender using CountVectorizer on 4.8K+ TMDB movies, leveraging 5K-dimensional sparse vectors & selecting Cosine similarity over Euclidean distance for NLP relevance.

  • Updated Jan 17, 2026
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the stemming-porters topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the stemming-porters topic, visit your repo's landing page and select "manage topics."

Learn more