An Information retrieval system using ranked retrieval coded from scratch in Python
-
Updated
May 22, 2020 - Jupyter Notebook
An Information retrieval system using ranked retrieval coded from scratch in Python
Hybrid RecSys, CF-based RecSys, Model-based RecSys, Content-based RecSys, Finding similar items using Jaccard similarity
A Web based Domain Specific Search Engine in Python
TF-IDF scores and visualizations for documents produced over time
First story detection using shingling, LSH and graphical methods
This was a HTML web scraping project with Python's libraries. The objective of the project was to extract user's comments in "mac power user" forum, cleanse data, tokenize text/comments, classify and store the words in datafrom.
Predicted geo-location of 80,000 tweets based on just its contents by finding Location Indicative Words and achieved 74% accuracy
Calculate the TF-IDF score using parallel algorithms
This is NLP based project, completed during FALL of 2020 for CSE 4022 - Natural Language Processing. Nepali Text Summarizer circulates on the idea of tf-idf and cosine similarity.
TF-IDF (Term frequency, Inverse Document Frequency) is an algorithm or way to score the importance of words (or 'terms') based on how frequently they appear
A complete search engine experience built on top of 75 GB Wikipedia corpus with subsecond latency for searches. Results contain wiki pages ordered by TF/IDF relevance based on given search word/s. From an optimized code to the K-Way mergesort algorithm, this project addresses latency, indexing, and big data challenges.
A crowdsourced search engine, which will return the specifics about professors depending on various types of search queries.
Add a description, image, and links to the tf-idf-score topic page so that developers can more easily learn about it.
To associate your repository with the tf-idf-score topic, visit your repo's landing page and select "manage topics."