Skip to content

MuhammadKhosravi/Crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Crawler

This is the answer to HW3 of Modern Information Retrieval, Fall 2022.

scraper.py is the implementation of a scraper that will get the news of Hamshahri website, in the specified interval. The result is stored in dataset.csv.

Indexing.ipynb is a jupyter-notebook that is responsible for storing the data in dataset.csv in an Elasitcsearch index.

Query.ipynb is a jupyter-notebook, in which I implemented a bunch of retrieval methods, inclueding boolean, tf-idf, fasttext, and also storing in elasticsearch.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published