This project implements a simple search engine, which consists of web crawling, creating the inverted index and serving the results through a query processor that ranks the documents based on TF-IDF cosine similarity. Multi-threading is supported in all three stages of the application.
- Web crawler: Python 3.9 and Anaconda
- Indexer: Java 8 and Gradle v6.7
- Query processor: NodeJS v15.8
- Database: MongoDB v4.4
Alternatively, you can use Docker and Docker Compose to run each and every component:
$ docker-compose build
$ docker-compose up
- Konstantinos Papakostas (papakosk@csd.auth.gr)
- Christina Kreza (chriskreza@csd.auth.gr)