Skip to content

Latest commit

 

History

History
27 lines (16 loc) · 681 Bytes

README.md

File metadata and controls

27 lines (16 loc) · 681 Bytes

Wikipedia Search Engine

A search engine for searching Wikipedia XML dumps.

Requirements

python3 and nltk library is required to run the search engine.

Creation of Inverted Index

To create the inverted index, run the following command.

./index.sh <path_to_wiki_dump_file> <path_to_index_folder>

The arguments are the absolute paths to the Wikipedia XML dump file and the folder where the inverted index is to be created and stored.

Querying

To search, run the following command.

./search.sh <path_to_index_folder>

Enter the query one by one to get the top 10 ranked results.