by 3.14rates
Google Challenge: Develop a local search engine to quickly search for a set of keywords through around 150,000 news articles dataset. Some examples of keywords are: "work desk", "presidential election", "Olympic closing ceremony" or "documentary". Search engine should output a ranked list of 5 articles which are the best match based on the given keywords and the time it took for the query to execute.
Our solutions mainly consists of preprocessing the input data (tokenization, lemmatization) and using ranking functions as tf-idf and bm25. We were awarded an honorable mention (TOP-3) at the hackathon.