Web Search Engine developed in Java
Features Used
- Web Crawler
- Pattern Matching
- HTML to Text
- Searching Word
- Word Suggestion
- Page Ranking
Concepts Used (till now):
- KMP algorithm --- for efficient word searching
- Edit Distance --- for alternative word suggestion
- Html to Text --- to convert html files to text files
- jsoup -- java library to fetch URLs and extract data
- Merge Sort -- for ranking web pages on basis of word occurence
- Hashtable -- indexing the file