This NodeJS project intends to Crawl through a given Web URL within its domain
- Npm
- NodeJs
- Clone the Project and go to the source directory
- Npm install
- node index.js (without promises)
- node app_promises.js (with promises library)
- The script goes to a url and collects all the links from the page, and then, proceeds to visit and collect the urls from the previously collected urls.
- We control the number of urls the script is visiting at a time, using a throttle limit.
- Also, we maintain two arrays here to control the web urls to visit and already visited urls.
- Once, the program has visited all the links in the domain or, is stopped by the User(ctrl + c), the visited links are written to 'visitedLinks.csv'.