GitHub - EstebanMqz/Web-scraper: HTML HTTP GET requests for dynamic/client-sided web-scrapping purposes other than traditional static caching protocols.

Formatted / indexed web-scrapper

Web-scrapper tool for metadata extraction purposes using HTTP GET requests.
For complete attribute structural inspections inherent in code's granularity.

Technique different than those provided by web-development tools:

View-source (ctrl+U) → HTML View-source prefix.
Raw code inspection for search-engines.
Inspect Element (ctrl+shift+I) → Attributes inspection.

Web-development View-source Save as: Complete HTML, Single HTML, HTML only traditional methods generally provide unreliable or incomplete information from websites, particularly if they are using dynamic and client-sided scripts.

Usage:

.sh

Terminal

$ ./html-extractor.sh Enter a URL: https://estebanmqz.github.io/EstebanMqz/html/Resume.html Do you want to extract the raw code to a temporary file? (Y/N): Y Enter a filename to save the raw code: Resume Raw code extracted to Resume.html opening..

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Resume.html		Resume.html
html-extractor.sh		html-extractor.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Formatted / indexed web-scrapper

See also:

About

Releases

Packages

Languages

License

EstebanMqz/Web-scraper

Folders and files

Latest commit

History

Repository files navigation

Formatted / indexed web-scrapper

See also:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages