This project is a web scraper for extracting school test results from the NECTA website. The scraper collects data from nested links on a specified page and consolidates the results into an Excel spreadsheet.
Ensure you have Python installed. You will also need to install the following Python libraries:
requests
beautifulsoup4
pandas
openpyxl
You can install these libraries using pip:
pip install requests beautifulsoup4 pandas openpyxl
-
Clone the Repository
git clone https://github.com/yourusername/school-test-results-scraper.git cd school-test-results-scraper
-
Edit the Script (if necessary)
Update the
url
variable inscraper.py
if you want to scrape a different page. -
Run the Script
python scraper.py
-
Check the Output
The results will be saved in an Excel file named
school_test_results.xlsx
in the same directory.
- Ensure the structure of the scraped data matches the defined schema. Adjust the
columns
list if necessary. - This script currently assumes that the second table on each page contains the desired data. Modify the script as needed for different structures.
This project is licensed under the MIT License.
Feel free to adjust the instructions or content based on your specific requirements or preferences.