From fb09589b4c3f911b9f3585868a663d7dd6847da8 Mon Sep 17 00:00:00 2001 From: michaelglenister Date: Tue, 30 Jan 2024 15:25:19 +0200 Subject: [PATCH] Update readme --- .../south_africa/data/members-interests/NEW_README.md | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/pombola/south_africa/data/members-interests/NEW_README.md b/pombola/south_africa/data/members-interests/NEW_README.md index 9ef89bbfa..05b7dfe98 100644 --- a/pombola/south_africa/data/members-interests/NEW_README.md +++ b/pombola/south_africa/data/members-interests/NEW_README.md @@ -11,8 +11,17 @@ To prepare the file: 2. Open the files in Google Docs and download each in `.docx` format 3. Store the these files in `./docx_files/` +Create an environment and install dependencies using +``` +virtualenv venv +source venv/bin/activate +pip install -r requirements.txt +``` + Run the script with the necessary arguments, e.g. -`python scrape_interests_docx.py --input ./docx_files/ --output ../2021.json --year 2021 --source https://static.pmg.org.za/Register_of_Members_Interests_2021.pdf` +``` +python scrape_interests_docx.py --input ./docx_files/ --output ../2021.json --year 2021 --source https://static.pmg.org.za/Register_of_Members_Interests_2021.pdf +``` This will combine documents into a single HTML file `main_html_file.html`