Update readme

michaelglenister · michaelglenister · commit 07a4be331287 · 2024-01-25T17:34:36.000+02:00
diff --git a/pombola/south_africa/data/members-interests/NEW_README.md b/pombola/south_africa/data/members-interests/NEW_README.md
@@ -7,7 +7,7 @@ There are several files in this directory:
 The scraper currently scrapes `.docx` files.
 To prepare the file:
 
-1. Split the `PDF` into seperate files small enough to open in Google Docs. PDF Arranger works well https://github.com/pdfarranger/pdfarranger 
+1. Split the `PDF` into seperate files small enough to open in Google Docs. [PDF Arranger](https://github.com/pdfarranger/pdfarranger) works well 
 2. Open the files in Google Docs and download each in `.docx` format
 3. Store the these files in `./docx_files/`
 
@@ -20,23 +20,6 @@ Run the script `html_to_json.py` to scrape the HTML and compile into an easy to
 
 The output should be `register.json`
 
-## Raw data
-
-    2010.json
-    2011.json
-    2012.json
-    2013.json
-    2014.json
-    2015.json
-    2016.json
-    2017.json
-    2018.json
-
-These are the JSON files provided to us by Geoff. They are unchanged and are (I
-believe) generated by scraping code that he has from the PDFs mentioned in
-them. For me these PDF urls 404ed so I was not able to look at the original
-source material.
-
 
 ## Conversion script