diff --git a/README.md b/README.md index 4c52fadd53..15926691e7 100644 --- a/README.md +++ b/README.md @@ -6,6 +6,30 @@ The United Nations Code for Trade and Transport Locations is a code list mantain Data comes from the [UNECE page](http://www.unece.org/cefact/locode/welcome.html), released at least once a year. +## Preparation + +As the original release files have encoding problems, we need to process both the mdb and the csv release. +To build the dataset we use the csv version of the current edition. + +Tools needed: [MDBTools](http://mdbtools.sourceforge.net/) and [CSVKit](https://github.com/onyxfish/csvkit). +Download the current edition from [UNECE](https://www.unece.org/cefact/codesfortrade/codes_index.html) and put it into the root directory. +Then execute ```bash scripts/prepare_edition_mdb.sh loc{ed}mdb.zip```, where {ed} identify the release. + +To integrate the data from the csv then run the python file + +Prerequisites: + +``` +pip install pandas titlecase +``` + +Run: +``` +python scripts/integrate.py loc232csv.zip +``` + +The provided ```prepare.py``` file would work alone when the original csv file will be fixed upstream. + ## License All data is licensed under the [ODC Public Domain Dedication and Licence (PDDL)](http://opendatacommons.org/licenses/pddl/1-0/). diff --git a/scripts/README.md b/scripts/README.md deleted file mode 100644 index ebeb0c7bc1..0000000000 --- a/scripts/README.md +++ /dev/null @@ -1,22 +0,0 @@ -#Build release -As the original release files have encoding problems, we need to process both the mdb and the csv release. -To build the dataset we use the csv version of the current edition. - -Tools needed: [MDBTools](http://mdbtools.sourceforge.net/) and [CSVKit](https://github.com/onyxfish/csvkit). -Download the current edition from [UNECE](https://www.unece.org/cefact/codesfortrade/codes_index.html) and put it into the root directory. -Then execute ```bash scripts/prepare_edition_mdb.sh loc{ed}mdb.zip```, where {ed} identify the release. - -To integrate the data from the csv then run the python file - -Prerequisites: - -``` -pip install pandas titlecase -``` - -Run: -``` -python scripts/integrate.py loc232csv.zip -``` - -The provided ```prepare.py``` file would work alone when the original csv file will be fixed upstream. \ No newline at end of file