A predictive engine for the Alan Turing Institute project DemoLand
Either clone and pip install or pip install from git.
See the notebooks in the docs
folder.
Top level overview:
- Generate the required files using the Demoland pipeline Docker container
- Commit engine-related files to
demoland-engine
repository - Commit app-related files to
demoland-web
repository
The rest should be automatic.
Important
The pipeline requires a decent amount of RAM (about 12GB at peak but we've seen even 22GB when running on Apple Silicon under emulation) and a fast internet connect (it needs to download 1.3GB of OSM data). Depending on the size of the area and a machine it can take anywhere from 15 minutes to a few hours.
The process of generating data for a new area is provided in the form of a Docker container. You will need to provide four pieces of information:
AREA_NAME
: Name that will be visible in the app to the user.NAME
: Name that is used as a key within thedemoland_engine
.AOI_FILE_PATH
: Path to a file relative to the current working directory (avoid../
) containing polygon geometries defining the extent of the area of interest. Can be any file readable bygeopandas.read_file
.GTFS_FILE_PATH
: Zip file with GTFS data covering the region of interest. Go to https://data.bus-data.dft.gov.uk/downloads/, register, and download timetable data for your region. Pass the file without any changes.
The best option is to create a folder with the two required files, navigate to the folder and run the container.
The container is not public, so you need to ensure you are logged in to ghcr.io within Docker. Follow the Github docs to do that. Note that to be able to pull the container, you need to have at least read permission of the demoland-engine
repository.
Example:
docker pull ghcr.io/urban-analytics-technology-platform/demoland_pipeline:latest
docker run \
--rm \
-ti \
-e AREA_NAME="Tyne and Wear" \
-e NAME="tyne_and_wear_v1" \
-e AOI_FILE_PATH="geography.geojson" \
-e GTFS_FILE_PATH="itm_north_east_gtfs.zip" \
-v ${PWD}:/app/data \
--user=$UID \
ghcr.io/urban-analytics-technology-platform/demoland_pipeline
The container generates two ZIP files. One shall be used in demoland_engine
, and the
other shall be used to deploy the app.
The file engine.zip
contains files to be added to the data
folder of the demoland-engine
repository. Use the information in hashes.json
to update data.py
in the demoland_engine
code. The result should look like the PR #7.
The file app.zip
contains all the necessary files to generate the webapp. Note that the new version of demoland_engine
with all the files from engine.zip
and correct hashes needs to be deployed before the app. See the dedicated documenation on the app deployment. There's no need to pay a lot of attention to the contents of each file as all of them are autogenerated by the pipeline.
To successfully build the container, navigate to the root of the repository and copy the necessary data files there as well. The required files:
air_quality_model.joblib
grid_adjacency_binary.parquet
grid_complete.parquet
house_price_model.joblib
All four need to be present in the repository at the time of building as they are copied to the container. You can then build the container and upload it to GHCR as:
docker build -t ghcr.io/urban-analytics-technology-platform/demoland_pipeline -f Dockerfile.pipe .
docker push ghcr.io/urban-analytics-technology-platform/demoland_pipeline:latest
The repository includes a GitHub Action which automatically deploys the main branch to Azure Functions.
For manual deployment steps, see the Developer Notes section in the DemoLand book for instructions.