Skip to content

Latest commit

 

History

History
38 lines (25 loc) · 2.17 KB

API.md

File metadata and controls

38 lines (25 loc) · 2.17 KB

All the Places API

All the Places runs all the spiders periodically and stores the results. We offer an API to access information about the different periodic runs and the data produced by the spiders.

Run Metadata

We store metadata about each run. The metadata includes information about the run itself, such as the start and end time, the output URL, and the size of the output file. The metadata also includes information about the data produced by the spiders, such as the number of spiders that ran and the total number of lines in the output GeoJSON file.

Field name Description
run_id The unique identifier for this run.
output_url The URL to download the output of this run. In most cases this is a zip file with the GeoJSON output of the spiders.
pmtiles_url The URL to the pmtiles of this run. This is a pmtiles file that can be used to visualize the data. It is generated by Tippecanoe from the GeoJSON output.
stats_url The URL to the statistics of this run. This is a JSON file with statistics about the run generated by scrapy.
insights_url The URL to the insights of this run. This is a JSON file with insights about the run generated by the All the Places pipeline.
start_time The time the run started. This is a timestamp in ISO 8601 format.
end_time The time the run ended. This is a timestamp in ISO 8601 format.
size_bytes The size of the output zip file in bytes.
spiders The number of spiders that ran in this run.
total_lines The total number of lines in the output GeoJSON file.

Endpoints

The API root is https://data.alltheplaces.xyz.

GET /runs/history.json

Returns a list of all the runs that have been stored. Each run has a run_id that can be used to access further data for that run.

We have historical data back to December 2017. Not all runs have all metadata, as some of the metadata support was added later.

See above for the metadata fields.

GET /runs/latest.json

Returns the metadata for the latest run.

You can use the output_url field to download the GeoJSON output of the latest run.