All the Places runs all the spiders periodically and stores the results. We offer an API to access information about the different periodic runs and the data produced by the spiders.
We store metadata about each run. The metadata includes information about the run itself, such as the start and end time, the output URL, and the size of the output file. The metadata also includes information about the data produced by the spiders, such as the number of spiders that ran and the total number of lines in the output GeoJSON file.
Field name | Description |
---|---|
run_id |
The unique identifier for this run. |
output_url |
The URL to download the output of this run. In most cases this is a zip file with the GeoJSON output of the spiders. |
pmtiles_url |
The URL to the pmtiles of this run. This is a pmtiles file that can be used to visualize the data. It is generated by Tippecanoe from the GeoJSON output. |
stats_url |
The URL to the statistics of this run. This is a JSON file with statistics about the run generated by scrapy. |
insights_url |
The URL to the insights of this run. This is a JSON file with insights about the run generated by the All the Places pipeline. |
start_time |
The time the run started. This is a timestamp in ISO 8601 format. |
end_time |
The time the run ended. This is a timestamp in ISO 8601 format. |
size_bytes |
The size of the output zip file in bytes. |
spiders |
The number of spiders that ran in this run. |
total_lines |
The total number of lines in the output GeoJSON file. |
The API root is https://data.alltheplaces.xyz
.
Returns a list of all the runs that have been stored. Each run has a run_id
that can be used to access further data for that run.
We have historical data back to December 2017. Not all runs have all metadata, as some of the metadata support was added later.
See above for the metadata fields.
Returns the metadata for the latest run.
You can use the output_url
field to download the GeoJSON output of the latest run.