graphgeo is a specialized library for analyzing geospatial graphs, designed to streamline the loading, processing, and analysis of geospatial data. With an edge-centric approach, it excels in modeling river networks and similar structures.
Built on GeoPandas and its dependencies, graphgeo integrates seamlessly with geospatial processing tools, enabling efficient spatial analysis and graph traversal.
Currently, only the Swiss Coordinate Reference System LV95 (EPSG:2056) is supported. Support for additional CRS will be added in future updates.
Execute these commands to set up your environment and install the dependencies needed.
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtThe first step is to load geospatial (LineString or MultiLineString) data as a GeoDataFrame using GeoPandas. Then, convert the GeoDataFrame into a Graph using graphgeo.
import geopandas as gpd
import graphgeo
# Read and prepare data
gdf = gpd.read_file(filename="path/to/your/datafile.gpkg")
gdf["length"] = gdf.geometry.length
# Convert data
graph, gdf = graphgeo.gdf_to_graph(gdf, attributes=["length"])This function returns a directed multi-graph along with an enriched copy of the GeoDataFrame. The enriched GeoDataFrame includes additional properties: id, start_node, and end_node. These properties allow tracking modifications applied to the edges within the graph and facilitate generating the updated GeoDataFrame after computations.
The graph represents each LineString as an edge. Since it allows multiple edges between the same nodes, each edge can have unique attributes. The gdf_to_graph function assigns additional attributes, such as length, to each edge.
Once the graph is constructed, you can traverse it and execute various calculations. By using traversal functions, you can compute values and store the results within each edge’s attribute dictionary.
The traversal is guaranteed to be topologically sorted, ensuring that all parent edges (or child edges, if traversing in reverse) have already been processed before their dependents.
Since edge attributes are stored in a dictionary, they can be accessed directly using square brackets.
from graphgeo import Edge, SearchStrategy
def process_length(edge: Edge):
# p.attributes["length"] == p["length"]
preceding_length = sum(p["summed_length"] for p in edge.parents)
edge["summed_length"] = edge["length"] + preceding_length
graph.traverse(
traverse_edge=process_length,
search_strategy=SearchStrategy.DFS, # DFS or BFS
reverse=False
)In this example, process_length calculates the cumulative length of each edge by summing its own length with that of its parent edges. The computed values are stored in the edge attributes under "summed_length".
Once the analysis is complete, you can export the results to a GeoDataFrame. The graph_to_gdf function creates a new GeoDataFrame containing the updated attributes, which can then be saved to a file.
This process traces the modified attributes back to the original edges in the GeoDataFrame using their enriched IDs, ensuring that all computations are properly recorded.
result = graphgeo.graph_to_gdf(
gdf,
graph,
[
# Attribute Name, Type, Fallback
("summed_length", "float64", 0.00),
],
)
result.to_file(filename="path/to/your/outputfile.gpkg")In this example, the "summed_length" attribute is extracted from the graph and stored in the output GeoDataFrame. The data type (float64) and a fallback value (0.00) ensure consistency in the final dataset.
Further details and options can be found in the Python documentation of each class and function.