Pairwise

Open-source parallel geospatial processing library

Reverse-engineered from ESRI's pairwise functionality for open use.

Overview

Pairwise is a Python library that provides parallel processing capabilities for geospatial operations. Inspired by ESRI ArcGIS Pro's pairwise tools, this library enables efficient processing of large geospatial datasets by distributing work across multiple CPU cores.

Key Features

Parallel Processing: Automatically divides work across available CPU cores
Flexible Configuration: Control the number of processes used (all cores, specific count, or percentage)
Multiple Operations: Support for all major ESRI pairwise tools
- Buffer: Create buffer zones around features
- Clip: Extract features within a boundary
- Dissolve: Aggregate features by attributes
- Erase: Remove overlapping portions
- Intersect: Compute geometric intersections
- Integrate: Snap vertices within tolerance
Multiple Input Types: Works with GeoDataFrame, Shapely geometries, and more
Extensible Architecture: Core processing engine can be used for various geospatial operations
ESRI-Compatible API: Similar interface to ESRI's pairwise tools for easy migration

Installation

pip install -r requirements.txt

Quick Start

Basic Operations

Buffer Operation

from pairwise import pairwise_buffer
import geopandas as gpd
from shapely.geometry import Point

# Create some sample data
gdf = gpd.GeoDataFrame(
    geometry=[Point(0, 0), Point(1, 1), Point(2, 2)]
)

# Buffer with automatic parallel processing
buffered = pairwise_buffer(gdf, distance=1.0)

Clip Operation

from pairwise import pairwise_clip
from shapely.geometry import box

# Clip features to a boundary
clip_boundary = box(0, 0, 2, 2)
clipped = pairwise_clip(gdf, clip_boundary)

Dissolve Operation

from pairwise import pairwise_dissolve

# Create features with categories
gdf = gpd.GeoDataFrame({
    'category': ['A', 'A', 'B'],
    'geometry': [box(0,0,1,1), box(1,0,2,1), box(0,1,1,2)]
})

# Dissolve by category
dissolved = pairwise_dissolve(gdf, by='category')

Intersect Operation

from pairwise import pairwise_intersect

# Find intersections between two feature sets
gdf1 = gpd.GeoDataFrame(geometry=[box(0,0,2,2)])
gdf2 = gpd.GeoDataFrame(geometry=[box(1,1,3,3)])
intersections = pairwise_intersect(gdf1, gdf2)

Erase Operation

from pairwise import pairwise_erase

# Remove portions that overlap with erase features
erase_area = box(0.5, 0.5, 1.5, 1.5)
erased = pairwise_erase(gdf, erase_area)

Integrate Operation

from pairwise import pairwise_integrate
from shapely.geometry import LineString

# Snap vertices within tolerance
lines = gpd.GeoDataFrame(geometry=[
    LineString([(0, 0), (1, 0)]),
    LineString([(1.001, 0), (2, 0)])
])
integrated = pairwise_integrate(lines, tolerance=0.01)

Configure Parallel Processing

from pairwise import pairwise_buffer, ParallelConfig

# Use specific number of processes
config = ParallelConfig(factor=4)
buffered = pairwise_buffer(gdf, distance=1.0, config=config)

# Use 50% of available cores
config = ParallelConfig(factor=0.5)
buffered = pairwise_buffer(gdf, distance=1.0, config=config)

# Use percentage as string
config = ParallelConfig(factor="75%")
buffered = pairwise_buffer(gdf, distance=1.0, config=config)

Advanced Usage with Core Processor

from pairwise import PairwiseProcessor, ParallelConfig
import geopandas as gpd

# Initialize processor
config = ParallelConfig(factor=4)
processor = PairwiseProcessor(config)

# Define custom operation
def custom_operation(batch):
    # Your custom processing logic here
    return batch.buffer(1.0)

# Process in parallel
results = processor.process_features(
    features=gdf,
    operation=custom_operation,
    merge_function=lambda results: pd.concat(results)
)

How It Works

The library implements ESRI's pairwise parallel processing pattern:

Batch Division: Input features are divided into batches
Parallel Processing: Each batch is processed on a separate CPU core
Result Merging: Results from all batches are combined

This approach provides significant performance improvements for large datasets, especially on multi-core systems.

Comparison to ESRI Pairwise Tools

Feature	ESRI Pairwise	This Library
Parallel Processing	✓	✓
Buffer Operations	✓	✓
Clip Operations	✓	✓
Dissolve Operations	✓	✓
Erase Operations	✓	✓
Intersect Operations	✓	✓
Integrate Operations	✓	✓
Configurable CPU Usage	✓	✓
Open Source	✗	✓
Python API	Limited	Full
Works with GeoPandas	Via Conversion	Native

Performance

Performance improvements depend on:

Dataset size (larger datasets benefit more)
Number of CPU cores available
Complexity of geometric operations
System memory

Typical performance improvements: 2-8x faster on 4-8 core systems with large datasets (10,000+ features).

Requirements

Python 3.7+
numpy
(Optional) geopandas - for GeoDataFrame support
(Optional) shapely - for geometry operations

Examples

See the examples/ directory for more detailed examples:

basic_buffer.py - Simple buffer operations
advanced_usage.py - Custom operations with core processor
performance_comparison.py - Performance benchmarks

API Reference

Core Operations

All pairwise operations support parallel processing with configurable CPU usage.

`pairwise_buffer(features, distance, config=None, dissolve=False, **kwargs)`

Create buffer polygons using parallel processing.

Parameters:

features: Input features (GeoDataFrame, list of geometries, etc.)
distance: Buffer distance in coordinate system units
config: ParallelConfig object for controlling parallelism
dissolve: Whether to dissolve overlapping buffers
**kwargs: Additional arguments passed to buffer operation

Returns: Buffered features (same type as input)

`pairwise_clip(input_features, clip_features, config=None)`

Extract features that fall within clip boundary using parallel processing.

Parameters:

input_features: Features to clip
clip_features: Clip boundary (GeoDataFrame, geometry, or list)
config: ParallelConfig object

Returns: Clipped features

`pairwise_dissolve(features, by=None, aggfunc='first', config=None)`

Aggregate features based on attributes using parallel processing.

Parameters:

features: Features to dissolve
by: Field name(s) to dissolve by (None = dissolve all)
aggfunc: Aggregation function for attributes
config: ParallelConfig object

Returns: Dissolved features

`pairwise_erase(input_features, erase_features, config=None)`

Remove portions that overlap with erase features using parallel processing.

Parameters:

input_features: Features to erase from
erase_features: Features defining areas to erase
config: ParallelConfig object

Returns: Erased features

`pairwise_intersect(input_features, intersect_features=None, config=None)`

Compute geometric intersections using parallel processing.

Parameters:

input_features: First feature set
intersect_features: Second feature set (None = self-intersect)
config: ParallelConfig object

Returns: Intersection results

`pairwise_integrate(features, tolerance, config=None)`

Adjust vertices within tolerance for alignment using parallel processing.

Parameters:

features: Features to integrate
tolerance: Distance tolerance for snapping vertices
config: ParallelConfig object

Returns: Integrated features with adjusted vertices

`ParallelConfig(factor=None)`

Configuration for parallel processing.

Parameters:

factor: Controls number of processes
- None or "auto": Use all available cores
- int: Use exactly this many processes
- float (0.0-1.0): Use this percentage of cores
- str: Percentage like "50%" or number

`PairwiseProcessor(config=None)`

Core parallel processor for custom operations.

Methods:

process_features(features, operation, batch_size=None, merge_function=None): Process features in parallel batches
process_pairwise(features1, features2, operation, merge_function=None): Process pairwise operations between two feature sets

Contributing

Contributions are welcome! This is an open-source reverse engineering project aimed at providing free alternatives to proprietary GIS tools.

License

MIT License - See LICENSE file for details

Acknowledgments

This library is inspired by ESRI's ArcGIS Pro pairwise tools, reverse-engineered for open-source use. It is not affiliated with or endorsed by ESRI.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
examples		examples
pairwise		pairwise
tests		tests
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
verify.py		verify.py

Folders and files

Latest commit

History

Repository files navigation

Pairwise

Overview

Key Features

Installation

Quick Start

Basic Operations

Buffer Operation

Clip Operation

Dissolve Operation

Intersect Operation

Erase Operation

Integrate Operation

Configure Parallel Processing

Advanced Usage with Core Processor

How It Works

Comparison to ESRI Pairwise Tools

Performance

Requirements

Examples

API Reference

Core Operations

pairwise_buffer(features, distance, config=None, dissolve=False, **kwargs)

pairwise_clip(input_features, clip_features, config=None)

pairwise_dissolve(features, by=None, aggfunc='first', config=None)

pairwise_erase(input_features, erase_features, config=None)

pairwise_intersect(input_features, intersect_features=None, config=None)

pairwise_integrate(features, tolerance, config=None)

ParallelConfig(factor=None)

PairwiseProcessor(config=None)

Contributing

License

Acknowledgments

References

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`pairwise_buffer(features, distance, config=None, dissolve=False, **kwargs)`

`pairwise_clip(input_features, clip_features, config=None)`

`pairwise_dissolve(features, by=None, aggfunc='first', config=None)`

`pairwise_erase(input_features, erase_features, config=None)`

`pairwise_intersect(input_features, intersect_features=None, config=None)`

`pairwise_integrate(features, tolerance, config=None)`

`ParallelConfig(factor=None)`

`PairwiseProcessor(config=None)`

Packages