Monitoring Land-centric Business Processes using Remote Sensing and Satellite Data
Process mining has been intensively used for business processes that are extensively supported by information systems. The tight integration of information processing and process execution as leveraged in the service sector is however often absent in land-centric processes such as farming. Land-centric processes exhibit some challenging characteristics that make it difficult to monitor them in real-time: they unfold continuously over time, however with clearly identifiable states. In this paper, we address the challenge of monitoring land-centric processes. We introduce a framework to generate event logs of land-centric processes by utilizing remote sensing systems such as satellites. We demonstrate the feasibility of our approach using publicly available data on agricultural processes in the United States.
Framework
In this study we have developed a framework for monitoring agricultural business process through satellite.
We have implemented our framework to investigate 15 years of agricultural activity from 2008 to 2022 on farm patches in Idaho, North Dakota, and Colorado, United States.
This repository contains the implementation of "Monitoring Cultivation Business Processes using Remote Sensing & Satellite Data"
- Python 3.11+
For required packages, please see requirements.txt.
To install all required packages:
pip install -r requirements.txt
This directory contains the main results.
Idaho:
log_Idaho_151024'_ALL.xes: Seed to harvest event log saved in xes formatlog_Idaho_151024'_ALL_df.h5: Seed to harvest event log saved in h5 format
North Dakota:
log_NorthDakota_151024'_ALL.xes: Seed to harvest event log saved in xes formatlog_NorthDakota_151024'_ALL_df.h5: Seed to harvest event log saved in h5 format
Colorado:
log_Colorado_151024'_ALL.xes: Seed to harvest event log saved in xes formatlog_Colorado_151024'_ALL_df.h5: Seed to harvest event log saved in h5 format
This directory contains the codes of this implementation.
GEE_download.ipynb: Download time series data from Google Earth Engine- To download data from GEE a GEE account is required. (Sign up for GEE)
Eventlog_generation.py -site -[smoother='ALL']: Event log generation scriptsite: Provide the case name to select the site. (The folder should share the same name).[smoother]Select a smoothing method. 'ALL', 'BZP', 'SG', 'WE', 'None'. Default value is 'ALL'
Performance_spectrum_evaluation.py -site -[smoother='ALL']: Create performance spectrumsite: Provide the case name to select the site. (The folder should share the same name).[smoother]Select a smoothing method. 'ALL', 'BZP', 'SG', 'WE', 'None'. Default value is 'ALL'
Smoothing_evaluation.py: Smoothing assessmentUsual_dates.py -[smoother='ALL']: Validation through usual dates[smoother]Select a smoothing method. 'ALL', 'BZP', 'SG', 'WE', 'None'. Default value is 'ALL'
Monitoring.py -site -[year=2022] -[crop=None] -[filtering=None] -[width=1.5]: Simulate monitoringsite: Provide the case name to select the site. (The folder should share the same name).year: Select a year (int).crop: Subseting with specific crop (str).filtering: If true remove of multiple crop cases, filter temporal outliers.width: Iffilteringis true filter by IQE +/-width*IQR (flt).
dfg.py -site: Directly-Follows Graph by seasonal timingsite: Provide the case name to select the site. If not provided, Idaho, NorthDakota, and Colorado will be loaded and combined to generate DFGs. The folder should share the same name.
seed_to_harvest.py: MACD activity recognition and event log enrichment
The generated event log has the following attributes:
| Attribute | Description | Type |
|---|---|---|
| Activity | Activity recognized | str |
| Timestamp | Timestamp filtered based on VI likelihood | pandas datetime object |
| Time_uncertainty | All valid recognition timestamp | List[pandas datetime object] |
| CaseID | ID given to the case structured as xxxx_yyyy. The first 4 digit represent the ID given to the site and the last 4 digit represent the year of the case | str |
| Crop | Cultivated crop | str |
| SiteID | ID given to the farm patch | int |
| WGS84_lon_lat | Center coordinate of the farm patch (WGS84) | list |
| County | County in which the farm patch is located determined by WGS84 coordinate | str |
| State | State/province in which the farm patch is located determined by WGS84 coordinate | str |
| Country | Country in which the farm patch is located determined by WGS84 coordinate | str |
| NDVI_range | Max/min range of valid recognition NDVI | list |
| num_valid_est | Number of valid recognition(s) | int |
| Multiple_crop | Binary indicator of whether multiple crop type was detected on field. 0: only one type of crop was found. 1: more than one types of crop were found. | int |