Skip to content

Exploring the use of machine learning to convert a Digital Surface Model (e.g. SRTM) to a Digital Terrain Model

License

Notifications You must be signed in to change notification settings

anaprietonem/DSM-to-DTM

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Using machine learning to improve free topography data for flood modelling

As part of the requirements for the Master of Disaster Risk & Resilience programme at the University of Canterbury, this research project explored the potential for machine learning models to make free Digital Surface Models (such as the widely-used SRTM) more applicable for flood modelling, by stripping away vertical biases relating to vegetation & built-up areas to get a "bare earth" Digital Terrain Model.

The image below visualises the performance of one of these models (a fully-convolutional neural network) in one of the three test zones considered (i.e. data unseen during model training & validation, used to assess the model's ability to generalise to new locations). A more detailed description is provided in the associated open-access journal article: Meadows & Wilson 2021.

graphical_abstract

Python scripts

All Python code fragments used during this research are shared here (covering preparing input data, building & training three different ML models, and visualising the results), in the hope that they'll be useful for others doing related work or extending/improving this approach. Please note this code includes lots of exploratory steps & some dead ends, and is not a refined step-by-step template for applying this approach in a new location.

Scripts are stored in folders relating to the virtual environments within which they were run, along with a text file summarising all packages loaded in each environment:

  • geo: geospatial processing & mapping
  • sklearn: development of Random Forest model
  • tf2: development of neural network models
  • osm: downloading OpenStreetMap data

Brief summary of datasets used

The data processed for use in this project comprised the feature data (free, global datasets relevant to the vertical bias in DSMs, to be used as inputs to the machine learning models), target data (the reference "bare earth" DTM from which the models learn to predict vertical bias), and some supplementary datasets (not essential to the modelling but used to explore/understand the results).

Feature data

A guiding principle for the project was that all feature (input) data should be available for free and with global (or near-global) coverage, so as to maximise applicability in low-income countries/contexts. While these datasets were too big to store here, all can be downloaded for free and relatively easily (some require signing up to the provider platform) based on the notes below.

Digital Surface Models (DSMs)

Multi-spectral imagery

  • Landsat-7: Downloaded from EarthExplorer under Landsat > Landsat Collection 1 Level-2 (On-Demand) > Landsat 7 ETM+ C1 Level-2 (surface reflectance bands) and Landsat > Landsat Collection 1 Level-1 > Landsat 7 ETM+ C1 Level-1 (thermal & panchromatic bands), limited to the Tier 1 collection only and a 6-month period centred around the SRTM data collection period (11-22 Feb 2000)
  • Landsat-8: Downloaded from EarthExplorer under Landsat -> Landsat Collection 1 Level-2 (On-Demand) -> Landsat 8 OLI/TIRS C1 Level-2 (surface reflectance bands) and Landsat -> Landsat Collection 1 Level-1 -> Landsat 8 OLI/TIRS C1 Level-1 (thermal & panchromatic bands), limited to the Tier 1 collection only and 6-month periods centred around each of the LiDAR survey dates (in 2016, 2017 & 2018)

Night-time light

Others

Target data

In order to learn how to predict (and then correct) the vertical biases present in DSMs, the models need reference data - "bare earth" DTMs assumed to be the "ground truth" that we're aiming for. For this project, we used three of the high-resolution LiDAR-derived DTMs published online by the New Zealand Government, accessible to all via the Land Information New Zealand (LINZ) Data Service. The specific LiDAR surveys used are summarised below, from the Marlborough & Tasman Districts (in the north of Aotearoa New Zealand's South Island):

  • Marlborough (May-Sep 2018): DTM and corresponding index tiles
  • Tasman - Golden Bay (Nov-Dec 2017): DTM and corresponding index tiles
  • Tasman - Abel Tasman & Golden Bay (Dec 2016): DTM and corresponding index tiles

To find similar target/reference DTM data in other parts of the world, the OpenTopography initiative maintains a catalogue of freely available sources.

Supplementary data

A few other datasets are referred to in the code, not as inputs to the machine learning models but just as references to better understand the results.


Brief summary of approach taken

The broad approach taken is summarised below as succinctly as possible, with further details provided as comments in the relevant scripts.

  1. For each available LiDAR survey zone, process the DSMs and DTM in tandem: clipping each DSM (SRTM, ASTER and AW3D30) to the extent covered by the LiDAR survey, and resampling the DTM to the same resolution & grid alignment as each DSM. Various DSM derivatives (such as slope, aspect & topographical index products) are also prepared here.

  2. Based on a comparison of differences between each DSM and the DTM (resampled to match that particular DSM), the SRTM DSM was selected as the "base" for all further processing (script).

  3. Process all other input datasets - resampling to match the SRTM resolution & grid alignment, masking out clouds for the multi-spectral imagery, applying bounds where appropriate (e.g. for percentage variables):

    • Landsat-7 multi-spectral imagery (script)
    • Landsat-8 multi-spectral imagery (script)
    • ASTER DEM (script)
    • AW3D30 DEM (script)
    • Night-time light (script)
    • Global forest canopy height (script)
    • Global forest cover (script)
    • Global surface water (script)
    • OpenStreetMap layers (script)
  4. Divide all available data into training (90%), validation (5%) and testing (5%) subsets, and prepare for input to the pixel-based approaches (random forest & standard neural network) and patch-based approach (convolutional neural network) (script).

  5. Use step floating forward selection (SFFS) (with a random forest estimator) to select relevant features based on the training & validation datasets (script)

  6. Train the random forest model, tuning hyperparameters with reference to the validation data subset (script)

  7. Train the densely-connected neural network model, tuning hyperparameters with reference to the validation data subset (script)

  8. Train the fully-convolutional neural network model, tuning hyperparameters with reference to the validation data subset (script)

  9. Visualise results for the three zones of the testing data subset (unseen during model development) (script)

About

Exploring the use of machine learning to convert a Digital Surface Model (e.g. SRTM) to a Digital Terrain Model

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%