Exploring human-nature interactions in national parks using social media photographs and computer vision
This repository contains Python scripts for the article Exploring human-nature interactions in national parks using social media photographs and computer vision published in Conservation Biology. The scripts use off-the-shelf computer vision models detect objects, extract features and classify scenes from social media photographs taken in Finnish national parks.
Follow the steps below to reproduce the results from the article or to adapt them to your own data.
- A geopackage containing Flickr image URLs, e.g. from database which contains data collected from Flickr API using your own API credentials
- A computer with an NVIDIA GPU or access to one (you can run the scripts on a CPU, but it's considerably slower)
- Python 3: we recommend creating a virtual environment for installing the libraries required to run the scripts
Directory | Description |
---|---|
preprocessing | Scripts for preprocessing of Flickr image data |
cv | Computer vision scripts |
processing | Processing scripts for the data |
plots | Scripts for plotting feature extraction results |
stats | Scripts for statistical analyses |
Step | Script | Description | Input | Output |
---|---|---|---|---|
0 | combine_data.py | Combines two existing datasets (not necessary if only one exists) | 2 geopackages | Combined geopackage without duplicates |
1 | image_download.py | Downloads images and removes posts with unavailable images | Geopackage with URLs | Image directory & geopackage without download errors |
2 | photoid_match.py | Double-checks geopackage reflects downloaded images | Geopackage | Geopackage |
3 | resize_images.py | Resizes and center-crops images for computer vision | Image directory | Resized image directory |
4 | extract_features.py | Extracts high-dimensional semantic feature vectors from images with ResNeXt101 pre-trained with ImageNet | Resized image directory | Pickled dataframe |
5 | join_userdata.py | Joins manually annotated user data to photo id matched geopackage | User data CSV & output geopackage from step 2 | Pickled dataframe |
6 | reduce_dimensions.py | Performs semantic clustering of the data based on feature vectors | Output pickled dataframe from step 5 | Pickled dataframe, a sanity check plot |
7 | predict_places365.py | Classifies image to a scene using VGG16-Places365 | Output pickled dataframe from step 6 | Pickled dataframe |
8 | detect_objects.py | Detects objects in images with Mask r-CNN pre-trained with MS COCO | Output pickled dataframe from step 7 | Pickled dataframe |
9 | plot scripts | Plot scripts for the plots used in the article | Output pickled dataframe from step 8 | Image files |
10 | stats scripts | Statistical tests used in the article | Output pickled dataframe from step 8 | CSV results files |