Srikumar Sastry*, Subash Khanal, Aayush Dhakal, Adeel Ahmad, Nathan Jacobs (*Corresponding Author)
WACV 2025
This repository is the official implementation of TaxaBind. TaxaBind is a suite of multimodal models useful for downstream ecological tasks covering six modalities: ground-level image, geographic location, satellite image, text, audio, and environmental features.
Our framework outperforms the state-of-the-art in both unimodal (BioCLIP, ArborCLIP) and multimodal setting (ImageBind).
- We release TaxaBench-8k, a truly multimodal dataset containing six paired modalities for evaluating large ecological models.
- We release iSatNat, containing 2.7M pairs of satellite images and ground-level species images.
- We release iSoundNat, containing 88,130 pairs of audio and ground-level species images.
Our pretrained models are made available through rshf
and transformers
package for easy inference.
Load and initialize taxabind config:
from transformers import PretrainedConfig
from rshf.taxabind import TaxaBind
config = PretrainedConfig.from_pretrained("MVRL/taxabind-config")
taxabind = TaxaBind(config)
📎 Loading ground-level image and text encoders:
# Loads open_clip style model
model = taxabind.get_image_text_encoder()
tokenizer = taxabind.get_tokenizer()
processor = taxabind.get_image_processor()
🛰️ Loading satellite image encoder:
sat_encoder = taxabind.get_sat_encoder()
sat_processor = taxabind.get_sat_processor()
📍 Loading location encoder:
location_encoder = taxabind.get_location_encoder()
🔈 Loading audio encoder:
audio_encoder = taxabind.get_audio_encoder()
audio_processor = taxabind.get_audio_processor()
🌦️ Loading environmental encoder:
env_encoder = taxabind.get_env_encoder()
env_processor = taxabind.get_env_processor()
📑 Citation
@inproceedings{sastry2025taxabind,
title={TaxaBind: A Unified Embedding Space for Ecological Applications},
author={Sastry, Srikumar and Khanal, Subash and Dhakal, Aayush and Ahmad, Adeel and Jacobs, Nathan},
booktitle={Winter Conference on Applications of Computer Vision},
year={2025},
organization={IEEE/CVF}
}
Check out our lab website for other interesting works on geospatial understanding and mapping: