Geo-MAG: Knowledge Graph-Enhanced Multimodal Retrieval-Augmented Generation Framework for Geological Map Understanding
Geo-MAG is a framework for geological map understanding that integrates knowledge graphs with multimodal large language models. The system processes geological reports to build knowledge graphs, extracts metadata from geological maps using vision-language models, and generates semantic interpretations using multimodal large models.
- Python 3.10+
- PyTorch 2.0+
- CUDA 11.8+ (for GPU acceleration)
Install the required packages using pip:
pip install -r requirements.txtCreate a .env file in the project root directory with the following environment variables:
# Vision-Language Model API (e.g., DashScope)
VL_API_KEY=your_vision_language_model_api_key
VL_API_BASE=https://api.openai.com/v1 # or OpenAI, or your provider's endpoint
# Multimodal Large Language Model API
MLLM_API_KEY=your_multimodal_llm_api_key
MLLM_API_BASE=https://api.openai.com/v1 # or your provider's endpoint
# Vector Database (if applicable)
VECTOR_DB_API_KEY=your_vector_database_api_key
VECTOR_DB_ENDPOINT=your_vector_database_endpoint
# Optional: Logging and Monitoring
LOG_LEVEL=INFO-
Geological Reports (for knowledge graph construction)
- Format: PDF or TXT files
- Location:
data/reports/ - Each report should contain geological descriptions, stratigraphy, lithology, and structural information
-
Geological Maps (for interpretation)
- Format: PNG or JPG images
- Location:
data/maps/ - Images should be high-resolution and include legends
data/
├── reports/ # Geological report files
│ ├── report_001.pdf
│ ├── report_002.pdf
│ └── ...
└── maps/ # Geological map images
├── map_001.png
├── map_002.jpg
└── ...
Process geological reports to construct the knowledge graph:
python scripts/build_kg.py --reports data/reports/ --output kg/geological_kg.jsonOutput: kg/geological_kg.json - Structured knowledge graph with entities and relationships
Use the vision-language model to extract metadata from geological maps:
python scripts/extract_metadata.py --maps data/maps/ --output metadata/map_metadata.jsonOutput: metadata/map_metadata.json - Extracted metadata including legends, strata, and structures
Retrieve relevant subgraphs from the knowledge graph based on map metadata:
python scripts/retrieve_kg.py --metadata metadata/map_metadata.json --kg kg/geological_kg.json --output subgraphs/relevant_subgraph.jsonOutput: subgraphs/relevant_subgraph.json - Relevant knowledge subgraphs
Generate geological interpretation using the multimodal large model:
python scripts/interpret.py --metadata metadata/map_metadata.json --subgraph subgraphs/relevant_subgraph.json --output results/interpretation.txtOutput: results/interpretation.txt - Generated geological interpretation report
Run the complete pipeline:
bash run_pipeline.sh --reports data/reports/ --maps data/maps/ --output results/- Geological map data is not included in this repository due to copyright and sensitivity concerns
- Public geological datasets can be used for testing
- API keys must be obtained from respective service providers
- GPU acceleration is recommended for large-scale processing