- Introduction
- Setup
- Application Structure
- Indexing
- Prompt Tuning
- Data Management
- Configuration
- API Integration
- Troubleshooting
The GraphRAG Indexer Application is a Gradio-based user interface for managing the indexing and prompt tuning processes of the GraphRAG (Graph Retrieval-Augmented Generation) system. This application provides an intuitive way to configure, run, and monitor indexing and prompt tuning tasks, as well as manage related data files.
- Ensure you have Python 3.7+ installed.
- Install required dependencies:
pip install gradio requests pydantic python-dotenv pyyaml pandas lancedb
- Set up environment variables in
indexing/.env
:API_BASE_URL=http://localhost:8012 LLM_API_BASE=http://localhost:11434 EMBEDDINGS_API_BASE=http://localhost:11434 ROOT_DIR=indexing
- Run the application:
python index_app.py
The application is divided into three main tabs:
- Indexing
- Prompt Tuning
- Data Management
Each tab provides specific functionality related to its purpose.
The Indexing tab allows users to configure and run the GraphRAG indexing process.
- Select LLM and Embedding models
- Set root directory for indexing
- Configure verbose and cache options
- Advanced options for resuming, reporting, and output formats
- Run indexing and check status
- Select the desired LLM and Embedding models from the dropdowns.
- Set the root directory for indexing.
- Configure additional options as needed.
- Click "Run Indexing" to start the process.
- Use "Check Indexing Status" to monitor progress.
The Prompt Tuning tab enables users to configure and run prompt tuning for GraphRAG.
- Set root directory and domain
- Choose tuning method (random, top, all)
- Configure limit, language, max tokens, and chunk size
- Option to exclude entity types
- Run prompt tuning and check status
- Set the root directory and optional domain.
- Choose the tuning method and configure parameters.
- Click "Run Prompt Tuning" to start the process.
- Use "Check Prompt Tuning Status" to monitor progress.
The Data Management tab provides tools for managing input files and viewing output folders.
- File upload functionality
- File list management (view, refresh, delete)
- Output folder exploration
- File content viewing and editing
- Use the File Upload section to add new input files.
- Manage existing files in the File Management section.
- Explore output folders and their contents in the Output Folders section.
The application uses a combination of environment variables and a config.yaml
file for configuration. Key settings include:
- LLM and Embedding models
- API endpoints
- Community level for GraphRAG
- Token limits
- API keys and types
To modify these settings, edit the .env
file or create a config.yaml
file in the root directory.
The application integrates with a backend API for executing indexing and prompt tuning tasks. Key API endpoints used:
/v1/index
: Start indexing process/v1/index_status
: Check indexing status/v1/prompt_tune
: Start prompt tuning process/v1/prompt_tune_status
: Check prompt tuning status
These endpoints are called using the requests
library, with appropriate error handling and logging.
Common issues and solutions:
- Model loading fails: Ensure the LLM_API_BASE is correctly set and the API is accessible.
- Indexing or Prompt Tuning doesn't start: Check API connectivity and verify that all required fields are filled.
- File management issues: Ensure proper read/write permissions in the ROOT_DIR.
For any persistent issues, check the application logs (visible in the console) for detailed error messages.