Knowledge Agents

Description

A text analysis framework implementing a three-stage pipeline architecture for processing and analyzing temporal data. The system combines multiple AI models (OpenAI, Grok, Venice.AI) to perform advanced sampling, prompt engineering, and inference capabilities to generate insights, with a focus on temporal analysis, event mapping and forecasting. This repository serves as a workbench for Chanscope Knowledge Agents application.

Key Capabilities

Temporal Intelligence:
- Precise datetime handling across timezones
- Time-aware context generation
- Historical pattern analysis
Distributed Processing:
- Multi-provider model orchestration
- Concurrent chunk processing
- Batched operations with progress tracking
Adaptive Analysis:
- Dynamic provider selection
- Automatic fallback mechanisms
- Environment-aware execution (notebook/terminal)
Performance Monitoring (Optional):
- Literal AI integration for comprehensive monitoring
- Thread-level execution tracking
- Provider performance metrics
- Error pattern analysis

The framework is designed for robust handling of large-scale text analysis tasks, with built-in support for data validation, error recovery, and detailed operational logging. It provides a flexible foundation for building knowledge processing applications with temporal awareness.

Supported Models

The project supports multiple AI model providers:

OpenAI: Default provider for both completions and embeddings
- Requires: OPENAI_API_KEY, OPENAI_MODEL, OPENAI_EMBEDDING_MODEL
Grok (X.AI): Alternative provider with its own embedding model
- Optional: GROK_API_KEY, GROK_MODEL, GROK_EMBEDDING_MODEL
Venice.AI: Additional model provider for completions
- Optional: VENICE_API_KEY, VENICE_MODEL

Configure your preferred provider in config.ini. The system features automatic fallback to OpenAI if the primary provider fails, ensuring robust operation.

Technical Features

Model Operations

Pipeline Architecture:
- Embedding Generation (OpenAI/Grok)
- Chunk Analysis (OpenAI/Grok/Venice)
- Summary Generation with temporal context
Provider Integration:
- Dynamic model selection and fallback
- Standardized cross-provider responses
- Concurrent batch processing

Performance Monitoring

Literal AI Integration:
- Thread-level execution tracking
- Step-by-step performance metrics
- Provider usage patterns
- Error rate monitoring
Monitoring Features:
- Automatic OpenAI instrumentation
- Custom step tracking
- Error pattern analysis
- Resource utilization metrics

Data Processing

Time-Aware Analysis:
- Historical pattern recognition
- Temporal context preservation
Content Management:
- Semantic chunking with quality thresholds
- Duplicate detection and filtering
- Multi-format data handling (CSV/Parquet/Excel)

Runtime Features

Adaptive Execution:
- Environment-aware (Notebook/Terminal)
- Async processing with progress tracking
- Configurable worker pools
Error Recovery:
- Exponential backoff retries
- Provider fallback chains
- Comprehensive logging system

Analysis Capabilities

Signal Processing:
- Semantic search and retrieval
- Pattern detection and analysis
- Multi-source data integration
Contextual Analysis:
- Thread activity monitoring
- Impact assessment metrics
- Narrative evolution tracking

Running in Jupyter Notebook (knowledge_workbench)

Clone the repository:

git clone https://github.com/your-username/knowledge-agents.git
cd knowledge-agents

Install dependencies:
```
pip install -r requirements.txt
```
Configure your model providers:
- Copy config_template.ini to config.ini
- Add your API keys and model preferences

Set up monitoring (optional):

Get a Literal AI API key

Set the environment variable:

os.environ["LITERAL_API_KEY"] = "your-literal-api-key"

Or pass it directly to the run function:

chunks, summary = await run_knowledge_agents(
    query=query,
    process_new=True,
    providers=providers,
    monitor_api_key="your-literal-api-key"
)

Launch Jupyter Notebook:
```
jupyter notebook
```
Navigate to and open knowledge_workbench.ipynb

Running from Terminal

Clone the repository:

git clone https://github.com/your-username/knowledge-agents.git
cd knowledge-agents

Install dependencies:
```
pip install -r requirements.txt
```
Configure your model providers:
- Update config_template.ini
- Add your API keys and model preferences

Set up monitoring (optional):

export LITERAL_API_KEY="your-literal-api-key"

Run the main script:
```
python model_ops.py
```

Monitoring Integration

The framework includes comprehensive performance monitoring through Literal AI integration:

Features

Thread-level execution tracking
Step-by-step performance metrics
Provider usage patterns
Error rate monitoring
Resource utilization metrics

Usage

Enable monitoring by setting the Literal AI API key:

os.environ["LITERAL_API_KEY"] = "your-literal-api-key"

Run with monitoring enabled:

chunks, summary = await run_knowledge_agents(
    query=query,
    process_new=True,
    providers=providers,
    monitor_api_key=os.getenv("LITERAL_API_KEY")
)

Access monitoring data through the Literal AI dashboard:
- View thread execution timelines
- Analyze provider performance
- Track error patterns
- Monitor resource usage

Monitored Operations

Embedding generation
Content retrieval
Chunk analysis
Summary generation
Error handling and recovery

Data Gathering

For data collection functionality, you can utilize the data gathering tools from the chanscope-lambda repository. If you prefer not to set up a Lambda function, you can use the gather.py script directly from that repository for data collection purposes.

Using gather.py

Clone the chanscope-lambda repository
Navigate to the gather.py script
Follow the script's documentation for standalone data gathering functionality

Prompt System

The prompt.yaml file is a crucial component that defines the system's interaction patterns and analytical capabilities. It contains two main sections:

System Prompts

Objective Analysis
- Handles complex forecasting tasks combining numerical and textual data
- Performs structured analysis including numerical validation, contextual integration, and pattern recognition
- Generates multimodal forecasts with confidence metrics and contextual validation
Generate Chunks
- Specializes in processing and analyzing text segments
- Performs temporal analysis, information extraction, and context generation
- Maintains structured output format for consistency

User Prompts

Summary Generation
- Templates for comprehensive summaries with forecasting capabilities
- Integrates numerical data with contextual information
- Includes historical analysis, forecast generation, and risk assessment
Text Chunk Summary
- Templates for analyzing discrete text segments
- Extracts time series data and key information
- Generates domain context, background knowledge, and assumptions

Each prompt type is designed to maintain temporal awareness, preserve numerical precision, and provide comprehensive contextual analysis. The system uses these prompts to ensure consistent, high-quality output across different analytical tasks.

References

Data Gathering Lambda: chanscope-lambda
Prompt Engineering Research: Temporal-Aware Language Models for Temporal Knowledge Graph Question Answering; used for designing temporal-aware prompts and multimodal forecasting capabilities

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data_processing		data_processing
.gitignore		.gitignore
README.md		README.md
config_template.ini		config_template.ini
data_ops.py		data_ops.py
embedding_ops.py		embedding_ops.py
inference_ops.py		inference_ops.py
knowledge_workbench.ipynb		knowledge_workbench.ipynb
model_ops.py		model_ops.py
monitoring.py		monitoring.py
prompt.yaml		prompt.yaml
requirements.txt		requirements.txt
run.py		run.py
stratified_ops.py		stratified_ops.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Knowledge Agents

Description

Key Capabilities

Supported Models

Technical Features

Model Operations

Performance Monitoring

Data Processing

Runtime Features

Analysis Capabilities

Running in Jupyter Notebook (knowledge_workbench)

Running from Terminal

Monitoring Integration

Features

Usage

Monitored Operations

Data Gathering

Using gather.py

Prompt System

System Prompts

User Prompts

References

About

Uh oh!

Releases

Packages

Languages

joelwk/knowledge-agents

Folders and files

Latest commit

History

Repository files navigation

Knowledge Agents

Description

Key Capabilities

Supported Models

Technical Features

Model Operations

Performance Monitoring

Data Processing

Runtime Features

Analysis Capabilities

Running in Jupyter Notebook (knowledge_workbench)

Running from Terminal

Monitoring Integration

Features

Usage

Monitored Operations

Data Gathering

Using gather.py

Prompt System

System Prompts

User Prompts

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages