This repository contains two complementary tools for software metadata management:
A lightweight CLI tool for executing software management actions using provider-based configurations.
Key Features:
- Provider-based Actions: Execute install, configure, start, stop, and other actions
- Multi-platform Support: Works across Linux, macOS, and Windows
- Extensible Providers: Support for package managers, containers, and custom providers
- Configuration Management: Flexible YAML/JSON configuration system
- Dry-run Mode: Preview actions before execution
An AI-enhanced tool for generating, validating, and managing software metadata in YAML format with universal repository support.
Key Features:
- Universal Repository Support: 50+ package managers including apt, dnf, brew, winget, npm, pypi, cargo, and more
- YAML-Driven Configuration: Add new repositories without code changes using simple YAML configs
- AI-Enhanced Generation: Uses LLMs (OpenAI, Anthropic, Ollama) for intelligent metadata creation
- Advanced Repository Management: Concurrent operations, intelligent caching, and real-time statistics
- Schema Validation: Validates generated files against official saidata schema
- RAG Support: Retrieval-Augmented Generation for improved accuracy
- Batch Processing: Generate metadata for multiple software packages efficiently
- Comprehensive CLI: Full-featured repository management and package search capabilities
pip install sai # Installs both sai and saigen
It's recommended to use a virtual environment for development:
git clone https://github.com/example42/sai.git
cd sai
# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Upgrade pip to latest version (required for pyproject.toml editable installs)
pip install --upgrade pip
# Install in development mode with all dependencies
pip install -e ".[dev,llm,rag]"
To deactivate the virtual environment when done:
deactivate
SAIGEN now supports 50+ package managers across all major platforms:
- Debian/Ubuntu: apt
- Red Hat/Fedora: dnf, yum
- SUSE: zypper
- Arch Linux: pacman
- Alpine: apk
- Gentoo: emerge, portage
- Void Linux: xbps
- Universal: flatpak, snap
- Homebrew: brew (formulae and casks)
- MacPorts: macports
- Nix: nix, nixpkgs
- Microsoft: winget
- Community: chocolatey, scoop
- JavaScript: npm, yarn, pnpm
- Python: pypi, conda
- Rust: cargo (crates.io)
- Ruby: gem (rubygems)
- Go: go modules
- PHP: composer (packagist)
- Java: maven, gradle
- C#/.NET: nuget
- Containers: docker hub
- Kubernetes: helm charts
- Scientific: spack, conda-forge
- Execute software actions using providers:
# Install nginx using available providers
sai install nginx
# Start a service
sai start nginx
# Dry run to preview actions
sai install nginx --dry-run
- Execute multiple actions from a file:
# Apply actions from a YAML file
sai apply actions.yaml
# Apply with parallel execution
sai apply actions.yaml --parallel
# Apply and continue on errors
sai apply actions.yaml --continue-on-error
- View available providers and statistics:
sai providers list
sai actions nginx
sai stats --detailed
sai config-show
- Explore Available Repositories (50+ supported):
# List all repositories
saigen repositories list-repos
# Filter by platform
saigen repositories list-repos --platform linux
# Filter by type
saigen repositories list-repos --type npm
- Search Packages Across All Repositories:
# Search across all 50+ repositories
saigen repositories search "redis"
# Search with platform filter
saigen repositories search "nginx" --platform linux --limit 10
# Get detailed package information
saigen repositories info "docker" --platform linux
- Repository Statistics and Management:
# Show comprehensive statistics
saigen repositories stats
# Update repository caches (concurrent operations)
saigen repositories update-cache
# JSON output for automation
saigen repositories stats --format json
- Configure LLM Provider:
export OPENAI_API_KEY="your-api-key"
# or
export ANTHROPIC_API_KEY="your-api-key"
- Generate Saidata with Repository Data:
# Generate using repository data + AI
saigen generate nginx
# Generate with specific providers
saigen generate nginx --llm-provider openai --providers apt brew --output nginx.yaml
# View current configuration
saigen config --show
- Validate Generated Files:
saigen validate nginx.yaml
saigen validate --show-context --format json nginx.yaml
sai install <software>
- Install software using available providerssai uninstall <software>
- Uninstall software using available providerssai start <software>
- Start software/servicesai stop <software>
- Stop software/servicesai restart <software>
- Restart software/servicesai status <software>
- Show software service statussai info <software>
- Show software informationsai search <term>
- Search for available softwaresai list
- List installed software managed through saisai logs <software>
- Show software service logssai version <software>
- Show software version informationsai apply <action_file>
- Apply multiple actions from a YAML/JSON file
sai providers list
- List available providerssai providers detect
- Detect and refresh provider availabilitysai providers info <provider>
- Show detailed provider informationsai providers clear-cache
- Clear provider detection cachesai providers cache-status
- Show provider cache statussai providers refresh-cache
- Refresh provider detection cache
sai config show
- Display current SAI configurationsai config set <key> <value>
- Set configuration valuesai config reset [key]
- Reset configuration to defaultssai config validate
- Validate configuration filesai config paths
- Show configuration file search paths
sai history list
- Show execution historysai history metrics
- Show execution metrics and statisticssai history clear
- Clear execution history
sai completion install
- Install shell completionsai completion uninstall
- Uninstall shell completion
sai scan <software>
- Scan for packages/vulnerabilities (syft, grype)sai generate <software>
- Generate SBOM or reports (syft)sai debug <software>
- Start debugging session (gdb)sai attach <software>
- Attach debugger to running process (gdb)sai report <software>
- Generate vulnerability/security reports (grype)sai export <software>
- Export data in multiple formats (syft, grype)sai update <software>
- Update databases/signatures (grype)sai convert <software>
- Convert between formats (syft)sai validate <software>
- Validate generated files (syft)sai filter <software>
- Apply filters to scan results (grype)sai check <software>
- Check for specific issues/CVEs (grype)
sai enable <software>
- Enable service auto-startsai disable <software>
- Disable service auto-start
sai stats
- Show comprehensive statistics about providers and actionssai validate <saidata-file>
- Validate a saidata file against the schemasai --version
- Show sai version information
saigen generate <software>
- Generate saidata for software with AI assistancesaigen validate <file>
- Validate saidata file against schema with detailed reportingsaigen quality <file>
- Assess quality metrics for saidata files with comprehensive scoringsaigen update <file>
- Update existing saidata with new information using intelligent mergingsaigen batch
- Generate saidata for multiple software packages in parallel
saigen config show
- Display current configuration including LLM providerssaigen config set <key> <value>
- Set configuration values with dot notationsaigen config validate
- Validate configuration file syntax and settingssaigen config init
- Initialize new configuration file with defaults
saigen repositories list-repos
- List all 50+ supported repositories with filteringsaigen repositories search <query>
- Search packages across all repositoriessaigen repositories info <package>
- Get detailed package informationsaigen repositories stats
- Show comprehensive repository statistics and healthsaigen repositories update-cache
- Update repository caches with concurrent operations
--platform <linux|macos|windows|universal>
- Filter by platform--type <apt|brew|npm|pypi|cargo|...>
- Filter by repository type--limit <number>
- Limit search results--format <table|json|yaml>
- Choose output format
saigen --help
- Show all available commands and optionssaigen --version
- Show version information
--llm-provider
- Choose LLM provider (openai, anthropic, ollama)--providers
- Target package providers (50+ supported)--output
- Output file path--dry-run
- Preview generation without making API calls--verbose
- Enable detailed logging
--merge-strategy
- Choose merge strategy (preserve, enhance, replace)--backup/--no-backup
- Control backup creation (default: enabled)--interactive
- Enable interactive conflict resolution--force-update
- Regenerate completely ignoring existing content
--input-file
- Input file containing software names--software-list
- Software names from command line--max-concurrent
- Maximum concurrent generations (default: 3)--category-filter
- Filter by category using regex patterns--preview
- Preview what would be processed without generating
--format
- Output format (text, json, yaml)--show-context
- Include detailed error context--strict
- Enable strict validation mode--advanced
- Enable advanced validation with quality metrics--no-repository-check
- Skip repository accuracy checking--detailed
- Show detailed quality metrics and suggestions
For common issues and solutions, see the Troubleshooting Guide.
SAI looks for configuration files in:
~/.sai/config.yaml
or~/.sai/config.json
.sai.yaml
or.sai.json
(in current directory)sai.yaml
orsai.json
(in current directory)
Example SAI configuration:
config_version: "0.1.0"
log_level: info
# Provider search paths
saidata_paths:
- "."
- "~/.sai/saidata"
- "/usr/local/share/sai/saidata"
provider_paths:
- "providers"
- "~/.sai/providers"
- "/usr/local/share/sai/providers"
# Provider priorities (lower number = higher priority)
provider_priorities:
apt: 1
brew: 2
winget: 3
# Execution settings
max_concurrent_actions: 3
action_timeout: 300
require_confirmation: true
dry_run_default: false
SAIGEN looks for configuration files in:
~/.saigen/config.yaml
or~/.saigen/config.json
.saigen.yaml
or.saigen.json
(in current directory)saigen.yaml
orsaigen.json
(in current directory)
Configuration can also be set via environment variables:
OPENAI_API_KEY
- OpenAI API keyANTHROPIC_API_KEY
- Anthropic API keySAIGEN_LOG_LEVEL
- Logging level (debug, info, warning, error)SAIGEN_CACHE_DIR
- Cache directory pathSAIGEN_OUTPUT_DIR
- Output directory path
Example SAIGEN configuration:
config_version: "0.1.0"
log_level: info
llm_providers:
openai:
provider: openai
model: gpt-3.5-turbo
max_tokens: 4000
temperature: 0.1
timeout: 30
max_retries: 3
enabled: true
anthropic:
provider: anthropic
model: claude-3-sonnet-20240229
enabled: false
repositories:
apt:
type: apt
enabled: true
cache_ttl: 3600
priority: 1
cache:
directory: ~/.saigen/cache
max_size_mb: 1000
default_ttl: 3600
rag:
enabled: true
index_directory: ~/.saigen/rag_index
embedding_model: sentence-transformers/all-MiniLM-L6-v2
max_context_items: 5
generation:
default_providers: [apt, brew, winget]
output_directory: ./saidata
parallel_requests: 3
request_timeout: 120
validation:
strict_mode: true
auto_fix_common_issues: true
├── sai/ # SAI CLI Tool
│ ├── cli/ # CLI interface and commands
│ ├── models/ # Data models (config, provider data)
│ └── utils/ # Utilities (config management)
├── saigen/ # SAIGEN AI Generation Tool
│ ├── cli/ # CLI interface and commands
│ ├── core/ # Core generation engine
│ ├── llm/ # LLM provider integrations
│ ├── models/ # Data models (saidata, generation)
│ ├── repositories/ # Package repository integrations
│ └── utils/ # Utilities and helpers
├── saidata/ # Generated saidata files
├── providers/ # Provider data files
├── schemas/ # JSON schema definitions
├── docs/ # Documentation
├── tests/ # Test suite
└── examples/ # Usage examples
- SAI consumes saidata files and provider configurations to execute software management actions
- SAIGEN generates saidata files using AI and repository data that SAI can then use
- Both tools share common schemas and data formats but operate independently
- SAI focuses on execution and action management
- SAIGEN focuses on metadata generation and validation
SAI includes specialized providers for security, debugging, and analysis tools:
- Port scanning:
sai search <target>
- Scan ports on target hosts - Service discovery:
sai info <target>
- Detect services and versions - Script scanning:
sai logs <target>
- Run NSE scripts - Performance tuning:
sai list <target>
- Optimized timing scans - Version detection:
sai version <target>
- Detect service versions on target
- Package scanning:
sai scan <path>
- Scan directories/images for packages - SBOM generation:
sai generate <image>
- Generate SBOM from container images - Format conversion:
sai convert <sbom>
- Convert between SBOM formats - Multi-format export:
sai export <target>
- Export to SPDX, CycloneDX, table formats - SBOM validation:
sai validate <sbom>
- Validate SBOM format compliance - SBOM comparison:
sai diff <baseline> <current>
- Compare two SBOMs
- Security scanning:
sai scan <target>
- Scan for vulnerabilities - Report generation:
sai report <target>
- Generate vulnerability reports - Severity filtering:
sai filter <target>
- Filter by vulnerability severity - Multi-format export:
sai export <target>
- Export to JSON, SARIF, table formats - Database updates:
sai update
- Update vulnerability database - CVE checking:
sai check <target>
- Check for specific CVEs
- Interactive debugging:
sai debug <binary>
- Start debugging session - Process attachment:
sai attach <process>
- Attach to running process - Core dump analysis:
sai core_dump <binary>
- Analyze core dumps - Stack traces:
sai backtrace <process>
- Get stack trace from running process - Breakpoint debugging:
sai breakpoint <binary>
- Set breakpoints and debug - Variable watching:
sai watch <binary>
- Watch variable changes - Memory inspection:
sai inspect <process>
- Inspect memory and variables - Execution profiling:
sai profile <binary>
- Profile application execution
- Automated software deployment and configuration
- Cross-platform software management
- Infrastructure as Code implementations
- CI/CD pipeline integrations
- System administration automation
- Generating metadata for new software packages
- Updating existing saidata with latest information
- Bulk metadata generation for software catalogs
- AI-assisted software documentation
- Repository data analysis and enrichment
MIT License - see LICENSE file for details.