Skip to content

Releases: keboola/osiris

v0.5.7: Filesystem Connections & Enhanced Validation

09 Nov 09:24
a57ce1a

Choose a tag to compare

Filesystem Connections & Enhanced Validation

This release adds filesystem connection support to osiris_connections.yaml, enabling connection-based discovery for CSV files identical to database components. Additionally, OSIRIS_HOME support is now bulletproof across all path resolution (config, connections, .env), and the validate command comprehensively checks all connection families and filesystem contract sections.

Key Features

1. Filesystem Connections in osiris_connections.yaml

  • New filesystem connection family in osiris_connections.yaml
  • Connection references using @filesystem.alias pattern (identical to database connections)
  • Environment variable substitution with ${OSIRIS_HOME} for portable paths

2. Connection-based Discovery for Filesystem

  • Command: osiris discovery run @filesystem.local (uses connection base_dir)
  • Shows real CSV column names and data types (integer, float, string, datetime, boolean)
  • No more placeholder names like "column_0, column_1"

3. OSIRIS_HOME Support Everywhere

  • osiris validate displays current OSIRIS_HOME setting
  • Config file loading checks $OSIRIS_HOME/osiris.yaml first
  • .env loading prioritizes $OSIRIS_HOME/.env over cwd
  • Works from any directory when OSIRIS_HOME is set

4. Comprehensive Validation

  • Validates ALL connection families (mysql, supabase, posthog, filesystem)
  • Checks filesystem contract sections
  • Fixed false "Missing section" warnings

Installation

pip install osiris-pipeline==0.5.7

Migration Notes

New Recommended Pattern:

# osiris_connections.yaml
filesystem:
  local:
    default: true
    base_dir: "${OSIRIS_HOME}/data"

Discovery Command:

# New (recommended)
osiris discovery run @filesystem.local

# Old (deprecated, still works)
osiris components discover filesystem.csv_extractor --config path/to/config.yaml

Testing

  • 77 filesystem-related tests passing
  • All backward compatible
  • Zero breaking changes

See CHANGELOG.md for complete details.

v0.5.6: PostHog Integration + 14 Critical Fixes

09 Nov 07:50
1a52973

Choose a tag to compare

PostHog Analytics Integration & Critical Fixes

This release adds production-ready PostHog analytics component with support for 4 data types and includes 14 critical security, performance, and data integrity fixes.

🎯 PostHog Integration

  • 4 Data Types: events, persons, sessions, person_distinct_ids
  • Production Ready: 57/57 Osiris checklist rules passing, 44/44 tests passing
  • E2B Compatible: Works in 2GB cloud sandbox
  • Features: SEEK pagination, streaming, UUID dedup, multi-region support

πŸ”’ Security Fixes (3)

  • HogQL Injection Prevention: Proper quote escaping instead of restrictive whitelist
  • Code Execution Fix: Removed driver imports from components list
  • Exception Handling: Fixed UnboundLocalError in posthog_client

⚑ Performance & Memory (3)

  • True Streaming: O(3N) β†’ O(2N) memory (33% reduction, 3x scalability)
  • Dedup Cache: Set β†’ deque for deterministic FIFO eviction
  • Timezone Handling: UTC conversion prevents silent time window shifts

πŸ“Š Data Integrity (3)

  • State Migration: Extended to all data types (persons, sessions)
  • ORDER BY Fix: Deterministic sorting for person_distinct_ids
  • Persons Pagination: Fixed HogQL list-to-dict conversion

πŸ“š Documentation (3)

  • State Schema: Fixed all 3 PostHog docs (CONFIG.md, README.md, example)
  • Docs Cleanup: Removed Phase 2 features (cohorts/feature_flags)
  • Schema Addition: Added deduplication_enabled to spec.yaml

βœ… Test Infrastructure (3)

  • Coverage: +27 tests (17 β†’ 44 tests, 40% β†’ 85% coverage)
  • CLI Tests: Fixed path resolution issues
  • MCP Tests: Updated to new --base-path architecture

πŸ“¦ Installation

pip install --upgrade osiris-pipeline==0.5.6

πŸ“– Documentation

πŸ™ Credits

Thanks to all contributors and the Codex review system for identifying critical issues!

v0.5.4 - CLI version display hotfix

06 Nov 23:31
a51a2aa

Choose a tag to compare

Hotfix - CLI Version Display

This hotfix corrects the CLI version display which was hardcoded instead of reading from the package version.

Fixed

Hardcoded Version in CLI

Problem: The osiris --version command was displaying hardcoded "v0.5.0" regardless of the installed package version.

Root Cause: Lines 183-185 in osiris/cli/main.py had hardcoded version strings that were never updated during version bumps.

Fix Applied: Updated CLI to dynamically read version from package:

# File: osiris/cli/main.py (lines 182-187)
from osiris import __version__
if json_output:
    print(json.dumps({"version": f"v{__version__}"}))
else:
    console.print(f"Osiris v{__version__}")

Impact:

  • βœ… osiris --version now correctly displays the actual installed version
  • βœ… Both text and JSON output (--json --version) work correctly
  • βœ… Users can now reliably verify their installed Osiris version

Before vs After

Before (v0.5.3 installed):

$ pip show osiris-pipeline | grep Version
Version: 0.5.3

$ osiris --version
Osiris v0.5.0  # ❌ Wrong!

After (v0.5.4 installed):

$ pip show osiris-pipeline | grep Version
Version: 0.5.4

$ osiris --version
Osiris v0.5.4  # βœ… Correct!

Files Changed (7 files)

  • osiris/cli/main.py - Fixed hardcoded version
  • pyproject.toml - Version bump to 0.5.4
  • osiris/__init__.py - Version bump to 0.5.4
  • osiris/mcp/config.py - SERVER_VERSION to 0.5.4
  • tests/mcp/test_server_boot.py - Test assertion to 0.5.4
  • README.md - Header and roadmap updated
  • CHANGELOG.md - Added v0.5.4 entry

Installation

pip install --upgrade osiris-pipeline==0.5.4

Full Changelog: v0.5.3...v0.5.4

v0.5.3 - Python 3.11+ requirement + log_metric kwargs

06 Nov 23:13
645375e

Choose a tag to compare

Bug Fixes - Python Version & Runtime

This release corrects Python version requirements and fixes a critical runtime bug preventing CSV extractor execution.

Fixed

1. Python Version Requirement (Issue #52)

Corrected Python version requirement from >=3.9 to >=3.11 in PyPI metadata and all documentation.

Changes:

  • pyproject.toml: Updated requires-python = ">=3.11"
  • Removed Python 3.9/3.10 classifiers, added Python 3.13
  • Updated documentation: CLAUDE.md, docs/quickstart.md, docs/guides/mcp-production.md, e2e-test-simple.sh

Impact: πŸ”΄ Breaking - Python 3.9 and 3.10 are no longer supported. Use Python 3.11+ to run Osiris.

2. RunnerContext.log_metric() Missing **kwargs (Issue #51)

Fixed signature mismatch causing TypeError when component drivers use the tags parameter with log_metric().

Root Cause: RunnerContext.log_metric() wrapper didn't forward keyword arguments to the underlying session logging function.

Fix Applied:

# File: osiris/core/runner_v0.py:445-446
def log_metric(self, name: str, value: Any, **kwargs):
    log_metric(name, value, **kwargs)

Impact: βœ… CSV extractor and other components now execute correctly with metric tags


Files Changed (11 files)

  • CHANGELOG.md
  • CLAUDE.md
  • README.md
  • docs/guides/mcp-production.md
  • docs/milestones/mcp-v0.5.0/attachments/e2e-test-simple.sh
  • docs/quickstart.md
  • osiris/init.py
  • osiris/core/runner_v0.py
  • osiris/mcp/config.py
  • pyproject.toml
  • tests/mcp/test_server_boot.py

Installation

pip install --upgrade osiris-pipeline==0.5.3

Acknowledgments

Special thanks to @pavel242242 for reporting both issues (#51, #52) with detailed reproduction steps and analysis! πŸ™


Full Changelog: v0.5.2...v0.5.3

v0.5.2 - Critical Bug Fixes Batch 3

06 Nov 22:36
ac15efe

Choose a tag to compare

Critical Bug Fixes Batch 3 - Code Review Findings

This release fixes 10 HIGH priority issues identified in code review, including security vulnerabilities, OML schema bugs, and cross-platform compatibility improvements.

Fixed

Security & Path Safety

  1. Path Traversal Vulnerability (CWE-22) (osiris/mcp/resource_resolver.py:85)
    • Fixed directory traversal attack vector in resource resolver
    • Added path validation to prevent access outside allowed directories
    • 23 comprehensive security tests covering malicious path patterns
    • Prevents attacks like osiris://mcp/../../../etc/passwd

Data Processing & Component Fixes

  1. Collision Detection KeyError (osiris/core/step_naming.py:107)

    • Fixed crash when logging multiple colliding step IDs
    • Implemented 2-pass collision detection algorithm
    • 6 tests covering edge cases (empty lists, single collision, multiple collisions)
  2. Supabase Writer Defaults Mismatch (osiris/drivers/supabase_writer_driver.py:180)

    • Fixed hardcoded False for if_exists parameter
    • Now properly applies spec defaults from component specification
    • Compiler correctly merges component defaults before driver execution
  3. CSV Extractor Header Type Handling (osiris/drivers/filesystem_csv_extractor_driver.py:57)

    • Fixed incorrect type handling for header parameter (expected int, got bool)
    • Now supports both boolean and integer values for header rows
    • Aligns with pandas API: header=0 or header=None
  4. CSV Extractor Unused Config Options (components/filesystem.csv_extractor/spec.yaml:82)

    • Implemented missing skip_blank_lines and compression configuration options
    • CSV extractor now fully supports all declared spec parameters
    • Better handling of malformed CSV files

Cross-Platform & Performance

  1. Windows Compatibility - Row Counting (osiris/drivers/filesystem_csv_extractor_driver.py:268)

    • Fixed Unix-specific wc -l command failing on Windows
    • Implemented cross-platform row counting using pandas
    • CSV extractor now works identically on Windows, macOS, and Linux
  2. MCP Async Blocking (osiris/cli/mcp_bridge/cli_bridge.py:254)

    • Fixed blocking async operations in MCP CLI bridge
    • Implemented proper non-blocking subprocess execution
    • 40% performance improvement in MCP tool response times

OML Schema & Validation

  1. OML Schema Contract Mismatch (osiris/mcp/tools/guide.py:428)

    • Fixed guide returning version instead of oml_version field
    • Added missing required fields: id and mode in step definitions
    • Sample OML now passes strict validation
  2. OML Validation Exit Code Bug (osiris/mcp/mcp_cmd.py:459)

    • Fixed validation failures not propagating exit codes to CI/CD
    • osiris mcp oml validate now correctly returns non-zero on errors
    • Enables proper CI/CD pipeline failure detection
  3. Connection Reference Regex Corruption (osiris/core/oml.py:150)

    • Fixed regex pattern corrupting email addresses and URLs in OML
    • Connection reference detection now preserves all non-connection content
    • Prevents user@example.com β†’ user@connection.alias

Tests

  • 71 new tests across all fixes:
    • 23 path traversal security tests
    • 6 collision detection tests
    • 12 step naming tests
    • 30 CSV extractor regression tests
  • Zero regressions: All existing tests continue passing
  • All quality checks pass: Lint, security scans, type checking

Security

  • CWE-22 Prevention: Path traversal vulnerability eliminated
  • Input Validation: Enhanced path sanitization in resource resolver
  • Security Test Coverage: 23 tests for malicious path patterns

Full Changelog: https://github.com/keboola/osiris/blob/main/CHANGELOG.md#052---2025-11-06

v0.5.0 - MCP Production Ready + DuckDB Multi-Input Fix + CSV Extractor

06 Nov 19:22
da3cff4

Choose a tag to compare

v0.5.0 - MCP Production Ready + DuckDB Multi-Input Fix + CSV Extractor

Release Date: 2025-11-06

πŸŽ‰ Major Release

This release delivers a production-ready Model Context Protocol (MCP) server with CLI-first security architecture, comprehensive testing, and full documentation. Additionally includes critical DuckDB multi-input fix for proper DataFrame handling and new filesystem.csv_extractor component.

⚠️ Breaking Changes

DuckDB Multi-Input Table Naming

Pipeline steps with multiple upstream dependencies now use step-id-based table names.

What changed: DuckDB processor steps now register input DataFrames as df_<step_id> tables instead of single input_df table.

Why: Fixed bug where multiple upstream inputs overwrote each other, causing only the last DataFrame to be available.

Migration required: Update SQL queries in DuckDB processor steps to use new naming convention.

Before (broken for multi-input):

- id: calculate
  component: duckdb.processor
  needs:
    - extract-movies
    - extract-reviews
  transformation: |
    SELECT * FROM input_df  # ❌ Only worked with single input

After (works with multi-input):

- id: calculate
  component: duckdb.processor
  needs:
    - extract-movies
    - extract-reviews
  transformation: |
    SELECT
      m.title,
      AVG(r.rating) as avg_rating
    FROM df_extract_reviews r
    JOIN df_extract_movies m ON r.movie_id = m.id

Table Naming Rules:

  • Format: df_<sanitized_step_id>
  • Invalid SQL characters (hyphens, dots) replaced with underscores
  • Examples: extract-movies β†’ df_extract_movies, get.data β†’ df_get_data

MCP Tool Name Changes

All MCP tools now use underscore-separated naming (connections_list, not osiris.connections.list). Legacy dot-notation aliases supported for backward compatibility but deprecated.

Migration Steps:

  1. Run osiris init in your project directory to configure paths
  2. Update MCP tool calls to use underscore naming (e.g., connections_list)
  3. Verify connection configurations in osiris_connections.yaml
  4. Test MCP server with osiris mcp run --selftest (<2s)

πŸš€ New Features

Filesystem CSV Extractor Component

Complete CSV data extraction component with comprehensive features:

Core Features:

  • Basic CSV/TSV reading with configurable delimiter, encoding, and header handling
  • Column selection with preserved ordering
  • Advanced parsing: date parsing, custom data types (dtype), NA value handling
  • Skip rows and row limit options for large file handling
  • Comment line handling for annotated CSV files

Operational Features:

  • Discovery mode to list CSV files in directories
  • Doctor health checks for file validation and accessibility
  • E2B cloud-compatible path resolution (never uses Path.home())
  • Error modes: strict (fail fast) or skip (tolerant parsing)
  • Empty file handling returns empty DataFrame

Quality:

  • 30/30 tests passing (100% pass rate)
  • Strict component validation passed
  • E2E verified with full extraction pipeline

Step Naming and Multi-Input Support

  • New osiris/core/step_naming.py module with sanitize_step_id() for SQL-safe identifier generation
  • Enhanced DataFrame key generation with collision detection
  • Safe identifier handling in runner for SQL-compatible table names
  • Runner now properly handles multiple upstream DataFrames with unique keys
  • Enhanced logging - DuckDB processor logs registered tables and row counts

πŸ”’ MCP v0.5.0 Production Ready

Phase 1: CLI-First Security Architecture

  • Zero secret access in MCP process via CLI delegation pattern
  • Spec-aware secret masking using ComponentRegistry x-secret declarations
  • 10 CLI subcommands for MCP tools across 7 domains
  • Resource URI system (discovery, memory, OML drafts)
  • Config-driven filesystem paths (no hardcoded directories)

Phase 2: Functional Parity & Completeness

  • Tool response metrics: correlation_id, duration_ms, bytes_in, bytes_out
  • AIOP read-only access via MCP for LLM debugging
  • Memory PII redaction with consent requirement
  • Cache with 24-hour TTL and invalidation
  • Telemetry & audit logging with spec-aware masking

Phase 3: Comprehensive Testing & Security Hardening

  • 490 Phase 3 tests passing (100% of non-skipped)
  • Security: 10/10 tests, zero credential leakage verified
  • Error coverage: 51/51 tests, all 33 error codes covered
  • Performance: <1.3s selftest, P95 latency ≀ 2Γ— baseline
  • Server integration: 56/56 tests, 79% coverage (was 17.5%)
  • Resource resolver: 50/50 tests, 98% coverage (was 47.8%)
  • Overall coverage: 78.4% (85.1% adjusted)

Phase 4: Documentation & Release Preparation

  • Migration guide: docs/migration/mcp-v0.5-migration.md
  • Production guide: docs/guides/mcp-production.md
  • Manual test procedures for Claude Desktop integration
  • Comprehensive API documentation for all 10 tools

πŸ”§ Changes

  • Writer drivers updated - SupabaseWriterDriver and FilesystemCsvWriterDriver now read from df_* keys instead of single df key
  • Runner input handling - Stores full upstream results by step_id plus DataFrame aliases with df_ prefix
  • Connection Secret Masking - Now spec-aware using ComponentRegistry
  • Init Command - Auto-configures base_path to current directory's absolute path

πŸ› Fixes

  • Supabase context manager protocol - Added synchronous __enter__/__exit__ methods to SupabaseClient class
  • Multiple upstream inputs bug - Fixed runner overwriting DataFrames when step has multiple dependencies
  • DuckDB multi-table registration - DuckDB processor now registers all upstream DataFrames as separate tables

πŸ“Š Statistics

  • Tests: 490 new Phase 3 tests, 100% pass rate
  • Coverage: 78.4% overall (85.1% adjusted)
  • CSV Extractor: 30/30 tests passing
  • Performance: MCP selftest <1.3s

πŸ“š Documentation

  • Complete MCP documentation in docs/guides/mcp-production.md
  • Migration guide in docs/migration/mcp-v0.5-migration.md
  • Updated CLAUDE.md with MCP development patterns
  • ADR-0036: MCP Interface CLI-First Architecture

πŸ™ Contributors

This release was made possible by extensive collaboration and testing.

πŸ€– Generated with Claude Code

v0.4.0: Filesystem Contract v1 - Deterministic Directory Structure

09 Oct 19:29
3c4dbbf

Choose a tag to compare

πŸš€ Osiris v0.4.0 - Filesystem Contract v1

Release Date: October 9, 2025
Status: Production Ready

🎯 Overview

Major release implementing Filesystem Contract v1 (ADR-0028), introducing a deterministic, versionable directory structure that replaces the legacy logs/ layout. This release establishes clear separation between build artifacts, runtime logs, and AI observability packages.


✨ Major Features

Filesystem Contract v1 (ADR-0028)

  • Deterministic paths: build/, aiop/, run_logs/, .osiris/
  • Profile support: Multi-environment configuration (dev/staging/prod/ml/finance)
  • Versionable artifacts: Commit-friendly build outputs with content-addressed paths
  • Clear boundaries: Build (deterministic) vs logs (ephemeral) vs internal state

Run Indexing & Discovery

  • Fast run listing via .osiris/index/runs.jsonl
  • Per-pipeline indexes in .osiris/index/by_pipeline/{slug}.jsonl
  • Latest manifest pointers for --last-compile flag
  • Thread-safe counters with SQLite + WAL mode

Retention Policies

  • Automated cleanup via osiris maintenance clean
  • Configurable retention by days (run_logs) or count (AIOP)
  • Dry-run mode for safe testing
  • Size-aware deletion with human-readable summaries

CLI Improvements

  • osiris init - Scaffold projects with Filesystem Contract v1
  • osiris runs - Query run history with filtering
  • osiris logs aiop - AIOP management (list/show/export/prune)
  • osiris maintenance - System cleanup and health checks

πŸ“Š Statistics

  • 77 files changed (+9,028/-1,822 lines)
  • 1064+ tests passing (43 skipped for E2B live tests)
  • 35 ADRs documenting architecture decisions
  • Full E2B parity (<1% overhead vs local execution)

πŸ”§ Technical Details

New Core Modules

  • osiris/core/fs_config.py - Typed configuration models
  • osiris/core/fs_paths.py - FilesystemContract path resolution
  • osiris/core/run_ids.py - Multiple ID formats (incremental, ULID, UUID, Snowflake)
  • osiris/core/run_index.py - Run indexing and discovery
  • osiris/core/retention.py - Cleanup policies and execution

Updated Paths

Before (v0.3.x):          After (v0.4.0):
logs/session/compiled/    β†’ build/pipelines/{profile}/{slug}/{hash}/
logs/session/artifacts/   β†’ run_logs/{profile}/{slug}/{run_id}/
.osiris_sessions/         β†’ .osiris/sessions/
(no AIOP)                 β†’ aiop/{profile}/{slug}/{hash}/{run_id}/

Configuration Example

# osiris.yaml
version: '2.0'
filesystem:
  base_path: ""  # Project root
  profiles:
    enabled: true
    values: ["dev", "staging", "prod"]
    default: "dev"
  build_dir: "build"
  aiop_dir: "aiop"
  run_logs_dir: "run_logs"

⚠️ Breaking Changes

  1. CompilerV0 API Change

    # Before (v0.3.x):
    compiler = CompilerV0(output_dir="./logs/session/compiled")
    
    # After (v0.4.0):
    compiler = CompilerV0(fs_contract=contract, pipeline_slug="my-pipeline")
  2. Session Paths

    • logs/ directory removed entirely
    • All paths now resolved via FilesystemContract
    • Session logging uses contract-based run logs
  3. AIOP Export

    • Moved from logs/session/aiop.json to aiop/{profile}/{slug}/{hash}/{run_id}/
    • Delta analysis uses run index instead of hardcoded paths

πŸ”„ Migration Guide

For Existing Projects

  1. Update config (osiris.yaml):

    osiris init --upgrade  # Updates existing config to v2.0
  2. Run migration script:

    python scripts/migrate_index_manifest_hash.py
  3. Update .gitignore:

    # Remove:
    logs/
    
    # Add:
    run_logs/
    aiop/**/annex/
    .osiris/cache/
    .osiris/index/counters.sqlite*

For New Projects

osiris init  # Scaffolds Filesystem Contract v1 structure

πŸ“š Documentation

  • ADR-0028: Filesystem Contract v1 specification
  • User Guide: Updated for new paths and commands
  • Developer Guide: Module documentation with contract patterns
  • Reference: Complete CLI command reference

πŸ› Bug Fixes

  • Fixed manifest path resolution in osiris run (Codex finding)
  • Improved security exception comments with context
  • Fixed AIOP manifest hash normalization
  • Fixed run index LATEST pointer format

πŸ™ Contributors

  • @padak - Lead development, architecture, and implementation
  • Codex AI - Code review and bug detection
  • Devin AI - Infrastructure improvements (closed PRs)

πŸ“¦ Installation

# Clone repository
git clone https://github.com/keboola/osiris.git
cd osiris

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Initialize project
python osiris.py init

🚦 Next Steps (Roadmap)

  • M2b: Real-time AIOP streaming
  • M3: Scale and performance optimization
  • M4: Data warehouse agent improvements

πŸ”— Links

Release v0.3.5 - GraphQL Extractor, DuckDB Processor & Test Infrastructure

07 Oct 12:09
78235c1

Choose a tag to compare

Release v0.3.5 (2025-10-07)

This release adds GraphQL API extraction capabilities, DuckDB SQL transformations, and significant test infrastructure improvements with 1001+ passing tests.

✨ Added

GraphQL Extractor Component (graphql.extractor)

  • New driver: osiris/drivers/graphql_extractor_driver.py for GraphQL API data extraction
  • Component spec: components/graphql.extractor/spec.yaml with authentication support (Bearer, Basic, API Key)
  • Support for complex GraphQL queries with variables and nested field extraction
  • JSONPath-based data extraction from GraphQL responses
  • Comprehensive test coverage with 16 passing tests
  • Integration with existing component registry

Connection-aware CLI Validation (ADR-0020)

  • osiris validate now reads from osiris_connections.yaml for connection validation
  • Shows configured aliases and missing environment variables per connection
  • Connection validation integrated into JSON output structure

πŸ”„ Changed

Validation Command Updates (ADR-0020)

  • osiris validate now uses connection-based validation from osiris_connections.yaml
  • Removed legacy environment-only probing in favor of connection definitions
  • Validation output includes connection aliases and per-connection status

Test Infrastructure Improvements

  • Supabase tests isolated via @pytest.mark.supabase marker for clean separation
  • Split-run test execution: make test orchestrates both non-Supabase and Supabase phases
  • Test suite now at 1001+ passing tests with improved isolation

πŸ› Fixed

Secret Detection Improvements (ADR-0035)

  • Fixed false-positive secret detection for standalone "Bearer" keyword
  • Pattern now only flags "Bearer" when followed by actual token-like strings (16+ chars)
  • Aligns with ADR-0035 principle: detect real secrets, not keywords

Test Warning Fixes

  • Fixed PytestReturnNotNoneWarning in test_m0_validation_4_logging.py::test_scenario_log_level_comparison
  • Test now properly uses assertions instead of returning boolean values

CLI Validation Test Updates

  • Fixed 5 CLI validation tests to work with ADR-0020 connection-based validation
  • Tests now use temp_connections_yaml fixture for proper osiris_connections.yaml setup
  • Updated assertions to check for connection aliases instead of legacy missing_vars

πŸ“š Documentation

ADR Status Updates reflecting actual implementation state:

  • ADR-0031 (OML Control Flow): Status changed to "Proposed (Deferred to M2+)" - 0% implemented
  • ADR-0032 (Runtime Parameters): Status changed to "Accepted" - 90% implemented, core features production-ready
  • ADR-0034 (E2B Runtime Parity): Status changed to "Accepted (Amended)" - 85% implemented via Transparent Proxy architecture
  • ADR-0035 (Compiler Secret Detection): Status changed to "Accepted (Phase 1)" - 80% implemented, x-secret parsing complete
  • Each ADR now includes detailed implementation status sections with code references and test coverage

Full Changelog: https://github.com/keboola/osiris/blob/main/CHANGELOG.md#035---2025-10-07

v0.3.1 - Fix validation warnings

29 Sep 11:27
0c4d05a

Choose a tag to compare

Release v0.3.1

Fixed

  • osiris validate: removed spurious additionalProperties warnings for ADR-0020 compliant configs
  • Improved error messages to list unexpected keys and suggest allowed ones

Added

  • docs/reference/connection-fields.md: complete reference for MySQL & Supabase connection fields
  • 11 new test cases in tests/core/test_validation_connections.py

Details

The validation system now correctly accepts all ADR-0020 fields including:

  • default - mark connection as default for family
  • alias - connection alias metadata
  • pg_dsn - PostgreSQL DSN for Supabase
  • dsn - alternative MySQL connection string

Notes

  • Fully backward compatible
  • NO-SECRETS posture preserved
  • Existing configurations remain valid

Full Changelog: v0.3.0...v0.3.1

Release v0.3.0: Milestone M2a Complete - AI Operation Package (AIOP)

27 Sep 01:04

Choose a tag to compare

πŸš€ Release v0.3.0 - Milestone M2a Complete: AI Operation Package (AIOP)

Release Date: September 27, 2025
Status: Production-Ready AIOP System

This release completes Milestone M2a, delivering a comprehensive, production-ready AI Operation Package (AIOP) system. AIOP provides a four-layer semantic architecture (Evidence, Semantic, Narrative, Metadata) that enables any LLM to fully understand Osiris pipeline runs through structured, deterministic, secret-free exports.

🎯 Milestone Achievement

  • βœ… All 24 acceptance criteria met
  • βœ… 921 tests passing (29 skipped E2B live tests)
  • βœ… ADR-0027 marked IMPLEMENTED
  • βœ… Production-ready quality assurance

πŸš€ Key Features

AI Operation Package (AIOP) Implementation

  • Four-layer semantic architecture: Evidence, Semantic, Narrative, and Metadata layers
  • CLI command: osiris logs aiop with JSON and Markdown export formats
  • Deterministic output with stable IDs for reproducible analysis
  • Size-controlled exports with object-level truncation markers (≀300KB core)
  • Comprehensive secret redaction with DSN masking (postgres://user:***@host/db)
  • Annex policy for large runs with NDJSON shards and compression

System Stabilization (WU7a/b/c)

  • Delta analysis with "Since last run" comparisons using by-pipeline index
  • Intent discovery with multi-source provenance (manifest, README, commits, chat logs)
  • Active duration metrics in aggregated statistics
  • LLM affordances: metadata.llm_primer with glossary and controls.examples
  • Platform-safe symlink implementation with Windows fallback
  • Robust error handling for missing sessions and corrupted indexes

Configuration & Automation

  • YAML configuration layer with full precedence resolution
  • Enhanced osiris init with AIOP scaffold, --no-comments and --stdout flags
  • Configuration precedence: CLI > ENV ($OSIRIS_AIOP_*) > Osiris.yaml > defaults
  • Auto-export after every run with templated paths and retention policies
  • Effective config tracking in metadata.config_effective with per-key source

πŸ“š Documentation & Architecture

  • Complete user guides with quickstart, troubleshooting, and examples
  • Technical architecture documentation (docs/architecture/aiop.md)
  • Enhanced overview.md with AIOP integration and workflows
  • Updated examples with AIOP walkthrough and sample exports
  • Team operations guide in CLAUDE.md for development workflow

πŸ”’ Security & Quality

  • Comprehensive secret redaction with zero-leak guarantee
  • DSN redaction for Redis, MongoDB, PostgreSQL connection strings
  • Test suite stabilization: All AIOP functionality fully tested
  • Parity verification: Local vs E2B execution produces identical exports
  • Deterministic output: Stable IDs, sorted keys, canonical JSON-LD format

πŸ“ˆ Performance

  • Memory footprint: <50MB during generation
  • Generation time: <2 seconds for typical runs
  • LRU caching for component registry lookups
  • Streaming JSON generation for large exports

πŸ”§ Quick Start

Enable AIOP in 3 steps:

# 1. Initialize with AIOP enabled
osiris init
# Edit osiris.yaml: ensure aiop.enabled: true

# 2. Run any pipeline
osiris run my-pipeline.yaml

# 3. View the AI-friendly export
cat logs/aiop/latest.json | jq '.narrative.summary'
# or browse the human-readable summary
open logs/aiop/run-card.md

πŸ”„ Migration Guide

AIOP is enabled by default in new installations (aiop.enabled: true in osiris.yaml). For existing installations, run osiris init to add AIOP configuration or manually enable in configuration.

πŸ“‹ Breaking Changes

None - this is a pure addition to the existing system.

πŸ”— Full Changelog

See CHANGELOG.md for complete details.