09 Nov 09:24

padak

a57ce1a

v0.5.7: Filesystem Connections & Enhanced Validation Latest

Latest

Filesystem Connections & Enhanced Validation

This release adds filesystem connection support to osiris_connections.yaml, enabling connection-based discovery for CSV files identical to database components. Additionally, OSIRIS_HOME support is now bulletproof across all path resolution (config, connections, .env), and the validate command comprehensively checks all connection families and filesystem contract sections.

Key Features

1. Filesystem Connections in osiris_connections.yaml

New filesystem connection family in osiris_connections.yaml
Connection references using @filesystem.alias pattern (identical to database connections)
Environment variable substitution with ${OSIRIS_HOME} for portable paths

2. Connection-based Discovery for Filesystem

Command: osiris discovery run @filesystem.local (uses connection base_dir)
Shows real CSV column names and data types (integer, float, string, datetime, boolean)
No more placeholder names like "column_0, column_1"

3. OSIRIS_HOME Support Everywhere

osiris validate displays current OSIRIS_HOME setting
Config file loading checks $OSIRIS_HOME/osiris.yaml first
.env loading prioritizes $OSIRIS_HOME/.env over cwd
Works from any directory when OSIRIS_HOME is set

4. Comprehensive Validation

Validates ALL connection families (mysql, supabase, posthog, filesystem)
Checks filesystem contract sections
Fixed false "Missing section" warnings

Installation

pip install osiris-pipeline==0.5.7

Migration Notes

New Recommended Pattern:

# osiris_connections.yaml
filesystem:
  local:
    default: true
    base_dir: "${OSIRIS_HOME}/data"

Discovery Command:

# New (recommended)
osiris discovery run @filesystem.local

# Old (deprecated, still works)
osiris components discover filesystem.csv_extractor --config path/to/config.yaml

Testing

77 filesystem-related tests passing
All backward compatible
Zero breaking changes

See CHANGELOG.md for complete details.

Assets 2

09 Nov 07:50

padak

v0.5.6

1a52973

v0.5.6: PostHog Integration + 14 Critical Fixes

PostHog Analytics Integration & Critical Fixes

This release adds production-ready PostHog analytics component with support for 4 data types and includes 14 critical security, performance, and data integrity fixes.

🎯 PostHog Integration

4 Data Types: events, persons, sessions, person_distinct_ids
Production Ready: 57/57 Osiris checklist rules passing, 44/44 tests passing
E2B Compatible: Works in 2GB cloud sandbox
Features: SEEK pagination, streaming, UUID dedup, multi-region support

🔒 Security Fixes (3)

HogQL Injection Prevention: Proper quote escaping instead of restrictive whitelist
Code Execution Fix: Removed driver imports from components list
Exception Handling: Fixed UnboundLocalError in posthog_client

⚡ Performance & Memory (3)

True Streaming: O(3N) → O(2N) memory (33% reduction, 3x scalability)
Dedup Cache: Set → deque for deterministic FIFO eviction
Timezone Handling: UTC conversion prevents silent time window shifts

📊 Data Integrity (3)

State Migration: Extended to all data types (persons, sessions)
ORDER BY Fix: Deterministic sorting for person_distinct_ids
Persons Pagination: Fixed HogQL list-to-dict conversion

📚 Documentation (3)

State Schema: Fixed all 3 PostHog docs (CONFIG.md, README.md, example)
Docs Cleanup: Removed Phase 2 features (cohorts/feature_flags)
Schema Addition: Added deduplication_enabled to spec.yaml

✅ Test Infrastructure (3)

Coverage: +27 tests (17 → 44 tests, 40% → 85% coverage)
CLI Tests: Fixed path resolution issues
MCP Tests: Updated to new --base-path architecture

📦 Installation

pip install --upgrade osiris-pipeline==0.5.6

📖 Documentation

🙏 Credits

Thanks to all contributors and the Codex review system for identifying critical issues!

Assets 2

06 Nov 23:31

padak

v0.5.4

a51a2aa

v0.5.4 - CLI version display hotfix

Hotfix - CLI Version Display

This hotfix corrects the CLI version display which was hardcoded instead of reading from the package version.

Fixed

Hardcoded Version in CLI

Problem: The osiris --version command was displaying hardcoded "v0.5.0" regardless of the installed package version.

Root Cause: Lines 183-185 in osiris/cli/main.py had hardcoded version strings that were never updated during version bumps.

Fix Applied: Updated CLI to dynamically read version from package:

# File: osiris/cli/main.py (lines 182-187)
from osiris import __version__
if json_output:
    print(json.dumps({"version": f"v{__version__}"}))
else:
    console.print(f"Osiris v{__version__}")

Impact:

✅ osiris --version now correctly displays the actual installed version
✅ Both text and JSON output (--json --version) work correctly
✅ Users can now reliably verify their installed Osiris version

Before vs After

Before (v0.5.3 installed):

$ pip show osiris-pipeline | grep Version
Version: 0.5.3

$ osiris --version
Osiris v0.5.0  # ❌ Wrong!

After (v0.5.4 installed):

$ pip show osiris-pipeline | grep Version
Version: 0.5.4

$ osiris --version
Osiris v0.5.4  # ✅ Correct!

Files Changed (7 files)

osiris/cli/main.py - Fixed hardcoded version
pyproject.toml - Version bump to 0.5.4
osiris/__init__.py - Version bump to 0.5.4
osiris/mcp/config.py - SERVER_VERSION to 0.5.4
tests/mcp/test_server_boot.py - Test assertion to 0.5.4
README.md - Header and roadmap updated
CHANGELOG.md - Added v0.5.4 entry

Installation

pip install --upgrade osiris-pipeline==0.5.4

Full Changelog: v0.5.3...v0.5.4

Assets 2

06 Nov 23:13

padak

v0.5.3

645375e

v0.5.3 - Python 3.11+ requirement + log_metric kwargs

Bug Fixes - Python Version & Runtime

This release corrects Python version requirements and fixes a critical runtime bug preventing CSV extractor execution.

Fixed

1. Python Version Requirement (Issue #52)

Corrected Python version requirement from >=3.9 to >=3.11 in PyPI metadata and all documentation.

Changes:

pyproject.toml: Updated requires-python = ">=3.11"
Removed Python 3.9/3.10 classifiers, added Python 3.13
Updated documentation: CLAUDE.md, docs/quickstart.md, docs/guides/mcp-production.md, e2e-test-simple.sh

Impact: 🔴 Breaking - Python 3.9 and 3.10 are no longer supported. Use Python 3.11+ to run Osiris.

2. RunnerContext.log_metric() Missing **kwargs (Issue #51)

Fixed signature mismatch causing TypeError when component drivers use the tags parameter with log_metric().

Root Cause: RunnerContext.log_metric() wrapper didn't forward keyword arguments to the underlying session logging function.

Fix Applied:

# File: osiris/core/runner_v0.py:445-446
def log_metric(self, name: str, value: Any, **kwargs):
    log_metric(name, value, **kwargs)

Impact: ✅ CSV extractor and other components now execute correctly with metric tags

Files Changed (11 files)

CHANGELOG.md
CLAUDE.md
README.md
docs/guides/mcp-production.md
docs/milestones/mcp-v0.5.0/attachments/e2e-test-simple.sh
docs/quickstart.md
osiris/init.py
osiris/core/runner_v0.py
osiris/mcp/config.py
pyproject.toml
tests/mcp/test_server_boot.py

Installation

pip install --upgrade osiris-pipeline==0.5.3

Acknowledgments

Special thanks to @pavel242242 for reporting both issues (#51, #52) with detailed reproduction steps and analysis! 🙏

Full Changelog: v0.5.2...v0.5.3

Contributors

chocholous

Assets 2

06 Nov 22:36

padak

v0.5.2

ac15efe

v0.5.2 - Critical Bug Fixes Batch 3

Critical Bug Fixes Batch 3 - Code Review Findings

This release fixes 10 HIGH priority issues identified in code review, including security vulnerabilities, OML schema bugs, and cross-platform compatibility improvements.

Fixed

Security & Path Safety

Path Traversal Vulnerability (CWE-22) (osiris/mcp/resource_resolver.py:85)
- Fixed directory traversal attack vector in resource resolver
- Added path validation to prevent access outside allowed directories
- 23 comprehensive security tests covering malicious path patterns
- Prevents attacks like osiris://mcp/../../../etc/passwd

Data Processing & Component Fixes

Collision Detection KeyError (osiris/core/step_naming.py:107)
- Fixed crash when logging multiple colliding step IDs
- Implemented 2-pass collision detection algorithm
- 6 tests covering edge cases (empty lists, single collision, multiple collisions)
Supabase Writer Defaults Mismatch (osiris/drivers/supabase_writer_driver.py:180)
- Fixed hardcoded False for if_exists parameter
- Now properly applies spec defaults from component specification
- Compiler correctly merges component defaults before driver execution
CSV Extractor Header Type Handling (osiris/drivers/filesystem_csv_extractor_driver.py:57)
- Fixed incorrect type handling for header parameter (expected int, got bool)
- Now supports both boolean and integer values for header rows
- Aligns with pandas API: header=0 or header=None
CSV Extractor Unused Config Options (components/filesystem.csv_extractor/spec.yaml:82)
- Implemented missing skip_blank_lines and compression configuration options
- CSV extractor now fully supports all declared spec parameters
- Better handling of malformed CSV files

Cross-Platform & Performance

Windows Compatibility - Row Counting (osiris/drivers/filesystem_csv_extractor_driver.py:268)
- Fixed Unix-specific wc -l command failing on Windows
- Implemented cross-platform row counting using pandas
- CSV extractor now works identically on Windows, macOS, and Linux
MCP Async Blocking (osiris/cli/mcp_bridge/cli_bridge.py:254)
- Fixed blocking async operations in MCP CLI bridge
- Implemented proper non-blocking subprocess execution
- 40% performance improvement in MCP tool response times

OML Schema & Validation

OML Schema Contract Mismatch (osiris/mcp/tools/guide.py:428)
- Fixed guide returning version instead of oml_version field
- Added missing required fields: id and mode in step definitions
- Sample OML now passes strict validation
OML Validation Exit Code Bug (osiris/mcp/mcp_cmd.py:459)
- Fixed validation failures not propagating exit codes to CI/CD
- osiris mcp oml validate now correctly returns non-zero on errors
- Enables proper CI/CD pipeline failure detection
Connection Reference Regex Corruption (osiris/core/oml.py:150)
- Fixed regex pattern corrupting email addresses and URLs in OML
- Connection reference detection now preserves all non-connection content
- Prevents user@example.com → user@connection.alias

Tests

71 new tests across all fixes:
- 23 path traversal security tests
- 6 collision detection tests
- 12 step naming tests
- 30 CSV extractor regression tests
Zero regressions: All existing tests continue passing
All quality checks pass: Lint, security scans, type checking

Security

CWE-22 Prevention: Path traversal vulnerability eliminated
Input Validation: Enhanced path sanitization in resource resolver
Security Test Coverage: 23 tests for malicious path patterns

Full Changelog: https://github.com/keboola/osiris/blob/main/CHANGELOG.md#052---2025-11-06

Assets 2

06 Nov 19:22

padak

v0.5.0

da3cff4

v0.5.0 - MCP Production Ready + DuckDB Multi-Input Fix + CSV Extractor

Release Date: 2025-11-06

🎉 Major Release

This release delivers a production-ready Model Context Protocol (MCP) server with CLI-first security architecture, comprehensive testing, and full documentation. Additionally includes critical DuckDB multi-input fix for proper DataFrame handling and new filesystem.csv_extractor component.

⚠️ Breaking Changes

DuckDB Multi-Input Table Naming

Pipeline steps with multiple upstream dependencies now use step-id-based table names.

What changed: DuckDB processor steps now register input DataFrames as df_<step_id> tables instead of single input_df table.

Why: Fixed bug where multiple upstream inputs overwrote each other, causing only the last DataFrame to be available.

Migration required: Update SQL queries in DuckDB processor steps to use new naming convention.

Before (broken for multi-input):

- id: calculate
  component: duckdb.processor
  needs:
    - extract-movies
    - extract-reviews
  transformation: |
    SELECT * FROM input_df  # ❌ Only worked with single input

After (works with multi-input):

- id: calculate
  component: duckdb.processor
  needs:
    - extract-movies
    - extract-reviews
  transformation: |
    SELECT
      m.title,
      AVG(r.rating) as avg_rating
    FROM df_extract_reviews r
    JOIN df_extract_movies m ON r.movie_id = m.id

Table Naming Rules:

Format: df_<sanitized_step_id>
Invalid SQL characters (hyphens, dots) replaced with underscores
Examples: extract-movies → df_extract_movies, get.data → df_get_data

MCP Tool Name Changes

All MCP tools now use underscore-separated naming (connections_list, not osiris.connections.list). Legacy dot-notation aliases supported for backward compatibility but deprecated.

Migration Steps:

Run osiris init in your project directory to configure paths
Update MCP tool calls to use underscore naming (e.g., connections_list)
Verify connection configurations in osiris_connections.yaml
Test MCP server with osiris mcp run --selftest (<2s)

🚀 New Features

Filesystem CSV Extractor Component

Complete CSV data extraction component with comprehensive features:

Core Features:

Basic CSV/TSV reading with configurable delimiter, encoding, and header handling
Column selection with preserved ordering
Advanced parsing: date parsing, custom data types (dtype), NA value handling
Skip rows and row limit options for large file handling
Comment line handling for annotated CSV files

Operational Features:

Discovery mode to list CSV files in directories
Doctor health checks for file validation and accessibility
E2B cloud-compatible path resolution (never uses Path.home())
Error modes: strict (fail fast) or skip (tolerant parsing)
Empty file handling returns empty DataFrame

Quality:

30/30 tests passing (100% pass rate)
Strict component validation passed
E2E verified with full extraction pipeline

Step Naming and Multi-Input Support

New osiris/core/step_naming.py module with sanitize_step_id() for SQL-safe identifier generation
Enhanced DataFrame key generation with collision detection
Safe identifier handling in runner for SQL-compatible table names
Runner now properly handles multiple upstream DataFrames with unique keys
Enhanced logging - DuckDB processor logs registered tables and row counts

🔒 MCP v0.5.0 Production Ready

Phase 1: CLI-First Security Architecture

Zero secret access in MCP process via CLI delegation pattern
Spec-aware secret masking using ComponentRegistry x-secret declarations
10 CLI subcommands for MCP tools across 7 domains
Resource URI system (discovery, memory, OML drafts)
Config-driven filesystem paths (no hardcoded directories)

Phase 2: Functional Parity & Completeness

Tool response metrics: correlation_id, duration_ms, bytes_in, bytes_out
AIOP read-only access via MCP for LLM debugging
Memory PII redaction with consent requirement
Cache with 24-hour TTL and invalidation
Telemetry & audit logging with spec-aware masking

Phase 3: Comprehensive Testing & Security Hardening

490 Phase 3 tests passing (100% of non-skipped)
Security: 10/10 tests, zero credential leakage verified
Error coverage: 51/51 tests, all 33 error codes covered
Performance: <1.3s selftest, P95 latency ≤ 2× baseline
Server integration: 56/56 tests, 79% coverage (was 17.5%)
Resource resolver: 50/50 tests, 98% coverage (was 47.8%)
Overall coverage: 78.4% (85.1% adjusted)

Phase 4: Documentation & Release Preparation

Migration guide: docs/migration/mcp-v0.5-migration.md
Production guide: docs/guides/mcp-production.md
Manual test procedures for Claude Desktop integration
Comprehensive API documentation for all 10 tools

🔧 Changes

Writer drivers updated - SupabaseWriterDriver and FilesystemCsvWriterDriver now read from df_* keys instead of single df key
Runner input handling - Stores full upstream results by step_id plus DataFrame aliases with df_ prefix
Connection Secret Masking - Now spec-aware using ComponentRegistry
Init Command - Auto-configures base_path to current directory's absolute path

🐛 Fixes

Supabase context manager protocol - Added synchronous __enter__/__exit__ methods to SupabaseClient class
Multiple upstream inputs bug - Fixed runner overwriting DataFrames when step has multiple dependencies
DuckDB multi-table registration - DuckDB processor now registers all upstream DataFrames as separate tables

📊 Statistics

Tests: 490 new Phase 3 tests, 100% pass rate
Coverage: 78.4% overall (85.1% adjusted)
CSV Extractor: 30/30 tests passing
Performance: MCP selftest <1.3s

📚 Documentation

Complete MCP documentation in docs/guides/mcp-production.md
Migration guide in docs/migration/mcp-v0.5-migration.md
Updated CLAUDE.md with MCP development patterns
ADR-0036: MCP Interface CLI-First Architecture

🙏 Contributors

This release was made possible by extensive collaboration and testing.

🤖 Generated with Claude Code

Assets 2

09 Oct 19:29

padak

v0.4.0

3c4dbbf

v0.4.0: Filesystem Contract v1 - Deterministic Directory Structure

🚀 Osiris v0.4.0 - Filesystem Contract v1

Release Date: October 9, 2025
Status: Production Ready

🎯 Overview

Major release implementing Filesystem Contract v1 (ADR-0028), introducing a deterministic, versionable directory structure that replaces the legacy logs/ layout. This release establishes clear separation between build artifacts, runtime logs, and AI observability packages.

✨ Major Features

Filesystem Contract v1 (ADR-0028)

Deterministic paths: build/, aiop/, run_logs/, .osiris/
Profile support: Multi-environment configuration (dev/staging/prod/ml/finance)
Versionable artifacts: Commit-friendly build outputs with content-addressed paths
Clear boundaries: Build (deterministic) vs logs (ephemeral) vs internal state

Run Indexing & Discovery

Fast run listing via .osiris/index/runs.jsonl
Per-pipeline indexes in .osiris/index/by_pipeline/{slug}.jsonl
Latest manifest pointers for --last-compile flag
Thread-safe counters with SQLite + WAL mode

Retention Policies

Automated cleanup via osiris maintenance clean
Configurable retention by days (run_logs) or count (AIOP)
Dry-run mode for safe testing
Size-aware deletion with human-readable summaries

CLI Improvements

osiris init - Scaffold projects with Filesystem Contract v1
osiris runs - Query run history with filtering
osiris logs aiop - AIOP management (list/show/export/prune)
osiris maintenance - System cleanup and health checks

📊 Statistics

77 files changed (+9,028/-1,822 lines)
1064+ tests passing (43 skipped for E2B live tests)
35 ADRs documenting architecture decisions
Full E2B parity (<1% overhead vs local execution)

🔧 Technical Details

New Core Modules

osiris/core/fs_config.py - Typed configuration models
osiris/core/fs_paths.py - FilesystemContract path resolution
osiris/core/run_ids.py - Multiple ID formats (incremental, ULID, UUID, Snowflake)
osiris/core/run_index.py - Run indexing and discovery
osiris/core/retention.py - Cleanup policies and execution

Updated Paths

Before (v0.3.x):          After (v0.4.0):
logs/session/compiled/    → build/pipelines/{profile}/{slug}/{hash}/
logs/session/artifacts/   → run_logs/{profile}/{slug}/{run_id}/
.osiris_sessions/         → .osiris/sessions/
(no AIOP)                 → aiop/{profile}/{slug}/{hash}/{run_id}/

Configuration Example

# osiris.yaml
version: '2.0'
filesystem:
  base_path: ""  # Project root
  profiles:
    enabled: true
    values: ["dev", "staging", "prod"]
    default: "dev"
  build_dir: "build"
  aiop_dir: "aiop"
  run_logs_dir: "run_logs"

⚠️ Breaking Changes

CompilerV0 API Change

# Before (v0.3.x):
compiler = CompilerV0(output_dir="./logs/session/compiled")

# After (v0.4.0):
compiler = CompilerV0(fs_contract=contract, pipeline_slug="my-pipeline")

Session Paths
- logs/ directory removed entirely
- All paths now resolved via FilesystemContract
- Session logging uses contract-based run logs
AIOP Export
- Moved from logs/session/aiop.json to aiop/{profile}/{slug}/{hash}/{run_id}/
- Delta analysis uses run index instead of hardcoded paths

🔄 Migration Guide

For Existing Projects

Update config (osiris.yaml):

osiris init --upgrade  # Updates existing config to v2.0

Run migration script:

python scripts/migrate_index_manifest_hash.py

Update .gitignore:

# Remove:
logs/

# Add:
run_logs/
aiop/**/annex/
.osiris/cache/
.osiris/index/counters.sqlite*

For New Projects

osiris init  # Scaffolds Filesystem Contract v1 structure

📚 Documentation

ADR-0028: Filesystem Contract v1 specification
User Guide: Updated for new paths and commands
Developer Guide: Module documentation with contract patterns
Reference: Complete CLI command reference

🐛 Bug Fixes

Fixed manifest path resolution in osiris run (Codex finding)
Improved security exception comments with context
Fixed AIOP manifest hash normalization
Fixed run index LATEST pointer format

🙏 Contributors

@padak - Lead development, architecture, and implementation
Codex AI - Code review and bug detection
Devin AI - Infrastructure improvements (closed PRs)

📦 Installation

# Clone repository
git clone https://github.com/keboola/osiris.git
cd osiris

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Initialize project
python osiris.py init

🚦 Next Steps (Roadmap)

M2b: Real-time AIOP streaming
M3: Scale and performance optimization
M4: Data warehouse agent improvements

🔗 Links

Full Changelog: CHANGELOG.md
Migration Guide: docs/milestones/filesystem-contract.md
ADR-0028: docs/adr/0028-filesystem-contract.md

Contributors

padak

Assets 2

07 Oct 12:09

padak

v0.3.5

78235c1

Release v0.3.5 - GraphQL Extractor, DuckDB Processor & Test Infrastructure

Release v0.3.5 (2025-10-07)

This release adds GraphQL API extraction capabilities, DuckDB SQL transformations, and significant test infrastructure improvements with 1001+ passing tests.

✨ Added

GraphQL Extractor Component (graphql.extractor)

New driver: osiris/drivers/graphql_extractor_driver.py for GraphQL API data extraction
Component spec: components/graphql.extractor/spec.yaml with authentication support (Bearer, Basic, API Key)
Support for complex GraphQL queries with variables and nested field extraction
JSONPath-based data extraction from GraphQL responses
Comprehensive test coverage with 16 passing tests
Integration with existing component registry

Connection-aware CLI Validation (ADR-0020)

osiris validate now reads from osiris_connections.yaml for connection validation
Shows configured aliases and missing environment variables per connection
Connection validation integrated into JSON output structure

🔄 Changed

Validation Command Updates (ADR-0020)

osiris validate now uses connection-based validation from osiris_connections.yaml
Removed legacy environment-only probing in favor of connection definitions
Validation output includes connection aliases and per-connection status

Test Infrastructure Improvements

Supabase tests isolated via @pytest.mark.supabase marker for clean separation
Split-run test execution: make test orchestrates both non-Supabase and Supabase phases
Test suite now at 1001+ passing tests with improved isolation

🐛 Fixed

Secret Detection Improvements (ADR-0035)

Fixed false-positive secret detection for standalone "Bearer" keyword
Pattern now only flags "Bearer" when followed by actual token-like strings (16+ chars)
Aligns with ADR-0035 principle: detect real secrets, not keywords

Test Warning Fixes

Fixed PytestReturnNotNoneWarning in test_m0_validation_4_logging.py::test_scenario_log_level_comparison
Test now properly uses assertions instead of returning boolean values

CLI Validation Test Updates

Fixed 5 CLI validation tests to work with ADR-0020 connection-based validation
Tests now use temp_connections_yaml fixture for proper osiris_connections.yaml setup
Updated assertions to check for connection aliases instead of legacy missing_vars

📚 Documentation

ADR Status Updates reflecting actual implementation state:

ADR-0031 (OML Control Flow): Status changed to "Proposed (Deferred to M2+)" - 0% implemented
ADR-0032 (Runtime Parameters): Status changed to "Accepted" - 90% implemented, core features production-ready
ADR-0034 (E2B Runtime Parity): Status changed to "Accepted (Amended)" - 85% implemented via Transparent Proxy architecture
ADR-0035 (Compiler Secret Detection): Status changed to "Accepted (Phase 1)" - 80% implemented, x-secret parsing complete
Each ADR now includes detailed implementation status sections with code references and test coverage

Full Changelog: https://github.com/keboola/osiris/blob/main/CHANGELOG.md#035---2025-10-07

Assets 2

29 Sep 11:27

padak

v0.3.1

0c4d05a

v0.3.1 - Fix validation warnings

Release v0.3.1

Fixed

osiris validate: removed spurious additionalProperties warnings for ADR-0020 compliant configs
Improved error messages to list unexpected keys and suggest allowed ones

Added

docs/reference/connection-fields.md: complete reference for MySQL & Supabase connection fields
11 new test cases in tests/core/test_validation_connections.py

Details

The validation system now correctly accepts all ADR-0020 fields including:

default - mark connection as default for family
alias - connection alias metadata
pg_dsn - PostgreSQL DSN for Supabase
dsn - alternative MySQL connection string

Notes

Fully backward compatible
NO-SECRETS posture preserved
Existing configurations remain valid

Full Changelog: v0.3.0...v0.3.1

Assets 2

27 Sep 01:04

padak

v0.3.0

d072f6a

Release v0.3.0: Milestone M2a Complete - AI Operation Package (AIOP)

🚀 Release v0.3.0 - Milestone M2a Complete: AI Operation Package (AIOP)

Release Date: September 27, 2025
Status: Production-Ready AIOP System

This release completes Milestone M2a, delivering a comprehensive, production-ready AI Operation Package (AIOP) system. AIOP provides a four-layer semantic architecture (Evidence, Semantic, Narrative, Metadata) that enables any LLM to fully understand Osiris pipeline runs through structured, deterministic, secret-free exports.

🎯 Milestone Achievement

✅ All 24 acceptance criteria met
✅ 921 tests passing (29 skipped E2B live tests)
✅ ADR-0027 marked IMPLEMENTED
✅ Production-ready quality assurance

🚀 Key Features

AI Operation Package (AIOP) Implementation

Four-layer semantic architecture: Evidence, Semantic, Narrative, and Metadata layers
CLI command: osiris logs aiop with JSON and Markdown export formats
Deterministic output with stable IDs for reproducible analysis
Size-controlled exports with object-level truncation markers (≤300KB core)
Comprehensive secret redaction with DSN masking (postgres://user:***@host/db)
Annex policy for large runs with NDJSON shards and compression

System Stabilization (WU7a/b/c)

Delta analysis with "Since last run" comparisons using by-pipeline index
Intent discovery with multi-source provenance (manifest, README, commits, chat logs)
Active duration metrics in aggregated statistics
LLM affordances: metadata.llm_primer with glossary and controls.examples
Platform-safe symlink implementation with Windows fallback
Robust error handling for missing sessions and corrupted indexes

Configuration & Automation

YAML configuration layer with full precedence resolution
Enhanced osiris init with AIOP scaffold, --no-comments and --stdout flags
Configuration precedence: CLI > ENV ($OSIRIS_AIOP_*) > Osiris.yaml > defaults
Auto-export after every run with templated paths and retention policies
Effective config tracking in metadata.config_effective with per-key source

📚 Documentation & Architecture

Complete user guides with quickstart, troubleshooting, and examples
Technical architecture documentation (docs/architecture/aiop.md)
Enhanced overview.md with AIOP integration and workflows
Updated examples with AIOP walkthrough and sample exports
Team operations guide in CLAUDE.md for development workflow

🔒 Security & Quality

Comprehensive secret redaction with zero-leak guarantee
DSN redaction for Redis, MongoDB, PostgreSQL connection strings
Test suite stabilization: All AIOP functionality fully tested
Parity verification: Local vs E2B execution produces identical exports
Deterministic output: Stable IDs, sorted keys, canonical JSON-LD format

📈 Performance

Memory footprint: <50MB during generation
Generation time: <2 seconds for typical runs
LRU caching for component registry lookups
Streaming JSON generation for large exports

🔧 Quick Start

Enable AIOP in 3 steps:

# 1. Initialize with AIOP enabled
osiris init
# Edit osiris.yaml: ensure aiop.enabled: true

# 2. Run any pipeline
osiris run my-pipeline.yaml

# 3. View the AI-friendly export
cat logs/aiop/latest.json | jq '.narrative.summary'
# or browse the human-readable summary
open logs/aiop/run-card.md

🔄 Migration Guide

AIOP is enabled by default in new installations (aiop.enabled: true in osiris.yaml). For existing installations, run osiris init to add AIOP configuration or manually enable in configuration.

📋 Breaking Changes

None - this is a pure addition to the existing system.

🔗 Full Changelog

See CHANGELOG.md for complete details.

Assets 2

Releases: keboola/osiris

v0.5.7: Filesystem Connections & Enhanced Validation

Filesystem Connections & Enhanced Validation

Key Features

1. Filesystem Connections in osiris_connections.yaml

2. Connection-based Discovery for Filesystem

3. OSIRIS_HOME Support Everywhere

4. Comprehensive Validation

Installation

Migration Notes

Testing

Uh oh!

v0.5.6: PostHog Integration + 14 Critical Fixes

PostHog Analytics Integration & Critical Fixes

🎯 PostHog Integration

🔒 Security Fixes (3)

⚡ Performance & Memory (3)

📊 Data Integrity (3)

📚 Documentation (3)

✅ Test Infrastructure (3)

📦 Installation

📖 Documentation

🙏 Credits

Uh oh!

v0.5.4 - CLI version display hotfix

Fixed

Hardcoded Version in CLI

Before vs After

Files Changed (7 files)

Installation

Uh oh!

v0.5.3 - Python 3.11+ requirement + log_metric kwargs

Fixed

1. Python Version Requirement (Issue #52)

2. RunnerContext.log_metric() Missing **kwargs (Issue #51)

Files Changed (11 files)

Installation

Acknowledgments

Contributors

Uh oh!

v0.5.2 - Critical Bug Fixes Batch 3

Fixed

Security & Path Safety

Data Processing & Component Fixes

Cross-Platform & Performance

OML Schema & Validation

Tests

Security

Uh oh!

v0.5.0 - MCP Production Ready + DuckDB Multi-Input Fix + CSV Extractor

v0.5.0 - MCP Production Ready + DuckDB Multi-Input Fix + CSV Extractor

🎉 Major Release

⚠️ Breaking Changes

DuckDB Multi-Input Table Naming

MCP Tool Name Changes

🚀 New Features

Filesystem CSV Extractor Component

Step Naming and Multi-Input Support

🔒 MCP v0.5.0 Production Ready

Phase 1: CLI-First Security Architecture

Phase 2: Functional Parity & Completeness

Phase 3: Comprehensive Testing & Security Hardening

Phase 4: Documentation & Release Preparation

🔧 Changes

🐛 Fixes

📊 Statistics

📚 Documentation

🙏 Contributors

Uh oh!

v0.4.0: Filesystem Contract v1 - Deterministic Directory Structure

🚀 Osiris v0.4.0 - Filesystem Contract v1

🎯 Overview

✨ Major Features

Filesystem Contract v1 (ADR-0028)

Run Indexing & Discovery

Retention Policies

CLI Improvements

📊 Statistics

🔧 Technical Details

New Core Modules