Try chDB v3.0.0

pip install -U chdb

Major Changes

Performance

Some performance improvement (12% faster than DuckDB 1.1.3 and chDB 2.1)

Session Mode and Memory Engine Support

Session mode in chDB 3.0 is now equivalent to the new connection-based implementation
Improved support for ClickHouse memory engine and stateful operations
Enhanced performance for INSERT and UPDATE operations
Better handling of background mutation threads and part compaction
New support for SQLite-style ":memory:" connection string mode
- Implemented using ClickHouse memory engine under the hood

New Default Query Mode: Connection-based Queries

Connection-based queries are now the recommended and default way to interact with chDB
Users should first create a connection using chDB connection or DBAPI
Queries can be executed through:
- Connection with cursor (fetch results via cursor)
- Direct connection query (simplified method, results returned directly)

New arch chart

What's Changed

Fix upload pypi and github release by @auxten in #278
Update PyPi info by @auxten in #279
Add citation just like ClickHouse, as chDB is mentioned in ClickHouse Paper by @auxten in #280
SQLite like API mode of chDB by @auxten in #283
Skip dup call build.sh in setup.py by @auxten in #285
Fix starving caused by GIL by @auxten in #289
Refactor query execution to use connection object by @auxten in #290
Patchset 3.0 by @auxten in #294

Full Changelog: v2.1.1...v3.0.0

Connection and Session Management

Connection String Format

# Supported formats with examples:
":memory:"                                    # In-memory database
"test.db"                                    # Relative path
"file:test.db"                               # Explicit file protocol
"/path/to/test.db"                           # Absolute path
"file:/path/to/test.db"                      # Absolute path with protocol
"file:test.db?param1=value1&param2=value2"   # With query parameters
"file::memory:?verbose&log-level=test"       # In-memory with parameters
"///path/to/test.db?param1=value1"           # Triple slash absolute path

Connection Parameters

Parameters in connection strings are passed to ClickHouse engine as startup args
Special parameter handling:
- mode=ro is converted to --readonly=1 for read-only mode
For full parameter list, refer to clickhouse local --help --verbose

Session Behavior

Sessions maintain query state throughout their lifecycle
Only one active session allowed at a time
Creating a new session automatically closes any existing session
Sessions can be temporary or persistent:
- Without specified path: Creates auto-cleaned temporary directory
- With specified path: Maintains persistent database state

Connection States

Only one global connection can exist at a time
Creating a new connection with the same connection string returns the existing connection
Creating a new connection with a different path before closing the existing one will trigger a user warning

API Reference

Detailed API Reference

Connection Class

class connect:
    def __init__(self, path: str = ":memory:"):
        """Initialize connection with optional path parameter"""
        
    def cursor(self) -> cursor:
        """Create and return a new cursor object"""
        
    def execute(self, query_str: str) -> Any:
        """Execute a query directly on the connection"""
        
    def query(self, query_str: str, format: str = "CSV") -> query_result:
        """Execute a query and return a query_result object
        
        Args:
            query_str: SQL query string
            format: Output format (default: "CSV")
        """
        
    def commit(self):
        """Commit current transaction"""
        
    def close(self):
        """Close the connection"""

Cursor Class

class cursor:
    def __init__(self, connection):
        """Initialize cursor with a connection object"""
        
    def execute(self, query: str) -> None:
        """Execute a SQL query"""
        
    def commit(self) -> None:
        """Commit current transaction"""
        
    def close(self) -> None:
        """Close the cursor"""
        
    def get_memview(self) -> memoryview:
        """Get memory view of result data"""
        
    def data_size(self) -> int:
        """Get size of result data in bytes"""
        
    def rows_read(self) -> int:
        """Get number of rows read"""
        
    def bytes_read(self) -> int:
        """Get number of bytes read"""
        
    def elapsed(self) -> float:
        """Get query execution time in seconds"""
        
    def has_error(self) -> bool:
        """Check if there was an error in execution"""
        
    def error_message(self) -> str:
        """Get error message if there was an error"""

Usage Examples

Basic Memory Engine Example

from chdb import connect

conn = connect(":memory:")
cursor = conn.cursor()

# Create a table
cursor.execute("""
    CREATE TABLE users (
        id Int32,
        name String,
        scores Array(UInt8)
    ) ENGINE = Memory
""")

# Insert data
cursor.execute("""
    INSERT INTO users VALUES
    (1, 'Alice', [95, 87, 92]),
    (2, 'Bob', [88, 85, 90]),
    (3, 'Charlie', [91, 89, 94])
""")

# Query data
cursor.execute("SELECT * FROM users WHERE id = 1")
row = cursor.fetchone()

Advanced DBAPI Usage

from chdb import dbapi

# Create connection
conn = dbapi.connect("path/to/db")
cur = conn.cursor()

# Create table
cur.execute("""
    CREATE TABLE rate (
        day Date,
        value Int64
    ) ENGINE = ReplacingMergeTree ORDER BY day
""")

# Insert data (single and multiple rows)
cur.execute("INSERT INTO rate VALUES (%s, %s)", ("2021-01-01", 24))
cur.executemany("INSERT INTO rate VALUES (%s, %s)", [
    ("2021-01-02", 128),
    ("2021-01-03", 256)
])

# Update data
cur.executemany(
    "ALTER TABLE rate UPDATE value = %s WHERE day = %s",
    [(72, "2021-01-02"), (96, "2021-01-03")]
)

# Fetch methods
cur.execute("SELECT value FROM rate ORDER BY day DESC")
row = cur.fetchone()           # Single row
rows = cur.fetchmany(2)        # Multiple rows
all_rows = cur.fetchall()      # All remaining rows

# Clean up
cur.close()
conn.close()

Best Practices

Create a single connection and reuse it throughout your project
Use connection-based queries instead of direct chDB queries
Close existing connections before creating new ones with different paths
Be mindful of thread usage when executing queries

Migration Guide

Users upgrading from pre-2.2 versions should:

Update code to use connection-based queries
Review and update connection string parameters
Update any code relying on the debug output format
Audit thread usage in applications using chDB

For more detailed examples and API usage, refer to:

For binding developers

New API avail:

/**
 * Creates a new chDB connection.
 * Only one active connection is allowed per process.
 * Creating a new connection with different path requires closing existing connection.
 * 
 * @param argc Number of command-line arguments
 * @param argv Command-line arguments array (--path=<db_path> to specify database location)
 * @return Pointer to connection pointer, or NULL on failure
 * @note Default path is ":memory:" if not specified
 */
CHDB_EXPORT struct chdb_conn ** connect_chdb(int argc, char ** argv);

/**
 * Closes an existing chDB connection and cleans up resources.
 * Thread-safe function that handles connection shutdown and cleanup.
 * 
 * @param conn Pointer to connection pointer to close
 */
CHDB_EXPORT void close_conn(struct chdb_conn ** conn);

/**
 * Executes a query on the given connection.
 * Thread-safe function that handles query execution in a separate thread.
 * 
 * @param conn Connection to execute query on
 * @param query SQL query string to execute
 * @param format Output format string (e.g., "CSV", default format)
 * @return Query result structure containing output or error message
 * @note Returns error result if connection is invalid or closed
 */
CHDB_EXPORT struct local_result_v2 * query_conn(struct chdb_conn * conn, const char * query, const char * format);

Please see programs/local/chdb.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v3.0.0

Try chDB v3.0.0

Major Changes

Performance

Session Mode and Memory Engine Support

New Default Query Mode: Connection-based Queries

New arch chart

What's Changed

Connection and Session Management

Connection String Format

Connection Parameters

Session Behavior

Connection States

API Reference

Detailed API Reference

Connection Class

Cursor Class

Usage Examples

Basic Memory Engine Example

Advanced DBAPI Usage

Best Practices

Migration Guide

For binding developers

Contributors