Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 13 additions & 30 deletions examples/demo-snowflake-project/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,17 +13,17 @@ This project deploys the [Databao](https://github.com/JetBrains/databao-cli) Str
1. **`setup.sql`** provisions everything needed inside Snowflake:
- A dedicated database, warehouse, and compute pool (all named with a configurable suffix)
- Network rules and external access integrations for outbound HTTPS
- A service user with a permissive network policy
- A Git repository object pointing at `databao-cli` on GitHub
- Snowflake secrets for the OpenAI/Anthropic API keys and datasource credentials
- Snowflake secrets for the OpenAI/Anthropic API keys and datasource configuration (warehouse, database)
- A Python UDF (`get_secret`) that reads those secrets at runtime
- The Streamlit app itself, running on a container runtime (`CPU_X64_M`)

2. **`cleanup.sql`** removes all objects created by `setup.sql` for a given suffix.

3. **`app.py`** is the Streamlit entry point that adapts `databao-cli`'s UI for Snowflake:
- Detects whether it is running inside Snowflake (via `/snowflake/session/token`)
- Calls `get_secret()` through a Snowflake SQL session to load secrets into environment variables (`OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `SNOWFLAKE_DS_*`)
- Calls `get_secret()` through a Snowflake SQL session to load secrets into environment variables (`OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `SNOWFLAKE_DS_WAREHOUSE`, `SNOWFLAKE_DS_DATABASE`)
- Patches the Snowflake introspector to authenticate using the SiS OAuth session token (re-read from `/snowflake/session/token` on every connection to avoid expiry). This means the datasource connection runs as the logged-in user — no service user or stored credentials needed.
- Locates and configures the ADBC Snowflake driver shared library so DuckDB's Snowflake extension can find it
- Launches the standard Databao UI in **read-only domain** mode

Expand All @@ -40,35 +40,20 @@ Open `setup.sql` and fill in the placeholder values at the top:
| `suffix` | Name suffix appended to all Snowflake objects. Set to e.g. `V2` to create a fully independent copy (objects will be named `STREAMLIT_DATABAO_DB_V2`, etc.). Changing the suffix lets you run multiple independent instances side by side. |
| `openai_key` | OpenAI API key |
| `anthropic_key` | Anthropic API key |
| `sf_ds_account` | Snowflake datasource account identifier (see [below](#datasource-credentials-sf_ds_)) |
| `sf_ds_warehouse` | Warehouse for the datasource (see [below](#datasource-credentials-sf_ds_)) |
| `sf_ds_database` | Database to explore (see [below](#datasource-credentials-sf_ds_)) |
| `sf_ds_user` | Service user for the datasource (see [below](#datasource-credentials-sf_ds_)) |
| `sf_ds_password` | Password for that service user |
| `sf_ds_warehouse` | Warehouse for the datasource (see [below](#datasource-configuration-sf_ds_)) |
| `sf_ds_database` | Database to explore (see [below](#datasource-configuration-sf_ds_)) |

#### Datasource credentials (`sf_ds_*`)
#### Datasource configuration (`sf_ds_*`)

These credentials are used by the Databao agent to connect to a Snowflake database via the Snowflake API. The agent reads data from this database to answer your questions.

- **`sf_ds_account`** — your Snowflake account identifier (e.g. `abc12345.us-east-1`). You can find it in Snowsight under your account menu.
These settings tell the Databao agent which warehouse and database to use when exploring data. Authentication is handled automatically via the SiS session token — the agent runs as the logged-in Snowflake user, so no service user or password is needed.

- **`sf_ds_warehouse`** — the warehouse the agent will use to run queries. If you don't have one, create it in **Snowsight → Admin → Warehouses → + Warehouse** (an `XSMALL` warehouse is sufficient).

- **`sf_ds_database`** — the database containing the data the agent will explore.

- **`sf_ds_user`** and **`sf_ds_password`** — a service user that the agent authenticates as. To create one:
1. Go to **Snowsight → Admin → Users & Roles → + User**
2. Enter a name (e.g. `STREAMLIT_SERVICE_USER`)
3. Set a password
4. Click **Create User**
5. Grant the user access to the target database and warehouse:

```sql
GRANT USAGE ON WAREHOUSE <your_warehouse> TO USER <your_service_user>;
GRANT USAGE ON DATABASE <your_database> TO USER <your_service_user>;
GRANT USAGE ON ALL SCHEMAS IN DATABASE <your_database> TO USER <your_service_user>;
GRANT SELECT ON ALL TABLES IN DATABASE <your_database> TO USER <your_service_user>;
```
The Snowflake account is detected automatically from the SPCS-provided `SNOWFLAKE_ACCOUNT` environment variable.

> **Note:** Users opening the Streamlit app must have `USAGE` grants on the configured warehouse and database, since the agent authenticates as their Snowflake identity.

### 2. Run the Setup Script

Expand All @@ -80,7 +65,7 @@ Once the script finishes, navigate to **Streamlit** in Snowsight and open the ap

## Cleanup

To remove all Snowflake objects created by `setup.sql`, open `cleanup.sql`, set the same `suffix` you used during setup, and run the script as `ACCOUNTADMIN`. This drops the database (cascading to all database-scoped objects), compute pool, integrations, service user, network policy, and warehouse.
To remove all Snowflake objects created by `setup.sql`, open `cleanup.sql`, set the same `suffix` you used during setup, and run the script as `ACCOUNTADMIN`. This drops the database (cascading to all database-scoped objects), compute pool, integrations, and warehouse.

## Local Development

Expand All @@ -91,18 +76,16 @@ uv sync
# Set the required environment variables
export OPENAI_API_KEY="..."
export ANTHROPIC_API_KEY="..."
export SNOWFLAKE_DS_ACCOUNT="..."
export SNOWFLAKE_ACCOUNT="..."
export SNOWFLAKE_DS_WAREHOUSE="..."
export SNOWFLAKE_DS_DATABASE="..."
export SNOWFLAKE_DS_USER="..."
export SNOWFLAKE_DS_PASSWORD="..."

# Run the Streamlit app
uv run streamlit run src/databao_snowflake_demo/app.py -- \
--project-dir .
```

When running locally, the Snowflake secret-loading logic is skipped (it only activates inside a Snowflake container). Environment variables must be set manually.
When running locally, the SiS session-token patch and Snowflake secret-loading logic are skipped (they only activate inside a Snowflake container). Environment variables must be set manually. The datasource YAML uses `externalbrowser` auth by default, which opens your browser for Snowflake SSO. In SiS, this is overridden by the OAuth session token patch.

## Updating the `databao` Package

Expand Down
12 changes: 0 additions & 12 deletions examples/demo-snowflake-project/cleanup.sql
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,9 @@ CREATE WAREHOUSE IF NOT EXISTS STREAMLIT_DATABAO_BOOTSTRAP_WH
USE WAREHOUSE STREAMLIT_DATABAO_BOOTSTRAP_WH;

DECLARE
_sql VARCHAR;

-- Derived object names (must match setup.sql)
_db VARCHAR DEFAULT 'STREAMLIT_DATABAO_DB_' || $suffix;
_wh VARCHAR DEFAULT 'STREAMLIT_DATABAO_WAREHOUSE_' || $suffix;
_user VARCHAR DEFAULT 'STREAMLIT_DATABAO_USER_' || $suffix;
_network_policy VARCHAR DEFAULT 'STREAMLIT_DATABAO_NETWORK_POLICY_' || $suffix;
_git_integration VARCHAR DEFAULT 'STREAMLIT_DATABAO_GIT_INTEGRATION_' || $suffix;
_eai VARCHAR DEFAULT 'STREAMLIT_DATABAO_EAI_' || $suffix;
_secrets_access VARCHAR DEFAULT 'STREAMLIT_DATABAO_SECRETS_ACCESS_' || $suffix;
Expand All @@ -40,14 +36,6 @@ BEGIN
-- API integration (git)
EXECUTE IMMEDIATE 'DROP INTEGRATION IF EXISTS ' || :_git_integration;

-- User (unset network policy first, then drop)
_sql := 'ALTER USER IF EXISTS ' || :_user || ' UNSET NETWORK_POLICY';
EXECUTE IMMEDIATE :_sql;
EXECUTE IMMEDIATE 'DROP USER IF EXISTS ' || :_user;

-- Network policy
EXECUTE IMMEDIATE 'DROP NETWORK POLICY IF EXISTS ' || :_network_policy;

-- Warehouse
EXECUTE IMMEDIATE 'DROP WAREHOUSE IF EXISTS ' || :_wh;
END;
Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
type: snowflake
name: snowflake
connection:
account: {{ env_var('SNOWFLAKE_DS_ACCOUNT') }}
account: {{ env_var('SNOWFLAKE_ACCOUNT') }}
warehouse: {{ env_var('SNOWFLAKE_DS_WAREHOUSE') }}
database: {{ env_var('SNOWFLAKE_DS_DATABASE') }}
user: {{ env_var('SNOWFLAKE_DS_USER') }}
auth:
password: {{ env_var('SNOWFLAKE_DS_PASSWORD') }}
authenticator: externalbrowser
64 changes: 7 additions & 57 deletions examples/demo-snowflake-project/setup.sql
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,8 @@ SET suffix = 'DEMO';
-- Secrets
SET openai_key = '<YOUR_OPENAI_API_KEY>';
SET anthropic_key = '<YOUR_ANTHROPIC_API_KEY>';
SET sf_ds_account = '<SNOWFLAKE_DATASOURCE_ACCOUNT>';
SET sf_ds_warehouse = '<SNOWFLAKE_DATASOURCE_WAREHOUSE>';
SET sf_ds_database = '<SNOWFLAKE_DATASOURCE_DATABASE>';
SET sf_ds_user = '<SNOWFLAKE_DATASOURCE_USER>';
SET sf_ds_password = '<SNOWFLAKE_DATASOURCE_PASSWORD>';

-- Git repository
SET git_repo_origin = 'https://github.com/JetBrains/databao-cli.git';
Expand Down Expand Up @@ -44,11 +41,8 @@ DECLARE
-- Configuration
_openai_key VARCHAR DEFAULT $openai_key;
_anthropic_key VARCHAR DEFAULT $anthropic_key;
_ds_account VARCHAR DEFAULT $sf_ds_account;
_ds_warehouse VARCHAR DEFAULT $sf_ds_warehouse;
_ds_database VARCHAR DEFAULT $sf_ds_database;
_ds_user VARCHAR DEFAULT $sf_ds_user;
_ds_password VARCHAR DEFAULT $sf_ds_password;
_git_origin VARCHAR DEFAULT $git_repo_origin;
_git_repo VARCHAR DEFAULT $git_repo_name;
_git_branch VARCHAR DEFAULT $git_branch;
Expand All @@ -58,8 +52,6 @@ DECLARE
_db VARCHAR DEFAULT 'STREAMLIT_DATABAO_DB_' || $suffix;
_wh VARCHAR DEFAULT 'STREAMLIT_DATABAO_WAREHOUSE_' || $suffix;
_egress_rule VARCHAR DEFAULT 'STREAMLIT_DATABAO_EGRESS_RULE_' || $suffix;
_user VARCHAR DEFAULT 'STREAMLIT_DATABAO_USER_' || $suffix;
_network_policy VARCHAR DEFAULT 'STREAMLIT_DATABAO_NETWORK_POLICY_' || $suffix;
_git_integration VARCHAR DEFAULT 'STREAMLIT_DATABAO_GIT_INTEGRATION_' || $suffix;
_eai VARCHAR DEFAULT 'STREAMLIT_DATABAO_EAI_' || $suffix;
_secrets_access VARCHAR DEFAULT 'STREAMLIT_DATABAO_SECRETS_ACCESS_' || $suffix;
Expand Down Expand Up @@ -89,29 +81,8 @@ BEGIN
|| ' VALUE_LIST = (''0.0.0.0:443'', ''0.0.0.0:80'')';
EXECUTE IMMEDIATE :_sql;

-- Unset policy from our user first so CREATE OR REPLACE succeeds on re-runs
_sql := 'ALTER USER IF EXISTS ' || :_user || ' UNSET NETWORK_POLICY';
EXECUTE IMMEDIATE :_sql;

_sql := 'CREATE OR REPLACE NETWORK POLICY ' || :_network_policy
|| ' ALLOWED_IP_LIST = (''0.0.0.0/0'')'
|| ' COMMENT = ''Allow all network connections''';
EXECUTE IMMEDIATE :_sql;

-- ==========================================================
-- 3. Service User
-- ==========================================================
_sql := 'CREATE OR REPLACE USER ' || :_user
|| ' TYPE = SERVICE'
|| ' DEFAULT_ROLE = ''PUBLIC''';
EXECUTE IMMEDIATE :_sql;

_sql := 'ALTER USER ' || :_user
|| ' SET NETWORK_POLICY = ''' || :_network_policy || '''';
EXECUTE IMMEDIATE :_sql;

-- ==========================================================
-- 4. Git Repository
-- 3. Git Repository
-- ==========================================================
_sql := 'CREATE OR REPLACE API INTEGRATION ' || :_git_integration
|| ' API_PROVIDER = git_https_api'
Expand All @@ -129,7 +100,7 @@ BEGIN
EXECUTE IMMEDIATE :_sql;

-- ==========================================================
-- 5. Application Secrets
-- 4. Application Secrets
-- ==========================================================
_sql := 'CREATE OR REPLACE SECRET ' || :_db || '.PUBLIC.openai_api_key'
|| ' TYPE = GENERIC_STRING'
Expand All @@ -141,11 +112,6 @@ BEGIN
|| ' SECRET_STRING = ''' || :_anthropic_key || '''';
EXECUTE IMMEDIATE :_sql;

_sql := 'CREATE OR REPLACE SECRET ' || :_db || '.PUBLIC.snowflake_ds_account'
|| ' TYPE = GENERIC_STRING'
|| ' SECRET_STRING = ''' || :_ds_account || '''';
EXECUTE IMMEDIATE :_sql;

_sql := 'CREATE OR REPLACE SECRET ' || :_db || '.PUBLIC.snowflake_ds_warehouse'
|| ' TYPE = GENERIC_STRING'
|| ' SECRET_STRING = ''' || :_ds_warehouse || '''';
Expand All @@ -156,18 +122,8 @@ BEGIN
|| ' SECRET_STRING = ''' || :_ds_database || '''';
EXECUTE IMMEDIATE :_sql;

_sql := 'CREATE OR REPLACE SECRET ' || :_db || '.PUBLIC.snowflake_ds_user'
|| ' TYPE = GENERIC_STRING'
|| ' SECRET_STRING = ''' || :_ds_user || '''';
EXECUTE IMMEDIATE :_sql;

_sql := 'CREATE OR REPLACE SECRET ' || :_db || '.PUBLIC.snowflake_ds_password'
|| ' TYPE = GENERIC_STRING'
|| ' SECRET_STRING = ''' || :_ds_password || '''';
EXECUTE IMMEDIATE :_sql;

-- ==========================================================
-- 6. External Access Integrations
-- 5. External Access Integrations
-- ==========================================================
_sql := 'CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION ' || :_eai
|| ' ALLOWED_NETWORK_RULES = (' || :_db || '.PUBLIC.' || :_egress_rule || ')'
Expand All @@ -179,16 +135,13 @@ BEGIN
|| ' ALLOWED_AUTHENTICATION_SECRETS = ('
|| :_db || '.PUBLIC.openai_api_key, '
|| :_db || '.PUBLIC.anthropic_api_key, '
|| :_db || '.PUBLIC.snowflake_ds_account, '
|| :_db || '.PUBLIC.snowflake_ds_warehouse, '
|| :_db || '.PUBLIC.snowflake_ds_database, '
|| :_db || '.PUBLIC.snowflake_ds_user, '
|| :_db || '.PUBLIC.snowflake_ds_password'
|| :_db || '.PUBLIC.snowflake_ds_database'
|| ') ENABLED = TRUE';
EXECUTE IMMEDIATE :_sql;

-- ==========================================================
-- 7. Compute Pool
-- 6. Compute Pool
-- ==========================================================
EXECUTE IMMEDIATE 'DROP COMPUTE POOL IF EXISTS ' || :_compute_pool;
_sql := 'CREATE COMPUTE POOL ' || :_compute_pool
Expand All @@ -200,7 +153,7 @@ BEGIN
EXECUTE IMMEDIATE :_sql;

-- ==========================================================
-- 8. UDF + Streamlit App
-- 7. UDF + Streamlit App
-- ==========================================================
_sql := 'CREATE OR REPLACE FUNCTION ' || :_db || '.PUBLIC.get_secret(secret_name STRING)'
|| ' RETURNS STRING'
Expand All @@ -211,11 +164,8 @@ BEGIN
|| ' SECRETS = ('
|| ' ''openai_api_key'' = ' || :_db || '.PUBLIC.openai_api_key,'
|| ' ''anthropic_api_key'' = ' || :_db || '.PUBLIC.anthropic_api_key,'
|| ' ''snowflake_ds_account'' = ' || :_db || '.PUBLIC.snowflake_ds_account,'
|| ' ''snowflake_ds_warehouse'' = ' || :_db || '.PUBLIC.snowflake_ds_warehouse,'
|| ' ''snowflake_ds_database'' = ' || :_db || '.PUBLIC.snowflake_ds_database,'
|| ' ''snowflake_ds_user'' = ' || :_db || '.PUBLIC.snowflake_ds_user,'
|| ' ''snowflake_ds_password'' = ' || :_db || '.PUBLIC.snowflake_ds_password'
|| ' ''snowflake_ds_database'' = ' || :_db || '.PUBLIC.snowflake_ds_database'
|| ') AS '
|| '$$' || CHR(10)
|| 'import _snowflake' || CHR(10)
Expand Down
51 changes: 47 additions & 4 deletions examples/demo-snowflake-project/src/databao_snowflake_demo/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,16 @@
import logging
import os
import sys
from collections.abc import Generator
from contextlib import contextmanager
from pathlib import Path
from typing import Any

import snowflake.connector
import streamlit as st
from databao_context_engine.plugins.databases.snowflake.snowflake_introspector import (
SnowflakeIntrospector,
)

from databao_cli.ui.app import main

Expand All @@ -13,18 +20,17 @@
SNOWFLAKE_SECRETS: dict[str, str] = {
"openai_api_key": "OPENAI_API_KEY",
"anthropic_api_key": "ANTHROPIC_API_KEY",
"snowflake_ds_account": "SNOWFLAKE_DS_ACCOUNT",
"snowflake_ds_warehouse": "SNOWFLAKE_DS_WAREHOUSE",
"snowflake_ds_database": "SNOWFLAKE_DS_DATABASE",
"snowflake_ds_user": "SNOWFLAKE_DS_USER",
"snowflake_ds_password": "SNOWFLAKE_DS_PASSWORD",
}

SESSION_TOKEN_PATH = Path("/snowflake/session/token")

ADBC_LIB = "libadbc_driver_snowflake.so"


def _is_running_in_snowflake() -> bool:
return Path("/snowflake/session/token").exists()
return SESSION_TOKEN_PATH.exists()


def _ensure_adbc_driver() -> None:
Expand Down Expand Up @@ -72,9 +78,46 @@ def _load_snowflake_secrets() -> None:
logger.warning("Failed to load secret '%s'", secret_name, exc_info=True)


def _patch_snowflake_introspector_for_sis() -> None:
"""Monkey-patch SnowflakeIntrospector._connect for Streamlit-in-Snowflake (SiS).

In SiS, the runtime maintains an OAuth session token at
/snowflake/session/token. We re-read it on every connection to avoid expiry
(tokens are valid ~1 hour, the file refreshes every few minutes).

DCE's _connect must return a context manager because BaseIntrospector uses
``with self._connect(file_config) as conn:``.
"""
@contextmanager
def _sis_connect(self: Any, file_config: Any, *, catalog: str | None = None) -> Generator[Any, None, None]:
token = SESSION_TOKEN_PATH.read_text().strip()
snowflake.connector.paramstyle = "qmark"
kwargs = file_config.connection.to_snowflake_kwargs()
# Replace any existing auth params with OAuth token
kwargs.pop("password", None)
kwargs.pop("private_key", None)
kwargs.pop("private_key_file", None)
kwargs.pop("private_key_file_pwd", None)
kwargs.pop("authenticator", None)
kwargs.pop("token", None)
kwargs["authenticator"] = "oauth"
kwargs["token"] = token
if catalog:
kwargs["database"] = catalog
conn = snowflake.connector.connect(**kwargs)
try:
yield conn
finally:
conn.close()

SnowflakeIntrospector._connect = _sis_connect # type: ignore[assignment]
logger.info("Patched SnowflakeIntrospector._connect for SiS OAuth token auth")


_ensure_adbc_driver()
if _is_running_in_snowflake():
_load_snowflake_secrets()
_patch_snowflake_introspector_for_sis()

if "--project-dir" not in sys.argv:
sys.argv.extend(["--project-dir", "examples/demo-snowflake-project"])
Expand Down