Skip to content

feat: template connectors#8981

Open
royendo wants to merge 52 commits intomainfrom
feat-template-connector
Open

feat: template connectors#8981
royendo wants to merge 52 commits intomainfrom
feat-template-connector

Conversation

@royendo
Copy link
Contributor

@royendo royendo commented Mar 5, 2026

Move connector schemas, SQL generation, and YAML templating from hand-written TypeScript into declarative JSON template definitions served by a new Go runtime/templates/ package.

Excluding auto-generated code (proto, Orval), JSON definitions, and deleted TypeScript schemas, the actual implementation is ~2,100 lines of new Go + TypeScript across 46 files — replacing ~3,400 lines of removed TypeScript.

What changed

  • New runtime/templates/ Go package (7 files, ~1,100 lines): registry with embed.FS loading, property processor that extracts env var refs for secrets, text/template renderer with [[ ]] delimiters, and template helper functions (duckdbSQL, propVal, default, azureContainer, azureBlobPath, clickhouseURLSuffix)
  • 32 JSON template definitions (~3,300 lines): each definition declares its JSON Schema (form fields, validation, display hints), output file templates, and OLAP-specific SQL inline — no Go code changes needed to add or modify a connector
  • New proto RPCs: ListTemplates returns all definitions with schemas; GenerateFile renders templates server-side, optionally writing files and merging .env
  • Frontend rewired: AddDataModal fetches schemas from ListTemplates API instead of importing TypeScript schema files; form rendering, validation, and submission unchanged; icon resolution uses template-level icon/small_icon fields
  • Deleted: 16 TypeScript schema files (schemas/*.ts), sourceUtils.ts (SQL generation), connector-icon-mapping.ts, icons.ts — all replaced by JSON definitions
  • ClickHouse HTTPS headers: url() table function support with auto-detected format parameter and headers() syntax for authenticated endpoints
  • ClickHouse local_file/sqlite stubs: emptied with _reason — ClickHouse's file()/sqlite() require server-side user_files directory, unusable in Rill Cloud
  • OLAP detection fix: normalizeOlapForTemplate checks both connectors and projectConnectors to resolve arbitrarily-named OLAP connectors (e.g. clickhouse_1)

Code review fixes

  • .env injection hardening: appendEnvVar now strips newlines and quotes values containing spaces/special characters to prevent env var injection
  • Hardcoded managed skip replaced: generic x-omit-if-default schema annotation replaces the ClickHouse-specific key == "managed" check in render.go; removed dead toBool helper
  • Error logging: repo access failures in GenerateTemplate, GenerateFile, and writeRenderedFiles now log warnings instead of silently swallowing errors
  • defaultVal safety: added doc comment warning against pipeline syntax (only positional [[ default (expr) "fallback" ]] is safe)
  • JSON template consistency: normalized display_name across OLAP engines (Postgres, Azure); x-category values use camelCase (sourceOnly); added missing name field pattern validation to https-duckdb.json
  • DuckDB managed default: changed from false to true to match the allOf conditional that enforces const: true for rill-managed
  • Dead code removed: unused exports getSchemaSecretKeys, getSchemaStringKeys, getBackendConnectorName from schema-utils.ts

Line count breakdown

Category Files +/-
Implementation (Go + TS) 46 +2,124 / -868
Tests (Go + TS) 9 +2,325 / -1
JSON definitions 28 +3,293
Auto-generated (proto, Orval) 10 +5,393 / -1,784
Deleted TS schemas/utils 23 +7 / -3,394

QA: Clickhouse, with verification workaround (Aditya will work on proper changes here)
https://ui.rilldata.com/demo/ch-managed-sources/-/status

  • Public GCS ✅
  • HMAC GCS ✅
  • Azure Conn String ✅
  • Azure Storage key ✅
  • s3 Key/secret ✅
  • postgres ✅

Some of the error messages are quite confusing as it always complains about manually setting the columns when thats not the actual issue. will likely need a follow up PR that helps with error messaging.

Checklist:

  • Covered by tests
  • Ran it and it works as intended
  • Reviewed the diff before requesting a review
  • Checked for unhandled edge cases
  • Linked the issues it closes
  • Checked if the docs need to be updated. If so, create a separate Linear DOCS issue
  • Intend to cherry-pick into the release branch
  • I'm proud of this work!

Developed in collaboration with Claude Code

lovincyrus and others added 30 commits February 20, 2026 13:23
Adds a new GenerateTemplate RPC endpoint that serves as the single source
of truth for YAML format generation. Replaces frontend imperative YAML
builders with a centralized backend handler.

- Add EnvVarName field to PropertySpec for custom env var name mapping
- Populate EnvVarName values across 14 driver specs (S3, GCS, Azure,
  ClickHouse, Postgres, BigQuery, Snowflake, Redshift, MotherDuck,
  Athena, MySQL, Druid, Pinot, Salesforce)
- Define GenerateTemplate proto messages with validation annotations
- Implement handler: auth check, driver validation, property validation,
  DuckDB rewrite for object/file stores and sqlite
- Add comprehensive YAML renderers for connectors and models with precise
  formatting via yaml.Node tree construction
- Implement env var conflict resolution with _1, _2 suffix appending
- Add 54 passing test cases covering all drivers, error scenarios, and
  edge cases

Frontend migration to use this RPC will follow as a separate step.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace client-side YAML builders with backend RPC for single source of truth.

- New helper: `generateTemplate()` + `mergeEnvVars()` in generate-template.ts
- Updated submission paths: submitAddConnectorForm, submitAddSourceForm, saveConnectorAnyway
- Updated preview: computeYamlPreview now calls RPC (async, 150ms debounced)
- HTTPS connector path preserved (backend doesn't support headers array yet)
- Removed dead code: compileSourceYAML, prepareSourceFormData, maybeRewriteToDuckDb
- Backend now handles DuckDB rewrite and env var naming (conflict resolution)
- Tests: added mergeEnvVars (9 tests) and normalizeConnectorError (11 tests)
- Net: 234 lines removed (~27% reduction in submission/preview code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ector headers

Add HTTPS headers support to the backend GenerateTemplate RPC, enabling removal
of all client-side YAML generation code. The HTTPS connector now uses the same
RPC path as all other connectors.

Backend changes:
- Add `headers` property to HTTPS driver ConfigProperties
- Implement header rendering with sensitive value extraction to .env
- Support Authorization header scheme prefixes (Bearer, Basic, Token, Bot)
- Use `connector.{driver}.{key}` naming convention for header env vars
- Add 4 comprehensive test cases for header handling

Frontend changes:
- Convert key-value UI arrays to maps before RPC submission
- Remove HTTPS-specific YAML preview and submission code
- All connectors now use GenerateTemplate RPC exclusively
- Remove 13 dead functions (compileConnectorYAML, updateDotEnvWithSecrets, etc)
- Consolidate env var naming to backend-only responsibility

Schema cleanup:
- Remove 30 dead `x-env-var-name` annotations (frontend no longer reads them)
- Single source of truth for env var names: backend PropertySpec.EnvVarName

Results:
- -1,317 lines of dead code removed
- +221 lines of backend implementation
- 2,055 frontend tests pass, 62 backend tests pass (4 new HTTPS tests)
The mapping from property Key to env var name doesn't follow a mechanical
pattern. Well-known names are shared across drivers (AWS_ACCESS_KEY_ID),
some add infixes (AZURE_STORAGE_*), and others diverge entirely
(GCS key_id -> GCP_ACCESS_KEY_ID). Each driver must specify explicitly.

Also remove personal docs from .gitignore.
- Fix gofmt alignment in connectors.go PropertySpec struct
- Remove unused connectorName param from renderDuckDBModelYAML
- Use const instead of let for newEnvBlob and connectorYamlBlob
…o — Core types: Template, File, ProcessedProp registry.go — Registry with //go:embed loading of YAML definitions, List(), Get(), ListByTags(), LookupByDriver() render.go — Rendering engine: pre-processes properties (secret extraction, empty filtering, DuckDB derivation), renders Go text/template with [[ ]] delimiters funcmap.go — Template functions: renderProps, indent, quote env.go — Extracted: ResolveEnvVarName, ReadEnvKeys headers.go — Extracted: IsSensitiveHeaderKey, SplitAuthSchemePrefix, HeaderKeyToEnvSegment, ResolveHeaderEnvVarName (with fixed per-call regex compilation bug) duckdb.go — Extracted: BuildDuckDBQuery, matchesExt (with fixed false positive bug for paths like parquet-archive/readme.txt) 32 template definitions (runtime/templates/definitions/) 17 connector templates: s3, gcs, azure, https, postgres, bigquery, snowflake, mysql, athena, redshift, salesforce, clickhouse, duckdb, motherduck, druid, pinot, starrocks 7 DuckDB model templates: s3-duckdb, gcs-duckdb, azure-duckdb, https-duckdb, local-file-duckdb, sqlite-duckdb 8 warehouse model templates: clickhouse-model, postgres-model, mysql-model, bigquery-model, snowflake-model, athena-model, redshift-model, salesforce-model 1 new template: iceberg-duckdb (the motivating use case) Proto definitions Template, TemplateFile, ListTemplatesRequest/Response, GenerateFileRequest/Response, GeneratedFile messages ListTemplates and GenerateFile RPCs on RuntimeService Frontend client auto-generated via Orval Server handlers (runtime/server/templates.go) ListTemplates — returns templates filtered by tags GenerateFile — renders template, optionally writes files + merges .env Tests (4 test files, 32+ test cases) Registry: loading, lookup, tag filtering, sorted output, all definitions valid Render: S3 connector, S3-DuckDB model, Snowflake warehouse model, Redshift no-dev, Iceberg-DuckDB, env var conflicts, empty filtering, output filtering, local file, SQLite Env: explicit name, fallback, conflict, double conflict DuckDB: query building for all formats, false positive fix Headers: sensitive detection, auth scheme splitting, env segment naming Bug fixes from PRD containsExt false positive fixed (basename suffix check) headerKeyToEnvSegment regex compiled at package level Render errors propagated (not silently empty)
royendo and others added 18 commits March 5, 2026 17:40
…PI-driven schemas

- Add empty stub JSON definitions for future connectors (ClickHouse + DuckDB)
- Update connector-schemas.spec.ts, AddDataFormManager.spec.ts, and
  add-source-visibility.spec.ts to use `populateSchemaCache` instead of
  the removed `multiStepFormSchemas` static dict

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests that depend on `getConnectorSchema`, `isMultiStepConnector`, and
`hasExplorerStep` now call `populateSchemaCache` with test fixtures
instead of relying on the removed `multiStepFormSchemas` static dict.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix undefined `connectors` → `$connectorsStore` in reactive block
- Remove duplicate `selectedSchema` declaration by inlining into `formWidthClass`
- Prefix unused `existingEnvBlob` param with underscore

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cascading changes from the orval→connectRPC merge: replace
`instanceId` / `runtime` store usage with `RuntimeClient` from
`useRuntimeClient()` in generate-template, AddDataFormManager,
connector-schemas, and submitAddDataForm.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor Author

@royendo royendo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High severity

1. .env value injection in appendEnvVar (templates.go:161-171)

appendEnvVar writes values to .env without quoting or sanitizing newlines. A property value containing \n would inject arbitrary key-value pairs:

# Input value: "secret\nSTEAL_KEY=evil"
# Produces:
CONNECTOR_S3_KEY=secret
STEAL_KEY=evil

Values with spaces or = characters will also produce malformed entries. Standard .env parsers expect KEY="value with spaces". Consider quoting values or at minimum stripping newlines.

2. defaultVal parameter ordering (funcmap.go:84)

func defaultVal(val, fallback string) string

Go's text/template pipes the pipeline value to the last parameter. So {{ .foo | default "fallback" }} would pass .foo as fallback and "fallback" as val — the opposite of intent. If this function is only used with positional syntax ({{ default .foo "bar" }}), it works correctly. Worth verifying all usages in the JSON template definitions, or swapping the parameter order to (fallback, val string) for pipeline safety.

Medium severity

3. Silent error swallowing on repo access (generate_template.go:67, templates.go:89, templates.go:140)

Three places discard errors from repo.Get / s.runtime.Repo. Permission errors, I/O failures, and corruption are silently ignored, meaning env var conflicts will not be detected. At minimum these should be logged.

4. ListTemplates has no auth check (templates.go:19)

GenerateFile checks auth.GetClaims, but ListTemplates does not. Template definitions include schema structure and docs URLs. Is this intentional?

5. Hardcoded managed: false skip in generic renderer (render.go:175-177)

ClickHouse-specific business logic (key == "managed") is embedded in the generic processPropertiesFromSchema. This could be driven by a schema annotation (e.g., x-skip-if-default: true) rather than a hardcoded key name.

6. JSON template inconsistencies across OLAP engines

Issue Details
S3 auth enum DuckDB: "access_keys" (plural) vs ClickHouse: "access_key" (singular)
Display names "Postgres" vs "PostgreSQL", "Azure" vs "Azure Blob Storage"
x-category casing Mix of camelCase (objectStore, fileStore) and snake_case (source_only)
HTTPS name validation https-duckdb.json name field lacks the ^[a-zA-Z0-9_]+$ pattern present in all other templates
Template variable renderProps .props vs renderProps .config_props — functionally identical today, fragile if they diverge

7. DuckDB managed field default mismatch (olap/duckdb.json)

Defaults to false but the allOf rule requires true when connector_type is "rill-managed". The ClickHouse equivalent defaults to true. A user selecting "Rill Managed" DuckDB would initially see an invalid default.

Low severity

8. Template duplication opportunity

DuckDB model and connector templates share nearly identical boilerplate across 10+ JSON files. The File struct already supports code_template_file — shared .tmpl files could reduce maintenance burden. Not blocking, but worth considering as a follow-up.

9. funcmap.go helper functions lack direct unit tests

s3ToHTTPS, gcsToHTTPS, azureContainer, azureBlobPath, and azureEndpoint are only tested indirectly through full template rendering. Direct unit tests would catch edge cases (e.g., path-style S3 URLs, Azurite HTTP endpoints) more reliably.


Developed in collaboration with Claude Code

Copy link
Contributor Author

@royendo royendo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI Status: All 14 checks passing.

This is a large, well-organized PR that migrates YAML generation for connector/source templates from imperative TypeScript to declarative Go templates served via RPC. The deleted frontend code is fully replaced by the new backend-driven approach, and test coverage is comprehensive (30+ Go tests, 30+ frontend tests).

Issues (ordered by significance)

1. ListTemplates missing auth checkruntime/server/templates.go:21

Unlike GenerateFile and GenerateTemplate, the ListTemplates endpoint does not verify permissions. The data is template metadata (not user secrets), so the risk is low, but it's inconsistent with the other two endpoints in the same file.

2. Backslash escaping gap in appendEnvVarruntime/server/templates.go:170-177

The function escapes double quotes in env var values but does not first escape backslashes. A value containing a literal \" sequence would be double-escaped incorrectly. Theoretical for credentials, but worth hardening:

val = strings.ReplaceAll(val, `\`, `\\`)  // escape backslashes first
val = strings.ReplaceAll(val, `"`, `\"`)   // then escape quotes

3. Template re-parsing on every renderruntime/templates/render.go:342-356

renderString creates a new template.Template and funcMap() on every call. For preview mode (called on each keystroke with debounce), caching the func map at the struct level would reduce allocations.

4. Empty headers array edge caseruntime/templates/render.go:183-197

An empty []any headers field would render as "[]" instead of being filtered out. The frontend currently prevents this, but the backend could be more defensive.

Verdict

None of these are blocking. The architecture is clean, test coverage is solid, and the security hardening for env var injection is appropriate.


Developed in collaboration with Claude Code

@royendo
Copy link
Contributor Author

royendo commented Mar 6, 2026

All three fixes are done:

Backslash escaping (templates.go:176-177) — Now escapes backslashes before double quotes, and includes \ in the ContainsAny check so values with backslashes also get quoted.

Cached funcMap (funcmap.go:11-25) — Replaced funcMap() function with a package-level sharedFuncMap var. All entries are stateless functions so sharing is safe.

Empty array edge case (render.go:188-191) — When kvArrayToMap returns nil (empty/malformed array), we now continue instead of falling through to fmt.Sprintf which would render "[]".

Skipped issue 1 (ListTemplates auth) since the request has no instance_id and returns only static template metadata.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants