Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
25c94e5
AI-2161 feat: add complex component config search tool
mariankrotil Jan 6, 2026
7478604
AI-2161 feat: add instruction examples
mariankrotil Jan 8, 2026
5f3449d
Merge branch 'AI-2161-include-links-in-get-tables' into AI-2161-inclu…
mariankrotil Jan 12, 2026
82ee62e
Merge AI-2161
mariankrotil Jan 12, 2026
2eeacb4
Merge AI-2161
mariankrotil Jan 12, 2026
c8a43cb
AI-2161 fix: ignore private matches in SearchHit equality
mariankrotil Jan 12, 2026
88d6615
AI-2161 test: use regex mode in search regex test
mariankrotil Jan 12, 2026
ff2f68b
AI-2161 refactor: remove usage tool registration
mariankrotil Jan 12, 2026
dc1da3e
AI-2161 fix: return extractor configs as configuration items
mariankrotil Jan 12, 2026
4ef683b
AI-2161 fix: include configurations in table usage search
mariankrotil Jan 12, 2026
da22a09
AI-2161 test: add config-based search integration test
mariankrotil Jan 12, 2026
cbe58d7
AI-2161 style: apply tox
mariankrotil Jan 12, 2026
fbb3aca
Merge AI-2161-storage-ref into AI-2161-search-tool-update
mariankrotil Jan 21, 2026
e10c074
AI-2161 chore: update version
mariankrotil Jan 21, 2026
e8998f1
Merge branch 'AI-2161-include-links-in-get-tables' into AI-2161-inclu…
mariankrotil Jan 21, 2026
e421ed8
AI-2161 refactor: add component to search type
mariankrotil Jan 21, 2026
5ca04b5
AI-2161 fix: update import
mariankrotil Jan 22, 2026
81610a9
Merge branch 'AI-2161-include-links-in-get-tables' into AI-2161-inclu…
mariankrotil Jan 22, 2026
3e6c3e4
Merge branch 'main' into AI-2161-include-links-in-get-tables-exp
mariankrotil Jan 23, 2026
d05fe02
Merge branch 'main' into AI-2161-include-links-in-get-tables-exp
mariankrotil Jan 26, 2026
c1fc287
AI-2161 chore: update version
mariankrotil Jan 26, 2026
05b868e
AI-2161 fix: missing return in validator, doc defaults, typos, and No…
Feb 6, 2026
dcdad7c
Merge branch 'main' into AI-2161-include-links-in-get-tables-exp
Feb 8, 2026
6de7ea4
AI-2161: fix search tool's docstring
Feb 8, 2026
0e6686e
AI-2161 feat: simplify search API and improve config scope matching
mariankrotil Feb 18, 2026
ac4e665
AI-2161 test: add config-based search scope coverage
mariankrotil Feb 18, 2026
b390a73
AI-2161 docs: refresh search tool documentation
mariankrotil Feb 18, 2026
19ba658
AI-2161 docs: add draft search description
mariankrotil Feb 18, 2026
f2a489e
AI-2161 chore: update lockfile package version
mariankrotil Feb 18, 2026
6f4e672
AI-2161 chore: remove draft search description file
mariankrotil Feb 18, 2026
5cc8b08
Merge main into AI-2161
mariankrotil Feb 18, 2026
d6240cb
AI-2161 docs: mention config-based search in project system prompt
mariankrotil Feb 18, 2026
52c32ea
AI-2161 docs: rename project prompt section to finding items
mariankrotil Feb 18, 2026
b2e104f
Merge main into AI-2161
mariankrotil Feb 23, 2026
241fbb2
AI-2161 feat: address search review feedback
mariankrotil Feb 23, 2026
6eb6277
AI-2161 feat: group matched patterns by scope in search results
mariankrotil Feb 23, 2026
26e8869
AI-2161 feat: adapt usage to grouped match scopes
mariankrotil Feb 23, 2026
a25546c
AI-2161 docs: update TOOLS reference
mariankrotil Feb 23, 2026
42eac29
AI-2161 style: apply flake
mariankrotil Feb 23, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
144 changes: 119 additions & 25 deletions TOOLS.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ including essential context and base instructions for working with it

### Search Tools
- [find_component_id](#find_component_id): Returns list of component IDs that match the given query.
- [search](#search): Searches for Keboola items (tables, buckets, configurations, transformations, flows, etc.
- [search](#search): Searches for Keboola items (tables, buckets, components, configurations, transformations, flows, data-apps, etc.

### Storage Tools
- [get_buckets](#get_buckets): Lists buckets or retrieves full details of specific buckets, including descriptions,
Expand Down Expand Up @@ -2619,79 +2619,147 @@ USAGE EXAMPLES:

**Description**:

Searches for Keboola items (tables, buckets, configurations, transformations, flows, etc.) in the current project
by matching patterns against item ID, name, display name, or description. Returns matching items grouped by type
with their IDs and metadata.
Searches for Keboola items (tables, buckets, components, configurations, transformations, flows, data-apps, etc.)
in the current project and returns matching ID + metadata.

This tool supports two complementary search types:

1) textual
- Searches item metadata fields by matching patterns against id, name, displayName, and description.
- For tables, also searches column names and column descriptions.

2) config-based
- Searches item configurations (JSON objects) by matching patterns against the configuration values ​​converted
to a string, optionally narrowed by JSON path `scopes`.
- Returns also `match_scopes` with JSON paths and matched patterns per scope.

THIS IS THE PRIMARY DISCOVERY TOOL. Always use it BEFORE any get_* tool when you need to find items
by name. Do NOT enumerate items with get_buckets, get_tables, get_configs, get_flows, or get_data_apps
just to locate a specific item — use this tool instead.
by name or specific configuration content. Do NOT enumerate items with get_buckets, get_tables, get_configs,
get_flows, or get_data_apps just to locate a specific item — use this tool instead.

WHEN TO USE:
- User asks to "find", "locate", or "search for" something by name, keyword, or text pattern
- User asks to "find", "locate", or "search for" something by name, keyword, text pattern, configuration content or
value
- User mentions a partial name and you need to find the full item (e.g., "find the customer table")
- User asks "what tables/configs/flows do I have with X in the name?"
- You need to discover items before performing operations on them
- User asks to "list all items with [name] in it"
- User asks to "list all items with [name] or [configuration value/part] in it"
- User asks where a value, table, component, specific configuration ID, or specific settings is used in components,
data-apps, flows, or transformations
- You need to trace lineage by searching for IDs referenced in configurations, or to find flows using a
specific component, or find usage of a bucket/table in transformations or components, or to find items with
specific parameters.
- User asks to "what is the genesis of this item?" or "explain me business logic of this item?"

HOW IT WORKS:
- Searches by regex pattern matching against id, name, displayName, and description fields
- For tables, also searches column names and column descriptions
- Case-insensitive search
- Supports two types:
- search_type="textual": matches against id, name, displayName, and description, for tables also column names
and column descriptions
- search_type="config-based": matches inside configuration JSON objects, optionally narrowed by JSON path `scopes`
- case-insensitive search
- mode for pattern search: `literal` (default) or `regex`
- Multiple patterns work as OR condition - matches items containing ANY of the patterns
- Returns grouped results by item type (tables, buckets, configurations, flows, etc.)
- Each result includes the item's ID, name, creation date, and relevant metadata
- scopes (config-based) narrow matching to specific JSONPath areas within configurations; matching is performed
against the stringified JSON node content in those areas.
- config-based always returns all matched paths per item in `match_scopes` (including matched patterns)

IMPORTANT:
- Always use this tool when the user mentions a name but you don't have the exact ID
- The search returns IDs that you can use with other tools (e.g., get_tables, get_configs, get_flows)
- Results are ordered by update time. The most recently updated items are returned first.
- Fill `item_types` to make the search more efficient when you know the item type; scanning buckets and tables can
be expensive
- For exact ID lookups, use specific tools like get_tables, get_configs, get_flows instead
- Use specific `scopes` only when you know the config structure (schema or real example); otherwise run config-based
search without scopes.
- Use find_component_id and get_configs tools to find configurations related to a specific component
- If results are too numerous or empty, ask the user to refine their query rather than enumerating all items.

USAGE EXAMPLES:
1) textual search examples:
- user_input: "Find all tables with 'customer' in the name"
→ patterns=["customer"], item_types=["table"]
→ Returns all tables whose id, name, displayName, or description contains "customer"
→ patterns=["customer"], item_types=["table"]
→ Returns all tables whose id, name, displayName, or description contains "customer"

- user_input: "Find tables with 'email' column"
→ patterns=["email"], item_types=["table"]
→ Returns all tables that have a column named "email" or with "email" in column description
→ patterns=["email"], item_types=["table"]
→ Returns all tables that have a column named "email" or with "email" in column description

- user_input: "Search for the sales transformation"
→ patterns=["sales"], item_types=["transformation"]
→ Returns transformations with "sales" in any searchable field
→ patterns=["sales"], item_types=["transformation"]
→ Returns transformations with "sales" in any searchable field

- user_input: "Find items named 'daily report' or 'weekly summary'"
→ patterns=["daily.*report", "weekly.*summary"], item_types=[]
→ Returns all items matching any of these patterns
→ patterns=["daily.*report", "weekly.*summary"], item_types=[], mode="regex"
→ Returns all items matching any of these patterns

- user_input: "Show me all configurations related to Google Analytics"
→ patterns=["google.*analytics"], item_types=["configuration"]
→ Returns configurations with matching patterns
→ patterns=["google.*analytics"], item_types=["configuration"], mode="regex"
→ Returns configurations with matching patterns

2) config-based search examples:
- user_input: "Find transformations/configs/components referencing table in.c-prod.customers"
-> patterns=["in.c-prod.customers"], item_types=["transformation", "configuration"],
search_type="config-based"
-> No scopes = search whole stringified config; result includes `match_scopes` with exact paths + patterns

- user_input: "Find configurations/transformations (etc.) using specific setting / id anywhere"
-> patterns=["setting", "id"], item_types=["configuration", "transformations"], search_type="config-based",

- user_input: "Find configurations/transformations (etc.) using specific setting / id in parameters"
-> patterns=["setting", "id"], item_types=["configuration", "transformations"], search_type="config-based",
scopes=["parameters"]

- user_input: "Find configurations/transformations (etc.) using specific setting / id in storage"
-> patterns=["setting", "id"], item_types=["configuration", "transformations"], search_type="config-based",
scopes=["storage"]

- user_input: "Find configurations/transformations (etc.) using specific setting / id in authorization"
-> patterns=["setting", "id"], item_types=["configuration", "transformations"], search_type="config-based",
scopes=["parameters.authorization", "authorization"]

- user_input: "Find components/transformations using my_bucket in input or output mappings"
-> patterns=["my_bucket"], item_types=["configuration", "transformation"], search_type="config-based",
scopes=["storage.input", "storage.output"]
-> Returns matches with paths like `storage.input.tables[0].source`, `storage.input.files[0].source`,
or `storage.output.tables[0].destination`

- user_input: "Find flows using configuration ID 01k9cz233cvd1rga3zzx40g8qj"
-> patterns=["01k9cz233cvd1rga3zzx40g8qj"], item_types=["flow"], search_type="config-based",
scopes=["tasks", "phases"]

- user_input: "Find transformations using this table / column / specific code in its script"
-> patterns=["element"], item_types=["transformation"], search_type="config-based",
scopes=["parameters", "storage"]

- user_input: "Find data apps using something in its config / python code / setting"
-> patterns=["something"], item_types=["data-app"], search_type="config-based"
-> Returns data apps where script/config sections contain the keyword and includes `match_scopes`


**Input JSON Schema**:
```json
{
"properties": {
"patterns": {
"description": "One or more search patterns to match against item ID, name, display name, or description. Supports regex patterns. Case-insensitive. Examples: [\"customer\"], [\"sales\", \"revenue\"], [\"test.*table\"]. Do not use empty strings or empty lists.",
"description": "One or more search patterns to match against item ID, name, display name, description, or configuration JSON objects. Case-insensitive by default. Examples: [\"customer\"], [\"sales\", \"revenue\"], [\"my_bucket\"]. Do not use empty strings or empty lists.",
"items": {
"type": "string"
},
"type": "array"
},
"item_types": {
"default": [],
"description": "Optional filter for specific Keboola item types. Leave empty to search all types. Common values: \"table\" (data tables), \"bucket\" (table containers), \"transformation\" (SQL/Python transformations), \"configuration\" (extractor/writer configs), \"flow\" (orchestration flows). Use when you know what type of item you're looking for.",
"description": "Filter for specific Keboola item types. Common values: \"table\" (data tables), \"bucket\" (table containers), \"transformation\" (SQL/Python transformations), \"component\" (extractor/writer/application components), \"data-app\" (data apps), \"flow\" (orchestration flows). Use when you know what type of item you're looking for or leave empty to search all types.",
"items": {
"enum": [
"flow",
"bucket",
"table",
"data-app",
"flow",
"transformation",
"component",
"configuration",
"configuration-row",
"workspace",
Expand All @@ -2703,6 +2771,32 @@ USAGE EXAMPLES:
},
"type": "array"
},
"search_type": {
"default": "textual",
"description": "Search mode: \"textual\" (name/id/description) or \"config-based\" (stringified configuration payloads). (default: \"textual\")",
"enum": [
"textual",
"config-based"
],
"type": "string"
},
"scopes": {
"default": [],
"description": "JSONPath expressions to narrow config-based search to specific parts of the configuration. Simple dot-notation (e.g. \"parameters\", \"storage.input\") and full JSONPath (e.g. \"$.tasks[*]\") are both supported (e.g. \"parameters.host\", \"storage.input[0].source\"). Leave empty to search the whole configuration.",
"items": {
"type": "string"
},
"type": "array"
},
"mode": {
"default": "literal",
"description": "How to interpret patterns: \"regex\" for regular expressions or \"literal\" for exact text (default: \"literal\").",
"enum": [
"regex",
"literal"
],
"type": "string"
},
"limit": {
"default": 50,
"description": "Maximum number of items to return (default: 50, max: 100).",
Expand Down
29 changes: 29 additions & 0 deletions integtests/tools/test_search.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,3 +103,32 @@ async def test_find_component_id(mcp_client: Client):
assert full_result.content[0].type == 'text'
decoded_toon = toon_format.decode(full_result.content[0].text)
assert decoded_toon == result


@pytest.mark.asyncio
async def test_search_config_based_simple_query(
mcp_client: Client,
configs: list[ConfigDef],
) -> None:
"""
Test config-based search with a simple scoped query.
"""
config = next(cfg for cfg in configs if cfg.component_id == 'ex-generic-v2')
full_result = await mcp_client.call_tool(
'search',
{
'patterns': ['wttr.in'],
'item_types': ['configuration'],
'search_type': 'config-based',
'scopes': ['parameters.api.baseUrl'],
'limit': 20,
'offset': 0,
},
)

assert full_result.structured_content is not None
result = [SearchHit.model_validate(hit) for hit in full_result.structured_content['result']]

assert any(
hit.component_id == 'ex-generic-v2' and hit.configuration_id == config.configuration_id for hit in result
), f'Expected config {config.configuration_id} to be returned. Found: {result}'
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "hatchling.build"

[project]
name = "keboola-mcp-server"
version = "1.44.8"
version = "1.45.0"
description = "MCP server for interacting with Keboola Connection"
readme = "README.md"
requires-python = ">=3.10"
Expand Down
Original file line number Diff line number Diff line change
@@ -1,13 +1,18 @@
### Finding Items by Name
### Finding Items

When looking for specific items (tables, buckets, configurations, flows, data apps) by name, description,
or partial match, **always use the `search` tool first** rather than listing all items with `get_*` tools.
partial match, or configuration content/reference, **always use the `search` tool first** rather than listing all
items with `get_*` tools.

- `search` matches by regex against names, IDs, descriptions, and (for tables) column names.
- `search` supports:
- textual search over names, IDs, descriptions, and (for tables) column names
- config-based search over item configuration JSON contents, including scoped JSONPath search when useful
- Listing all items with empty IDs (e.g., `get_buckets(bucket_ids=[])`, `get_configs()`, `get_flows(flow_ids=[])`)
is wasteful on large projects and should only be used when you genuinely need a complete inventory.
- If the user mentions a name but you do not have the exact ID, call `search` with an appropriate pattern
and `item_types` filter.
- If the user asks where a table/component/config ID/value is used, call `search` with
`search_type="config-based"` (and use `scopes` when you know the config structure).
- If `search` returns too many results or zero results, ask the user to be more specific rather than
falling back to enumerating all items.

Expand Down
Loading