Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions src/strands_tools/retrieve.py
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,14 @@
),
"default": False,
},
"overrideSearchType": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious about the naming here?

  1. Are you using the override prefix to so the agent sets the field less frequently?

  2. Is the usage different if you just name this searchType - the default behavior is still the same?

  3. Are all knowledge bases able to support both types or can this break older knowledge bases?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great questions. I should have attached the relevant documentation:
https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseVectorSearchConfiguration.html

In answer to your questions:

  1. The naming is derived as is from the parameter used in the bedrock API call. This is also what you see in the bedrock console.
  2. I can, but I am following the convention of keeping the name same as the underlying bedrock API search param. I see the rest of the class follows the same convention.
  3. The documentation says right now only Amazon Opensearch based stores support it with support for other KB types coming soon. I have not tested the API with other KB types, but I have another KB backed by Kendra and the override search option does show up for it, so I am assuming it doesn't break older KBs. That being said, this is an optional parameter and should the API invalidate the request for other KBs, the user may simply opt to not set this field.

Copy link
Member

@dbschmigelski dbschmigelski Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good on the name.

But for the compatibility, the user doesn't set the field at all (in most cases), the Agent is the one setting the field. The agent can recover with an error message (if its good) but I don't want to waste customer tokens if we know this will happen a lot

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I validated an API call with a KB that doesn't support hybrid search and it outputs the following:

An error occurred (ValidationException) when calling the Retrieve operation: HYBRID search type is not supported for search operation on index NF4WREDULM. Retry your request with a different search type.

What would you suggest we do here? Should additional documentation over the tool that explicitly states "only supported by amazon opensearch and kendra knowledge bases, do not use with other sources" suffice here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively I can add a fallback in cases of ValidationException to remove this parameter, but that doesn't seem right to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey,

ToolSpecs can be dynamic. So what about making this opt in and we only include the overrideSearchType if the user is opted in?

I would not want to do "only supported by amazon opensearch and kendra knowledge bases"

"type": "string",
"description": (
"Override the search type for the knowledge base query. Supported values: 'HYBRID', "
"'SEMANTIC'. Default behavior uses the knowledge base's configured search type."
),
"enum": ["HYBRID", "SEMANTIC"],
},
},
"required": ["text"],
}
Expand Down Expand Up @@ -306,6 +314,16 @@ def retrieve(tool: ToolUse, **kwargs: Any) -> ToolResult:
min_score = tool_input.get("score", default_min_score)
enable_metadata = tool_input.get("enableMetadata", default_enable_metadata)
retrieve_filter = tool_input.get("retrieveFilter")
override_search_type = tool_input.get("overrideSearchType")

# Validate overrideSearchType if provided
if override_search_type and override_search_type not in ["HYBRID", "SEMANTIC"]:
return {
"toolUseId": tool_use_id,
"status": "error",
"content": [{"text": f"Invalid overrideSearchType: {override_search_type}. "
f"Supported values: HYBRID, SEMANTIC"}],
}

# Initialize Bedrock client with optional profile name
profile_name = tool_input.get("profile_name")
Expand All @@ -321,6 +339,9 @@ def retrieve(tool: ToolUse, **kwargs: Any) -> ToolResult:
# Default retrieval configuration
retrieval_config = {"vectorSearchConfiguration": {"numberOfResults": number_of_results}}

if override_search_type:
retrieval_config["vectorSearchConfiguration"]["overrideSearchType"] = override_search_type

if retrieve_filter:
try:
if _validate_filter(retrieve_filter):
Expand Down
167 changes: 167 additions & 0 deletions tests/test_retrieve.py
Original file line number Diff line number Diff line change
Expand Up @@ -656,6 +656,171 @@ def test_retrieve_with_environment_variable_default(mock_boto3_client):
assert "test-source-1" not in result_text


def test_retrieve_with_override_search_type_hybrid(mock_boto3_client):
"""Test retrieve with overrideSearchType set to HYBRID."""
tool_use = {
"toolUseId": "test-tool-use-id",
"input": {
"text": "test query",
"knowledgeBaseId": "test-kb-id",
"overrideSearchType": "HYBRID",
},
}

result = retrieve.retrieve(tool=tool_use)

# Verify the result is successful
assert result["status"] == "success"
assert "Retrieved 2 results with score >= 0.4" in result["content"][0]["text"]

# Verify that boto3 client was called with overrideSearchType
mock_boto3_client.return_value.retrieve.assert_called_once_with(
retrievalQuery={"text": "test query"},
knowledgeBaseId="test-kb-id",
retrievalConfiguration={
"vectorSearchConfiguration": {
"numberOfResults": 10,
"overrideSearchType": "HYBRID"
}
},
)


def test_retrieve_with_override_search_type_semantic(mock_boto3_client):
"""Test retrieve with overrideSearchType set to SEMANTIC."""
tool_use = {
"toolUseId": "test-tool-use-id",
"input": {
"text": "test query",
"knowledgeBaseId": "test-kb-id",
"overrideSearchType": "SEMANTIC",
},
}

result = retrieve.retrieve(tool=tool_use)

# Verify the result is successful
assert result["status"] == "success"

# Verify that boto3 client was called with overrideSearchType
mock_boto3_client.return_value.retrieve.assert_called_once_with(
retrievalQuery={"text": "test query"},
knowledgeBaseId="test-kb-id",
retrievalConfiguration={
"vectorSearchConfiguration": {
"numberOfResults": 10,
"overrideSearchType": "SEMANTIC"
}
},
)


def test_retrieve_with_invalid_override_search_type(mock_boto3_client):
"""Test retrieve with invalid overrideSearchType."""
tool_use = {
"toolUseId": "test-tool-use-id",
"input": {
"text": "test query",
"knowledgeBaseId": "test-kb-id",
"overrideSearchType": "INVALID_TYPE",
},
}

result = retrieve.retrieve(tool=tool_use)

# Verify the result is an error
assert result["status"] == "error"
assert "Invalid overrideSearchType: INVALID_TYPE" in result["content"][0]["text"]
assert "Supported values: HYBRID, SEMANTIC" in result["content"][0]["text"]

# Verify that boto3 client was not called
mock_boto3_client.return_value.retrieve.assert_not_called()


def test_retrieve_without_override_search_type(mock_boto3_client):
"""Test retrieve without overrideSearchType (default behavior)."""
tool_use = {
"toolUseId": "test-tool-use-id",
"input": {
"text": "test query",
"knowledgeBaseId": "test-kb-id",
},
}

result = retrieve.retrieve(tool=tool_use)

# Verify the result is successful
assert result["status"] == "success"

# Verify that boto3 client was called without overrideSearchType
mock_boto3_client.return_value.retrieve.assert_called_once_with(
retrievalQuery={"text": "test query"},
knowledgeBaseId="test-kb-id",
retrievalConfiguration={
"vectorSearchConfiguration": {
"numberOfResults": 10
}
},
)


def test_retrieve_with_override_search_type_and_filter(mock_boto3_client):
"""Test retrieve with both overrideSearchType and retrieveFilter."""
tool_use = {
"toolUseId": "test-tool-use-id",
"input": {
"text": "test query",
"knowledgeBaseId": "test-kb-id",
"overrideSearchType": "HYBRID",
"retrieveFilter": {"equals": {"key": "category", "value": "security"}},
},
}

result = retrieve.retrieve(tool=tool_use)

# Verify the result is successful
assert result["status"] == "success"

# Verify that boto3 client was called with both overrideSearchType and filter
mock_boto3_client.return_value.retrieve.assert_called_once_with(
retrievalQuery={"text": "test query"},
knowledgeBaseId="test-kb-id",
retrievalConfiguration={
"vectorSearchConfiguration": {
"numberOfResults": 10,
"overrideSearchType": "HYBRID",
"filter": {"equals": {"key": "category", "value": "security"}}
}
},
)


def test_retrieve_via_agent_with_override_search_type(agent, mock_boto3_client):
"""Test retrieving via the agent interface with overrideSearchType."""
with mock.patch.dict(os.environ, {"KNOWLEDGE_BASE_ID": "agent-kb-id"}):
result = agent.tool.retrieve(
text="agent query",
knowledgeBaseId="test-kb-id",
overrideSearchType="HYBRID"
)

result_text = extract_result_text(result)
assert "Retrieved" in result_text
assert "results with score >=" in result_text

# Verify the boto3 client was called with overrideSearchType
mock_boto3_client.return_value.retrieve.assert_called_once_with(
retrievalQuery={"text": "agent query"},
knowledgeBaseId="test-kb-id",
retrievalConfiguration={
"vectorSearchConfiguration": {
"numberOfResults": 10,
"overrideSearchType": "HYBRID"
}
},
)


def test_retrieve_via_agent_with_enable_metadata(agent, mock_boto3_client):
"""Test retrieving via the agent interface with enableMetadata."""
with mock.patch.dict(os.environ, {"KNOWLEDGE_BASE_ID": "agent-kb-id"}):
Expand All @@ -677,3 +842,5 @@ def test_retrieve_via_agent_with_enable_metadata(agent, mock_boto3_client):
assert "results with score >=" in result_text
assert "Metadata:" not in result_text
assert "test-source" not in result_text