-
Notifications
You must be signed in to change notification settings - Fork 32
feat(langchain): add docs on dynamic tools #643
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Preview ID generated: preview-cbdyna-1758742693-3b83878 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move to middleware
Preview ID generated: preview-cbdyna-1758747534-6b63fd5 |
Update:
|
Preview ID generated: preview-cbdyna-1758748047-d12eb3e |
Preview ID generated: preview-cbdyna-1758761214-5616d01 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive documentation for dynamic tool selection in LangChain, addressing how to efficiently manage large tool catalogs by dynamically selecting relevant subsets per turn. This feature helps reduce context pressure, improve model performance, and decrease latency/costs when working with hundreds or thousands of tools.
Key changes include:
- Adding a complete "Dynamic tools" section explaining the problem and solution approach
- Providing both simple context-based and advanced semantic similarity-based tool selection examples
- Including production optimization strategies and design guidelines
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
src/oss/langchain/structured-output.mdx | Fixes import path for toolStrategy and providerStrategy from langchain/agents to langchain |
src/oss/langchain/middleware.mdx | Adds comprehensive dynamic tools documentation with examples, optimization strategies, and design guidelines |
runtime.context.vcs_provider if getattr(runtime, "context", None) else "github" | ||
).lower() | ||
active = [gitlab_create_issue] if provider == "gitlab" else [github_create_issue] | ||
request.tools = active # [!code highlight] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The init_embeddings
function call appears to use an incorrect API. Based on standard LangChain patterns, this should likely be OpenAIEmbeddings(model="text-embedding-3-small")
or similar constructor pattern rather than an init_embeddings
function.
Copilot uses AI. Check for mistakes.
return {"url": f"https://github.com/{repo}/issues/1", "title": title} | ||
|
||
|
||
# GitLab tools |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This import path appears incorrect. The standard LangChain import for OpenAI embeddings would be from langchain_openai import OpenAIEmbeddings
rather than importing an init_embeddings
function.
Copilot uses AI. Check for mistakes.
|
||
// Choose tools based on user context (e.g., vcsProvider = "github" | "gitlab") | ||
const vcsToolGate = createMiddleware({ | ||
name: "VcsToolGate", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The humorous reference to 'travel with a banana' in the visa requirements example may confuse users who expect realistic documentation examples. Consider using a more realistic visa requirement example.
name: "VcsToolGate", | |
`Travelers from ${country} are required to obtain a visa and present a valid passport upon entry.`, |
Copilot uses AI. Check for mistakes.
Preview ID generated: preview-cbdyna-1758761420-9d67015 |
Quick patch that documents that user can define tools in middleware as well as ModelRequest now takes a list of tool names instead of tool instances. Co-authored-by: Lauren Hirata Singh <lauren@langchain.dev>
Preview ID generated: preview-cbdyna-1759143549-68c15ff |
@eyurtsev do you think this is good to go? |
Preview ID generated: preview-cbdyna-1759246583-d530ae5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
- `{ type: "function", function: { name: string } }`: The model will use the specified function. | ||
::: | ||
- `tools` (list of `BaseTool`): the tools to use for this model call | ||
- `tools` (list of strings): the tool names to use for this model call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation change from 'list of BaseTool
' to 'list of strings' appears to contradict the Python example on line 1176 where request.tools = [tool.name for tool in active_tools]
suggests tools should be tool names (strings), not tool objects. However, this inconsistency should be verified against the actual API to ensure the documentation accurately reflects the expected parameter type.
Copilot uses AI. Check for mistakes.
full_catalog = [book_flight, lookup_visa_requirements, local_weather] | ||
|
||
# 2) Precompute and cache embeddings for tool metadata | ||
embedder = init_embeddings("openai:text-embedding-3-small") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The init_embeddings
function is not a standard LangChain API. This should likely be OpenAIEmbeddings(model='text-embedding-3-small')
or the appropriate embeddings initialization method from the LangChain library.
Copilot uses AI. Check for mistakes.
content = ( | ||
last["content"] if isinstance(last, dict) else getattr(last, "content", None) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The message content extraction logic is overly complex and fragile. Consider using a more robust approach like content = last.content if hasattr(last, 'content') else (last.get('content') if isinstance(last, dict) else None)
or define a helper function for message content extraction.
content = ( | |
last["content"] if isinstance(last, dict) else getattr(last, "content", None) | |
) | |
content = last.content if hasattr(last, "content") else (last.get("content") if isinstance(last, dict) else None) |
Copilot uses AI. Check for mistakes.
const toolVectors = await embedder.embedDocuments(toolTexts); | ||
|
||
type CatalogItem = { tool: StructuredTool; vector: number[] }; | ||
const catalog: CatalogItem[] = fullCatalog.map((tool, i) => ({ | ||
tool, | ||
vector: toolVectors[i], | ||
})); | ||
|
||
function cosineSimilarity(a: number[], b: number[]) { | ||
const dot = a.reduce((s, v, i) => s + v * b[i], 0); | ||
const na = Math.hypot(...a); | ||
const nb = Math.hypot(...b); | ||
return na && nb ? dot / (na * nb) : 0; | ||
} | ||
|
||
async function selectTopKBySimilarity(query: string, k = 6) { | ||
const qv = await embedder.embedQuery(query); | ||
return catalog | ||
.map((c) => ({ c, score: cosineSimilarity(qv, c.vector) })) | ||
.sort((a, b) => b.score - a.score) | ||
.slice(0, k) | ||
.map(({ c }) => c.tool); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The await
keyword is used in a non-async context. The embedding computation should either be wrapped in an async function or moved to an initialization phase that properly handles the asynchronous operation.
const toolVectors = await embedder.embedDocuments(toolTexts); | |
type CatalogItem = { tool: StructuredTool; vector: number[] }; | |
const catalog: CatalogItem[] = fullCatalog.map((tool, i) => ({ | |
tool, | |
vector: toolVectors[i], | |
})); | |
function cosineSimilarity(a: number[], b: number[]) { | |
const dot = a.reduce((s, v, i) => s + v * b[i], 0); | |
const na = Math.hypot(...a); | |
const nb = Math.hypot(...b); | |
return na && nb ? dot / (na * nb) : 0; | |
} | |
async function selectTopKBySimilarity(query: string, k = 6) { | |
const qv = await embedder.embedQuery(query); | |
return catalog | |
.map((c) => ({ c, score: cosineSimilarity(qv, c.vector) })) | |
.sort((a, b) => b.score - a.score) | |
.slice(0, k) | |
.map(({ c }) => c.tool); | |
} | |
async function main() { | |
const toolVectors = await embedder.embedDocuments(toolTexts); | |
type CatalogItem = { tool: StructuredTool; vector: number[] }; | |
const catalog: CatalogItem[] = fullCatalog.map((tool, i) => ({ | |
tool, | |
vector: toolVectors[i], | |
})); | |
function cosineSimilarity(a: number[], b: number[]) { | |
const dot = a.reduce((s, v, i) => s + v * b[i], 0); | |
const na = Math.hypot(...a); | |
const nb = Math.hypot(...b); | |
return na && nb ? dot / (na * nb) : 0; | |
} | |
async function selectTopKBySimilarity(query: string, k = 6) { | |
const qv = await embedder.embedQuery(query); | |
return catalog | |
.map((c) => ({ c, score: cosineSimilarity(qv, c.vector) })) | |
.sort((a, b) => b.score - a.score) | |
.slice(0, k) | |
.map(({ c }) => c.tool); | |
} | |
// You can now use selectTopKBySimilarity, catalog, etc. here | |
} | |
main(); |
Copilot uses AI. Check for mistakes.
This docs page covers the advanced usage of "Dynamic Tools" explaining how users can implement smart ways to choose tools based on context (simple) or semantic similarity (advanced).
Happy to add Python code examples if we are ok with current structure.