diff --git a/src/langsmith/images/chat-model-dark.png b/src/langsmith/images/chat-model-dark.png new file mode 100644 index 000000000..f49541d74 Binary files /dev/null and b/src/langsmith/images/chat-model-dark.png differ diff --git a/src/langsmith/images/chat-model-light.png b/src/langsmith/images/chat-model-light.png new file mode 100644 index 000000000..e27eedd3c Binary files /dev/null and b/src/langsmith/images/chat-model-light.png differ diff --git a/src/langsmith/images/chat-model.png b/src/langsmith/images/chat-model.png deleted file mode 100644 index 7a332caaf..000000000 Binary files a/src/langsmith/images/chat-model.png and /dev/null differ diff --git a/src/langsmith/log-llm-trace.mdx b/src/langsmith/log-llm-trace.mdx index e9183819a..1e8e11fc1 100644 --- a/src/langsmith/log-llm-trace.mdx +++ b/src/langsmith/log-llm-trace.mdx @@ -1,72 +1,70 @@ --- -title: Log custom LLM traces -sidebarTitle: Log custom LLM traces +title: Guidelines for logging LLM calls +sidebarTitle: Guidelines for logging LLM calls --- - -Nothing will break if you don't log LLM traces in the correct format - data will still be logged. However, the data will not be processed or rendered in a way that is specific to LLMs. - +LangSmith provides special rendering and processing for LLM traces. This includes pretty rendering for the list of messages, token counting (assuming token counts are not available from the model provider) and token-based cost calculation. -LangSmith provides special rendering and processing for LLM traces, including token counting (assuming token counts are not available from the model provider) and token-based cost calculation. In order to make the most of this feature, you must log your LLM traces in a specific format. +In order to make the most of LangSmith's LLM trace processing, **we recommend logging your LLM traces in one of the specified formats**. +If you don't log your LLM traces in the suggested formats, you will still be able to log the data to LangSmith, but it may not be processed or rendered in expected ways. - -The examples below uses the `traceable` decorator/wrapper to log the model run (which is the recommended approach for Python and JS/TS). However, the same idea applies if you are using the [RunTree](/langsmith/annotate-code#use-the-runtree-api) or [API](https://api.smith.langchain.com/redoc) directly. - -## Chat-style models -### Using LangChain OSS or LangSmith wrappers -If you are using LangChain OSS or LangSmith wrappers, you don't need to do anything special. The wrappers will automatically log traces in the correct format. + +The examples below use the `traceable` decorator/wrapper to log the model run (which is the recommended approach for Python and JS/TS). However, the same idea applies if you are using the [RunTree](/langsmith/annotate-code#use-the-runtree-api) or [API](https://api.smith.langchain.com/redoc) directly. + -### Implementing your own custom chat-model +## Using LangChain OSS or LangSmith wrappers -If you are implementing your own custom chat-model, you need to ensure that your inputs contain a key `messages` with a list of dictionaries/objects. Each dictionary/object must contain the keys `role` and `content` with string values. The output must return an object that, when serialized, contains the key `choices` with a list of dictionaries/objects. Each must contain the key `message` with a dictionary/object that contains the keys `role` and `content` with string values. +If you are using [LangChain OSS to call language models](https://python.langchain.com/docs/tutorials/llm_chain/) or LangSmith wrappers ([OpenAI](/langsmith/trace-openai), [Anthropic](/langsmith/trace-anthropic)), then you're all set! These approaches will automatically log traces in the correct format. -To make your custom LLM traces appear well-formatted in the LangSmith UI, your trace inputs and outputs must conform to a format LangSmith recognizes: +## Tracing a Model with a Custom Input/Output Format -* A list of messages in [OpenAI](https://platform.openai.com/docs/api-reference/messages) or [Anthropic](https://docs.anthropic.com/en/api/messages) format, represented as Python dictionaries or TypeScript objects. +When tracing a custom model, follow the guidelines below to ensure your LLM traces are rendered properly and features such as token tracking and cost calculation work as expected. - * Each message must contain the key `role` and `content`. - * Messages with the `"assistant"` role may optionally contain `tool_calls`. These `tool_calls` may be in [OpenAI](https://platform.openai.com/docs/guides/function-calling?api-mode=chat) format or [LangChain's format](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.tool.ToolCall.html#langchain_core.messages.tool.ToolCall). +### Input Format -* An dict/object containing `"messages"` key with a list of messages in the above format. - * LangSmith may use additional parameters in this input dict that match OpenAI's [chat completion endpoint](https://platform.openai.com/docs/guides/text?api-mode=chat) for rendering in the trace view, such as a list of available `tools` for the model to call. +A Python dictionary or TypeScript object containing a `"messages"` key with a list of messages in [LangChain](https://python.langchain.com/docs/concepts/messages), [OpenAI](https://platform.openai.com/docs/api-reference/messages) (chat completions) or [Anthropic](https://docs.anthropic.com/en/api/messages) format. The messages key must be in the top level of the input field. + +* Each message must contain the key `"role"` and `"content"`. + * `"role"`: `"system" | "user" | "assistant" | "tool"` + * `"content"`: string +* Messages with the `"assistant"` role may contain `"tool_calls"`. These `"tool_calls"` may be in [OpenAI](https://platform.openai.com/docs/guides/function-calling?api-mode=chat) format or [LangChain's format](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.tool.ToolCall.html#langchain_core.messages.tool.ToolCall). +* LangSmith may use additional parameters in this input dict that match OpenAI's [chat completion endpoint](https://platform.openai.com/docs/guides/text?api-mode=chat) for rendering in the trace view, such as a list of available `tools` for the model to call. Here are some examples: + -```python List of messages -# Format 1: List of messages +```python LangChain format: List of messages inputs = [ {"role": "system", "content": "You are a helpful assistant."}, - {"role": "user", "content": "What's the weather like?"}, + {"role": "user", "content": "What's the weather like in San Francisco?"}, { "role": "assistant", "content": "I need to check the weather for you.", "tool_calls": [ { - "id": "call_123", + "id": "call_XXX", "type": "function", "function": { "name": "get_weather", - "arguments": '{"location": "current"}' + "arguments": '{"location": "San Francisco"}' } } ] + }, + { + "role": "tool", + "tool_call_id": "call_XXXX", + "content": "." } ] - -@traceable(run_type="llm") -def chat_model(messages: list): - ... - -chat_model(inputs) ``` -```python Messages dict -# Format 2: Object with messages key +```python OpenAI completions format: Messages dict inputs = { "messages": [ {"role": "system", "content": "You are a helpful assistant."}, @@ -86,45 +84,65 @@ inputs = { } } } - ], - "temperature": 0.7 + ] } +``` -@traceable(run_type="llm") -def chat_model(messages: dict): - ... - -chat_model(inputs) +```python Anthropic format: Messages dict +inputs = { + "messages": [ + { + "role": "user", + "content": "What is the weather like in San Francisco?" + } + ], + "tools": [ + { + "name": "get_weather", + "description": "Get the current weather in a given location", + "input_schema": { + "type": "object", + "properties": { + "location": { + "type": "string", + "description": "The city and state, e.g. San Francisco, CA" + } + }, + "required": ["location"] + } + } + ] + } ``` +### Output Format + The output is accepted in any of the following formats: -* A dictionary/object that contains the key `choices` with a value that is a list of dictionaries/objects. Each dictionary/object must contain the key `message`, which maps to a message object with the keys `role` and `content`. * A dictionary/object that contains the key `message` with a value that is a message object with the keys `role` and `content`. +* A dictionary/object that contains the key `choices` with a value that is a list of dictionaries/objects. Each dictionary/object must contain the key `message`, which maps to a message object with the keys `role` and `content`. * A tuple/array of two elements, where the first element is the role and the second element is the content. * A dictionary/object that contains the key `role` and `content`. +* Similar to the input format, outputs may contain `"tool_calls"`. These `"tool_calls"` may be in OpenAI format or LangChain’s format. + Here are some examples: -```python Choices format +```python Message format from langsmith import traceable @traceable(run_type="llm") -def chat_model_choices(messages): +def chat_model_message(messages): # Your model logic here return { - "choices": [ - { - "message": { - "role": "assistant", - "content": "Sure, what time would you like to book the table for?" - } - } - ] + "message": { + "role": "assistant", + "content": "Sure, what time would you like to book the table for?" + } } # Usage @@ -132,20 +150,24 @@ inputs = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "I'd like to book a table for two."} ] -chat_model_choices(inputs) +chat_model_message(inputs) ``` -```python Message format +```python OpenAI completions format: Choices from langsmith import traceable @traceable(run_type="llm") -def chat_model_message(messages): +def chat_model_choices(messages): # Your model logic here return { - "message": { - "role": "assistant", - "content": "Sure, what time would you like to book the table for?" - } + "choices": [ + { + "message": { + "role": "assistant", + "content": "Sure, what time would you like to book the table for?" + } + } + ] } # Usage @@ -153,7 +175,7 @@ inputs = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "I'd like to book a table for two."} ] -chat_model_message(inputs) +chat_model_choices(inputs) ``` ```python Tuple format @@ -193,7 +215,51 @@ chat_model_direct(inputs) -You can also provide the following `metadata` fields to help LangSmith identify the model - which if recognized, LangSmith will use to automatically calculate costs. To learn more about how to use the `metadata` fields, see [this guide](/langsmith/add-metadata-tags). + +### Converting custom I/O formats into LangSmith compatible formats + +If you're using a custom input or output format, you can convert it to a LangSmith compatible format using `process_inputs` and `process_outputs` functions on the [`@traceable` decorator](https://docs.smith.langchain.com/reference/python/run_helpers/langsmith.run_helpers.traceable). Note that these parameters are only available in the Python SDK. + +`process_inputs` and `process_outputs` accept functions that allow you to transform the inputs and outputs of a specific trace before they are logged to LangSmith. They have access to the trace's inputs and outputs, and can return a new dictionary with the processed data. + +Here's a boilerplate example of how to use `process_inputs` and `process_outputs` to convert a custom I/O format into a LangSmith compatible format: + + + +```python Python +class OriginalInputs(BaseModel): + """Your app's custom request shape""" + +class OriginalOutputs(BaseModel): + """Your app's custom response shape.""" + +class LangSmithInputs(BaseModel): + """The input format LangSmith expects.""" + +class LangSmithOutputs(BaseModel): + """The output format LangSmith expects.""" + +def process_inputs(inputs: dict) -> dict: + """Dict -> OriginalInputs -> LangSmithInputs -> dict""" + +def process_outputs(output: Any) -> dict: + """OriginalOutputs -> LangSmithOutputs -> dict""" + + +@traceable(run_type="llm", process_inputs=process_inputs, process_outputs=process_outputs) +def chat_model(inputs: dict) -> dict: + """ + Your app's model call. Keeps your custom I/O shape. + The decorators call process_* to log LangSmith-compatible format. + """ + +``` + + + +### Identifying a custom model in traces + +When using a custom model, it is recommended to also provide the following `metadata` fields to identify the model when viewing traces and when filtering. * `ls_provider`: The provider of the model, eg "openai", "anthropic", etc. * `ls_model_name`: The name of the model, eg "gpt-4o-mini", "claude-3-opus-20240307", etc. @@ -218,21 +284,6 @@ output = { ] } -# Can also use one of: -# output = { -# "message": { -# "role": "assistant", -# "content": "Sure, what time would you like to book the table for?" -# } -# } -# -# output = { -# "role": "assistant", -# "content": "Sure, what time would you like to book the table for?" -# } -# -# output = ["assistant", "Sure, what time would you like to book the table for?"] - @traceable( run_type="llm", metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"} @@ -302,8 +353,19 @@ await chatModel({ messages }); The above code will log the following trace: -![](/langsmith/images/chat-model.png) - +
+LangSmith UI showing an LLM call trace called ChatOpenAI with a system and human input followed by an AI Output. + +LangSmith UI showing an LLM call trace called ChatOpenAI with a system and human input followed by an AI Output. +
If you implement a custom streaming chat_model, you can "reduce" the outputs into the same format as the non-streaming version. This is currently only supported in Python. ```python @@ -347,7 +409,10 @@ If `ls_model_name` is not present in `extra.metadata`, other fields might be use 3. `inputs.model_name` -## Provide token and cost information +To learn more about how to use the `metadata` fields, see [this guide](/langsmith/add-metadata-tags). + + +### Provide token and cost information By default, LangSmith uses [tiktoken](https://github.com/openai/tiktoken) to count tokens, utilizing a best guess at the model's tokenizer based on the `ls_model_name` provided. It also calculates costs automatically by using the [model pricing table](https://smith.langchain.com/settings/workspaces/models). To learn how LangSmith calculates token-based costs, see [this guide](/langsmith/calculate-token-based-costs). @@ -388,7 +453,7 @@ class UsageMetadata(TypedDict, total=False): Note that the usage data can also include cost information, in case you do not want to rely on LangSmith's token-based cost formula. This is useful for models with pricing that is not linear by token type. -### Setting run metadata +#### Setting run metadata You can [modify the current run's metadata](/langsmith/add-metadata-tags) with usage information within your traced function. The advantage of this approach is that you do not need to change your traced function's runtime outputs. Here's an example: @@ -487,7 +552,7 @@ await chatModel({ messages }); -### Setting run outputs +#### Setting run outputs You can add a `usage_metadata` key to the function's response to set manual token counts and costs. @@ -630,56 +695,3 @@ await runTree.patchRun(); ``` - -## Instruct-style models - -For instruct-style models (string in, string out), your inputs must contain a key `prompt` with a string value. Other inputs are also permitted. The output must return an object that, when serialized, contains the key `choices` with a list of dictionaries/objects. Each must contain the key `text` with a string value. The same rules for `metadata` and `usage_metadata` apply as for chat-style models. - - - -```python Python -from langsmith import traceable - -@traceable( - run_type="llm", - metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"} -) -def hello_llm(prompt: str): - return { - "choices": [ - {"text": "Hello, " + prompt} - ], - "usage_metadata": { - "input_tokens": 4, - "output_tokens": 5, - "total_tokens": 9, - }, - } - -hello_llm("polly the parrot\n") -``` - -```typescript TypeScript -import { traceable } from "langsmith/traceable"; - -const helloLLM = traceable(({ prompt }: { prompt: string }) => { - return { - choices: [ - { text: "Hello, " + prompt } - ], - usage_metadata: { - input_tokens: 4, - output_tokens: 5, - total_tokens: 9, - }, - }; -}, { run_type: "llm", name: "hello_llm", metadata: { ls_provider: "my_provider", ls_model_name: "my_model" } }); - -await helloLLM({ prompt: "polly the parrot\n" }); -``` - - - -The above code will log the following trace: - -![](/langsmith/images/hello-llm.png)