diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chatmodel.adoc b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chatmodel.adoc index b754b21de2..47b7c1459b 100644 --- a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chatmodel.adoc +++ b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chatmodel.adoc @@ -23,8 +23,7 @@ Here is the link:https://github.com/spring-projects/spring-ai/blob/main/spring-a ---- public interface ChatModel extends Model { - default String call(String message) {// implementation omitted - } + default String call(String message) {...} @Override ChatResponse call(Prompt prompt); @@ -32,8 +31,8 @@ public interface ChatModel extends Model { ---- -The `call` method with a `String` parameter simplifies initial use, avoiding the complexities of the more sophisticated `Prompt` and `ChatResponse` classes. -In real-world applications, it is more common to use the `call` method that takes a `Prompt` instance and returns an `ChatResponse`. +The `call()` method with a `String` parameter simplifies initial use, avoiding the complexities of the more sophisticated `Prompt` and `ChatResponse` classes. +In real-world applications, it is more common to use the `call()` method that takes a `Prompt` instance and returns a `ChatResponse`. === StreamingChatModel @@ -42,17 +41,20 @@ Here is the link:https://github.com/spring-projects/spring-ai/blob/main/spring-a [source,java] ---- public interface StreamingChatModel extends StreamingModel { + + default Flux stream(String message) {...} + @Override Flux stream(Prompt prompt); } ---- -The `stream` method takes a `Prompt` request similar to `ChatModel` but it streams the responses using the reactive Flux API. +The `stream()` method takes a `String` or `Prompt` parameter similar to `ChatModel` but it streams the responses using the reactive Flux API. === Prompt The https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/prompt/Prompt.java[Prompt] is a `ModelRequest` that encapsulates a list of https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/messages/Message.java[Message] objects and optional model request options. -The following listing shows a truncated version of the Prompt class, excluding constructors and other utility methods: +The following listing shows a truncated version of the `Prompt` class, excluding constructors and other utility methods: [source,java] ---- @@ -63,7 +65,7 @@ public class Prompt implements ModelRequest> { private ChatOptions modelOptions; @Override - public ChatOptions getOptions() {..} + public ChatOptions getOptions() {...} @Override public List getInstructions() {...} @@ -74,7 +76,7 @@ public class Prompt implements ModelRequest> { ==== Message -The `Message` interface encapsulates a Prompt textual content and a collection of metadata attributes and a categorization known as `MessageType`. +The `Message` interface encapsulates a `Prompt` textual content, a collection of metadata attributes, and a categorization known as `MessageType`. The interface is defined as follows: @@ -108,7 +110,7 @@ image::spring-ai-message-api.jpg[Spring AI Message API, width=800, align="center The chat completion endpoint, distinguish between message categories based on conversational roles, effectively mapped by the `MessageType`. -For instance, OpenAI recognizes message categories for distinct conversational roles such as `system`,`user`, `function` or `assistant`. +For instance, OpenAI recognizes message categories for distinct conversational roles such as `system`, `user`, `function`, or `assistant`. While the term `MessageType` might imply a specific message format, in this context it effectively designates the role a message plays in the dialogue. @@ -120,23 +122,26 @@ To understand the practical application and the relationship between `Prompt` an Represents the options that can be passed to the AI model. The `ChatOptions` class is a subclass of `ModelOptions` and is used to define few portable options that can be passed to the AI model. The `ChatOptions` class is defined as follows: - [source,java] ---- public interface ChatOptions extends ModelOptions { + String getModel(); + Float getFrequencyPenalty(); + Integer getMaxTokens(); + Float getPresencePenalty(); + List getStopSequences(); Float getTemperature(); - void setTemperature(Float temperature); - Float getTopP(); - void setTopP(Float topP); Integer getTopK(); - void setTopK(Integer topK); + Float getTopP(); + ChatOptions copy(); + } ---- -Additionally, every model specific ChatModel/StreamingChatModel implementation can have its own options that can be passed to the AI model. For example, the OpenAI Chat Completion model has its own options like `presencePenalty`, `frequencyPenalty`, `bestOf` etc. +Additionally, every model specific ChatModel/StreamingChatModel implementation can have its own options that can be passed to the AI model. For example, the OpenAI Chat Completion model has its own options like `logitBias`, `seed`, and `user`. -This is a powerful feature that allows developers to use model specific options when starting the application and then override them with at runtime using the Prompt request: +This is a powerful feature that allows developers to use model-specific options when starting the application and then override them at runtime using the `Prompt` request: image::chat-options-flow.jpg[align="center", width="800px"] @@ -169,13 +174,13 @@ The `ChatResponse` class also carries a `ChatResponseMetadata` metadata about th [[Generation]] === Generation -Finally, the https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/model/Generation.java[Generation] class extends from the `ModelResult` to represent the output assistant message response and related metadata about this result: +Finally, the https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/model/Generation.java[Generation] class extends from the `ModelResult` to represent the model output (assistant message) and related metadata: [source,java] ---- public class Generation implements ModelResult { - private AssistantMessage assistantMessage; + private final AssistantMessage assistantMessage; private ChatGenerationMetadata chatGenerationMetadata; @Override @@ -194,9 +199,9 @@ The `ChatModel` and `StreamingChatModel` implementations are provided for the fo image::spring-ai-chat-completions-clients.jpg[align="center", width="800px"] -* xref:api/chat/openai-chat.adoc[OpenAI Chat Completion] (streaming & function-calling support) +* xref:api/chat/openai-chat.adoc[OpenAI Chat Completion] (streaming, multi-modality & function-calling support) * xref:api/chat/azure-openai-chat.adoc[Microsoft Azure Open AI Chat Completion] (streaming & function-calling support) -* xref:api/chat/ollama-chat.adoc[Ollama Chat Completion] +* xref:api/chat/ollama-chat.adoc[Ollama Chat Completion] (streaming, multi-modality & function-calling support) * xref:api/chat/huggingface.adoc[Hugging Face Chat Completion] (no streaming support) * xref:api/chat/vertexai-palm2-chat.adoc[Google Vertex AI PaLM2 Chat Completion] (no streaming support) * xref:api/chat/vertexai-gemini-chat.adoc[Google Vertex AI Gemini Chat Completion] (streaming, multi-modality & function-calling support) @@ -207,11 +212,11 @@ image::spring-ai-chat-completions-clients.jpg[align="center", width="800px"] ** xref:api/chat/bedrock/bedrock-anthropic.adoc[Anthropic Chat Completion] ** xref:api/chat/bedrock/bedrock-jurassic2.adoc[Jurassic2 Chat Completion] * xref:api/chat/mistralai-chat.adoc[Mistral AI Chat Completion] (streaming & function-calling support) -* xref:api/chat/anthropic-chat.adoc[Anthropic Chat Completion] (streaming) +* xref:api/chat/anthropic-chat.adoc[Anthropic Chat Completion] (streaming & function-calling support) == Chat Model API -The Spring AI Chat Model API is build on top of the Spring AI `Generic Model API` providing Chat specific abstractions and implementations. Following class diagram illustrates the main classes and interfaces of the Spring AI Chat Model API. +The Spring AI Chat Model API is built on top of the Spring AI `Generic Model API` providing Chat specific abstractions and implementations. The following class diagram illustrates the main classes and interfaces of the Spring AI Chat Model API. image::spring-ai-chat-api.jpg[align="center", width="900px"] diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/functions.adoc b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/functions.adoc index bd0d10f049..79e23f7102 100644 --- a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/functions.adoc +++ b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/functions.adoc @@ -5,7 +5,7 @@ The integration of function support in AI models, permits the model to request t image::function-calling-basic-flow.jpg[Function calling, width=700, align="center"] -Spring AI currently supports Function invocation for the following AI Models +Spring AI currently supports function invocation for the following AI Models: * Anthropic Claude: Refer to the xref:api/chat/functions/anthropic-chat-functions.adoc[Anthropic Claude function invocation docs]. * Azure OpenAI: Refer to the xref:api/chat/functions/azure-open-ai-chat-functions.adoc[Azure OpenAI function invocation docs]. @@ -14,5 +14,5 @@ Spring AI currently supports Function invocation for the following AI Models * Mistral AI: Refer to the xref:api/chat/functions/mistralai-chat-functions.adoc[Mistral AI function invocation docs]. // * MiniMax : Refer to the xref:api/chat/functions/minimax-chat-functions.adoc[MiniMax function invocation docs]. * Ollama: Refer to the xref:api/chat/functions/ollama-chat-functions.adoc[Ollama function invocation docs] (streaming not supported yet). -* OpenAI: Refer to the xref:api/chat/functions/openai-chat-functions.adoc[Open AI function invocation docs]. +* OpenAI: Refer to the xref:api/chat/functions/openai-chat-functions.adoc[OpenAI function invocation docs]. // * ZhiPu AI : Refer to the xref:api/chat/functions/zhipuai-chat-functions.adoc[ZhiPu AI function invocation docs]. diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/generic-model.adoc b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/generic-model.adoc index 602bb4b951..f3a488464a 100644 --- a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/generic-model.adoc +++ b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/generic-model.adoc @@ -11,7 +11,7 @@ image::spring-ai-generic-model-api.jpg[width=900, align="center"] == Model -The Model interface provides a generic API for invoking AI models. It is designed to handle the interaction with various types of AI models by abstracting the process of sending requests and receiving responses. The interface uses Java generics to accommodate different types of requests and responses, enhancing flexibility and adaptability across different AI model implementations. +The `Model` interface provides a generic API for invoking AI models. It is designed to handle the interaction with various types of AI models by abstracting the process of sending requests and receiving responses. The interface uses Java generics to accommodate different types of requests and responses, enhancing flexibility and adaptability across different AI model implementations. The interface is defined below: @@ -31,7 +31,7 @@ public interface Model, TRes extends ModelResponse< == StreamingModel -The StreamingModel interface provides a generic API for invoking an AI model with streaming response. It abstracts the process of sending requests and receiving a streaming response. The interface uses Java generics to accommodate different types of requests and responses, enhancing flexibility and adaptability across different AI model implementations. +The `StreamingModel` interface provides a generic API for invoking an AI model with streaming response. It abstracts the process of sending requests and receiving a streaming response. The interface uses Java generics to accommodate different types of requests and responses, enhancing flexibility and adaptability across different AI model implementations. [source,java] ---- @@ -50,7 +50,7 @@ public interface StreamingModel, TResChunk extends == ModelRequest -Interface representing a request to an AI model. This interface encapsulates the necessary information required to interact with an AI model, including instructions or inputs (of generic type T) and additional model options. It provides a standardized way to send requests to AI models, ensuring that all necessary details are included and can be easily managed. +The `ModelRequest` interface represents a request to an AI model. It encapsulates the necessary information required to interact with an AI model, including instructions or inputs (of generic type `T`) and additional model options. It provides a standardized way to send requests to AI models, ensuring that all necessary details are included and can be easily managed. [source,java] ---- @@ -73,7 +73,7 @@ public interface ModelRequest { == ModelOptions -Interface representing the customizable options for AI model interactions. This marker interface allows for the specification of various settings and parameters that can influence the behavior and output of AI models. It is designed to provide flexibility and adaptability in different AI scenarios, ensuring that the AI models can be fine-tuned according to specific requirements. +The `ModelOptions` interface represents the customizable options for AI model interactions. This marker interface allows for the specification of various settings and parameters that can influence the behavior and output of AI models. It is designed to provide flexibility and adaptability in different AI scenarios, ensuring that the AI models can be fine-tuned according to specific requirements. [source,java] ---- @@ -84,7 +84,7 @@ public interface ModelOptions { == ModelResponse -Interface representing the response received from an AI model. This interface provides methods to access the main result or a list of results generated by the AI model, along with the response metadata. It serves as a standardized way to encapsulate and manage the output from AI models, ensuring easy retrieval and processing of the generated information. +The `ModelResponse` interface represents the response received from an AI model. This interface provides methods to access the main result or a list of results generated by the AI model, along with the response metadata. It serves as a standardized way to encapsulate and manage the output from AI models, ensuring easy retrieval and processing of the generated information. [source,java] ---- @@ -111,10 +111,9 @@ public interface ModelResponse> { } ---- - == ModelResult -This interface provides methods to access the main output of the AI model and the metadata associated with this result. It is designed to offer a standardized and comprehensive way to handle and interpret the outputs generated by AI models. +The `ModelResult` interface provides methods to access the main output of the AI model and the metadata associated with this result. It is designed to offer a standardized and comprehensive way to handle and interpret the outputs generated by AI models. [source,java] ---- diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/multimodality.adoc b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/multimodality.adoc index fc389b2fdd..17d9ae7d5b 100644 --- a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/multimodality.adoc +++ b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/multimodality.adoc @@ -27,28 +27,27 @@ The Spring AI Message API provides all necessary abstractions to support multimo image::spring-ai-message-api.jpg[Spring AI Message API, width=800, align="center"] -The UserMessage’s `content` field is used as primarily text inputs, while the, optional, `media` field allows adding one or more additional content of different modalities such as images, audio and video. +The UserMessage’s `content` field is used primarily for text inputs, while the optional `media` field allows adding one or more additional content of different modalities such as images, audio and video. The `MimeType` specifies the modality type. -Depending on the used LLMs the Media's data field can be either encoded raw media content or an URI to the content. +Depending on the used LLMs, the `Media` data field can be either the raw media content as a `Resource` object or a `URI` to the content. -NOTE: The media field is currently applicable only for user input messages (e.g., `UserMessage`). It does not hold significance for system messages. The `AssistantMessage`, which includes the LLM response, provides text content only. To generate non-text media outputs, you should utilize one of dedicated, single modality models.* +NOTE: The media field is currently applicable only for user input messages (e.g., `UserMessage`). It does not hold significance for system messages. The `AssistantMessage`, which includes the LLM response, provides text content only. To generate non-text media outputs, you should utilize one of the dedicated, single-modality models.* - -For example we can take the following picture (*multimodal.test.png*) as an input and ask the LLM to explain what it sees in the picture. +For example, we can take the following picture (`multimodal.test.png`) as an input and ask the LLM to explain what it sees. image::multimodal.test.png[Multimodal Test Image, 200, 200, align="left"] -From most of the multimodal LLMs, the Spring AI code would look something like this: +For most of the multimodal LLMs, the Spring AI code would look something like this: [source,java] ---- -byte[] imageData = new ClassPathResource("/multimodal.test.png").getContentAsByteArray(); +var imageResource = new ClassPathResource("/multimodal.test.png"); var userMessage = new UserMessage( "Explain what do you see in this picture?", // content - List.of(new Media(MimeTypeUtils.IMAGE_PNG, imageData))); // media + new Media(MimeTypeUtils.IMAGE_PNG, imageResource)); // media -ChatResponse response = chatModel.call(new Prompt(List.of(userMessage))); +ChatResponse response = chatModel.call(new Prompt(userMessage)); ---- or with the fluent xref::api/chatclient.adoc[ChatClient] API: @@ -57,21 +56,20 @@ or with the fluent xref::api/chatclient.adoc[ChatClient] API: ---- String response = ChatClient.create(chatModel).prompt() .user(u -> u.text("Explain what do you see on this picture?") - .media(MimeTypeUtils.IMAGE_PNG, new ClassPathResource("/multimodal.test.png"))) + .media(MimeTypeUtils.IMAGE_PNG, new ClassPathResource("/multimodal.test.png"))) .call() .content(); ---- - and produce a response like: > This is an image of a fruit bowl with a simple design. The bowl is made of metal with curved wire edges that create an open structure, allowing the fruit to be visible from all angles. Inside the bowl, there are two yellow bananas resting on top of what appears to be a red apple. The bananas are slightly overripe, as indicated by the brown spots on their peels. The bowl has a metal ring at the top, likely to serve as a handle for carrying. The bowl is placed on a flat surface with a neutral-colored background that provides a clear view of the fruit inside. -Latest version of Spring AI provides multimodal support for the following Chat Clients: +Spring AI provides multimodal support for the following chat models: -* xref:api/chat/openai-chat.adoc#_multimodal[Open AI - (GPT-4-Vision and GPT-4o models)] -* xref:api/chat/ollama-chat.adoc#_multimodal[Ollama - (LlaVa and Baklava models)] -* xref:api/chat/vertexai-gemini-chat.adoc#_multimodal[Vertex AI Gemini - (gemini-1.5-pro-001, gemini-1.5-flash-001 models)] +* xref:api/chat/openai-chat.adoc#_multimodal[OpenAI (e.g. GPT-4 and GPT-4o models)] +* xref:api/chat/ollama-chat.adoc#_multimodal[Ollama (e.g. LlaVa and Baklava models)] +* xref:api/chat/vertexai-gemini-chat.adoc#_multimodal[Vertex AI Gemini (e.g. gemini-1.5-pro-001, gemini-1.5-flash-001 models)] * xref:api/chat/anthropic-chat.adoc#_multimodal[Anthropic Claude 3] * xref:api/chat/bedrock/bedrock-anthropic3.adoc#_multimodal[AWS Bedrock Anthropic Claude 3] -* xref:api/chat/azure-openai-chat.adoc#_multimodal[Azure Open AI - (GPT-4o models)] \ No newline at end of file +* xref:api/chat/azure-openai-chat.adoc#_multimodal[Azure Open AI (e.g. GPT-4o models)] diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/prompt.adoc b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/prompt.adoc index d12b514390..7e11009f5a 100644 --- a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/prompt.adoc +++ b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/prompt.adoc @@ -12,7 +12,8 @@ Another analogy is a SQL statement that contain placeholders for certain express As Spring AI evolves, it will introduce higher levels of abstraction for interacting with AI models. The foundational classes described in this section can be likened to JDBC in terms of their role and functionality. The `ChatModel` class, for instance, is analogous to the core JDBC library in the JDK. -Building upon this, Spring AI can provide helper classes similar to `JdbcTemplate`, Spring Data Repositories, and eventually, more advanced constructs like ChatEngines and Agents that consider past interactions with the model. +The `ChatClient` class can be likened to the `JdbcClient`, built on top of `ChatModel` and providing more advanced constructs via `Advisor` +to consider past interactions with the model, augment the prompt with additional contextual documents, and introduce agentic behavior. The structure of prompts has evolved over time within the AI field. Initially, prompts were simple strings. @@ -24,11 +25,11 @@ OpenAI have introduced even more structure to prompts by categorizing multiple m === Prompt -It is common to use the `call` method of `ChatModel` that takes a `Prompt` instance and returns an `ChatResponse`. +It is common to use the `call()` method of `ChatModel` that takes a `Prompt` instance and returns a `ChatResponse`. -The Prompt class functions as a container for an organized series of `Message` objects and a request `ChatOptions`. -Every Message embodies a unique role within the prompt, differing in its content and intent. -These roles can encompass a variety of elements, from user inquiries to AI-generated responses or relevant background information. +The `Prompt` class functions as a container for an organized series of `Message` objects and a request `ChatOptions`. +Every `Message` embodies a unique role within the prompt, differing in its content and intent. +These roles can encompass a variety of elements, from user inquiries to AI-generated responses to relevant background information. This arrangement enables intricate and detailed interactions with AI models, as the prompt is constructed from multiple messages, each assigned a specific role to play in the dialogue. Below is a truncated version of the Prompt class, with constructors and utility methods omitted for brevity: @@ -44,7 +45,7 @@ public class Prompt implements ModelRequest> { === Message -The `Message` interface encapsulates a Prompt textual content and a collection of metadata attributes and a categorization known as `MessageType`. +The `Message` interface encapsulates a `Prompt` textual content, a collection of metadata attributes, and a categorization known as `MessageType`. The interface is defined as follows: @@ -94,7 +95,7 @@ More than just an answer or reaction, it's crucial for maintaining the flow of t By tracking the AI's previous responses (its 'Assistant Role' messages), the system ensures coherent and contextually relevant interactions. The Assistant message may contain Function Tool Call request information as well. It's like a special feature in the AI, used when needed to perform specific functions such as calculations, fetching data, or other tasks beyond just talking. -* Tool/Function Role: The Too/Function Role focuses on returning additional information in response to Tool Call Aisstnat Messages. +* Tool/Function Role: The Too/Function Role focuses on returning additional information in response to Tool Call Assistant Messages. Roles are represented as an enumeration in Spring AI as shown below @@ -113,7 +114,6 @@ public enum MessageType { } ``` - === PromptTemplate A key component for prompt templating in Spring AI is the `PromptTemplate` class. @@ -131,9 +131,9 @@ The interfaces implemented by this class support different aspects of prompt cre `PromptTemplateStringActions` focuses on creating and rendering prompt strings, representing the most basic form of prompt generation. -`PromptTemplateMessageActions` is tailored for prompt creation through the generation and manipulation of Message objects. +`PromptTemplateMessageActions` is tailored for prompt creation through the generation and manipulation of `Message` objects. -`PromptTemplateActions` is designed to return the Prompt object, which can be passed to ChatModel for generating a response. +`PromptTemplateActions` is designed to return the `Prompt` object, which can be passed to `ChatModel` for generating a response. While these interfaces might not be used extensively in many projects, they show the different approaches to prompt creation. @@ -151,21 +151,25 @@ public interface PromptTemplateStringActions { The method `String render()`: Renders a prompt template into a final string format without external input, suitable for templates without placeholders or dynamic content. -The method `String render(Map model)`: Enhances rendering functionality to include dynamic content. It uses a Map where map keys are placeholder names in the prompt template, and values are the dynamic content to be inserted. +The method `String render(Map model)`: Enhances rendering functionality to include dynamic content. It uses a `Map` where map keys are placeholder names in the prompt template, and values are the dynamic content to be inserted. ```java public interface PromptTemplateMessageActions { Message createMessage(); + Message createMessage(List mediaList); + Message createMessage(Map model); } ``` -The method `Message createMessage()`: Creates a Message object without additional data, used for static or predefined message content. +The method `Message createMessage()`: Creates a `Message` object without additional data, used for static or predefined message content. -The method `Message createMessage(Map model)`: Extends message creation to integrate dynamic content, accepting a Map where each entry represents a placeholder in the message template and its corresponding dynamic value. +The method `Message createMessage(List mediaList)`: Creates a `Message` object with static textual and media content. + +The method `Message createMessage(Map model)`: Extends message creation to integrate dynamic content, accepting a `Map` where each entry represents a placeholder in the message template and its corresponding dynamic value. ```java @@ -173,21 +177,27 @@ public interface PromptTemplateActions extends PromptTemplateStringActions { Prompt create(); + Prompt create(ChatOptions modelOptions); + Prompt create(Map model); + Prompt create(Map model, ChatOptions modelOptions); + } ``` -The method `Prompt create()`: Generates a Prompt object without external data inputs, ideal for static or predefined prompts. +The method `Prompt create()`: Generates a `Prompt` object without external data inputs, ideal for static or predefined prompts. + +The method `Prompt create(ChatOptions modelOptions)`: Generates a `Prompt` object without external data inputs and with specific options for the chat request. -The method `Prompt create(Map model)`: Expands prompt creation capabilities to include dynamic content, taking a Map where each map entry is a placeholder in the prompt template and its associated dynamic value. +The method `Prompt create(Map model)`: Expands prompt creation capabilities to include dynamic content, taking a `Map` where each map entry is a placeholder in the prompt template and its associated dynamic value. +The method `Prompt create(Map model, ChatOptions modelOptions)`: Expands prompt creation capabilities to include dynamic content, taking a `Map` where each map entry is a placeholder in the prompt template and its associated dynamic value, and specific options for the chat request. == Example Usage A simple example taken from the https://github.com/Azure-Samples/spring-ai-azure-workshop/blob/main/2-README-prompt-templating.md[AI Workshop on PromptTemplates] is shown below. - ```java PromptTemplate promptTemplate = new PromptTemplate("Tell me a {adjective} joke about {topic}"); @@ -226,11 +236,10 @@ This shows how you can build up the `Prompt` instance by using the `SystemPrompt The message with the role `user` is then combined with the message of the role `system` to form the prompt. The prompt is then passed to the ChatModel to get a generative response. - === Using resources instead of raw Strings -Spring AI supports the `org.springframework.core.io.Resource` abstraction so you can put prompt data in a file that can directly be used in PromptTemplates. -For example, you can define a field in your Spring managed component to retrieve the Resource. +Spring AI supports the `org.springframework.core.io.Resource` abstraction, so you can put prompt data in a file that can directly be used in a `PromptTemplate`. +For example, you can define a field in your Spring managed component to retrieve the `Resource`. ```java @Value("classpath:/prompts/system-message.st") @@ -239,14 +248,10 @@ private Resource systemResource; and then pass that resource to the `SystemPromptTemplate` directly. - ```java SystemPromptTemplate systemPromptTemplate = new SystemPromptTemplate(systemResource); ``` - - - == Prompt Engineering In generative AI, the creation of prompts is a crucial task for developers. @@ -267,21 +272,21 @@ You should recognize the importance of prompt engineering and consider using ins When developing prompts, it's important to integrate several key components to ensure clarity and effectiveness: -* *Instructions*: Offer clear and direct instructions to the AI, similar to how you would communicate with a person. This clarity is essential for helping the AI 'understand' what is expected. +* *Instructions*: Offer clear and direct instructions to the AI, similar to how you would communicate with a person. This clarity is essential for helping the AI "understand" what is expected. -* *External Context*: Include relevant background information or specific guidance for the AI's response when necessary. This 'external context' frames the prompt and aids the AI in grasping the overall scenario. +* *External Context*: Include relevant background information or specific guidance for the AI's response when necessary. This "external context" frames the prompt and aids the AI in grasping the overall scenario. * *User Input*: This is the straightforward part - the user's direct request or question forming the core of the prompt. * *Output Indicator*: This aspect can be tricky. It involves specifying the desired format for the AI's response, such as JSON. However, be aware that the AI might not always adhere strictly to this format. For instance, it might prepend a phrase like "here is your JSON" before the actual JSON data, or sometimes generate a JSON-like structure that is not accurate. Providing the AI with examples of the anticipated question and answer format can be highly beneficial when crafting prompts. -This practice helps the AI 'understand' the structure and intent of your query, leading to more precise and relevant responses. +This practice helps the AI "understand" the structure and intent of your query, leading to more precise and relevant responses. While this documentation does not delve deeply into these techniques, they provide a starting point for further exploration in AI prompt engineering. Following is a list of resources for further investigation. -== Simple Techniques +==== Simple Techniques * *https://www.promptingguide.ai/introduction/examples.en#text-summarization[Text Summarization]*: + Reduces extensive text into concise summaries, capturing key points and main ideas while omitting less critical details. @@ -298,7 +303,7 @@ Creates interactive dialogues where the AI can engage in back-and-forth communic * *https://www.promptingguide.ai/introduction/examples.en#code-generation[Code Generation]*: + Generates functional code snippets based on specific user requirements or descriptions, translating natural language instructions into executable code. -== Advanced Techniques +==== Advanced Techniques * *https://www.promptingguide.ai/techniques/zeroshot[Zero-shot], https://www.promptingguide.ai/techniques/fewshot[Few-shot Learning]*: + Enables the model to make accurate predictions or responses with minimal to no prior examples of the specific problem type, understanding and acting on new tasks using learned generalizations. @@ -309,13 +314,11 @@ Links multiple AI responses to create a coherent and contextually aware conversa * *https://www.promptingguide.ai/techniques/react[ReAct (Reason + Act)]*: + In this method, the AI first analyzes (reasons about) the input, then determines the most appropriate course of action or response. It combines understanding with decision-making. -== Microsoft Guidance +==== Microsoft Guidance * *https://github.com/microsoft/guidance[Framework for Prompt Creation and Optimization]*: + Microsoft offers a structured approach to developing and refining prompts. This framework guides users in creating effective prompts that elicit the desired responses from AI models, optimizing the interaction for clarity and efficiency. - - == Tokens Tokens are essential in how AI models process text, acting as a bridge that converts words (as we understand them) into a format that AI models can process. @@ -337,5 +340,3 @@ Tokens have practical implications beyond their technical role in AI processing, * Context Window: A model's token limit determines its context window. Inputs exceeding this limit are not processed by the model. It's crucial to send only the minimal effective set of information for processing. For example, when inquiring about "Hamlet," there's no need to include tokens from all of Shakespeare's other works. * Response Metadata: The metadata of a response from an AI model includes the number of tokens used, a vital piece of information for managing usage and costs. - - diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/structured-output-converter.adoc b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/structured-output-converter.adoc index 2f29b8392b..90dd0c9443 100644 --- a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/structured-output-converter.adoc +++ b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/structured-output-converter.adoc @@ -6,7 +6,7 @@ NOTE: As of 02.05.2024 the old `OutputParser`, `BeanOutputParser`, `ListOutputPa the latter are drop-in replacements for the former ones and provide the same functionality. The reason for the change was primarily naming, as there isn't any parsing being done, but also have aligned with the Spring `org.springframework.core.convert.converter` package brining in some improved functionality. The ability of LLMs to produce structured outputs is important for downstream applications that rely on reliably parsing output values. -Developers want to quickly turn results from an AI model into data types, such as JSON, XML or Java Classes, that can be passed to other application functions and methods. +Developers want to quickly turn results from an AI model into data types, such as JSON, XML or Java classes, that can be passed to other application functions and methods. The Spring AI `Structured Output Converters` help to convert the LLM output into a structured format. As shown in the following diagram, this approach operates around the LLM text completion endpoint: @@ -78,7 +78,6 @@ The format instructions are most often appended to the end of the user input usi The Converter is responsible to transform output text from the model into instances of the specified type `T`. - === Available Converters Currently, Spring AI provides `AbstractConversionServiceOutputConverter`, `AbstractMessageOutputConverter`, `BeanOutputConverter`, `MapOutputConverter` and `ListOutputConverter` implementations: @@ -107,18 +106,18 @@ record ActorsFilms(String actor, List movies) { } ---- -Here is how to apply the BeanOutputConverter using the new, fluent ChatClient API: +Here is how to apply the BeanOutputConverter using the high-level, fluent `ChatClient` API: [source,java] ---- ActorsFilms actorsFilms = ChatClient.create(chatModel).prompt() .user(u -> u.text("Generate the filmography of 5 movies for {actor}.") - .param("actor", "Tom Hanks")) + .param("actor", "Tom Hanks")) .call() .entity(ActorsFilms.class); ---- -or using the low-level, ChatModel API directly: +or using the low-level `ChatModel` API directly: [source,java] ---- @@ -135,7 +134,7 @@ String template = """ """; Generation generation = chatModel.call( - new Prompt(new PromptTemplate(template, Map.of("actor", actor, "format", format)).createMessage())).getResult(); + new PromptTemplate(template, Map.of("actor", actor, "format", format)).create()).getResult(); ActorsFilms actorsFilms = beanOutputConverter.convert(generation.getOutput().getContent()); ---- @@ -150,11 +149,10 @@ For example, to represent a list of actors and their filmographies: List actorsFilms = ChatClient.create(chatModel).prompt() .user("Generate the filmography of 5 movies for Tom Hanks and Bill Murray.") .call() - .entity(new ParameterizedTypeReference>() { - }); + .entity(new ParameterizedTypeReference>() {}); ---- -or using the low-level, ChatModel API directly: +or using the low-level `ChatModel` API directly: [source,java] ---- @@ -167,7 +165,7 @@ String template = """ {format} """; -Prompt prompt = new Prompt(new PromptTemplate(template, Map.of("format", format)).createMessage()); +Prompt prompt = new PromptTemplate(template, Map.of("format", format)).create(); Generation generation = chatModel.call(prompt).getResult(); @@ -176,19 +174,18 @@ List actorsFilms = outputConverter.convert(generation.getOutput().g === Map Output Converter -Following sniped shows how to use `MapOutputConverter` to generate a list of numbers. +The following snippet shows how to use `MapOutputConverter` to convert the model output to a list of numbers in a map. [source,java] ---- Map result = ChatClient.create(chatModel).prompt() .user(u -> u.text("Provide me a List of {subject}") - .param("subject", "an array of numbers from 1 to 9 under they key name 'numbers'")) + .param("subject", "an array of numbers from 1 to 9 under they key name 'numbers'")) .call() - .entity(new ParameterizedTypeReference>() { - }); + .entity(new ParameterizedTypeReference>() {}); ---- -or using the low-level, ChatModel API directly: +or using the low-level `ChatModel` API directly: [source,java] ---- @@ -199,9 +196,10 @@ String template = """ Provide me a List of {subject} {format} """; -PromptTemplate promptTemplate = new PromptTemplate(template, - Map.of("subject", "an array of numbers from 1 to 9 under they key name 'numbers'", "format", format)); -Prompt prompt = new Prompt(promptTemplate.createMessage()); + +Prompt prompt = new PromptTemplate(template, + Map.of("subject", "an array of numbers from 1 to 9 under they key name 'numbers'", "format", format)).create(); + Generation generation = chatModel.call(prompt).getResult(); Map result = mapOutputConverter.convert(generation.getOutput().getContent()); @@ -209,18 +207,18 @@ Map result = mapOutputConverter.convert(generation.getOutput().g === List Output Converter -Following snippet shows how to use `ListOutputConverter` to generate a list of ice cream flavors. +The following snippet shows how to use `ListOutputConverter` to convert the model output into a list of ice cream flavors. [source,java] ---- List flavors = ChatClient.create(chatModel).prompt() .user(u -> u.text("List five {subject}") - .param("subject", "ice cream flavors")) + .param("subject", "ice cream flavors")) .call() .entity(new ListOutputConverter(new DefaultConversionService())); ---- -or using the low-level, ChatModel API directly: +or using the low-level `ChatModel API` directly: [source,java] ---- @@ -231,9 +229,10 @@ String template = """ List five {subject} {format} """; -PromptTemplate promptTemplate = new PromptTemplate(template, - Map.of("subject", "ice cream flavors", "format", format)); -Prompt prompt = new Prompt(promptTemplate.createMessage()); + +Prompt prompt = new PromptTemplate(template, + Map.of("subject", "ice cream flavors", "format", format)).create(); + Generation generation = this.chatModel.call(prompt).getResult(); List list = listOutputConverter.convert(generation.getOutput().getContent()); @@ -258,17 +257,11 @@ The following AI Models have been tested to support List, Map and Bean structure | xref:api/chat/bedrock/bedrock-llama.adoc[Bedrock Llama] | link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-bedrock/src/test/java/org/springframework/ai/bedrock/llama/BedrockLlamaChatModelIT.java[BedrockLlamaChatModelIT.java.java] |==== -== Build-in JSON mode +== Built-in JSON mode Some AI Models provide dedicated configuration options to generate structured (usually JSON) output. -* xref:api/chat/openai-chat.adoc#_structured_outputs[OpenAI Structured Outputs] can ensure your model generates responses conforming strictly to your provided JSON Schema. You can choose between the `JSON_OBJECT` that guarantees the message the model generates is valid JSON or `JSON_SCHEMA` with a supplied schema that guarantees the model will generate a response that matches your supplied schema. +* xref:api/chat/openai-chat.adoc#_structured_outputs[OpenAI Structured Outputs] can ensure your model generates responses conforming strictly to your provided JSON Schema. You can choose between the `JSON_OBJECT` that guarantees the message the model generates is valid JSON or `JSON_SCHEMA` with a supplied schema that guarantees the model will generate a response that matches your supplied schema (`spring.ai.openai.chat.options.responseFormat` option). * xref:api/chat/azure-openai-chat.adoc[Azure OpenAI] - provides a `spring.ai.azure.openai.chat.options.responseFormat` options specifying the format that the model must output. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON. -* xref:api/chat/ollama-chat.adoc[Ollama] - provides a `spring.ai.ollama.chat.options.format` option to specify the format to return a response in. Currently the only accepted value is `json`. -* xref:api/chat/mistralai-chat.adoc[Mistral AI] - provides a `spring.ai.mistralai.chat.options.responseFormat` option to specify the format to return a response in. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON. - - - - - - +* xref:api/chat/ollama-chat.adoc[Ollama] - provides a `spring.ai.ollama.chat.options.format` option to specify the format to return a response in. Currently, the only accepted value is `json`. +* xref:api/chat/mistralai-chat.adoc[Mistral AI] - provides a `spring.ai.mistralai.chat.options.responseFormat` option to specify the format to return a response in. Setting it to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON. diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/pages/concepts.adoc b/spring-ai-docs/src/main/antora/modules/ROOT/pages/concepts.adoc index d43a965c50..3d923662ff 100644 --- a/spring-ai-docs/src/main/antora/modules/ROOT/pages/concepts.adoc +++ b/spring-ai-docs/src/main/antora/modules/ROOT/pages/concepts.adoc @@ -18,7 +18,7 @@ image::spring-ai-concepts-model-types.jpg[Model types, width=600, align="center" Spring AI currently supports models that process input and output as language, image, and audio. The last row in the previous table, which accepts text as input and outputs numbers, is more commonly known as embedding text and represents the internal data structures used in an AI model. -Spring AI has support for embeddings to support more advanced use cases. +Spring AI has support for embeddings to enable more advanced use cases. What sets models like GPT apart is their pre-trained nature, as indicated by the "P" in GPT—Chat Generative Pre-trained Transformer. This pre-training feature transforms AI into a general developer tool that does not require an extensive machine learning or model training background. @@ -36,7 +36,7 @@ There is also the user role, which is typically the input from the user. Crafting effective prompts is both an art and a science. ChatGPT was designed for human conversations. -This is quite a departure from using something like SQL to "'ask a question.'" +This is quite a departure from using something like SQL to "ask a question". One must communicate with the AI model akin to conversing with another person. Such is the importance of this interaction style that the term "Prompt Engineering" has emerged as its own discipline. @@ -61,9 +61,9 @@ For instance, consider the simple prompt template: Tell me a {adjective} joke about {content}. ``` -In Spring AI, prompt templates can be likened to the "'View'" in Spring MVC architecture. +In Spring AI, prompt templates can be likened to the "View" in Spring MVC architecture. A model object, typically a `java.util.Map`, is provided to populate placeholders within the template. -The "'rendered'" string becomes the content of the prompt supplied to the AI model. +The "rendered" string becomes the content of the prompt supplied to the AI model. There is considerable variability in the specific data format of the prompt sent to the model. Initially starting as simple strings, prompts have evolved to include multiple messages, where each string in each message represents a distinct role for the model. @@ -87,7 +87,7 @@ Embeddings are particularly relevant in practical applications like the Retrieva They enable the representation of data as points in a semantic space, which is akin to the 2-D space of Euclidean geometry, but in higher dimensions. This means just like how points on a plane in Euclidean geometry can be close or far based on their coordinates, in a semantic space, the proximity of points reflects the similarity in meaning. Sentences about similar topics are positioned closer in this multi-dimensional space, much like points lying close to each other on a graph. -This proximity aids in tasks like text classification, semantic search, and even product recommendations, as it allows the AI to discern and group related concepts based on their 'location' in this expanded semantic landscape. +This proximity aids in tasks like text classification, semantic search, and even product recommendations, as it allows the AI to discern and group related concepts based on their "location" in this expanded semantic landscape. You can think of this semantic space as a vector. @@ -104,7 +104,7 @@ Perhaps more important is that Tokens = Money. In the context of hosted AI models, your charges are determined by the number of tokens used. Both input and output contribute to the overall token count. Also, models are subject to token limits, which restrict the amount of text processed in a single API call. -This threshold is often referred to as the 'context window'. The model does not process any text that exceeds this limit. +This threshold is often referred to as the "context window". The model does not process any text that exceeds this limit. For instance, ChatGPT3 has a 4K token limit, while GPT4 offers varying options, such as 8K, 16K, and 32K. Anthropic's Claude AI model features a 100K token limit, and Meta's recent research yielded a 1M token limit model. @@ -134,16 +134,16 @@ An interesting bit of trivia is that this dataset is around 650GB. Three techniques exist for customizing the AI model to incorporate your data: -* `Fine Tuning`: This traditional machine learning technique involves tailoring the model and changing its internal weighting. +* **Fine Tuning**: This traditional machine learning technique involves tailoring the model and changing its internal weighting. However, it is a challenging process for machine learning experts and extremely resource-intensive for models like GPT due to their size. Additionally, some models might not offer this option. -* `Prompt Stuffing`: A more practical alternative involves embedding your data within the prompt provided to the model. Given a model's token limits, techniques are required to present relevant data within the model's context window. +* **Prompt Stuffing**: A more practical alternative involves embedding your data within the prompt provided to the model. Given a model's token limits, techniques are required to present relevant data within the model's context window. This approach is colloquially referred to as "`stuffing the prompt.`" The Spring AI library helps you implement solutions based on the "`stuffing the prompt`" technique otherwise known as xref::concepts.adoc#concept-rag[Retrieval Augmented Generation (RAG)]. image::spring-ai-prompt-stuffing.jpg[Prompt stuffing, width=700, align="center"] -* xref::concepts.adoc#concept-fc[Function Calling]: This technique allows registering custom, user functions that connect the large language models to the APIs of external systems. +* **xref::concepts.adoc#concept-fc[Function Calling]**: This technique allows registering custom, user functions that connect the large language models to the APIs of external systems. Spring AI greatly simplifies code you need to write to support xref:api/functions.adoc[function calling]. [[concept-rag]] @@ -169,8 +169,8 @@ This is the reason to use a vector database. It is very good at finding similar image::spring-ai-rag.jpg[Spring AI RAG, width=1000, align="center"] -* The xref::api/etl-pipeline.adoc[ETL pipeline] provides further information about orchestrating the flow of extracting data from data sources and storing it in a structured vector store, ensuring data is in the optimal format for retrieval when passing it to the AI model. -* The xref::api/chatclient.adoc#_retrieval_augmented_generation[ChatClient - RAG] explains how to use the `QuestionAnswerAdvisor` advisor to enable the RAG capability in your application. +* The xref::api/etl-pipeline.adoc[ETL Pipeline] provides further information about orchestrating the flow of extracting data from data sources and storing it in a structured vector store, ensuring data is in the optimal format for retrieval when passing it to the AI model. +* The xref::api/chatclient.adoc#_retrieval_augmented_generation[ChatClient - RAG] explains how to use the `QuestionAnswerAdvisor` to enable the RAG capability in your application. [[concept-fc]] === Function Calling @@ -188,13 +188,13 @@ Additionally, you can define and reference multiple functions in a single prompt image::function-calling-basic-flow.jpg[Function calling, width=700, align="center"] -* (1) perform a chat request sending along function definition information. +1. Perform a chat request sending along function definition information. The latter provides the `name`, `description` (e.g. explaining when the Model should call the function), and `input parameters` (e.g. the function's input parameters schema). -* (2) when the Model decides to call the function, it will call the function with the input parameters and return the output to the model. -* (3) Spring AI handles this conversation for you. +2. When the Model decides to call the function, it will call the function with the input parameters and return the output to the model. +3. Spring AI handles this conversation for you. It dispatches the function call to the appropriate function and returns the result to the model. -* (4) Model can perform multiple function calls to retrieve all the information it needs. -* (5) once all information needed is acquired, the Model will generate a response. +4. The Model can perform multiple function calls to retrieve all the information it needs. +5. Once all information needed is acquired, the Model will generate a response. Follow the xref::api/functions.adoc[Function Calling] documentation for further information on how to use this feature with different AI models. @@ -210,4 +210,5 @@ One approach involves presenting both the user's request and the AI model's resp Furthermore, leveraging the information stored in the vector database as supplementary data can enhance the evaluation process, aiding in the determination of response relevance. -The Spring AI project currently provides some very basic examples of how you can evaluate the responses in the form of prompts to include in a JUnit test. +The Spring AI project provides an `Evaluator` API which currently gives access to basic strategies to evaluate model responses. +Follow the xref::api/testing.adoc[Evaluation Testing] documentation for further information. diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/pages/getting-started.adoc b/spring-ai-docs/src/main/antora/modules/ROOT/pages/getting-started.adoc index fa4c82c1dd..a97f8faa51 100644 --- a/spring-ai-docs/src/main/antora/modules/ROOT/pages/getting-started.adoc +++ b/spring-ai-docs/src/main/antora/modules/ROOT/pages/getting-started.adoc @@ -97,8 +97,8 @@ dependencies { Each of the following sections in the documentation shows which dependencies you need to add to your project build system. -* xref:api/embeddings.adoc[Embeddings Models] * xref:api/chatmodel.adoc[Chat Models] +* xref:api/embeddings.adoc[Embeddings Models] * xref:api/imageclient.adoc[Image Generation Models] * xref:api/audio/transcriptions.adoc[Transcription Models] * xref:api/audio/speech.adoc[Text-To-Speech (TTS) Models] diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/pages/index.adoc b/spring-ai-docs/src/main/antora/modules/ROOT/pages/index.adoc index 02c6dbe19b..d9c534a259 100644 --- a/spring-ai-docs/src/main/antora/modules/ROOT/pages/index.adoc +++ b/spring-ai-docs/src/main/antora/modules/ROOT/pages/index.adoc @@ -16,7 +16,7 @@ These abstractions have multiple implementations, enabling easy component swappi Spring AI provides the following features: * Support for all major Model providers such as OpenAI, Microsoft, Amazon, Google, and Hugging Face. -* Supported Model types are Chat, Text to Image, Audio Transcription, Text to Speech, and more on the way. +* Supported Model types are Chat, Text to Image, Audio Transcription, Text to Speech, Moderation, and more on the way. * Portable API across AI providers for all models. Both synchronous and stream API options are supported. Dropping down to access model specific features is also supported. * Mapping of AI Model output to POJOs. * Support for all major Vector Database providers such as Apache Cassandra, Azure Vector Search, Chroma, Milvus, MongoDB Atlas, Neo4j, Oracle, PostgreSQL/PGVector, PineCone, Qdrant, Redis, and Weaviate.