diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/images/spring-ai-nvidia-llm-api-1.jpg b/spring-ai-docs/src/main/antora/modules/ROOT/images/spring-ai-nvidia-llm-api-1.jpg new file mode 100644 index 0000000000..b055220921 Binary files /dev/null and b/spring-ai-docs/src/main/antora/modules/ROOT/images/spring-ai-nvidia-llm-api-1.jpg differ diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/images/spring-ai-nvidia-llm-api.jpg b/spring-ai-docs/src/main/antora/modules/ROOT/images/spring-ai-nvidia-llm-api.jpg deleted file mode 100644 index cf2631b2b3..0000000000 Binary files a/spring-ai-docs/src/main/antora/modules/ROOT/images/spring-ai-nvidia-llm-api.jpg and /dev/null differ diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/nvidia-chat.adoc b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/nvidia-chat.adoc index fc1a33697c..56b135a72b 100644 --- a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/nvidia-chat.adoc +++ b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/nvidia-chat.adoc @@ -5,9 +5,9 @@ https://docs.api.nvidia.com/nim/reference/llm-apis[NVIDIA LLM API] is a proxy AI Spring AI integrates with the NVIDIA LLM API by reusing the existing xref::api/chat/openai-chat.adoc[OpenAI] client. For this you need to set the base-url to `https://integrate.api.nvidia.com`, select one of the provided https://docs.api.nvidia.com/nim/reference/llm-apis#model[LLM models] and get an `api-key` for it. -image::spring-ai-nvidia-llm-api.jpg[w=800,align="center"] +image::spring-ai-nvidia-llm-api-1.jpg[w=800,align="center"] -NOTE: NVIDIA LLM API requires the `max-token` parameter to be explicitly set or server error will be thrown. +NOTE: NVIDIA LLM API requires the `max-tokens` parameter to be explicitly set or server error will be thrown. Check the https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-openai/src/test/java/org/springframework/ai/openai/chat/proxy/NvidiaWithOpenAiChatModelIT.java[NvidiaWithOpenAiChatModelIT.java] tests for examples of using NVIDIA LLM API with Spring AI. @@ -89,7 +89,7 @@ The prefix `spring.ai.openai.chat` is the property prefix that lets you configur | spring.ai.openai.chat.options.model | The link:https://docs.api.nvidia.com/nim/reference/llm-apis#models[NVIDIA LLM model] to use | - | spring.ai.openai.chat.options.temperature | The sampling temperature to use that controls the apparent creativity of generated completions. Higher values will make output more random while lower values will make results more focused and deterministic. It is not recommended to modify temperature and top_p for the same completions request as the interaction of these two settings is difficult to predict. | 0.8 | spring.ai.openai.chat.options.frequencyPenalty | Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. | 0.0f -| spring.ai.openai.chat.options.maxTokens | The maximum number of tokens to generate in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. | - +| spring.ai.openai.chat.options.maxTokens | The maximum number of tokens to generate in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. | NOTE: NVIDIA LLM API requires the `max-tokens` parameter to be explicitly set or server error will be thrown. | spring.ai.openai.chat.options.n | How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs. | 1 | spring.ai.openai.chat.options.presencePenalty | Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. | - | spring.ai.openai.chat.options.responseFormat | An object specifying the format that the model must output. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON.| - diff --git a/vector-stores/spring-ai-mongodb-atlas-store/pom.xml b/vector-stores/spring-ai-mongodb-atlas-store/pom.xml index bbef8b359d..3bd7a4b944 100644 --- a/vector-stores/spring-ai-mongodb-atlas-store/pom.xml +++ b/vector-stores/spring-ai-mongodb-atlas-store/pom.xml @@ -55,6 +55,13 @@ test + + org.springframework.ai + spring-ai-test + ${parent.version} + test + + io.micrometer micrometer-observation-test