diff --git a/.mock/definition/api.yml b/.mock/definition/api.yml new file mode 100644 index 00000000..4ae27d7b --- /dev/null +++ b/.mock/definition/api.yml @@ -0,0 +1,18 @@ +name: api +error-discrimination: + strategy: status-code +default-environment: prod +default-url: Base +environments: + prod: + urls: + Base: https://api.hume.ai/ + evi: wss://api.hume.ai/v0/evi + tts: wss://api.hume.ai/v0/tts + stream: wss://api.hume.ai/v0/stream +auth: HeaderAuthScheme +auth-schemes: + HeaderAuthScheme: + header: X-Hume-Api-Key + type: optional + name: apiKey diff --git a/.mock/definition/empathic-voice/__package__.yml b/.mock/definition/empathic-voice/__package__.yml new file mode 100644 index 00000000..88526340 --- /dev/null +++ b/.mock/definition/empathic-voice/__package__.yml @@ -0,0 +1,3251 @@ +errors: + UnprocessableEntityError: + status-code: 422 + type: HTTPValidationError + docs: Validation Error + examples: + - value: {} + BadRequestError: + status-code: 400 + type: ErrorResponse + docs: Bad Request + examples: + - value: {} +types: + AssistantEnd: + docs: When provided, the output is an assistant end message. + properties: + custom_session_id: + type: optional + docs: >- + Used to manage conversational state, correlate frontend and backend + data, and persist conversations across EVI sessions. + type: + type: literal<"assistant_end"> + docs: >- + The type of message sent through the socket; for an Assistant End + message, this must be `assistant_end`. + + + This message indicates the conclusion of the assistant's response, + signaling that the assistant has finished speaking for the current + conversational turn. + source: + openapi: evi-asyncapi.json + AssistantInput: + docs: When provided, the input is spoken by EVI. + properties: + custom_session_id: + type: optional + docs: >- + Used to manage conversational state, correlate frontend and backend + data, and persist conversations across EVI sessions. + text: + type: string + docs: >- + Assistant text to synthesize into spoken audio and insert into the + conversation. + + + EVI uses this text to generate spoken audio using our proprietary + expressive text-to-speech model. Our model adds appropriate emotional + inflections and tones to the text based on the user's expressions and + the context of the conversation. The synthesized audio is streamed + back to the user as an [Assistant + Message](/reference/speech-to-speech-evi/chat#receive.AssistantMessage). + type: + type: literal<"assistant_input"> + docs: >- + The type of message sent through the socket; must be `assistant_input` + for our server to correctly identify and process it as an Assistant + Input message. + source: + openapi: evi-openapi.json + AssistantMessage: + docs: When provided, the output is an assistant message. + properties: + custom_session_id: + type: optional + docs: >- + Used to manage conversational state, correlate frontend and backend + data, and persist conversations across EVI sessions. + from_text: + type: boolean + docs: >- + Indicates if this message was inserted into the conversation as text + from an [Assistant Input + message](/reference/speech-to-speech-evi/chat#send.AssistantInput.text). + id: + type: optional + docs: >- + ID of the assistant message. Allows the Assistant Message to be + tracked and referenced. + language: + type: optional + docs: Detected language of the message text. + message: + type: ChatMessage + docs: Transcript of the message. + models: + type: Inference + docs: Inference model results. + type: + type: literal<"assistant_message"> + docs: >- + The type of message sent through the socket; for an Assistant Message, + this must be `assistant_message`. + + + This message contains both a transcript of the assistant's response + and the expression measurement predictions of the assistant's audio + output. + source: + openapi: evi-asyncapi.json + AssistantProsody: + docs: When provided, the output is an Assistant Prosody message. + properties: + custom_session_id: + type: optional + docs: >- + Used to manage conversational state, correlate frontend and backend + data, and persist conversations across EVI sessions. + id: + type: optional + docs: Unique identifier for the segment. + models: + type: Inference + docs: Inference model results. + type: + type: literal<"assistant_prosody"> + docs: >- + The type of message sent through the socket; for an Assistant Prosody + message, this must be `assistant_PROSODY`. + + + This message the expression measurement predictions of the assistant's + audio output. + source: + openapi: evi-asyncapi.json + AudioConfiguration: + properties: + channels: + type: integer + docs: Number of audio channels. + codec: + type: optional + docs: Optional codec information. + encoding: + type: Encoding + docs: Encoding format of the audio input, such as `linear16`. + sample_rate: + type: integer + docs: >- + Audio sample rate. Number of samples per second in the audio input, + measured in Hertz. + source: + openapi: evi-openapi.json + AudioInput: + docs: When provided, the input is audio. + properties: + custom_session_id: + type: optional + docs: >- + Used to manage conversational state, correlate frontend and backend + data, and persist conversations across EVI sessions. + data: + type: string + docs: >- + Base64 encoded audio input to insert into the conversation. + + + The content of an Audio Input message is treated as the user's speech + to EVI and must be streamed continuously. Pre-recorded audio files are + not supported. + + + For optimal transcription quality, the audio data should be + transmitted in small chunks. + + + Hume recommends streaming audio with a buffer window of 20 + milliseconds (ms), or 100 milliseconds (ms) for web applications. + type: + type: literal<"audio_input"> + docs: >- + The type of message sent through the socket; must be `audio_input` for + our server to correctly identify and process it as an Audio Input + message. + + + This message is used for sending audio input data to EVI for + processing and expression measurement. Audio data should be sent as a + continuous stream, encoded in Base64. + source: + openapi: evi-openapi.json + AudioOutput: + docs: >- + The type of message sent through the socket; for an Audio Output message, + this must be `audio_output`. + properties: + custom_session_id: + type: optional + docs: >- + Used to manage conversational state, correlate frontend and backend + data, and persist conversations across EVI sessions. + data: + type: string + docs: >- + Base64 encoded audio output. This encoded audio is transmitted to the + client, where it can be decoded and played back as part of the user + interaction. + id: + type: string + docs: >- + ID of the audio output. Allows the Audio Output message to be tracked + and referenced. + index: + type: integer + docs: Index of the chunk of audio relative to the whole audio segment. + type: + type: literal<"audio_output"> + docs: >- + The type of message sent through the socket; for an Audio Output + message, this must be `audio_output`. + source: + openapi: evi-asyncapi.json + BuiltInTool: + enum: + - web_search + - hang_up + source: + openapi: evi-openapi.json + BuiltinToolConfig: + properties: + fallback_content: + type: optional + docs: >- + Optional text passed to the supplemental LLM if the tool call fails. + The LLM then uses this text to generate a response back to the user, + ensuring continuity in the conversation. + name: + type: BuiltInTool + source: + openapi: evi-openapi.json + ChatMessageToolResult: + discriminated: false + docs: Function call response from client. + union: + - type: ToolResponseMessage + - type: ToolErrorMessage + source: + openapi: evi-asyncapi.json + inline: true + ChatMessage: + properties: + content: + type: optional + docs: Transcript of the message. + role: + type: Role + docs: Role of who is providing the message. + tool_call: + type: optional + docs: Function call name and arguments. + tool_result: + type: optional + docs: Function call response from client. + source: + openapi: evi-asyncapi.json + ChatMetadata: + docs: When provided, the output is a chat metadata message. + properties: + chat_group_id: + type: string + docs: >- + ID of the Chat Group. + + + Used to resume a Chat when passed in the + [resumed_chat_group_id](/reference/speech-to-speech-evi/chat#request.query.resumed_chat_group_id) + query parameter of a subsequent connection request. This allows EVI to + continue the conversation from where it left off within the Chat + Group. + + + Learn more about [supporting chat + resumability](/docs/speech-to-speech-evi/faq#does-evi-support-chat-resumability) + from the EVI FAQ. + chat_id: + type: string + docs: >- + ID of the Chat session. Allows the Chat session to be tracked and + referenced. + custom_session_id: + type: optional + docs: >- + Used to manage conversational state, correlate frontend and backend + data, and persist conversations across EVI sessions. + request_id: + type: optional + docs: ID of the initiating request. + type: + type: literal<"chat_metadata"> + docs: >- + The type of message sent through the socket; for a Chat Metadata + message, this must be `chat_metadata`. + + + The Chat Metadata message is the first message you receive after + establishing a connection with EVI and contains important identifiers + for the current Chat session. + source: + openapi: evi-asyncapi.json + Context: + properties: + text: + type: string + docs: >- + The context to be injected into the conversation. Helps inform the + LLM's response by providing relevant information about the ongoing + conversation. + + + This text will be appended to the end of + [user_messages](/reference/speech-to-speech-evi/chat#receive.UserMessage.message.content) + based on the chosen persistence level. For example, if you want to + remind EVI of its role as a helpful weather assistant, the context you + insert will be appended to the end of user messages as `{Context: You + are a helpful weather assistant}`. + type: + type: optional + docs: >- + The persistence level of the injected context. Specifies how long the + injected context will remain active in the session. + + + - **Temporary**: Context that is only applied to the following + assistant response. + + + - **Persistent**: Context that is applied to all subsequent assistant + responses for the remainder of the Chat. + source: + openapi: evi-openapi.json + ContextType: + enum: + - persistent + - temporary + source: + openapi: evi-openapi.json + EmotionScores: + properties: + Admiration: double + Adoration: double + Aesthetic Appreciation: double + Amusement: double + Anger: double + Anxiety: double + Awe: double + Awkwardness: double + Boredom: double + Calmness: double + Concentration: double + Confusion: double + Contemplation: double + Contempt: double + Contentment: double + Craving: double + Desire: double + Determination: double + Disappointment: double + Disgust: double + Distress: double + Doubt: double + Ecstasy: double + Embarrassment: double + Empathic Pain: double + Entrancement: double + Envy: double + Excitement: double + Fear: double + Guilt: double + Horror: double + Interest: double + Joy: double + Love: double + Nostalgia: double + Pain: double + Pride: double + Realization: double + Relief: double + Romance: double + Sadness: double + Satisfaction: double + Shame: double + Surprise (negative): double + Surprise (positive): double + Sympathy: double + Tiredness: double + Triumph: double + source: + openapi: evi-openapi.json + Encoding: + type: literal<"linear16"> + WebSocketError: + docs: When provided, the output is an error message. + properties: + code: + type: string + docs: Error code. Identifies the type of error encountered. + custom_session_id: + type: optional + docs: >- + Used to manage conversational state, correlate frontend and backend + data, and persist conversations across EVI sessions. + message: + type: string + docs: Detailed description of the error. + request_id: + type: optional + docs: ID of the initiating request. + slug: + type: string + docs: >- + Short, human-readable identifier and description for the error. See a + complete list of error slugs on the [Errors + page](/docs/resources/errors). + type: + type: literal<"error"> + docs: >- + The type of message sent through the socket; for a Web Socket Error + message, this must be `error`. + + + This message indicates a disruption in the WebSocket connection, such + as an unexpected disconnection, protocol error, or data transmission + issue. + source: + openapi: evi-asyncapi.json + ErrorLevel: + type: literal<"warn"> + Inference: + properties: + prosody: + type: optional + docs: >- + Prosody model inference results. + + + EVI uses the prosody model to measure 48 emotions related to speech + and vocal characteristics within a given expression. + source: + openapi: evi-openapi.json + MillisecondInterval: + properties: + begin: + type: integer + docs: Start time of the interval in milliseconds. + end: + type: integer + docs: End time of the interval in milliseconds. + source: + openapi: evi-openapi.json + PauseAssistantMessage: + docs: >- + Pause responses from EVI. Chat history is still saved and sent after + resuming. + properties: + custom_session_id: + type: optional + docs: >- + Used to manage conversational state, correlate frontend and backend + data, and persist conversations across EVI sessions. + type: + type: literal<"pause_assistant_message"> + docs: >- + The type of message sent through the socket; must be + `pause_assistant_message` for our server to correctly identify and + process it as a Pause Assistant message. + + + Once this message is sent, EVI will not respond until a [Resume + Assistant + message](/reference/speech-to-speech-evi/chat#send.ResumeAssistantMessage) + is sent. When paused, EVI won't respond, but transcriptions of your + audio inputs will still be recorded. + source: + openapi: evi-openapi.json + ProsodyInference: + properties: + scores: + type: EmotionScores + docs: >- + The confidence scores for 48 emotions within the detected expression + of an audio sample. + + + Scores typically range from 0 to 1, with higher values indicating a + stronger confidence level in the measured attribute. + + + See our guide on [interpreting expression measurement + results](/docs/expression-measurement/faq#how-do-i-interpret-my-results) + to learn more. + source: + openapi: evi-openapi.json + ResumeAssistantMessage: + docs: >- + Resume responses from EVI. Chat history sent while paused will now be + sent. + properties: + custom_session_id: + type: optional + docs: >- + Used to manage conversational state, correlate frontend and backend + data, and persist conversations across EVI sessions. + type: + type: literal<"resume_assistant_message"> + docs: >- + The type of message sent through the socket; must be + `resume_assistant_message` for our server to correctly identify and + process it as a Resume Assistant message. + + + Upon resuming, if any audio input was sent during the pause, EVI will + retain context from all messages sent but only respond to the last + user message. (e.g., If you ask EVI two questions while paused and + then send a `resume_assistant_message`, EVI will respond to the second + question and have added the first question to its conversation + context.) + source: + openapi: evi-openapi.json + Role: + enum: + - assistant + - system + - user + - all + - tool + - context + source: + openapi: evi-openapi.json + SessionSettingsVariablesValue: + discriminated: false + union: + - string + - double + - boolean + source: + openapi: evi-openapi.json + inline: true + SessionSettings: + docs: Settings for this chat session. + properties: + audio: + type: optional + docs: >- + Configuration details for the audio input used during the session. + Ensures the audio is being correctly set up for processing. + + + This optional field is only required when the audio input is encoded + in PCM Linear 16 (16-bit, little-endian, signed PCM WAV data). For + detailed instructions on how to configure session settings for PCM + Linear 16 audio, please refer to the [Session Settings + guide](/docs/speech-to-speech-evi/configuration/session-settings). + builtin_tools: + type: optional> + docs: >- + List of built-in tools to enable for the session. + + + Tools are resources used by EVI to perform various tasks, such as + searching the web or calling external APIs. Built-in tools, like web + search, are natively integrated, while user-defined tools are created + and invoked by the user. To learn more, see our [Tool Use + Guide](/docs/speech-to-speech-evi/features/tool-use). + + + Currently, the only built-in tool Hume provides is **Web Search**. + When enabled, Web Search equips EVI with the ability to search the web + for up-to-date information. + context: + type: optional + docs: >- + Field for injecting additional context into the conversation, which is + appended to the end of user messages for the session. + + + When included in a Session Settings message, the provided context can + be used to remind the LLM of its role in every user message, prevent + it from forgetting important details, or add new relevant information + to the conversation. + + + Set to `null` to clear injected context. + custom_session_id: + type: optional + docs: >- + Unique identifier for the session. Used to manage conversational + state, correlate frontend and backend data, and persist conversations + across EVI sessions. + + + If included, the response sent from Hume to your backend will include + this ID. This allows you to correlate frontend users with their + incoming messages. + + + It is recommended to pass a `custom_session_id` if you are using a + Custom Language Model. Please see our guide to [using a custom + language + model](/docs/speech-to-speech-evi/guides/custom-language-model) with + EVI to learn more. + language_model_api_key: + type: optional + docs: >- + Third party API key for the supplemental language model. + + + When provided, EVI will use this key instead of Hume's API key for the + supplemental LLM. This allows you to bypass rate limits and utilize + your own API key as needed. + metadata: optional> + system_prompt: + type: optional + docs: >- + Instructions used to shape EVI's behavior, responses, and style for + the session. + + + When included in a Session Settings message, the provided Prompt + overrides the existing one specified in the EVI configuration. If no + Prompt was defined in the configuration, this Prompt will be the one + used for the session. + + + You can use the Prompt to define a specific goal or role for EVI, + specifying how it should act or what it should focus on during the + conversation. For example, EVI can be instructed to act as a customer + support representative, a fitness coach, or a travel advisor, each + with its own set of behaviors and response styles. + + + For help writing a system prompt, see our [Prompting + Guide](/docs/speech-to-speech-evi/guides/prompting). + tools: + type: optional> + docs: >- + List of user-defined tools to enable for the session. + + + Tools are resources used by EVI to perform various tasks, such as + searching the web or calling external APIs. Built-in tools, like web + search, are natively integrated, while user-defined tools are created + and invoked by the user. To learn more, see our [Tool Use + Guide](/docs/speech-to-speech-evi/features/tool-use). + type: + type: literal<"session_settings"> + docs: >- + The type of message sent through the socket; must be + `session_settings` for our server to correctly identify and process it + as a Session Settings message. + + + Session settings are temporary and apply only to the current Chat + session. These settings can be adjusted dynamically based on the + requirements of each session to ensure optimal performance and user + experience. + + + For more information, please refer to the [Session Settings + guide](/docs/speech-to-speech-evi/configuration/session-settings). + variables: + type: optional> + docs: >- + This field allows you to assign values to dynamic variables referenced + in your system prompt. + + + Each key represents the variable name, and the corresponding value is + the specific content you wish to assign to that variable within the + session. While the values for variables can be strings, numbers, or + booleans, the value will ultimately be converted to a string when + injected into your system prompt. + + + Using this field, you can personalize responses based on + session-specific details. For more guidance, see our [guide on using + dynamic + variables](/docs/speech-to-speech-evi/features/dynamic-variables). + voice_id: + type: optional + docs: >- + Allows you to change the voice during an active chat. Updating the + voice does not affect chat context or conversation history. + source: + openapi: evi-openapi.json + Tool: + properties: + description: + type: optional + docs: >- + An optional description of what the tool does, used by the + supplemental LLM to choose when and how to call the function. + fallback_content: + type: optional + docs: >- + Optional text passed to the supplemental LLM if the tool call fails. + The LLM then uses this text to generate a response back to the user, + ensuring continuity in the conversation. + name: + type: string + docs: Name of the user-defined tool to be enabled. + parameters: + type: string + docs: >- + Parameters of the tool. Is a stringified JSON schema. + + + These parameters define the inputs needed for the tool's execution, + including the expected data type and description for each input field. + Structured as a JSON schema, this format ensures the tool receives + data in the expected format. + type: + type: ToolType + docs: Type of tool. Set to `function` for user-defined tools. + source: + openapi: evi-openapi.json + ToolCallMessage: + docs: When provided, the output is a tool call. + properties: + custom_session_id: + type: optional + docs: >- + Used to manage conversational state, correlate frontend and backend + data, and persist conversations across EVI sessions. + name: + type: string + docs: Name of the tool called. + parameters: + type: string + docs: >- + Parameters of the tool. + + + These parameters define the inputs needed for the tool's execution, + including the expected data type and description for each input field. + Structured as a stringified JSON schema, this format ensures the tool + receives data in the expected format. + response_required: + type: boolean + docs: >- + Indicates whether a response to the tool call is required from the + developer, either in the form of a [Tool Response + message](/reference/speech-to-speech-evi/chat#send.ToolResponseMessage) + or a [Tool Error + message](/reference/speech-to-speech-evi/chat#send.ToolErrorMessage). + tool_call_id: + type: string + docs: >- + The unique identifier for a specific tool call instance. + + + This ID is used to track the request and response of a particular tool + invocation, ensuring that the correct response is linked to the + appropriate request. + tool_type: + type: optional + docs: >- + Type of tool called. Either `builtin` for natively implemented tools, + like web search, or `function` for user-defined tools. + type: + type: literal<"tool_call"> + docs: >- + The type of message sent through the socket; for a Tool Call message, + this must be `tool_call`. + + + This message indicates that the supplemental LLM has detected a need + to invoke the specified tool. + source: + openapi: evi-openapi.json + ToolErrorMessage: + docs: When provided, the output is a function call error. + properties: + code: + type: optional + docs: Error code. Identifies the type of error encountered. + content: + type: optional + docs: >- + Optional text passed to the supplemental LLM in place of the tool call + result. The LLM then uses this text to generate a response back to the + user, ensuring continuity in the conversation if the tool errors. + custom_session_id: + type: optional + docs: >- + Used to manage conversational state, correlate frontend and backend + data, and persist conversations across EVI sessions. + error: + type: string + docs: Error message from the tool call, not exposed to the LLM or user. + level: + type: optional + docs: >- + Indicates the severity of an error; for a Tool Error message, this + must be `warn` to signal an unexpected event. + tool_call_id: + type: string + docs: >- + The unique identifier for a specific tool call instance. + + + This ID is used to track the request and response of a particular tool + invocation, ensuring that the Tool Error message is linked to the + appropriate tool call request. The specified `tool_call_id` must match + the one received in the [Tool Call + message](/reference/speech-to-speech-evi/chat#receive.ToolCallMessage). + tool_type: + type: optional + docs: >- + Type of tool called. Either `builtin` for natively implemented tools, + like web search, or `function` for user-defined tools. + type: + type: literal<"tool_error"> + docs: >- + The type of message sent through the socket; for a Tool Error message, + this must be `tool_error`. + + + Upon receiving a [Tool Call + message](/reference/speech-to-speech-evi/chat#receive.ToolCallMessage) + and failing to invoke the function, this message is sent to notify EVI + of the tool's failure. + source: + openapi: evi-openapi.json + ToolResponseMessage: + docs: When provided, the output is a function call response. + properties: + content: + type: string + docs: >- + Return value of the tool call. Contains the output generated by the + tool to pass back to EVI. + custom_session_id: + type: optional + docs: >- + Used to manage conversational state, correlate frontend and backend + data, and persist conversations across EVI sessions. + tool_call_id: + type: string + docs: >- + The unique identifier for a specific tool call instance. + + + This ID is used to track the request and response of a particular tool + invocation, ensuring that the correct response is linked to the + appropriate request. The specified `tool_call_id` must match the one + received in the [Tool Call + message](/reference/speech-to-speech-evi/chat#receive.ToolCallMessage.tool_call_id). + tool_name: + type: optional + docs: >- + Name of the tool. + + + Include this optional field to help the supplemental LLM identify + which tool generated the response. The specified `tool_name` must + match the one received in the [Tool Call + message](/reference/speech-to-speech-evi/chat#receive.ToolCallMessage). + tool_type: + type: optional + docs: >- + Type of tool called. Either `builtin` for natively implemented tools, + like web search, or `function` for user-defined tools. + type: + type: literal<"tool_response"> + docs: >- + The type of message sent through the socket; for a Tool Response + message, this must be `tool_response`. + + + Upon receiving a [Tool Call + message](/reference/speech-to-speech-evi/chat#receive.ToolCallMessage) + and successfully invoking the function, this message is sent to convey + the result of the function call back to EVI. + source: + openapi: evi-openapi.json + ToolType: + enum: + - builtin + - function + source: + openapi: evi-openapi.json + UserInput: + docs: >- + User text to insert into the conversation. Text sent through a User Input + message is treated as the user's speech to EVI. EVI processes this input + and provides a corresponding response. + + + Expression measurement results are not available for User Input messages, + as the prosody model relies on audio input and cannot process text alone. + properties: + custom_session_id: + type: optional + docs: >- + Used to manage conversational state, correlate frontend and backend + data, and persist conversations across EVI sessions. + text: + type: string + docs: >- + User text to insert into the conversation. Text sent through a User + Input message is treated as the user's speech to EVI. EVI processes + this input and provides a corresponding response. + + + Expression measurement results are not available for User Input + messages, as the prosody model relies on audio input and cannot + process text alone. + type: + type: literal<"user_input"> + docs: >- + The type of message sent through the socket; must be `user_input` for + our server to correctly identify and process it as a User Input + message. + source: + openapi: evi-openapi.json + UserInterruption: + docs: When provided, the output is an interruption. + properties: + custom_session_id: + type: optional + docs: >- + Used to manage conversational state, correlate frontend and backend + data, and persist conversations across EVI sessions. + time: + type: integer + docs: Unix timestamp of the detected user interruption. + type: + type: literal<"user_interruption"> + docs: >- + The type of message sent through the socket; for a User Interruption + message, this must be `user_interruption`. + + + This message indicates the user has interrupted the assistant's + response. EVI detects the interruption in real-time and sends this + message to signal the interruption event. This message allows the + system to stop the current audio playback, clear the audio queue, and + prepare to handle new user input. + source: + openapi: evi-asyncapi.json + UserMessage: + docs: When provided, the output is a user message. + properties: + custom_session_id: + type: optional + docs: >- + Used to manage conversational state, correlate frontend and backend + data, and persist conversations across EVI sessions. + from_text: + type: boolean + docs: >- + Indicates if this message was inserted into the conversation as text + from a [User + Input](/reference/speech-to-speech-evi/chat#send.UserInput.text) + message. + interim: + type: boolean + docs: >- + Indicates whether this `UserMessage` contains an interim (unfinalized) + transcript. + + + - `true`: the transcript is provisional; words may be repeated or + refined in subsequent `UserMessage` responses as additional audio is + processed. + + - `false`: the transcript is final and complete. + + + Interim transcripts are only sent when the + [`verbose_transcription`](/reference/speech-to-speech-evi/chat#request.query.verbose_transcription) + query parameter is set to `true` in the initial handshake. + language: + type: optional + docs: Detected language of the message text. + message: + type: ChatMessage + docs: Transcript of the message. + models: + type: Inference + docs: Inference model results. + time: + type: MillisecondInterval + docs: Start and End time of user message. + type: + type: literal<"user_message"> + docs: >- + The type of message sent through the socket; for a User Message, this + must be `user_message`. + + + This message contains both a transcript of the user's input and the + expression measurement predictions if the input was sent as an [Audio + Input message](/reference/speech-to-speech-evi/chat#send.AudioInput). + Expression measurement predictions are not provided for a [User Input + message](/reference/speech-to-speech-evi/chat#send.UserInput), as the + prosody model relies on audio input and cannot process text alone. + source: + openapi: evi-asyncapi.json + SubscribeEvent: + discriminated: false + union: + - type: AssistantEnd + - type: AssistantMessage + - type: AssistantProsody + - type: AudioOutput + - type: ChatMetadata + - type: WebSocketError + - type: UserInterruption + - type: UserMessage + - type: ToolCallMessage + - type: ToolResponseMessage + - type: ToolErrorMessage + source: + openapi: evi-asyncapi.json + JsonMessage: + discriminated: false + union: + - type: AssistantEnd + - type: AssistantMessage + - type: AssistantProsody + - type: ChatMetadata + - type: WebSocketError + - type: UserInterruption + - type: UserMessage + - type: ToolCallMessage + - type: ToolResponseMessage + - type: ToolErrorMessage + source: + openapi: evi-asyncapi.json + ConnectSessionSettingsAudio: + docs: >- + Configuration details for the audio input used during the session. Ensures + the audio is being correctly set up for processing. + + + This optional field is only required when the audio input is encoded in + PCM Linear 16 (16-bit, little-endian, signed PCM WAV data). For detailed + instructions on how to configure session settings for PCM Linear 16 audio, + please refer to the [Session Settings + section](/docs/empathic-voice-interface-evi/configuration#session-settings) + on the EVI Configuration page. + properties: + channels: + type: optional + docs: Sets number of audio channels for audio input. + encoding: + type: optional + docs: Sets encoding format of the audio input, such as `linear16`. + sample_rate: + type: optional + docs: >- + Sets the sample rate for audio input. (Number of samples per second in + the audio input, measured in Hertz.) + source: + openapi: evi-asyncapi.json + inline: true + ConnectSessionSettingsContext: + docs: >- + Allows developers to inject additional context into the conversation, + which is appended to the end of user messages for the session. + + + When included in a Session Settings message, the provided context can be + used to remind the LLM of its role in every user message, prevent it from + forgetting important details, or add new relevant information to the + conversation. + + + Set to `null` to disable context injection. + properties: + text: + type: optional + docs: >- + The context to be injected into the conversation. Helps inform the + LLM's response by providing relevant information about the ongoing + conversation. + + + This text will be appended to the end of + [user_messages](/reference/speech-to-speech-evi/chat#receive.UserMessage.message.content) + based on the chosen persistence level. For example, if you want to + remind EVI of its role as a helpful weather assistant, the context you + insert will be appended to the end of user messages as `{Context: You + are a helpful weather assistant}`. + type: + type: optional + docs: >- + The persistence level of the injected context. Specifies how long the + injected context will remain active in the session. + + + - **Temporary**: Context that is only applied to the following + assistant response. + + + - **Persistent**: Context that is applied to all subsequent assistant + responses for the remainder of the Chat. + source: + openapi: evi-asyncapi.json + inline: true + ConnectSessionSettingsVariablesValue: + discriminated: false + union: + - string + - double + - boolean + source: + openapi: evi-asyncapi.json + inline: true + ConnectSessionSettings: + properties: + audio: + type: optional + docs: >- + Configuration details for the audio input used during the session. + Ensures the audio is being correctly set up for processing. + + + This optional field is only required when the audio input is encoded + in PCM Linear 16 (16-bit, little-endian, signed PCM WAV data). For + detailed instructions on how to configure session settings for PCM + Linear 16 audio, please refer to the [Session Settings + section](/docs/empathic-voice-interface-evi/configuration#session-settings) + on the EVI Configuration page. + context: + type: optional + docs: >- + Allows developers to inject additional context into the conversation, + which is appended to the end of user messages for the session. + + + When included in a Session Settings message, the provided context can + be used to remind the LLM of its role in every user message, prevent + it from forgetting important details, or add new relevant information + to the conversation. + + + Set to `null` to disable context injection. + custom_session_id: + type: optional + docs: >- + Used to manage conversational state, correlate frontend and backend + data, and persist conversations across EVI sessions. + event_limit: + type: optional + docs: >- + The maximum number of chat events to return from chat history. By + default, the system returns up to 300 events (100 events per page × 3 + pages). Set this parameter to a smaller value to limit the number of + events returned. + language_model_api_key: + type: optional + docs: >- + Third party API key for the supplemental language model. + + + When provided, EVI will use this key instead of Hume's API key for the + supplemental LLM. This allows you to bypass rate limits and utilize + your own API key as needed. + system_prompt: + type: optional + docs: >- + Instructions used to shape EVI's behavior, responses, and style for + the session. + + + When included in a Session Settings message, the provided Prompt + overrides the existing one specified in the EVI configuration. If no + Prompt was defined in the configuration, this Prompt will be the one + used for the session. + + + You can use the Prompt to define a specific goal or role for EVI, + specifying how it should act or what it should focus on during the + conversation. For example, EVI can be instructed to act as a customer + support representative, a fitness coach, or a travel advisor, each + with its own set of behaviors and response styles. + + + For help writing a system prompt, see our [Prompting + Guide](/docs/speech-to-speech-evi/guides/prompting). + variables: + type: optional> + docs: >- + This field allows you to assign values to dynamic variables referenced + in your system prompt. + + + Each key represents the variable name, and the corresponding value is + the specific content you wish to assign to that variable within the + session. While the values for variables can be strings, numbers, or + booleans, the value will ultimately be converted to a string when + injected into your system prompt. + + + Using this field, you can personalize responses based on + session-specific details. For more guidance, see our [guide on using + dynamic + variables](/docs/speech-to-speech-evi/features/dynamic-variables). + voice_id: + type: optional + docs: >- + The name or ID of the voice from the `Voice Library` to be used as the + speaker for this EVI session. This will override the speaker set in + the selected configuration. + source: + openapi: evi-asyncapi.json + ControlPlanePublishEvent: + discriminated: false + union: + - type: SessionSettings + - type: UserInput + - type: AssistantInput + - type: ToolResponseMessage + - type: ToolErrorMessage + - type: PauseAssistantMessage + - type: ResumeAssistantMessage + source: + openapi: evi-openapi.json + HTTPValidationError: + properties: + detail: + type: optional> + source: + openapi: evi-openapi.json + LanguageModelType: + enum: + - value: claude-3-7-sonnet-latest + name: Claude37SonnetLatest + - value: claude-3-5-sonnet-latest + name: Claude35SonnetLatest + - value: claude-3-5-haiku-latest + name: Claude35HaikuLatest + - value: claude-3-5-sonnet-20240620 + name: Claude35Sonnet20240620 + - value: claude-3-opus-20240229 + name: Claude3Opus20240229 + - value: claude-3-sonnet-20240229 + name: Claude3Sonnet20240229 + - value: claude-3-haiku-20240307 + name: Claude3Haiku20240307 + - value: claude-sonnet-4-20250514 + name: ClaudeSonnet420250514 + - value: claude-sonnet-4-5-20250929 + name: ClaudeSonnet4520250929 + - value: claude-haiku-4-5-20251001 + name: ClaudeHaiku4520251001 + - value: us.anthropic.claude-3-5-haiku-20241022-v1:0 + name: UsAnthropicClaude35Haiku20241022V10 + - value: us.anthropic.claude-3-5-sonnet-20240620-v1:0 + name: UsAnthropicClaude35Sonnet20240620V10 + - value: us.anthropic.claude-3-haiku-20240307-v1:0 + name: UsAnthropicClaude3Haiku20240307V10 + - value: gpt-oss-120b + name: GptOss120B + - value: qwen-3-235b-a22b + name: Qwen3235BA22B + - value: qwen-3-235b-a22b-instruct-2507 + name: Qwen3235BA22BInstruct2507 + - value: qwen-3-235b-a22b-thinking-2507 + name: Qwen3235BA22BThinking2507 + - value: gemini-1.5-pro + name: Gemini15Pro + - value: gemini-1.5-flash + name: Gemini15Flash + - value: gemini-1.5-pro-002 + name: Gemini15Pro002 + - value: gemini-1.5-flash-002 + name: Gemini15Flash002 + - value: gemini-2.0-flash + name: Gemini20Flash + - value: gemini-2.5-flash + name: Gemini25Flash + - value: gemini-2.5-flash-preview-04-17 + name: Gemini25FlashPreview0417 + - value: gpt-4-turbo + name: Gpt4Turbo + - value: gpt-4-turbo-preview + name: Gpt4TurboPreview + - value: gpt-3.5-turbo-0125 + name: Gpt35Turbo0125 + - value: gpt-3.5-turbo + name: Gpt35Turbo + - value: gpt-4o + name: Gpt4O + - value: gpt-4o-mini + name: Gpt4OMini + - value: gpt-4.1 + name: Gpt41 + - value: gpt-5 + name: Gpt5 + - value: gpt-5-mini + name: Gpt5Mini + - value: gpt-5-nano + name: Gpt5Nano + - value: gpt-4o-priority + name: Gpt4OPriority + - value: gpt-4o-mini-priority + name: Gpt4OMiniPriority + - value: gpt-4.1-priority + name: Gpt41Priority + - value: gpt-5-priority + name: Gpt5Priority + - value: gpt-5-mini-priority + name: Gpt5MiniPriority + - value: gpt-5-nano-priority + name: Gpt5NanoPriority + - value: gemma-7b-it + name: Gemma7BIt + - value: llama3-8b-8192 + name: Llama38B8192 + - value: llama3-70b-8192 + name: Llama370B8192 + - value: llama-3.1-70b-versatile + name: Llama3170BVersatile + - value: llama-3.3-70b-versatile + name: Llama3370BVersatile + - value: llama-3.1-8b-instant + name: Llama318BInstant + - value: moonshotai/kimi-k2-instruct + name: MoonshotaiKimiK2Instruct + - value: accounts/fireworks/models/mixtral-8x7b-instruct + name: AccountsFireworksModelsMixtral8X7BInstruct + - value: accounts/fireworks/models/llama-v3p1-405b-instruct + name: AccountsFireworksModelsLlamaV3P1405BInstruct + - value: accounts/fireworks/models/llama-v3p1-70b-instruct + name: AccountsFireworksModelsLlamaV3P170BInstruct + - value: accounts/fireworks/models/llama-v3p1-8b-instruct + name: AccountsFireworksModelsLlamaV3P18BInstruct + - sonar + - value: sonar-pro + name: SonarPro + - sambanova + - value: DeepSeek-R1-Distill-Llama-70B + name: DeepSeekR1DistillLlama70B + - value: Llama-4-Maverick-17B-128E-Instruct + name: Llama4Maverick17B128EInstruct + - value: Qwen3-32B + name: Qwen332B + - value: grok-4-fast-non-reasoning-latest + name: Grok4FastNonReasoningLatest + - ellm + - value: custom-language-model + name: CustomLanguageModel + - value: hume-evi-3-web-search + name: HumeEvi3WebSearch + source: + openapi: evi-openapi.json + ModelProviderEnum: + enum: + - GROQ + - OPEN_AI + - FIREWORKS + - ANTHROPIC + - CUSTOM_LANGUAGE_MODEL + - GOOGLE + - HUME_AI + - AMAZON_BEDROCK + - PERPLEXITY + - SAMBANOVA + - CEREBRAS + source: + openapi: evi-openapi.json + ValidationErrorLocItem: + discriminated: false + union: + - string + - integer + source: + openapi: evi-openapi.json + inline: true + ValidationError: + properties: + loc: + type: list + msg: string + type: string + source: + openapi: evi-openapi.json + WebhookEventBase: + docs: Represents the fields common to all webhook events. + properties: + chat_group_id: + type: string + docs: Unique ID of the **Chat Group** associated with the **Chat** session. + chat_id: + type: string + docs: Unique ID of the **Chat** session. + config_id: + type: optional + docs: Unique ID of the EVI **Config** used for the session. + source: + openapi: evi-openapi.json + WebhookEvent: + discriminated: false + union: + - WebhookEventChatStarted + - WebhookEventChatEnded + - WebhookEventToolCall + source: + openapi: evi-openapi.json + WebhookEventChatEnded: + properties: + caller_number: + type: optional + docs: >- + Phone number of the caller in E.164 format (e.g., `+12223333333`). + This field is included only if the Chat was created via the [Twilio + phone calling](/docs/empathic-voice-interface-evi/phone-calling) + integration. + custom_session_id: + type: optional + docs: >- + User-defined session ID. Relevant only when employing a [custom + language + model](/docs/empathic-voice-interface-evi/custom-language-model) in + the EVI Config. + duration_seconds: + type: integer + docs: Total duration of the session in seconds. + end_reason: + type: WebhookEventChatStatus + docs: Reason for the session's termination. + end_time: + type: integer + docs: Unix timestamp (in milliseconds) indicating when the session ended. + event_name: + type: optional> + docs: Always `chat_ended`. + extends: + - WebhookEventBase + source: + openapi: evi-openapi.json + WebhookEventChatStartType: + enum: + - new_chat_group + - resumed_chat_group + source: + openapi: evi-openapi.json + WebhookEventChatStarted: + properties: + caller_number: + type: optional + docs: >- + Phone number of the caller in E.164 format (e.g., `+12223333333`). + This field is included only if the Chat was created via the [Twilio + phone calling](/docs/empathic-voice-interface-evi/phone-calling) + integration. + chat_start_type: + type: WebhookEventChatStartType + docs: >- + Indicates whether the chat is the first in a new Chat Group + (`new_chat_group`) or the continuation of an existing chat group + (`resumed_chat_group`). + custom_session_id: + type: optional + docs: >- + User-defined session ID. Relevant only when employing a [custom + language + model](/docs/empathic-voice-interface-evi/custom-language-model) in + the EVI Config. + event_name: + type: optional> + docs: Always `chat_started`. + start_time: + type: integer + docs: Unix timestamp (in milliseconds) indicating when the session started. + extends: + - WebhookEventBase + source: + openapi: evi-openapi.json + WebhookEventChatStatus: + enum: + - ACTIVE + - USER_ENDED + - USER_TIMEOUT + - INACTIVITY_TIMEOUT + - MAX_DURATION_TIMEOUT + - SILENCE_TIMEOUT + - ERROR + source: + openapi: evi-openapi.json + WebhookEventToolCall: + properties: + caller_number: + type: optional + docs: >- + Phone number of the caller in E.164 format (e.g., `+12223333333`). + This field is included only if the Chat was created via the [Twilio + phone calling](/docs/empathic-voice-interface-evi/phone-calling) + integration. + custom_session_id: + type: optional + docs: >- + User-defined session ID. Relevant only when employing a [custom + language + model](/docs/empathic-voice-interface-evi/custom-language-model) in + the EVI Config. + event_name: + type: optional> + docs: Always `tool_call`. + timestamp: + type: integer + docs: >- + Unix timestamp (in milliseconds) indicating when the tool call was + triggered. + tool_call_message: + type: ToolCallMessage + docs: The tool call. + extends: + - WebhookEventBase + source: + openapi: evi-openapi.json + ErrorResponse: + properties: + error: optional + message: optional + code: optional + source: + openapi: evi-openapi.json + ReturnPagedUserDefinedTools: + docs: A paginated list of user defined tool versions returned from the server + properties: + page_number: + type: integer + docs: >- + The page number of the returned list. + + + This value corresponds to the `page_number` parameter specified in the + request. Pagination uses zero-based indexing. + page_size: + type: integer + docs: >- + The maximum number of items returned per page. + + + This value corresponds to the `page_size` parameter specified in the + request. + total_pages: + type: integer + docs: The total number of pages in the collection. + tools_page: + docs: >- + List of tools returned for the specified `page_number` and + `page_size`. + type: list> + source: + openapi: evi-openapi.json + ReturnUserDefinedToolToolType: + enum: + - BUILTIN + - FUNCTION + docs: >- + Type of Tool. Either `BUILTIN` for natively implemented tools, like web + search, or `FUNCTION` for user-defined tools. + inline: true + source: + openapi: evi-openapi.json + ReturnUserDefinedToolVersionType: + enum: + - FIXED + - LATEST + docs: >- + Versioning method for a Tool. Either `FIXED` for using a fixed version + number or `LATEST` for auto-updating to the latest version. + inline: true + source: + openapi: evi-openapi.json + ReturnUserDefinedTool: + docs: A specific tool version returned from the server + properties: + tool_type: + type: ReturnUserDefinedToolToolType + docs: >- + Type of Tool. Either `BUILTIN` for natively implemented tools, like + web search, or `FUNCTION` for user-defined tools. + id: + type: string + docs: Identifier for a Tool. Formatted as a UUID. + version: + type: integer + docs: >- + Version number for a Tool. + + + Tools, Configs, Custom Voices, and Prompts are versioned. This + versioning system supports iterative development, allowing you to + progressively refine tools and revert to previous versions if needed. + + + Version numbers are integer values representing different iterations + of the Tool. Each update to the Tool increments its version number. + version_type: + type: ReturnUserDefinedToolVersionType + docs: >- + Versioning method for a Tool. Either `FIXED` for using a fixed version + number or `LATEST` for auto-updating to the latest version. + version_description: + type: optional + docs: An optional description of the Tool version. + name: + type: string + docs: Name applied to all versions of a particular Tool. + created_on: + type: long + docs: >- + Time at which the Tool was created. Measured in seconds since the Unix + epoch. + modified_on: + type: long + docs: >- + Time at which the Tool was last modified. Measured in seconds since + the Unix epoch. + fallback_content: + type: optional + docs: >- + Optional text passed to the supplemental LLM in place of the tool call + result. The LLM then uses this text to generate a response back to the + user, ensuring continuity in the conversation if the Tool errors. + description: + type: optional + docs: >- + An optional description of what the Tool does, used by the + supplemental LLM to choose when and how to call the function. + parameters: + type: string + docs: >- + Stringified JSON defining the parameters used by this version of the + Tool. + + + These parameters define the inputs needed for the Tool's execution, + including the expected data type and description for each input field. + Structured as a stringified JSON schema, this format ensures the tool + receives data in the expected format. + source: + openapi: evi-openapi.json + ReturnPagedPrompts: + docs: A paginated list of prompt versions returned from the server + properties: + page_number: + type: integer + docs: >- + The page number of the returned list. + + + This value corresponds to the `page_number` parameter specified in the + request. Pagination uses zero-based indexing. + page_size: + type: integer + docs: >- + The maximum number of items returned per page. + + + This value corresponds to the `page_size` parameter specified in the + request. + total_pages: + type: integer + docs: The total number of pages in the collection. + prompts_page: + docs: >- + List of prompts returned for the specified `page_number` and + `page_size`. + type: list> + source: + openapi: evi-openapi.json + ReturnPrompt: + docs: A Prompt associated with this Config. + properties: + name: + type: string + docs: Name applied to all versions of a particular Prompt. + id: + type: string + docs: Identifier for a Prompt. Formatted as a UUID. + text: + type: string + docs: >- + Instructions used to shape EVI's behavior, responses, and style. + + + You can use the Prompt to define a specific goal or role for EVI, + specifying how it should act or what it should focus on during the + conversation. For example, EVI can be instructed to act as a customer + support representative, a fitness coach, or a travel advisor, each + with its own set of behaviors and response styles. For help writing a + system prompt, see our [Prompting + Guide](/docs/speech-to-speech-evi/guides/prompting). + version: + type: integer + docs: >- + Version number for a Prompt. + + + Prompts, Configs, Custom Voices, and Tools are versioned. This + versioning system supports iterative development, allowing you to + progressively refine prompts and revert to previous versions if + needed. + + + Version numbers are integer values representing different iterations + of the Prompt. Each update to the Prompt increments its version + number. + version_type: + type: ReturnPromptVersionType + docs: >- + Versioning method for a Prompt. Either `FIXED` for using a fixed + version number or `LATEST` for auto-updating to the latest version. + created_on: + type: long + docs: >- + Time at which the Prompt was created. Measured in seconds since the + Unix epoch. + modified_on: + type: long + docs: >- + Time at which the Prompt was last modified. Measured in seconds since + the Unix epoch. + version_description: + type: optional + docs: An optional description of the Prompt version. + source: + openapi: evi-openapi.json + ReturnPagedConfigs: + docs: A paginated list of config versions returned from the server + properties: + page_number: + type: optional + docs: >- + The page number of the returned list. + + + This value corresponds to the `page_number` parameter specified in the + request. Pagination uses zero-based indexing. + page_size: + type: optional + docs: >- + The maximum number of items returned per page. + + + This value corresponds to the `page_size` parameter specified in the + request. + total_pages: + type: integer + docs: The total number of pages in the collection. + configs_page: + type: optional> + docs: >- + List of configs returned for the specified `page_number` and + `page_size`. + source: + openapi: evi-openapi.json + ReturnConfig: + docs: A specific config version returned from the server + properties: + name: + type: optional + docs: Name applied to all versions of a particular Config. + id: + type: optional + docs: Identifier for a Config. Formatted as a UUID. + version: + type: optional + docs: >- + Version number for a Config. + + + Configs, Prompts, Custom Voices, and Tools are versioned. This + versioning system supports iterative development, allowing you to + progressively refine configurations and revert to previous versions if + needed. + + + Version numbers are integer values representing different iterations + of the Config. Each update to the Config increments its version + number. + language_model: + type: optional + docs: >- + The supplemental language model associated with this Config. + + + This model is used to generate longer, more detailed responses from + EVI. Choosing an appropriate supplemental language model for your use + case is crucial for generating fast, high-quality responses from EVI. + builtin_tools: + type: optional>> + docs: List of built-in tools associated with this Config. + evi_version: + type: optional + docs: >- + Specifies the EVI version to use. See our [EVI Version + Guide](/docs/speech-to-speech-evi/configuration/evi-version) for + differences between versions. + + + **We're officially sunsetting EVI versions 1 and 2 on August 30, + 2025**. To keep things running smoothly, be sure to [migrate to EVI + 3](/docs/speech-to-speech-evi/configuration/evi-version#migrating-to-evi-3) + before then. + timeouts: optional + nudges: optional + event_messages: optional + ellm_model: + type: optional + docs: >- + The eLLM setup associated with this Config. + + + Hume's eLLM (empathic Large Language Model) is a multimodal language + model that takes into account both expression measures and language. + The eLLM generates short, empathic language responses and guides + text-to-speech (TTS) prosody. + voice: + type: optional + docs: A voice specification associated with this Config. + prompt: optional + webhooks: + type: optional>> + docs: Map of webhooks associated with this config. + created_on: + type: optional + docs: >- + Time at which the Config was created. Measured in seconds since the + Unix epoch. + modified_on: + type: optional + docs: >- + Time at which the Config was last modified. Measured in seconds since + the Unix epoch. + version_description: + type: optional + docs: An optional description of the Config version. + tools: + type: optional>> + docs: List of user-defined tools associated with this Config. + source: + openapi: evi-openapi.json + ReturnPagedChatsPaginationDirection: + enum: + - ASC + - DESC + docs: >- + Indicates the order in which the paginated results are presented, based on + their creation date. + + + It shows `ASC` for ascending order (chronological, with the oldest records + first) or `DESC` for descending order (reverse-chronological, with the + newest records first). This value corresponds to the `ascending_order` + query parameter used in the request. + inline: true + source: + openapi: evi-openapi.json + ReturnPagedChats: + docs: A paginated list of chats returned from the server + properties: + page_number: + type: integer + docs: >- + The page number of the returned list. + + + This value corresponds to the `page_number` parameter specified in the + request. Pagination uses zero-based indexing. + page_size: + type: integer + docs: >- + The maximum number of items returned per page. + + + This value corresponds to the `page_size` parameter specified in the + request. + total_pages: + type: integer + docs: The total number of pages in the collection. + pagination_direction: + type: ReturnPagedChatsPaginationDirection + docs: >- + Indicates the order in which the paginated results are presented, + based on their creation date. + + + It shows `ASC` for ascending order (chronological, with the oldest + records first) or `DESC` for descending order (reverse-chronological, + with the newest records first). This value corresponds to the + `ascending_order` query parameter used in the request. + chats_page: + docs: >- + List of Chats and their metadata returned for the specified + `page_number` and `page_size`. + type: list + source: + openapi: evi-openapi.json + ReturnChatPagedEventsStatus: + enum: + - ACTIVE + - USER_ENDED + - USER_TIMEOUT + - MAX_DURATION_TIMEOUT + - INACTIVITY_TIMEOUT + - ERROR + docs: >- + Indicates the current state of the chat. There are six possible statuses: + + + - `ACTIVE`: The chat is currently active and ongoing. + + + - `USER_ENDED`: The chat was manually ended by the user. + + + - `USER_TIMEOUT`: The chat ended due to a user-defined timeout. + + + - `MAX_DURATION_TIMEOUT`: The chat ended because it reached the maximum + allowed duration. + + + - `INACTIVITY_TIMEOUT`: The chat ended due to an inactivity timeout. + + + - `ERROR`: The chat ended unexpectedly due to an error. + inline: true + source: + openapi: evi-openapi.json + ReturnChatPagedEventsPaginationDirection: + enum: + - ASC + - DESC + docs: >- + Indicates the order in which the paginated results are presented, based on + their creation date. + + + It shows `ASC` for ascending order (chronological, with the oldest records + first) or `DESC` for descending order (reverse-chronological, with the + newest records first). This value corresponds to the `ascending_order` + query parameter used in the request. + inline: true + source: + openapi: evi-openapi.json + ReturnChatPagedEvents: + docs: >- + A description of chat status with a paginated list of chat events returned + from the server + properties: + id: + type: string + docs: Identifier for a Chat. Formatted as a UUID. + chat_group_id: + type: string + docs: >- + Identifier for the Chat Group. Any chat resumed from this Chat will + have the same `chat_group_id`. Formatted as a UUID. + status: + type: ReturnChatPagedEventsStatus + docs: >- + Indicates the current state of the chat. There are six possible + statuses: + + + - `ACTIVE`: The chat is currently active and ongoing. + + + - `USER_ENDED`: The chat was manually ended by the user. + + + - `USER_TIMEOUT`: The chat ended due to a user-defined timeout. + + + - `MAX_DURATION_TIMEOUT`: The chat ended because it reached the + maximum allowed duration. + + + - `INACTIVITY_TIMEOUT`: The chat ended due to an inactivity timeout. + + + - `ERROR`: The chat ended unexpectedly due to an error. + start_timestamp: + type: long + docs: >- + Time at which the Chat started. Measured in seconds since the Unix + epoch. + end_timestamp: + type: optional + docs: >- + Time at which the Chat ended. Measured in seconds since the Unix + epoch. + pagination_direction: + type: ReturnChatPagedEventsPaginationDirection + docs: >- + Indicates the order in which the paginated results are presented, + based on their creation date. + + + It shows `ASC` for ascending order (chronological, with the oldest + records first) or `DESC` for descending order (reverse-chronological, + with the newest records first). This value corresponds to the + `ascending_order` query parameter used in the request. + events_page: + docs: List of Chat Events for the specified `page_number` and `page_size`. + type: list + metadata: + type: optional + docs: Stringified JSON with additional metadata about the chat. + page_number: + type: integer + docs: >- + The page number of the returned list. + + + This value corresponds to the `page_number` parameter specified in the + request. Pagination uses zero-based indexing. + page_size: + type: integer + docs: >- + The maximum number of items returned per page. + + + This value corresponds to the `page_size` parameter specified in the + request. + total_pages: + type: integer + docs: The total number of pages in the collection. + config: optional + source: + openapi: evi-openapi.json + ReturnChatAudioReconstructionStatus: + enum: + - QUEUED + - IN_PROGRESS + - COMPLETE + - ERROR + - CANCELLED + docs: >- + Indicates the current state of the audio reconstruction job. There are + five possible statuses: + + + - `QUEUED`: The reconstruction job is waiting to be processed. + + + - `IN_PROGRESS`: The reconstruction is currently being processed. + + + - `COMPLETE`: The audio reconstruction is finished and ready for download. + + + - `ERROR`: An error occurred during the reconstruction process. + + + - `CANCELED`: The reconstruction job has been canceled. + inline: true + source: + openapi: evi-openapi.json + ReturnChatAudioReconstruction: + docs: >- + List of chat audio reconstructions returned for the specified page number + and page size. + properties: + id: + type: string + docs: Identifier for the chat. Formatted as a UUID. + user_id: + type: string + docs: Identifier for the user that owns this chat. Formatted as a UUID. + status: + type: ReturnChatAudioReconstructionStatus + docs: >- + Indicates the current state of the audio reconstruction job. There are + five possible statuses: + + + - `QUEUED`: The reconstruction job is waiting to be processed. + + + - `IN_PROGRESS`: The reconstruction is currently being processed. + + + - `COMPLETE`: The audio reconstruction is finished and ready for + download. + + + - `ERROR`: An error occurred during the reconstruction process. + + + - `CANCELED`: The reconstruction job has been canceled. + filename: + type: optional + docs: Name of the chat audio reconstruction file. + modified_at: + type: optional + docs: >- + The timestamp of the most recent status change for this audio + reconstruction, formatted milliseconds since the Unix epoch. + signed_audio_url: + type: optional + docs: Signed URL used to download the chat audio reconstruction file. + signed_url_expiration_timestamp_millis: + type: optional + docs: >- + The timestamp when the signed URL will expire, formatted as a Unix + epoch milliseconds. + source: + openapi: evi-openapi.json + ReturnPagedChatGroupsPaginationDirection: + enum: + - ASC + - DESC + docs: >- + Indicates the order in which the paginated results are presented, based on + their creation date. + + + It shows `ASC` for ascending order (chronological, with the oldest records + first) or `DESC` for descending order (reverse-chronological, with the + newest records first). This value corresponds to the `ascending_order` + query parameter used in the request. + inline: true + source: + openapi: evi-openapi.json + ReturnPagedChatGroups: + docs: A paginated list of chat_groups returned from the server + properties: + page_number: + type: integer + docs: >- + The page number of the returned list. + + + This value corresponds to the `page_number` parameter specified in the + request. Pagination uses zero-based indexing. + page_size: + type: integer + docs: >- + The maximum number of items returned per page. + + + This value corresponds to the `page_size` parameter specified in the + request. + total_pages: + type: integer + docs: The total number of pages in the collection. + pagination_direction: + type: ReturnPagedChatGroupsPaginationDirection + docs: >- + Indicates the order in which the paginated results are presented, + based on their creation date. + + + It shows `ASC` for ascending order (chronological, with the oldest + records first) or `DESC` for descending order (reverse-chronological, + with the newest records first). This value corresponds to the + `ascending_order` query parameter used in the request. + chat_groups_page: + docs: >- + List of Chat Groups and their metadata returned for the specified + `page_number` and `page_size`. + type: list + source: + openapi: evi-openapi.json + ReturnChatGroupPagedChatsPaginationDirection: + enum: + - ASC + - DESC + docs: >- + Indicates the order in which the paginated results are presented, based on + their creation date. + + + It shows `ASC` for ascending order (chronological, with the oldest records + first) or `DESC` for descending order (reverse-chronological, with the + newest records first). This value corresponds to the `ascending_order` + query parameter used in the request. + inline: true + source: + openapi: evi-openapi.json + ReturnChatGroupPagedChats: + docs: >- + A description of chat_group and its status with a paginated list of each + chat in the chat_group + properties: + id: + type: string + docs: >- + Identifier for the Chat Group. Any Chat resumed from this Chat Group + will have the same `chat_group_id`. Formatted as a UUID. + first_start_timestamp: + type: long + docs: >- + Time at which the first Chat in this Chat Group was created. Measured + in seconds since the Unix epoch. + most_recent_start_timestamp: + type: long + docs: >- + Time at which the most recent Chat in this Chat Group was created. + Measured in seconds since the Unix epoch. + num_chats: + type: integer + docs: The total number of Chats associated with this Chat Group. + page_number: + type: integer + docs: >- + The page number of the returned list. + + + This value corresponds to the `page_number` parameter specified in the + request. Pagination uses zero-based indexing. + page_size: + type: integer + docs: >- + The maximum number of items returned per page. + + + This value corresponds to the `page_size` parameter specified in the + request. + total_pages: + type: integer + docs: The total number of pages in the collection. + pagination_direction: + type: ReturnChatGroupPagedChatsPaginationDirection + docs: >- + Indicates the order in which the paginated results are presented, + based on their creation date. + + + It shows `ASC` for ascending order (chronological, with the oldest + records first) or `DESC` for descending order (reverse-chronological, + with the newest records first). This value corresponds to the + `ascending_order` query parameter used in the request. + chats_page: + docs: List of Chats for the specified `page_number` and `page_size`. + type: list + active: + type: optional + docs: >- + Denotes whether there is an active Chat associated with this Chat + Group. + source: + openapi: evi-openapi.json + ReturnChatGroupPagedEventsPaginationDirection: + enum: + - ASC + - DESC + docs: >- + Indicates the order in which the paginated results are presented, based on + their creation date. + + + It shows `ASC` for ascending order (chronological, with the oldest records + first) or `DESC` for descending order (reverse-chronological, with the + newest records first). This value corresponds to the `ascending_order` + query parameter used in the request. + inline: true + source: + openapi: evi-openapi.json + ReturnChatGroupPagedEvents: + docs: >- + A paginated list of chat events that occurred across chats in this + chat_group from the server + properties: + id: + type: string + docs: >- + Identifier for the Chat Group. Any Chat resumed from this Chat Group + will have the same `chat_group_id`. Formatted as a UUID. + page_number: + type: integer + docs: >- + The page number of the returned list. + + + This value corresponds to the `page_number` parameter specified in the + request. Pagination uses zero-based indexing. + page_size: + type: integer + docs: >- + The maximum number of items returned per page. + + + This value corresponds to the `page_size` parameter specified in the + request. + total_pages: + type: integer + docs: The total number of pages in the collection. + pagination_direction: + type: ReturnChatGroupPagedEventsPaginationDirection + docs: >- + Indicates the order in which the paginated results are presented, + based on their creation date. + + + It shows `ASC` for ascending order (chronological, with the oldest + records first) or `DESC` for descending order (reverse-chronological, + with the newest records first). This value corresponds to the + `ascending_order` query parameter used in the request. + events_page: + docs: List of Chat Events for the specified `page_number` and `page_size`. + type: list + source: + openapi: evi-openapi.json + ReturnChatGroupPagedAudioReconstructionsPaginationDirection: + enum: + - ASC + - DESC + docs: >- + Indicates the order in which the paginated results are presented, based on + their creation date. + + + It shows `ASC` for ascending order (chronological, with the oldest records + first) or `DESC` for descending order (reverse-chronological, with the + newest records first). This value corresponds to the `ascending_order` + query parameter used in the request. + inline: true + source: + openapi: evi-openapi.json + ReturnChatGroupPagedAudioReconstructions: + docs: A paginated list of chat reconstructions for a particular chatgroup + properties: + id: + type: string + docs: Identifier for the chat group. Formatted as a UUID. + user_id: + type: string + docs: Identifier for the user that owns this chat. Formatted as a UUID. + num_chats: + type: integer + docs: Total number of chats in this chatgroup + page_number: + type: integer + docs: >- + The page number of the returned list. + + + This value corresponds to the `page_number` parameter specified in the + request. Pagination uses zero-based indexing. + page_size: + type: integer + docs: >- + The maximum number of items returned per page. + + + This value corresponds to the `page_size` parameter specified in the + request. + total_pages: + type: integer + docs: The total number of pages in the collection. + pagination_direction: + type: ReturnChatGroupPagedAudioReconstructionsPaginationDirection + docs: >- + Indicates the order in which the paginated results are presented, + based on their creation date. + + + It shows `ASC` for ascending order (chronological, with the oldest + records first) or `DESC` for descending order (reverse-chronological, + with the newest records first). This value corresponds to the + `ascending_order` query parameter used in the request. + audio_reconstructions_page: + docs: >- + List of chat audio reconstructions returned for the specified page + number and page size. + type: list + source: + openapi: evi-openapi.json + ReturnPromptVersionType: + enum: + - FIXED + - LATEST + docs: >- + Versioning method for a Prompt. Either `FIXED` for using a fixed version + number or `LATEST` for auto-updating to the latest version. + inline: true + source: + openapi: evi-openapi.json + PostedConfigPromptSpec: + docs: >- + Identifies which prompt to use in a a config OR how to create a new prompt + to use in the config + properties: + id: + type: optional + docs: Identifier for a Prompt. Formatted as a UUID. + version: + type: optional + docs: >- + Version number for a Prompt. Version numbers should be integers. The + combination of configId and version number is unique. + text: + type: optional + docs: Text used to create a new prompt for a particular config. + source: + openapi: evi-openapi.json + PostedLanguageModel: + docs: A LanguageModel to be posted to the server + properties: + model_provider: + type: optional + docs: The provider of the supplemental language model. + model_resource: + type: optional + docs: String that specifies the language model to use with `model_provider`. + temperature: + type: optional + docs: >- + The model temperature, with values between 0 to 1 (inclusive). + + + Controls the randomness of the LLM's output, with values closer to 0 + yielding focused, deterministic responses and values closer to 1 + producing more creative, diverse responses. + source: + openapi: evi-openapi.json + PostedEllmModel: + docs: A eLLM model configuration to be posted to the server + properties: + allow_short_responses: + type: optional + docs: |- + Boolean indicating if the eLLM is allowed to generate short responses. + + If omitted, short responses from the eLLM are enabled by default. + source: + openapi: evi-openapi.json + PostedUserDefinedToolSpec: + docs: A specific tool identifier to be posted to the server + properties: + id: + type: string + docs: Identifier for a Tool. Formatted as a UUID. + version: + type: optional + docs: >- + Version number for a Tool. + + + Tools, Configs, Custom Voices, and Prompts are versioned. This + versioning system supports iterative development, allowing you to + progressively refine tools and revert to previous versions if needed. + + + Version numbers are integer values representing different iterations + of the Tool. Each update to the Tool increments its version number. + source: + openapi: evi-openapi.json + PostedBuiltinToolName: + enum: + - web_search + - hang_up + docs: >- + Name of the built-in tool to use. Hume supports the following built-in + tools: + + + - **web_search:** enables EVI to search the web for up-to-date information + when applicable. + + - **hang_up:** closes the WebSocket connection when appropriate (e.g., + after detecting a farewell in the conversation). + + + For more information, see our guide on [using built-in + tools](/docs/speech-to-speech-evi/features/tool-use#using-built-in-tools). + inline: true + source: + openapi: evi-openapi.json + PostedBuiltinTool: + docs: A configuration of a built-in tool to be posted to the server + properties: + name: + type: PostedBuiltinToolName + docs: >- + Name of the built-in tool to use. Hume supports the following built-in + tools: + + + - **web_search:** enables EVI to search the web for up-to-date + information when applicable. + + - **hang_up:** closes the WebSocket connection when appropriate (e.g., + after detecting a farewell in the conversation). + + + For more information, see our guide on [using built-in + tools](/docs/speech-to-speech-evi/features/tool-use#using-built-in-tools). + fallback_content: + type: optional + docs: >- + Optional text passed to the supplemental LLM in place of the tool call + result. The LLM then uses this text to generate a response back to the + user, ensuring continuity in the conversation if the Tool errors. + source: + openapi: evi-openapi.json + PostedEventMessageSpecs: + docs: >- + Collection of event messages returned by the server. + + + Event messages are sent by the server when specific events occur during a + chat session. These messages are used to configure behaviors for EVI, such + as controlling how EVI starts a new conversation. + properties: + on_new_chat: + type: optional + docs: >- + Specifies the initial message EVI provides when a new chat is started, + such as a greeting or welcome message. + on_inactivity_timeout: + type: optional + docs: >- + Specifies the message EVI provides when the chat is about to be + disconnected due to a user inactivity timeout, such as a message + mentioning a lack of user input for a period of time. + + + Enabling an inactivity message allows developers to use this message + event for "checking in" with the user if they are not responding to + see if they are still active. + + + If the user does not respond in the number of seconds specified in the + `inactivity_timeout` field, then EVI will say the message and the user + has 15 seconds to respond. If they respond in time, the conversation + will continue; if not, the conversation will end. + + + However, if the inactivity message is not enabled, then reaching the + inactivity timeout will immediately end the connection. + on_max_duration_timeout: + type: optional + docs: >- + Specifies the message EVI provides when the chat is disconnected due + to reaching the maximum chat duration, such as a message mentioning + the time limit for the chat has been reached. + source: + openapi: evi-openapi.json + PostedNudgeSpec: + docs: A nudge specification posted to the server + properties: + enabled: + type: optional + docs: >- + If true, EVI will 'nudge' the user to speak after a determined + interval of silence. + interval_secs: + type: optional + docs: The interval of inactivity (in seconds) before a nudge is triggered. + source: + openapi: evi-openapi.json + PostedTimeoutSpecsInactivity: + docs: >- + Specifies the duration of user inactivity (in seconds) after which the EVI + WebSocket connection will be automatically disconnected. Default is 600 + seconds (10 minutes). + + + Accepts a minimum value of 30 seconds and a maximum value of 1,800 + seconds. + properties: + duration_secs: + type: optional + docs: >- + Duration in seconds for the timeout (e.g. 600 seconds represents 10 + minutes). + enabled: + type: boolean + docs: >- + Boolean indicating if this timeout is enabled. + + + If set to false, EVI will not timeout due to a specified duration of + user inactivity being reached. However, the conversation will + eventually disconnect after 1,800 seconds (30 minutes), which is the + maximum WebSocket duration limit for EVI. + source: + openapi: evi-openapi.json + inline: true + PostedTimeoutSpecsMaxDuration: + docs: >- + Specifies the maximum allowed duration (in seconds) for an EVI WebSocket + connection before it is automatically disconnected. Default is 1,800 + seconds (30 minutes). + + + Accepts a minimum value of 30 seconds and a maximum value of 1,800 + seconds. + properties: + duration_secs: + type: optional + docs: >- + Duration in seconds for the timeout (e.g. 600 seconds represents 10 + minutes). + enabled: + type: boolean + docs: >- + Boolean indicating if this timeout is enabled. + + + If set to false, EVI will not timeout due to a specified maximum + duration being reached. However, the conversation will eventually + disconnect after 1,800 seconds (30 minutes), which is the maximum + WebSocket duration limit for EVI. + source: + openapi: evi-openapi.json + inline: true + PostedTimeoutSpecs: + docs: >- + Collection of timeout specifications returned by the server. + + + Timeouts are sent by the server when specific time-based events occur + during a chat session. These specifications set the inactivity timeout and + the maximum duration an EVI WebSocket connection can stay open before it + is automatically disconnected. + properties: + inactivity: + type: optional + docs: >- + Specifies the duration of user inactivity (in seconds) after which the + EVI WebSocket connection will be automatically disconnected. Default + is 600 seconds (10 minutes). + + + Accepts a minimum value of 30 seconds and a maximum value of 1,800 + seconds. + max_duration: + type: optional + docs: >- + Specifies the maximum allowed duration (in seconds) for an EVI + WebSocket connection before it is automatically disconnected. Default + is 1,800 seconds (30 minutes). + + + Accepts a minimum value of 30 seconds and a maximum value of 1,800 + seconds. + source: + openapi: evi-openapi.json + PostedWebhookEventType: + enum: + - chat_started + - chat_ended + docs: Events this URL is subscribed to + inline: true + source: + openapi: evi-openapi.json + PostedWebhookSpec: + docs: URL and settings for a specific webhook to be posted to the server + properties: + url: + type: string + docs: >- + The URL where event payloads will be sent. This must be a valid https + URL to ensure secure communication. The server at this URL must accept + POST requests with a JSON payload. + events: + docs: >- + The list of events the specified URL is subscribed to. + + + See our [webhooks + guide](/docs/speech-to-speech-evi/configuration/build-a-configuration#supported-events) + for more information on supported events. + type: list + source: + openapi: evi-openapi.json + ReturnLanguageModel: + docs: A specific LanguageModel + properties: + model_provider: + type: optional + docs: The provider of the supplemental language model. + model_resource: + type: optional + docs: String that specifies the language model to use with `model_provider`. + temperature: + type: optional + docs: >- + The model temperature, with values between 0 to 1 (inclusive). + + + Controls the randomness of the LLM's output, with values closer to 0 + yielding focused, deterministic responses and values closer to 1 + producing more creative, diverse responses. + source: + openapi: evi-openapi.json + ReturnEllmModel: + docs: A specific eLLM Model configuration + properties: + allow_short_responses: + type: boolean + docs: |- + Boolean indicating if the eLLM is allowed to generate short responses. + + If omitted, short responses from the eLLM are enabled by default. + source: + openapi: evi-openapi.json + ReturnBuiltinToolToolType: + enum: + - BUILTIN + - FUNCTION + docs: >- + Type of Tool. Either `BUILTIN` for natively implemented tools, like web + search, or `FUNCTION` for user-defined tools. + inline: true + source: + openapi: evi-openapi.json + ReturnBuiltinTool: + docs: A specific builtin tool version returned from the server + properties: + tool_type: + type: ReturnBuiltinToolToolType + docs: >- + Type of Tool. Either `BUILTIN` for natively implemented tools, like + web search, or `FUNCTION` for user-defined tools. + name: + type: string + docs: Name applied to all versions of a particular Tool. + fallback_content: + type: optional + docs: >- + Optional text passed to the supplemental LLM in place of the tool call + result. The LLM then uses this text to generate a response back to the + user, ensuring continuity in the conversation if the Tool errors. + source: + openapi: evi-openapi.json + ReturnEventMessageSpecs: + docs: >- + Collection of event messages returned by the server. + + + Event messages are sent by the server when specific events occur during a + chat session. These messages are used to configure behaviors for EVI, such + as controlling how EVI starts a new conversation. + properties: + on_new_chat: + type: optional + docs: >- + Specifies the initial message EVI provides when a new chat is started, + such as a greeting or welcome message. + on_inactivity_timeout: + type: optional + docs: >- + Specifies the message EVI provides when the chat is about to be + disconnected due to a user inactivity timeout, such as a message + mentioning a lack of user input for a period of time. + + + Enabling an inactivity message allows developers to use this message + event for "checking in" with the user if they are not responding to + see if they are still active. + + + If the user does not respond in the number of seconds specified in the + `inactivity_timeout` field, then EVI will say the message and the user + has 15 seconds to respond. If they respond in time, the conversation + will continue; if not, the conversation will end. + + + However, if the inactivity message is not enabled, then reaching the + inactivity timeout will immediately end the connection. + on_max_duration_timeout: + type: optional + docs: >- + Specifies the message EVI provides when the chat is disconnected due + to reaching the maximum chat duration, such as a message mentioning + the time limit for the chat has been reached. + source: + openapi: evi-openapi.json + ReturnTimeoutSpecs: + docs: >- + Collection of timeout specifications returned by the server. + + + Timeouts are sent by the server when specific time-based events occur + during a chat session. These specifications set the inactivity timeout and + the maximum duration an EVI WebSocket connection can stay open before it + is automatically disconnected. + properties: + inactivity: + type: ReturnTimeoutSpec + docs: >- + Specifies the duration of user inactivity (in seconds) after which the + EVI WebSocket connection will be automatically disconnected. Default + is 600 seconds (10 minutes). + + + Accepts a minimum value of 30 seconds and a maximum value of 1,800 + seconds. + max_duration: + type: ReturnTimeoutSpec + docs: >- + Specifies the maximum allowed duration (in seconds) for an EVI + WebSocket connection before it is automatically disconnected. Default + is 1,800 seconds (30 minutes). + + + Accepts a minimum value of 30 seconds and a maximum value of 1,800 + seconds. + source: + openapi: evi-openapi.json + ReturnNudgeSpec: + docs: A specific nudge configuration returned from the server + properties: + enabled: + type: boolean + docs: EVI will nudge user after inactivity + interval_secs: + type: optional + docs: Time interval in seconds after which the nudge will be sent. + source: + openapi: evi-openapi.json + ReturnWebhookEventType: + enum: + - chat_started + - chat_ended + docs: Events this URL is subscribed to + inline: true + source: + openapi: evi-openapi.json + ReturnWebhookSpec: + docs: Collection of webhook URL endpoints to be returned from the server + properties: + url: + type: string + docs: >- + The URL where event payloads will be sent. This must be a valid https + URL to ensure secure communication. The server at this URL must accept + POST requests with a JSON payload. + events: + docs: >- + The list of events the specified URL is subscribed to. + + + See our [webhooks + guide](/docs/speech-to-speech-evi/configuration/build-a-configuration#supported-events) + for more information on supported events. + type: list + source: + openapi: evi-openapi.json + ReturnChatStatus: + enum: + - ACTIVE + - USER_ENDED + - USER_TIMEOUT + - MAX_DURATION_TIMEOUT + - INACTIVITY_TIMEOUT + - ERROR + docs: >- + Indicates the current state of the chat. There are six possible statuses: + + + - `ACTIVE`: The chat is currently active and ongoing. + + + - `USER_ENDED`: The chat was manually ended by the user. + + + - `USER_TIMEOUT`: The chat ended due to a user-defined timeout. + + + - `MAX_DURATION_TIMEOUT`: The chat ended because it reached the maximum + allowed duration. + + + - `INACTIVITY_TIMEOUT`: The chat ended due to an inactivity timeout. + + + - `ERROR`: The chat ended unexpectedly due to an error. + inline: true + source: + openapi: evi-openapi.json + ReturnChat: + docs: A description of chat and its status + properties: + id: + type: string + docs: Identifier for a Chat. Formatted as a UUID. + chat_group_id: + type: string + docs: >- + Identifier for the Chat Group. Any chat resumed from this Chat will + have the same `chat_group_id`. Formatted as a UUID. + status: + type: ReturnChatStatus + docs: >- + Indicates the current state of the chat. There are six possible + statuses: + + + - `ACTIVE`: The chat is currently active and ongoing. + + + - `USER_ENDED`: The chat was manually ended by the user. + + + - `USER_TIMEOUT`: The chat ended due to a user-defined timeout. + + + - `MAX_DURATION_TIMEOUT`: The chat ended because it reached the + maximum allowed duration. + + + - `INACTIVITY_TIMEOUT`: The chat ended due to an inactivity timeout. + + + - `ERROR`: The chat ended unexpectedly due to an error. + start_timestamp: + type: long + docs: >- + Time at which the Chat started. Measured in seconds since the Unix + epoch. + end_timestamp: + type: optional + docs: >- + Time at which the Chat ended. Measured in seconds since the Unix + epoch. + event_count: + type: optional + docs: The total number of events currently in this chat. + metadata: + type: optional + docs: Stringified JSON with additional metadata about the chat. + config: optional + source: + openapi: evi-openapi.json + ReturnChatEventRole: + enum: + - USER + - AGENT + - SYSTEM + - TOOL + docs: >- + The role of the entity which generated the Chat Event. There are four + possible values: + + - `USER`: The user, capable of sending user messages and interruptions. + + - `AGENT`: The assistant, capable of sending agent messages. + + - `SYSTEM`: The backend server, capable of transmitting errors. + + - `TOOL`: The function calling mechanism. + inline: true + source: + openapi: evi-openapi.json + ReturnChatEventType: + enum: + - FUNCTION_CALL + - FUNCTION_CALL_RESPONSE + - CHAT_END_MESSAGE + - AGENT_MESSAGE + - SYSTEM_PROMPT + - USER_RECORDING_START_MESSAGE + - RESUME_ONSET + - USER_INTERRUPTION + - CHAT_START_MESSAGE + - PAUSE_ONSET + - USER_MESSAGE + docs: >- + Type of Chat Event. There are eleven Chat Event types: + + - `SYSTEM_PROMPT`: The system prompt used to initialize the session. + + - `CHAT_START_MESSAGE`: Marks the beginning of the chat session. + + - `USER_RECORDING_START_MESSAGE`: Marks when the client began streaming + audio and the start of audio processing. + + - `USER_MESSAGE`: A message sent by the user. + + - `USER_INTERRUPTION`: A user-initiated interruption while the assistant + is speaking. + + - `AGENT_MESSAGE`: A response generated by the assistant. + + - `FUNCTION_CALL`: A record of a tool invocation by the assistant. + + - `FUNCTION_CALL_RESPONSE`: The result of a previously invoked function or + tool. + + - `PAUSE_ONSET`: Marks when the client sent a `pause_assistant_message` to + pause the assistant. + + - `RESUME_ONSET`: Marks when the client sent a `resume_assistant_message` + to resume the assistant. + + - `CHAT_END_MESSAGE`: Indicates the end of the chat session. + inline: true + source: + openapi: evi-openapi.json + ReturnChatEvent: + docs: A description of a single event in a chat returned from the server + properties: + id: + type: string + docs: Identifier for a Chat Event. Formatted as a UUID. + chat_id: + type: string + docs: Identifier for the Chat this event occurred in. Formatted as a UUID. + timestamp: + type: long + docs: >- + Time at which the Chat Event occurred. Measured in seconds since the + Unix epoch. + role: + type: ReturnChatEventRole + docs: >- + The role of the entity which generated the Chat Event. There are four + possible values: + + - `USER`: The user, capable of sending user messages and + interruptions. + + - `AGENT`: The assistant, capable of sending agent messages. + + - `SYSTEM`: The backend server, capable of transmitting errors. + + - `TOOL`: The function calling mechanism. + type: + type: ReturnChatEventType + docs: >- + Type of Chat Event. There are eleven Chat Event types: + + - `SYSTEM_PROMPT`: The system prompt used to initialize the session. + + - `CHAT_START_MESSAGE`: Marks the beginning of the chat session. + + - `USER_RECORDING_START_MESSAGE`: Marks when the client began + streaming audio and the start of audio processing. + + - `USER_MESSAGE`: A message sent by the user. + + - `USER_INTERRUPTION`: A user-initiated interruption while the + assistant is speaking. + + - `AGENT_MESSAGE`: A response generated by the assistant. + + - `FUNCTION_CALL`: A record of a tool invocation by the assistant. + + - `FUNCTION_CALL_RESPONSE`: The result of a previously invoked + function or tool. + + - `PAUSE_ONSET`: Marks when the client sent a + `pause_assistant_message` to pause the assistant. + + - `RESUME_ONSET`: Marks when the client sent a + `resume_assistant_message` to resume the assistant. + + - `CHAT_END_MESSAGE`: Indicates the end of the chat session. + message_text: + type: optional + docs: >- + The text of the Chat Event. This field contains the message content + for each event type listed in the `type` field. + emotion_features: + type: optional + docs: >- + Stringified JSON containing the prosody model inference results. + + + EVI uses the prosody model to measure 48 expressions related to speech + and vocal characteristics. These results contain a detailed emotional + and tonal analysis of the audio. Scores typically range from 0 to 1, + with higher values indicating a stronger confidence level in the + measured attribute. + metadata: + type: optional + docs: Stringified JSON with additional metadata about the chat event. + related_event_id: + type: optional + docs: >- + Identifier for a related chat event. Currently only seen on + ASSISTANT_PROSODY events, to point back to the ASSISTANT_MESSAGE that + generated these prosody scores + source: + openapi: evi-openapi.json + ReturnConfigSpec: + docs: The Config associated with this Chat. + properties: + id: + type: string + docs: Identifier for a Config. Formatted as a UUID. + version: + type: optional + docs: >- + Version number for a Config. + + + Configs, Prompts, Custom Voices, and Tools are versioned. This + versioning system supports iterative development, allowing you to + progressively refine configurations and revert to previous versions if + needed. + + + Version numbers are integer values representing different iterations + of the Config. Each update to the Config increments its version + number. + source: + openapi: evi-openapi.json + ReturnChatGroup: + docs: A description of chat_group and its status + properties: + id: + type: string + docs: >- + Identifier for the Chat Group. Any Chat resumed from this Chat Group + will have the same `chat_group_id`. Formatted as a UUID. + first_start_timestamp: + type: long + docs: >- + Time at which the first Chat in this Chat Group was created. Measured + in seconds since the Unix epoch. + most_recent_start_timestamp: + type: long + docs: >- + Time at which the most recent Chat in this Chat Group was created. + Measured in seconds since the Unix epoch. + most_recent_chat_id: + type: optional + docs: >- + The `chat_id` of the most recent Chat in this Chat Group. Formatted as + a UUID. + most_recent_config: optional + num_chats: + type: integer + docs: The total number of Chats in this Chat Group. + active: + type: optional + docs: >- + Denotes whether there is an active Chat associated with this Chat + Group. + source: + openapi: evi-openapi.json + PostedEventMessageSpec: + docs: Settings for a specific event_message to be posted to the server + properties: + enabled: + type: boolean + docs: >- + Boolean indicating if this event message is enabled. + + + If set to `true`, a message will be sent when the circumstances for + the specific event are met. + text: + type: optional + docs: >- + Text to use as the event message when the corresponding event occurs. + If no text is specified, EVI will generate an appropriate message + based on its current context and the system prompt. + source: + openapi: evi-openapi.json + PostedTimeoutSpec: + docs: Settings for a specific timeout to be posted to the server + properties: + enabled: + type: boolean + docs: Boolean indicating if this event message is enabled. + duration_secs: + type: optional + docs: Duration in seconds for the timeout. + source: + openapi: evi-openapi.json + ReturnEventMessageSpec: + docs: A specific event message configuration to be returned from the server + properties: + enabled: + type: boolean + docs: >- + Boolean indicating if this event message is enabled. + + + If set to `true`, a message will be sent when the circumstances for + the specific event are met. + text: + type: optional + docs: >- + Text to use as the event message when the corresponding event occurs. + If no text is specified, EVI will generate an appropriate message + based on its current context and the system prompt. + source: + openapi: evi-openapi.json + ReturnTimeoutSpec: + docs: A specific timeout configuration to be returned from the server + properties: + enabled: + type: boolean + docs: >- + Boolean indicating if this timeout is enabled. + + + If set to false, EVI will not timeout due to a specified duration + being reached. However, the conversation will eventually disconnect + after 1,800 seconds (30 minutes), which is the maximum WebSocket + duration limit for EVI. + duration_secs: + type: optional + docs: >- + Duration in seconds for the timeout (e.g. 600 seconds represents 10 + minutes). + source: + openapi: evi-openapi.json + VoiceId: + properties: + id: + type: string + docs: ID of the voice in the `Voice Library`. + provider: + type: optional + docs: Model provider associated with this Voice ID. + source: + openapi: evi-openapi.json + VoiceName: + properties: + name: + type: string + docs: Name of the voice in the `Voice Library`. + provider: + type: optional + docs: Model provider associated with this Voice Name. + source: + openapi: evi-openapi.json + VoiceRef: + discriminated: false + union: + - type: VoiceId + - type: VoiceName + source: + openapi: evi-openapi.json + ReturnVoice: + docs: An Octave voice available for text-to-speech + properties: + id: optional + name: optional + provider: optional + compatible_octave_models: optional> + source: + openapi: evi-openapi.json + VoiceProvider: + enum: + - HUME_AI + - CUSTOM_VOICE + source: + openapi: evi-openapi.json diff --git a/.mock/definition/empathic-voice/chat.yml b/.mock/definition/empathic-voice/chat.yml new file mode 100644 index 00000000..c69e2ab8 --- /dev/null +++ b/.mock/definition/empathic-voice/chat.yml @@ -0,0 +1,149 @@ +imports: + root: __package__.yml +channel: + path: /chat + url: evi + auth: false + docs: Chat with Empathic Voice Interface (EVI) + query-parameters: + access_token: + type: optional + default: '' + docs: >- + Access token used for authenticating the client. If not provided, an + `api_key` must be provided to authenticate. + + + The access token is generated using both an API key and a Secret key, + which provides an additional layer of security compared to using just an + API key. + + + For more details, refer to the [Authentication Strategies + Guide](/docs/introduction/api-key#authentication-strategies). + allow_connection: + type: optional + default: false + docs: Allows external connections to this chat via the /connect endpoint. + config_id: + type: optional + docs: >- + The unique identifier for an EVI configuration. + + + Include this ID in your connection request to equip EVI with the Prompt, + Language Model, Voice, and Tools associated with the specified + configuration. If omitted, EVI will apply [default configuration + settings](/docs/speech-to-speech-evi/configuration/build-a-configuration#default-configuration). + + + For help obtaining this ID, see our [Configuration + Guide](/docs/speech-to-speech-evi/configuration). + config_version: + type: optional + docs: >- + The version number of the EVI configuration specified by the + `config_id`. + + + Configs, as well as Prompts and Tools, are versioned. This versioning + system supports iterative development, allowing you to progressively + refine configurations and revert to previous versions if needed. + + + Include this parameter to apply a specific version of an EVI + configuration. If omitted, the latest version will be applied. + event_limit: + type: optional + docs: >- + The maximum number of chat events to return from chat history. By + default, the system returns up to 300 events (100 events per page × 3 + pages). Set this parameter to a smaller value to limit the number of + events returned. + resumed_chat_group_id: + type: optional + docs: >- + The unique identifier for a Chat Group. Use this field to preserve + context from a previous Chat session. + + + A Chat represents a single session from opening to closing a WebSocket + connection. In contrast, a Chat Group is a series of resumed Chats that + collectively represent a single conversation spanning multiple sessions. + Each Chat includes a Chat Group ID, which is used to preserve the + context of previous Chat sessions when starting a new one. + + + Including the Chat Group ID in the `resumed_chat_group_id` query + parameter is useful for seamlessly resuming a Chat after unexpected + network disconnections and for picking up conversations exactly where + you left off at a later time. This ensures preserved context across + multiple sessions. + + + There are three ways to obtain the Chat Group ID: + + + - [Chat + Metadata](/reference/speech-to-speech-evi/chat#receive.ChatMetadata): + Upon establishing a WebSocket connection with EVI, the user receives a + Chat Metadata message. This message contains a `chat_group_id`, which + can be used to resume conversations within this chat group in future + sessions. + + + - [List Chats + endpoint](/reference/speech-to-speech-evi/chats/list-chats): Use the GET + `/v0/evi/chats` endpoint to obtain the Chat Group ID of individual Chat + sessions. This endpoint lists all available Chat sessions and their + associated Chat Group ID. + + + - [List Chat Groups + endpoint](/reference/speech-to-speech-evi/chat-groups/list-chat-groups): + Use the GET `/v0/evi/chat_groups` endpoint to obtain the Chat Group IDs + of all Chat Groups associated with an API key. This endpoint returns a + list of all available chat groups. + verbose_transcription: + type: optional + default: false + docs: >- + A flag to enable verbose transcription. Set this query parameter to + `true` to have unfinalized user transcripts be sent to the client as + interim UserMessage messages. The + [interim](/reference/speech-to-speech-evi/chat#receive.UserMessage.interim) + field on a + [UserMessage](/reference/speech-to-speech-evi/chat#receive.UserMessage) + denotes whether the message is "interim" or "final." + api_key: + type: optional + default: '' + docs: >- + API key used for authenticating the client. If not provided, an + `access_token` must be provided to authenticate. + + + For more details, refer to the [Authentication Strategies + Guide](/docs/introduction/api-key#authentication-strategies). + session_settings: root.ConnectSessionSettings + messages: + publish: + origin: client + body: PublishEvent + subscribe: + origin: server + body: root.SubscribeEvent +types: + PublishEvent: + discriminated: false + union: + - type: root.AudioInput + - type: root.SessionSettings + - type: root.UserInput + - type: root.AssistantInput + - type: root.ToolResponseMessage + - type: root.ToolErrorMessage + - type: root.PauseAssistantMessage + - type: root.ResumeAssistantMessage + source: + openapi: evi-asyncapi.json diff --git a/.mock/definition/empathic-voice/chatGroups.yml b/.mock/definition/empathic-voice/chatGroups.yml new file mode 100644 index 00000000..9233b0de --- /dev/null +++ b/.mock/definition/empathic-voice/chatGroups.yml @@ -0,0 +1,623 @@ +imports: + root: __package__.yml +service: + auth: false + base-path: '' + endpoints: + list-chat-groups: + path: /v0/evi/chat_groups + method: GET + docs: Fetches a paginated list of **Chat Groups**. + pagination: + offset: $request.page_number + results: $response.chat_groups_page + source: + openapi: evi-openapi.json + display-name: List chat_groups + request: + name: ChatGroupsListChatGroupsRequest + query-parameters: + page_number: + type: optional + default: 0 + docs: >- + Specifies the page number to retrieve, enabling pagination. + + + This parameter uses zero-based indexing. For example, setting + `page_number` to 0 retrieves the first page of results (items 0-9 + if `page_size` is 10), setting `page_number` to 1 retrieves the + second page (items 10-19), and so on. Defaults to 0, which + retrieves the first page. + page_size: + type: optional + docs: >- + Specifies the maximum number of results to include per page, + enabling pagination. The value must be between 1 and 100, + inclusive. + + + For example, if `page_size` is set to 10, each page will include + up to 10 items. Defaults to 10. + ascending_order: + type: optional + docs: >- + Specifies the sorting order of the results based on their creation + date. Set to true for ascending order (chronological, with the + oldest records first) and false for descending order + (reverse-chronological, with the newest records first). Defaults + to true. + config_id: + type: optional + docs: >- + The unique identifier for an EVI configuration. + + + Filter Chat Groups to only include Chats that used this + `config_id` in their most recent Chat. + validation: + format: uuid + response: + docs: Success + type: root.ReturnPagedChatGroups + status-code: 200 + errors: + - root.BadRequestError + examples: + - query-parameters: + page_number: 0 + page_size: 1 + ascending_order: true + config_id: 1b60e1a0-cc59-424a-8d2c-189d354db3f3 + response: + body: + page_number: 0 + page_size: 1 + total_pages: 1 + pagination_direction: ASC + chat_groups_page: + - id: 697056f0-6c7e-487d-9bd8-9c19df79f05f + first_start_timestamp: 1721844196397 + most_recent_start_timestamp: 1721861821717 + active: false + most_recent_chat_id: dfdbdd4d-0ddf-418b-8fc4-80a266579d36 + num_chats: 5 + get-chat-group: + path: /v0/evi/chat_groups/{id} + method: GET + docs: >- + Fetches a **ChatGroup** by ID, including a paginated list of **Chats** + associated with the **ChatGroup**. + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Chat Group. Formatted as a UUID. + display-name: Get chat_group + request: + name: ChatGroupsGetChatGroupRequest + query-parameters: + page_size: + type: optional + docs: >- + Specifies the maximum number of results to include per page, + enabling pagination. The value must be between 1 and 100, + inclusive. + + + For example, if `page_size` is set to 10, each page will include + up to 10 items. Defaults to 10. + page_number: + type: optional + default: 0 + docs: >- + Specifies the page number to retrieve, enabling pagination. + + + This parameter uses zero-based indexing. For example, setting + `page_number` to 0 retrieves the first page of results (items 0-9 + if `page_size` is 10), setting `page_number` to 1 retrieves the + second page (items 10-19), and so on. Defaults to 0, which + retrieves the first page. + ascending_order: + type: optional + docs: >- + Specifies the sorting order of the results based on their creation + date. Set to true for ascending order (chronological, with the + oldest records first) and false for descending order + (reverse-chronological, with the newest records first). Defaults + to true. + response: + docs: Success + type: root.ReturnChatGroupPagedChats + status-code: 200 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 697056f0-6c7e-487d-9bd8-9c19df79f05f + query-parameters: + page_number: 0 + page_size: 1 + ascending_order: true + response: + body: + id: 369846cf-6ad5-404d-905e-a8acb5cdfc78 + first_start_timestamp: 1712334213647 + most_recent_start_timestamp: 1712334213647 + num_chats: 1 + page_number: 0 + page_size: 1 + total_pages: 1 + pagination_direction: ASC + chats_page: + - id: 6375d4f8-cd3e-4d6b-b13b-ace66b7c8aaa + chat_group_id: 369846cf-6ad5-404d-905e-a8acb5cdfc78 + status: USER_ENDED + start_timestamp: 1712334213647 + end_timestamp: 1712334332571 + event_count: 0 + metadata: null + config: null + active: false + list-chat-group-events: + path: /v0/evi/chat_groups/{id}/events + method: GET + docs: >- + Fetches a paginated list of **Chat** events associated with a **Chat + Group**. + pagination: + offset: $request.page_number + results: $response.events_page + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Chat Group. Formatted as a UUID. + display-name: List chat events from a specific chat_group + request: + name: ChatGroupsListChatGroupEventsRequest + query-parameters: + page_size: + type: optional + docs: >- + Specifies the maximum number of results to include per page, + enabling pagination. The value must be between 1 and 100, + inclusive. + + + For example, if `page_size` is set to 10, each page will include + up to 10 items. Defaults to 10. + page_number: + type: optional + default: 0 + docs: >- + Specifies the page number to retrieve, enabling pagination. + + + This parameter uses zero-based indexing. For example, setting + `page_number` to 0 retrieves the first page of results (items 0-9 + if `page_size` is 10), setting `page_number` to 1 retrieves the + second page (items 10-19), and so on. Defaults to 0, which + retrieves the first page. + ascending_order: + type: optional + docs: >- + Specifies the sorting order of the results based on their creation + date. Set to true for ascending order (chronological, with the + oldest records first) and false for descending order + (reverse-chronological, with the newest records first). Defaults + to true. + response: + docs: Success + type: root.ReturnChatGroupPagedEvents + status-code: 200 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 697056f0-6c7e-487d-9bd8-9c19df79f05f + query-parameters: + page_number: 0 + page_size: 3 + ascending_order: true + response: + body: + id: 697056f0-6c7e-487d-9bd8-9c19df79f05f + page_number: 0 + page_size: 3 + total_pages: 1 + pagination_direction: ASC + events_page: + - id: 5d44bdbb-49a3-40fb-871d-32bf7e76efe7 + chat_id: 470a49f6-1dec-4afe-8b61-035d3b2d63b0 + timestamp: 1716244940762 + role: SYSTEM + type: SYSTEM_PROMPT + message_text: >- + You are an AI weather assistant providing users with + accurate and up-to-date weather information. Respond to user + queries concisely and clearly. Use simple language and avoid + technical jargon. Provide temperature, precipitation, wind + conditions, and any weather alerts. Include helpful tips if + severe weather is expected. + emotion_features: '' + metadata: '' + - id: 5976ddf6-d093-4bb9-ba60-8f6c25832dde + chat_id: 470a49f6-1dec-4afe-8b61-035d3b2d63b0 + timestamp: 1716244956278 + role: USER + type: USER_MESSAGE + message_text: Hello. + emotion_features: >- + {"Admiration": 0.09906005859375, "Adoration": + 0.12213134765625, "Aesthetic Appreciation": + 0.05035400390625, "Amusement": 0.16552734375, "Anger": + 0.0037384033203125, "Anxiety": 0.010101318359375, "Awe": + 0.058197021484375, "Awkwardness": 0.10552978515625, + "Boredom": 0.1141357421875, "Calmness": 0.115234375, + "Concentration": 0.00444793701171875, "Confusion": + 0.0343017578125, "Contemplation": 0.00812530517578125, + "Contempt": 0.009002685546875, "Contentment": + 0.087158203125, "Craving": 0.00818634033203125, "Desire": + 0.018310546875, "Determination": 0.003238677978515625, + "Disappointment": 0.024169921875, "Disgust": + 0.00702667236328125, "Distress": 0.00936126708984375, + "Doubt": 0.00632476806640625, "Ecstasy": 0.0293731689453125, + "Embarrassment": 0.01800537109375, "Empathic Pain": + 0.0088348388671875, "Entrancement": 0.013397216796875, + "Envy": 0.02557373046875, "Excitement": 0.12109375, "Fear": + 0.004413604736328125, "Guilt": 0.016571044921875, "Horror": + 0.00274658203125, "Interest": 0.2142333984375, "Joy": + 0.29638671875, "Love": 0.16015625, "Nostalgia": + 0.007843017578125, "Pain": 0.007160186767578125, "Pride": + 0.00508880615234375, "Realization": 0.054229736328125, + "Relief": 0.048736572265625, "Romance": 0.026397705078125, + "Sadness": 0.0265350341796875, "Satisfaction": + 0.051361083984375, "Shame": 0.00974273681640625, "Surprise + (negative)": 0.0218963623046875, "Surprise (positive)": + 0.216064453125, "Sympathy": 0.021728515625, "Tiredness": + 0.0173797607421875, "Triumph": 0.004520416259765625} + metadata: >- + {"segments": [{"content": "Hello.", "embedding": + [0.6181640625, 0.1763916015625, -30.921875, 1.2705078125, + 0.927734375, 0.63720703125, 2.865234375, 0.1080322265625, + 0.2978515625, 1.0107421875, 1.34375, 0.74560546875, + 0.416259765625, 0.99462890625, -0.333740234375, + 0.361083984375, -1.388671875, 1.0107421875, 1.3173828125, + 0.55615234375, 0.541015625, -0.1837158203125, 1.697265625, + 0.228515625, 2.087890625, -0.311767578125, + 0.053680419921875, 1.3349609375, 0.95068359375, + 0.00441741943359375, 0.705078125, 1.8916015625, + -0.939453125, 0.93701171875, -0.28955078125, 1.513671875, + 0.5595703125, 1.0126953125, -0.1624755859375, 1.4072265625, + -0.28857421875, -0.4560546875, -0.1500244140625, + -0.1102294921875, -0.222412109375, 0.8779296875, + 1.275390625, 1.6689453125, 0.80712890625, -0.34814453125, + -0.325439453125, 0.412841796875, 0.81689453125, + 0.55126953125, 1.671875, 0.6611328125, 0.7451171875, + 1.50390625, 1.0224609375, -1.671875, 0.7373046875, + 2.1328125, 2.166015625, 0.41015625, -0.127685546875, + 1.9345703125, -4.2734375, 0.332275390625, 0.26171875, + 0.76708984375, 0.2685546875, 0.468017578125, 1.208984375, + -1.517578125, 1.083984375, 0.84814453125, 1.0244140625, + -0.0072174072265625, 1.34375, 1.0712890625, 1.517578125, + -0.52001953125, 0.59228515625, 0.8154296875, -0.951171875, + -0.07757568359375, 1.3330078125, 1.125, 0.61181640625, + 1.494140625, 0.357421875, 1.1796875, 1.482421875, 0.8046875, + 0.1536865234375, 1.8076171875, 0.68115234375, -15.171875, + 1.2294921875, 0.319091796875, 0.499755859375, 1.5771484375, + 0.94677734375, -0.2490234375, 0.88525390625, 3.47265625, + 0.75927734375, 0.71044921875, 1.2333984375, 1.4169921875, + -0.56640625, -1.8095703125, 1.37109375, 0.428955078125, + 1.89453125, -0.39013671875, 0.1734619140625, 1.5595703125, + -1.2294921875, 2.552734375, 0.58349609375, 0.2156982421875, + -0.00984954833984375, -0.6865234375, -0.0272979736328125, + -0.2264404296875, 2.853515625, 1.3896484375, 0.52978515625, + 0.783203125, 3.0390625, 0.75537109375, 0.219970703125, + 0.384521484375, 0.385986328125, 2.0546875, + -0.10443115234375, 1.5146484375, 1.4296875, 1.9716796875, + 1.1318359375, 0.31591796875, 0.338623046875, 1.654296875, + -0.88037109375, -0.21484375, 1.45703125, 1.0380859375, + -0.52294921875, -0.47802734375, 0.1650390625, 1.2392578125, + -1.138671875, 0.56787109375, 1.318359375, 0.4287109375, + 0.1981201171875, 2.4375, 0.281005859375, 0.89404296875, + -0.1552734375, 0.6474609375, -0.08331298828125, + 0.00740814208984375, -0.045501708984375, -0.578125, + 2.02734375, 0.59228515625, 0.35693359375, 1.2919921875, + 1.22265625, 1.0537109375, 0.145263671875, 1.05859375, + -0.369140625, 0.207275390625, 0.78857421875, 0.599609375, + 0.99072265625, 0.24462890625, 1.26953125, 0.08404541015625, + 1.349609375, 0.73291015625, 1.3212890625, 0.388916015625, + 1.0869140625, 0.9931640625, -1.5673828125, 0.0462646484375, + 0.650390625, 0.253662109375, 0.58251953125, 1.8134765625, + 0.8642578125, 2.591796875, 0.7314453125, 0.85986328125, + 0.5615234375, 0.9296875, 0.04144287109375, 1.66015625, + 1.99609375, 1.171875, 1.181640625, 1.5126953125, + 0.0224456787109375, 0.58349609375, -1.4931640625, + 0.81884765625, 0.732421875, -0.6455078125, -0.62451171875, + 1.7802734375, 0.01526641845703125, -0.423095703125, + 0.461669921875, 4.87890625, 1.2392578125, -0.6953125, + 0.6689453125, 0.62451171875, -1.521484375, 1.7685546875, + 0.810546875, 0.65478515625, 0.26123046875, 1.6396484375, + 0.87548828125, 1.7353515625, 2.046875, 1.5634765625, + 0.69384765625, 1.375, 0.8916015625, 1.0107421875, + 0.1304931640625, 2.009765625, 0.06402587890625, + -0.08428955078125, 0.04351806640625, -1.7529296875, + 2.02734375, 3.521484375, 0.404541015625, 1.6337890625, + -0.276611328125, 0.8837890625, -0.1287841796875, + 0.91064453125, 0.8193359375, 0.701171875, 0.036529541015625, + 1.26171875, 1.0478515625, -0.1422119140625, 1.0634765625, + 0.61083984375, 1.3505859375, 1.208984375, 0.57275390625, + 1.3623046875, 2.267578125, 0.484375, 0.9150390625, + 0.56787109375, -0.70068359375, 0.27587890625, + -0.70654296875, 0.8466796875, 0.57568359375, 1.6162109375, + 0.87939453125, 2.248046875, -0.5458984375, 1.7744140625, + 1.328125, 1.232421875, 0.6806640625, 0.9365234375, + 1.052734375, -1.08984375, 1.8330078125, -0.4033203125, + 1.0673828125, 0.297607421875, 1.5703125, 1.67578125, + 1.34765625, 2.8203125, 2.025390625, -0.48583984375, + 0.7626953125, 0.01007843017578125, 1.435546875, + 0.007205963134765625, 0.05157470703125, -0.9853515625, + 0.26708984375, 1.16796875, 1.2041015625, 1.99609375, + -0.07916259765625, 1.244140625, -0.32080078125, + 0.6748046875, 0.419921875, 1.3212890625, 1.291015625, + 0.599609375, 0.0550537109375, 0.9599609375, 0.93505859375, + 0.111083984375, 1.302734375, 0.0833740234375, 2.244140625, + 1.25390625, 1.6015625, 0.58349609375, 1.7568359375, + -0.263427734375, -0.019866943359375, -0.24658203125, + -0.1871337890625, 0.927734375, 0.62255859375, + 0.275146484375, 0.79541015625, 1.1796875, 1.1767578125, + -0.26123046875, -0.268310546875, 1.8994140625, 1.318359375, + 2.1875, 0.2469482421875, 1.41015625, 0.03973388671875, + 1.2685546875, 1.1025390625, 0.9560546875, 0.865234375, + -1.92578125, 1.154296875, 0.389892578125, 1.130859375, + 0.95947265625, 0.72314453125, 2.244140625, + 0.048553466796875, 0.626953125, 0.42919921875, + 0.82275390625, 0.311767578125, -0.320556640625, + 0.01041412353515625, 0.1483154296875, 0.10809326171875, + -0.3173828125, 1.1337890625, -0.8642578125, 1.4033203125, + 0.048828125, 1.1787109375, 0.98779296875, 1.818359375, + 1.1552734375, 0.6015625, 1.2392578125, -1.2685546875, + 0.39208984375, 0.83251953125, 0.224365234375, + 0.0019989013671875, 0.87548828125, 1.6572265625, + 1.107421875, 0.434814453125, 1.8251953125, 0.442626953125, + 1.2587890625, 0.09320068359375, -0.896484375, 1.8017578125, + 1.451171875, -0.0755615234375, 0.6083984375, 2.06640625, + 0.673828125, -0.33740234375, 0.192138671875, 0.21435546875, + 0.80224609375, -1.490234375, 0.9501953125, 0.86083984375, + -0.40283203125, 4.109375, 2.533203125, 1.2529296875, + 0.8271484375, 0.225830078125, 1.0478515625, -1.9755859375, + 0.841796875, 0.392822265625, 0.525390625, 0.33935546875, + -0.79443359375, 0.71630859375, 0.97998046875, + -0.175537109375, 0.97705078125, 1.705078125, 0.29638671875, + 0.68359375, 0.54150390625, 0.435791015625, 0.99755859375, + -0.369140625, 1.009765625, -0.140380859375, 0.426513671875, + 0.189697265625, 1.8193359375, 1.1201171875, -0.5009765625, + -0.331298828125, 0.759765625, -0.09442138671875, 0.74609375, + -1.947265625, 1.3544921875, -3.935546875, 2.544921875, + 1.359375, 0.1363525390625, 0.79296875, 0.79931640625, + -0.3466796875, 1.1396484375, -0.33447265625, 2.0078125, + -0.241455078125, 0.6318359375, 0.365234375, 0.296142578125, + 0.830078125, 1.0458984375, 0.5830078125, 0.61572265625, + 14.0703125, -2.0078125, -0.381591796875, 1.228515625, + 0.08282470703125, -0.67822265625, -0.04339599609375, + 0.397216796875, 0.1656494140625, 0.137451171875, + 0.244873046875, 1.1611328125, -1.3818359375, 0.8447265625, + 1.171875, 0.36328125, 0.252685546875, 0.1197509765625, + 0.232177734375, -0.020172119140625, 0.64404296875, + -0.01100921630859375, -1.9267578125, 0.222412109375, + 0.56005859375, 1.3046875, 1.1630859375, 1.197265625, + 1.02734375, 1.6806640625, -0.043731689453125, 1.4697265625, + 0.81201171875, 1.5390625, 1.240234375, -0.7353515625, + 1.828125, 1.115234375, 1.931640625, -0.517578125, + 0.77880859375, 1.0546875, 0.95361328125, 3.42578125, + 0.0160369873046875, 0.875, 0.56005859375, 1.2421875, + 1.986328125, 1.4814453125, 0.0948486328125, 1.115234375, + 0.00665283203125, 2.09375, 0.3544921875, -0.52783203125, + 1.2099609375, 0.45068359375, 0.65625, 0.1112060546875, + 1.0751953125, -0.9521484375, -0.30029296875, 1.4462890625, + 2.046875, 3.212890625, 1.68359375, 1.07421875, + -0.5263671875, 0.74560546875, 1.37890625, 0.15283203125, + 0.2440185546875, 0.62646484375, -0.1280517578125, + 0.7646484375, -0.515625, -0.35693359375, 1.2958984375, + 0.96923828125, 0.58935546875, 1.3701171875, 1.0673828125, + 0.2337646484375, 0.93115234375, 0.66357421875, 6.0, + 1.1025390625, -0.51708984375, -0.38330078125, 0.7197265625, + 0.246826171875, -0.45166015625, 1.9521484375, 0.5546875, + 0.08807373046875, 0.18505859375, 0.8857421875, + -0.57177734375, 0.251708984375, 0.234375, 2.57421875, + 0.9599609375, 0.5029296875, 0.10382080078125, + 0.08331298828125, 0.66748046875, -0.349609375, 1.287109375, + 0.259765625, 2.015625, 2.828125, -0.3095703125, + -0.164306640625, -0.3408203125, 0.486572265625, + 0.8466796875, 1.9130859375, 0.09088134765625, 0.66552734375, + 0.00972747802734375, -0.83154296875, 1.755859375, + 0.654296875, 0.173828125, 0.27587890625, -0.47607421875, + -0.264404296875, 0.7529296875, 0.6533203125, 0.7275390625, + 0.499755859375, 0.833984375, -0.44775390625, -0.05078125, + -0.454833984375, 0.75439453125, 0.68505859375, + 0.210693359375, -0.283935546875, -0.53564453125, + 0.96826171875, 0.861328125, -3.33984375, -0.26171875, + 0.77734375, 0.26513671875, -0.14111328125, -0.042236328125, + -0.84814453125, 0.2137451171875, 0.94921875, 0.65185546875, + -0.5380859375, 0.1529541015625, -0.360595703125, + -0.0333251953125, -0.69189453125, 0.8974609375, 0.7109375, + 0.81494140625, -0.259521484375, 1.1904296875, 0.62158203125, + 1.345703125, 0.89404296875, 0.70556640625, 1.0673828125, + 1.392578125, 0.5068359375, 0.962890625, 0.736328125, + 1.55078125, 0.50390625, -0.398681640625, 2.361328125, + 0.345947265625, -0.61962890625, 0.330078125, 0.75439453125, + -0.673828125, -0.2379150390625, 1.5673828125, 1.369140625, + 0.1119384765625, -0.1834716796875, 1.4599609375, + -0.77587890625, 0.5556640625, 0.09954833984375, + 0.0285186767578125, 0.58935546875, -0.501953125, + 0.212890625, 0.02679443359375, 0.1715087890625, + 0.03466796875, -0.564453125, 2.029296875, 2.45703125, + -0.72216796875, 2.138671875, 0.50830078125, + -0.09356689453125, 0.230224609375, 1.6943359375, + 1.5126953125, 0.39453125, 0.411376953125, 1.07421875, + -0.8046875, 0.51416015625, 0.2271728515625, -0.283447265625, + 0.38427734375, 0.73388671875, 0.6962890625, 1.4990234375, + 0.02813720703125, 0.40478515625, 1.2451171875, 1.1162109375, + -5.5703125, 0.76171875, 0.322021484375, 1.0361328125, + 1.197265625, 0.1163330078125, 0.2425537109375, 1.5595703125, + 1.5791015625, -0.0921630859375, 0.484619140625, + 1.9052734375, 5.31640625, 1.6337890625, 0.95947265625, + -0.1751708984375, 0.466552734375, 0.8330078125, 1.03125, + 0.2044677734375, 0.31298828125, -1.1220703125, 0.5517578125, + 0.93505859375, 0.45166015625, 1.951171875, 0.65478515625, + 1.30859375, 1.0859375, 0.56494140625, 2.322265625, + 0.242919921875, 1.81640625, -0.469970703125, -0.841796875, + 0.90869140625, 1.5361328125, 0.923828125, 1.0595703125, + 0.356689453125, -0.46142578125, 2.134765625, 1.3037109375, + -0.32373046875, -9.2265625, 0.4521484375, 0.88037109375, + -0.53955078125, 0.96484375, 0.7705078125, 0.84521484375, + 1.580078125, -0.1448974609375, 0.7607421875, 1.0166015625, + -0.086669921875, 1.611328125, 0.05938720703125, 0.5078125, + 0.8427734375, 2.431640625, 0.66357421875, 3.203125, + 0.132080078125, 0.461181640625, 0.779296875, 1.9482421875, + 1.8720703125, 0.845703125, -1.3837890625, -0.138916015625, + 0.35546875, 0.2457275390625, 0.75341796875, 1.828125, + 1.4169921875, 0.60791015625, 1.0068359375, 1.109375, + 0.484130859375, -0.302001953125, 0.4951171875, 0.802734375, + 1.9482421875, 0.916015625, 0.1646728515625, 2.599609375, + 1.7177734375, -0.2374267578125, 0.98046875, 0.39306640625, + -1.1396484375, 1.6533203125, 0.375244140625], "scores": + [0.09906005859375, 0.12213134765625, 0.05035400390625, + 0.16552734375, 0.0037384033203125, 0.010101318359375, + 0.058197021484375, 0.10552978515625, 0.1141357421875, + 0.115234375, 0.00444793701171875, 0.00812530517578125, + 0.0343017578125, 0.009002685546875, 0.087158203125, + 0.00818634033203125, 0.003238677978515625, 0.024169921875, + 0.00702667236328125, 0.00936126708984375, + 0.00632476806640625, 0.0293731689453125, 0.01800537109375, + 0.0088348388671875, 0.013397216796875, 0.02557373046875, + 0.12109375, 0.004413604736328125, 0.016571044921875, + 0.00274658203125, 0.2142333984375, 0.29638671875, + 0.16015625, 0.007843017578125, 0.007160186767578125, + 0.00508880615234375, 0.054229736328125, 0.048736572265625, + 0.026397705078125, 0.0265350341796875, 0.051361083984375, + 0.018310546875, 0.00974273681640625, 0.0218963623046875, + 0.216064453125, 0.021728515625, 0.0173797607421875, + 0.004520416259765625], "stoks": [52, 52, 52, 52, 52, 41, 41, + 374, 303, 303, 303, 427], "time": {"begin_ms": 640, + "end_ms": 1140}}]} + - id: 7645a0d1-2e64-410d-83a8-b96040432e9a + chat_id: 470a49f6-1dec-4afe-8b61-035d3b2d63b0 + timestamp: 1716244957031 + role: AGENT + type: AGENT_MESSAGE + message_text: Hello! + emotion_features: >- + {"Admiration": 0.044921875, "Adoration": 0.0253753662109375, + "Aesthetic Appreciation": 0.03265380859375, "Amusement": + 0.118408203125, "Anger": 0.06719970703125, "Anxiety": + 0.0411376953125, "Awe": 0.03802490234375, "Awkwardness": + 0.056549072265625, "Boredom": 0.04217529296875, "Calmness": + 0.08709716796875, "Concentration": 0.070556640625, + "Confusion": 0.06964111328125, "Contemplation": + 0.0343017578125, "Contempt": 0.037689208984375, + "Contentment": 0.059417724609375, "Craving": + 0.01132965087890625, "Desire": 0.01406097412109375, + "Determination": 0.1143798828125, "Disappointment": + 0.051177978515625, "Disgust": 0.028594970703125, "Distress": + 0.054901123046875, "Doubt": 0.04638671875, "Ecstasy": + 0.0258026123046875, "Embarrassment": 0.0222015380859375, + "Empathic Pain": 0.015777587890625, "Entrancement": + 0.0160980224609375, "Envy": 0.0163421630859375, + "Excitement": 0.129638671875, "Fear": 0.03125, "Guilt": + 0.01483917236328125, "Horror": 0.0194549560546875, + "Interest": 0.1341552734375, "Joy": 0.0738525390625, "Love": + 0.0216522216796875, "Nostalgia": 0.0210418701171875, "Pain": + 0.020721435546875, "Pride": 0.05499267578125, "Realization": + 0.0728759765625, "Relief": 0.04052734375, "Romance": + 0.0129241943359375, "Sadness": 0.0254669189453125, + "Satisfaction": 0.07159423828125, "Shame": 0.01495361328125, + "Surprise (negative)": 0.05560302734375, "Surprise + (positive)": 0.07965087890625, "Sympathy": + 0.022247314453125, "Tiredness": 0.0194549560546875, + "Triumph": 0.04107666015625} + metadata: '' + get-audio: + path: /v0/evi/chat_groups/{id}/audio + method: GET + docs: >- + Fetches a paginated list of audio for each **Chat** within the specified + **Chat Group**. For more details, see our guide on audio reconstruction + [here](/docs/speech-to-speech-evi/faq#can-i-access-the-audio-of-previous-conversations-with-evi). + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Chat Group. Formatted as a UUID. + display-name: Get chat group audio + request: + name: ChatGroupsGetAudioRequest + query-parameters: + page_number: + type: optional + default: 0 + docs: >- + Specifies the page number to retrieve, enabling pagination. + + + This parameter uses zero-based indexing. For example, setting + `page_number` to 0 retrieves the first page of results (items 0-9 + if `page_size` is 10), setting `page_number` to 1 retrieves the + second page (items 10-19), and so on. Defaults to 0, which + retrieves the first page. + page_size: + type: optional + docs: >- + Specifies the maximum number of results to include per page, + enabling pagination. The value must be between 1 and 100, + inclusive. + + + For example, if `page_size` is set to 10, each page will include + up to 10 items. Defaults to 10. + ascending_order: + type: optional + docs: >- + Specifies the sorting order of the results based on their creation + date. Set to true for ascending order (chronological, with the + oldest records first) and false for descending order + (reverse-chronological, with the newest records first). Defaults + to true. + response: + docs: Success + type: root.ReturnChatGroupPagedAudioReconstructions + status-code: 200 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 369846cf-6ad5-404d-905e-a8acb5cdfc78 + query-parameters: + page_number: 0 + page_size: 10 + ascending_order: true + response: + body: + id: 369846cf-6ad5-404d-905e-a8acb5cdfc78 + user_id: e6235940-cfda-3988-9147-ff531627cf42 + num_chats: 1 + page_number: 0 + page_size: 10 + total_pages: 1 + pagination_direction: ASC + audio_reconstructions_page: + - id: 470a49f6-1dec-4afe-8b61-035d3b2d63b0 + user_id: e6235940-cfda-3988-9147-ff531627cf42 + status: COMPLETE + filename: >- + e6235940-cfda-3988-9147-ff531627cf42/470a49f6-1dec-4afe-8b61-035d3b2d63b0/reconstructed_audio.mp4 + modified_at: 1729875432555 + signed_audio_url: https://storage.googleapis.com/...etc. + signed_url_expiration_timestamp_millis: 1730232816964 + source: + openapi: evi-openapi.json diff --git a/.mock/definition/empathic-voice/chatWebhooks.yml b/.mock/definition/empathic-voice/chatWebhooks.yml new file mode 100644 index 00000000..cc3c858c --- /dev/null +++ b/.mock/definition/empathic-voice/chatWebhooks.yml @@ -0,0 +1,58 @@ +imports: + root: __package__.yml +webhooks: + chatEnded: + audiences: [] + method: POST + display-name: Chat Ended + headers: {} + payload: root.WebhookEventChatEnded + examples: + - payload: + chat_group_id: chat_group_id + chat_id: chat_id + config_id: null + caller_number: null + custom_session_id: null + duration_seconds: 1 + end_reason: ACTIVE + end_time: 1 + docs: Sent when an EVI chat ends. + chatStarted: + audiences: [] + method: POST + display-name: Chat Started + headers: {} + payload: root.WebhookEventChatStarted + examples: + - payload: + chat_group_id: chat_group_id + chat_id: chat_id + config_id: null + caller_number: null + chat_start_type: new_chat_group + custom_session_id: null + start_time: 1 + docs: Sent when an EVI chat is started. + toolCall: + audiences: [] + method: POST + display-name: Tool Call + headers: {} + payload: root.WebhookEventToolCall + examples: + - payload: + chat_group_id: chat_group_id + chat_id: chat_id + config_id: null + caller_number: null + custom_session_id: null + timestamp: 1 + tool_call_message: + custom_session_id: null + name: name + parameters: parameters + response_required: true + tool_call_id: tool_call_id + type: tool_call + docs: Sent when EVI triggers a tool call diff --git a/.mock/definition/empathic-voice/chats.yml b/.mock/definition/empathic-voice/chats.yml new file mode 100644 index 00000000..7ceb5503 --- /dev/null +++ b/.mock/definition/empathic-voice/chats.yml @@ -0,0 +1,503 @@ +imports: + root: __package__.yml +service: + auth: false + base-path: '' + endpoints: + list-chats: + path: /v0/evi/chats + method: GET + docs: Fetches a paginated list of **Chats**. + pagination: + offset: $request.page_number + results: $response.chats_page + source: + openapi: evi-openapi.json + display-name: List chats + request: + name: ChatsListChatsRequest + query-parameters: + page_number: + type: optional + default: 0 + docs: >- + Specifies the page number to retrieve, enabling pagination. + + + This parameter uses zero-based indexing. For example, setting + `page_number` to 0 retrieves the first page of results (items 0-9 + if `page_size` is 10), setting `page_number` to 1 retrieves the + second page (items 10-19), and so on. Defaults to 0, which + retrieves the first page. + page_size: + type: optional + docs: >- + Specifies the maximum number of results to include per page, + enabling pagination. The value must be between 1 and 100, + inclusive. + + + For example, if `page_size` is set to 10, each page will include + up to 10 items. Defaults to 10. + ascending_order: + type: optional + docs: >- + Specifies the sorting order of the results based on their creation + date. Set to true for ascending order (chronological, with the + oldest records first) and false for descending order + (reverse-chronological, with the newest records first). Defaults + to true. + config_id: + type: optional + docs: Filter to only include chats that used this config. + validation: + format: uuid + response: + docs: Success + type: root.ReturnPagedChats + status-code: 200 + errors: + - root.BadRequestError + examples: + - query-parameters: + page_number: 0 + page_size: 1 + ascending_order: true + response: + body: + page_number: 0 + page_size: 1 + total_pages: 1 + pagination_direction: ASC + chats_page: + - id: 470a49f6-1dec-4afe-8b61-035d3b2d63b0 + chat_group_id: 9fc18597-3567-42d5-94d6-935bde84bf2f + status: USER_ENDED + start_timestamp: 1716244940648 + end_timestamp: 1716244958546 + event_count: 3 + metadata: '' + config: + id: 1b60e1a0-cc59-424a-8d2c-189d354db3f3 + version: 0 + list-chat-events: + path: /v0/evi/chats/{id} + method: GET + docs: Fetches a paginated list of **Chat** events. + pagination: + offset: $request.page_number + results: $response.events_page + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Chat. Formatted as a UUID. + display-name: List chat events + request: + name: ChatsListChatEventsRequest + query-parameters: + page_size: + type: optional + docs: >- + Specifies the maximum number of results to include per page, + enabling pagination. The value must be between 1 and 100, + inclusive. + + + For example, if `page_size` is set to 10, each page will include + up to 10 items. Defaults to 10. + page_number: + type: optional + default: 0 + docs: >- + Specifies the page number to retrieve, enabling pagination. + + + This parameter uses zero-based indexing. For example, setting + `page_number` to 0 retrieves the first page of results (items 0-9 + if `page_size` is 10), setting `page_number` to 1 retrieves the + second page (items 10-19), and so on. Defaults to 0, which + retrieves the first page. + ascending_order: + type: optional + docs: >- + Specifies the sorting order of the results based on their creation + date. Set to true for ascending order (chronological, with the + oldest records first) and false for descending order + (reverse-chronological, with the newest records first). Defaults + to true. + response: + docs: Success + type: root.ReturnChatPagedEvents + status-code: 200 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 470a49f6-1dec-4afe-8b61-035d3b2d63b0 + query-parameters: + page_number: 0 + page_size: 3 + ascending_order: true + response: + body: + id: 470a49f6-1dec-4afe-8b61-035d3b2d63b0 + chat_group_id: 9fc18597-3567-42d5-94d6-935bde84bf2f + status: USER_ENDED + start_timestamp: 1716244940648 + pagination_direction: ASC + events_page: + - id: 5d44bdbb-49a3-40fb-871d-32bf7e76efe7 + chat_id: 470a49f6-1dec-4afe-8b61-035d3b2d63b0 + timestamp: 1716244940762 + role: SYSTEM + type: SYSTEM_PROMPT + message_text: >- + You are an AI weather assistant providing users with + accurate and up-to-date weather information. Respond to user + queries concisely and clearly. Use simple language and avoid + technical jargon. Provide temperature, precipitation, wind + conditions, and any weather alerts. Include helpful tips if + severe weather is expected. + emotion_features: '' + metadata: '' + - id: 5976ddf6-d093-4bb9-ba60-8f6c25832dde + chat_id: 470a49f6-1dec-4afe-8b61-035d3b2d63b0 + timestamp: 1716244956278 + role: USER + type: USER_MESSAGE + message_text: Hello. + emotion_features: >- + {"Admiration": 0.09906005859375, "Adoration": + 0.12213134765625, "Aesthetic Appreciation": + 0.05035400390625, "Amusement": 0.16552734375, "Anger": + 0.0037384033203125, "Anxiety": 0.010101318359375, "Awe": + 0.058197021484375, "Awkwardness": 0.10552978515625, + "Boredom": 0.1141357421875, "Calmness": 0.115234375, + "Concentration": 0.00444793701171875, "Confusion": + 0.0343017578125, "Contemplation": 0.00812530517578125, + "Contempt": 0.009002685546875, "Contentment": + 0.087158203125, "Craving": 0.00818634033203125, "Desire": + 0.018310546875, "Determination": 0.003238677978515625, + "Disappointment": 0.024169921875, "Disgust": + 0.00702667236328125, "Distress": 0.00936126708984375, + "Doubt": 0.00632476806640625, "Ecstasy": 0.0293731689453125, + "Embarrassment": 0.01800537109375, "Empathic Pain": + 0.0088348388671875, "Entrancement": 0.013397216796875, + "Envy": 0.02557373046875, "Excitement": 0.12109375, "Fear": + 0.004413604736328125, "Guilt": 0.016571044921875, "Horror": + 0.00274658203125, "Interest": 0.2142333984375, "Joy": + 0.29638671875, "Love": 0.16015625, "Nostalgia": + 0.007843017578125, "Pain": 0.007160186767578125, "Pride": + 0.00508880615234375, "Realization": 0.054229736328125, + "Relief": 0.048736572265625, "Romance": 0.026397705078125, + "Sadness": 0.0265350341796875, "Satisfaction": + 0.051361083984375, "Shame": 0.00974273681640625, "Surprise + (negative)": 0.0218963623046875, "Surprise (positive)": + 0.216064453125, "Sympathy": 0.021728515625, "Tiredness": + 0.0173797607421875, "Triumph": 0.004520416259765625} + metadata: >- + {"segments": [{"content": "Hello.", "embedding": + [0.6181640625, 0.1763916015625, -30.921875, 1.2705078125, + 0.927734375, 0.63720703125, 2.865234375, 0.1080322265625, + 0.2978515625, 1.0107421875, 1.34375, 0.74560546875, + 0.416259765625, 0.99462890625, -0.333740234375, + 0.361083984375, -1.388671875, 1.0107421875, 1.3173828125, + 0.55615234375, 0.541015625, -0.1837158203125, 1.697265625, + 0.228515625, 2.087890625, -0.311767578125, + 0.053680419921875, 1.3349609375, 0.95068359375, + 0.00441741943359375, 0.705078125, 1.8916015625, + -0.939453125, 0.93701171875, -0.28955078125, 1.513671875, + 0.5595703125, 1.0126953125, -0.1624755859375, 1.4072265625, + -0.28857421875, -0.4560546875, -0.1500244140625, + -0.1102294921875, -0.222412109375, 0.8779296875, + 1.275390625, 1.6689453125, 0.80712890625, -0.34814453125, + -0.325439453125, 0.412841796875, 0.81689453125, + 0.55126953125, 1.671875, 0.6611328125, 0.7451171875, + 1.50390625, 1.0224609375, -1.671875, 0.7373046875, + 2.1328125, 2.166015625, 0.41015625, -0.127685546875, + 1.9345703125, -4.2734375, 0.332275390625, 0.26171875, + 0.76708984375, 0.2685546875, 0.468017578125, 1.208984375, + -1.517578125, 1.083984375, 0.84814453125, 1.0244140625, + -0.0072174072265625, 1.34375, 1.0712890625, 1.517578125, + -0.52001953125, 0.59228515625, 0.8154296875, -0.951171875, + -0.07757568359375, 1.3330078125, 1.125, 0.61181640625, + 1.494140625, 0.357421875, 1.1796875, 1.482421875, 0.8046875, + 0.1536865234375, 1.8076171875, 0.68115234375, -15.171875, + 1.2294921875, 0.319091796875, 0.499755859375, 1.5771484375, + 0.94677734375, -0.2490234375, 0.88525390625, 3.47265625, + 0.75927734375, 0.71044921875, 1.2333984375, 1.4169921875, + -0.56640625, -1.8095703125, 1.37109375, 0.428955078125, + 1.89453125, -0.39013671875, 0.1734619140625, 1.5595703125, + -1.2294921875, 2.552734375, 0.58349609375, 0.2156982421875, + -0.00984954833984375, -0.6865234375, -0.0272979736328125, + -0.2264404296875, 2.853515625, 1.3896484375, 0.52978515625, + 0.783203125, 3.0390625, 0.75537109375, 0.219970703125, + 0.384521484375, 0.385986328125, 2.0546875, + -0.10443115234375, 1.5146484375, 1.4296875, 1.9716796875, + 1.1318359375, 0.31591796875, 0.338623046875, 1.654296875, + -0.88037109375, -0.21484375, 1.45703125, 1.0380859375, + -0.52294921875, -0.47802734375, 0.1650390625, 1.2392578125, + -1.138671875, 0.56787109375, 1.318359375, 0.4287109375, + 0.1981201171875, 2.4375, 0.281005859375, 0.89404296875, + -0.1552734375, 0.6474609375, -0.08331298828125, + 0.00740814208984375, -0.045501708984375, -0.578125, + 2.02734375, 0.59228515625, 0.35693359375, 1.2919921875, + 1.22265625, 1.0537109375, 0.145263671875, 1.05859375, + -0.369140625, 0.207275390625, 0.78857421875, 0.599609375, + 0.99072265625, 0.24462890625, 1.26953125, 0.08404541015625, + 1.349609375, 0.73291015625, 1.3212890625, 0.388916015625, + 1.0869140625, 0.9931640625, -1.5673828125, 0.0462646484375, + 0.650390625, 0.253662109375, 0.58251953125, 1.8134765625, + 0.8642578125, 2.591796875, 0.7314453125, 0.85986328125, + 0.5615234375, 0.9296875, 0.04144287109375, 1.66015625, + 1.99609375, 1.171875, 1.181640625, 1.5126953125, + 0.0224456787109375, 0.58349609375, -1.4931640625, + 0.81884765625, 0.732421875, -0.6455078125, -0.62451171875, + 1.7802734375, 0.01526641845703125, -0.423095703125, + 0.461669921875, 4.87890625, 1.2392578125, -0.6953125, + 0.6689453125, 0.62451171875, -1.521484375, 1.7685546875, + 0.810546875, 0.65478515625, 0.26123046875, 1.6396484375, + 0.87548828125, 1.7353515625, 2.046875, 1.5634765625, + 0.69384765625, 1.375, 0.8916015625, 1.0107421875, + 0.1304931640625, 2.009765625, 0.06402587890625, + -0.08428955078125, 0.04351806640625, -1.7529296875, + 2.02734375, 3.521484375, 0.404541015625, 1.6337890625, + -0.276611328125, 0.8837890625, -0.1287841796875, + 0.91064453125, 0.8193359375, 0.701171875, 0.036529541015625, + 1.26171875, 1.0478515625, -0.1422119140625, 1.0634765625, + 0.61083984375, 1.3505859375, 1.208984375, 0.57275390625, + 1.3623046875, 2.267578125, 0.484375, 0.9150390625, + 0.56787109375, -0.70068359375, 0.27587890625, + -0.70654296875, 0.8466796875, 0.57568359375, 1.6162109375, + 0.87939453125, 2.248046875, -0.5458984375, 1.7744140625, + 1.328125, 1.232421875, 0.6806640625, 0.9365234375, + 1.052734375, -1.08984375, 1.8330078125, -0.4033203125, + 1.0673828125, 0.297607421875, 1.5703125, 1.67578125, + 1.34765625, 2.8203125, 2.025390625, -0.48583984375, + 0.7626953125, 0.01007843017578125, 1.435546875, + 0.007205963134765625, 0.05157470703125, -0.9853515625, + 0.26708984375, 1.16796875, 1.2041015625, 1.99609375, + -0.07916259765625, 1.244140625, -0.32080078125, + 0.6748046875, 0.419921875, 1.3212890625, 1.291015625, + 0.599609375, 0.0550537109375, 0.9599609375, 0.93505859375, + 0.111083984375, 1.302734375, 0.0833740234375, 2.244140625, + 1.25390625, 1.6015625, 0.58349609375, 1.7568359375, + -0.263427734375, -0.019866943359375, -0.24658203125, + -0.1871337890625, 0.927734375, 0.62255859375, + 0.275146484375, 0.79541015625, 1.1796875, 1.1767578125, + -0.26123046875, -0.268310546875, 1.8994140625, 1.318359375, + 2.1875, 0.2469482421875, 1.41015625, 0.03973388671875, + 1.2685546875, 1.1025390625, 0.9560546875, 0.865234375, + -1.92578125, 1.154296875, 0.389892578125, 1.130859375, + 0.95947265625, 0.72314453125, 2.244140625, + 0.048553466796875, 0.626953125, 0.42919921875, + 0.82275390625, 0.311767578125, -0.320556640625, + 0.01041412353515625, 0.1483154296875, 0.10809326171875, + -0.3173828125, 1.1337890625, -0.8642578125, 1.4033203125, + 0.048828125, 1.1787109375, 0.98779296875, 1.818359375, + 1.1552734375, 0.6015625, 1.2392578125, -1.2685546875, + 0.39208984375, 0.83251953125, 0.224365234375, + 0.0019989013671875, 0.87548828125, 1.6572265625, + 1.107421875, 0.434814453125, 1.8251953125, 0.442626953125, + 1.2587890625, 0.09320068359375, -0.896484375, 1.8017578125, + 1.451171875, -0.0755615234375, 0.6083984375, 2.06640625, + 0.673828125, -0.33740234375, 0.192138671875, 0.21435546875, + 0.80224609375, -1.490234375, 0.9501953125, 0.86083984375, + -0.40283203125, 4.109375, 2.533203125, 1.2529296875, + 0.8271484375, 0.225830078125, 1.0478515625, -1.9755859375, + 0.841796875, 0.392822265625, 0.525390625, 0.33935546875, + -0.79443359375, 0.71630859375, 0.97998046875, + -0.175537109375, 0.97705078125, 1.705078125, 0.29638671875, + 0.68359375, 0.54150390625, 0.435791015625, 0.99755859375, + -0.369140625, 1.009765625, -0.140380859375, 0.426513671875, + 0.189697265625, 1.8193359375, 1.1201171875, -0.5009765625, + -0.331298828125, 0.759765625, -0.09442138671875, 0.74609375, + -1.947265625, 1.3544921875, -3.935546875, 2.544921875, + 1.359375, 0.1363525390625, 0.79296875, 0.79931640625, + -0.3466796875, 1.1396484375, -0.33447265625, 2.0078125, + -0.241455078125, 0.6318359375, 0.365234375, 0.296142578125, + 0.830078125, 1.0458984375, 0.5830078125, 0.61572265625, + 14.0703125, -2.0078125, -0.381591796875, 1.228515625, + 0.08282470703125, -0.67822265625, -0.04339599609375, + 0.397216796875, 0.1656494140625, 0.137451171875, + 0.244873046875, 1.1611328125, -1.3818359375, 0.8447265625, + 1.171875, 0.36328125, 0.252685546875, 0.1197509765625, + 0.232177734375, -0.020172119140625, 0.64404296875, + -0.01100921630859375, -1.9267578125, 0.222412109375, + 0.56005859375, 1.3046875, 1.1630859375, 1.197265625, + 1.02734375, 1.6806640625, -0.043731689453125, 1.4697265625, + 0.81201171875, 1.5390625, 1.240234375, -0.7353515625, + 1.828125, 1.115234375, 1.931640625, -0.517578125, + 0.77880859375, 1.0546875, 0.95361328125, 3.42578125, + 0.0160369873046875, 0.875, 0.56005859375, 1.2421875, + 1.986328125, 1.4814453125, 0.0948486328125, 1.115234375, + 0.00665283203125, 2.09375, 0.3544921875, -0.52783203125, + 1.2099609375, 0.45068359375, 0.65625, 0.1112060546875, + 1.0751953125, -0.9521484375, -0.30029296875, 1.4462890625, + 2.046875, 3.212890625, 1.68359375, 1.07421875, + -0.5263671875, 0.74560546875, 1.37890625, 0.15283203125, + 0.2440185546875, 0.62646484375, -0.1280517578125, + 0.7646484375, -0.515625, -0.35693359375, 1.2958984375, + 0.96923828125, 0.58935546875, 1.3701171875, 1.0673828125, + 0.2337646484375, 0.93115234375, 0.66357421875, 6.0, + 1.1025390625, -0.51708984375, -0.38330078125, 0.7197265625, + 0.246826171875, -0.45166015625, 1.9521484375, 0.5546875, + 0.08807373046875, 0.18505859375, 0.8857421875, + -0.57177734375, 0.251708984375, 0.234375, 2.57421875, + 0.9599609375, 0.5029296875, 0.10382080078125, + 0.08331298828125, 0.66748046875, -0.349609375, 1.287109375, + 0.259765625, 2.015625, 2.828125, -0.3095703125, + -0.164306640625, -0.3408203125, 0.486572265625, + 0.8466796875, 1.9130859375, 0.09088134765625, 0.66552734375, + 0.00972747802734375, -0.83154296875, 1.755859375, + 0.654296875, 0.173828125, 0.27587890625, -0.47607421875, + -0.264404296875, 0.7529296875, 0.6533203125, 0.7275390625, + 0.499755859375, 0.833984375, -0.44775390625, -0.05078125, + -0.454833984375, 0.75439453125, 0.68505859375, + 0.210693359375, -0.283935546875, -0.53564453125, + 0.96826171875, 0.861328125, -3.33984375, -0.26171875, + 0.77734375, 0.26513671875, -0.14111328125, -0.042236328125, + -0.84814453125, 0.2137451171875, 0.94921875, 0.65185546875, + -0.5380859375, 0.1529541015625, -0.360595703125, + -0.0333251953125, -0.69189453125, 0.8974609375, 0.7109375, + 0.81494140625, -0.259521484375, 1.1904296875, 0.62158203125, + 1.345703125, 0.89404296875, 0.70556640625, 1.0673828125, + 1.392578125, 0.5068359375, 0.962890625, 0.736328125, + 1.55078125, 0.50390625, -0.398681640625, 2.361328125, + 0.345947265625, -0.61962890625, 0.330078125, 0.75439453125, + -0.673828125, -0.2379150390625, 1.5673828125, 1.369140625, + 0.1119384765625, -0.1834716796875, 1.4599609375, + -0.77587890625, 0.5556640625, 0.09954833984375, + 0.0285186767578125, 0.58935546875, -0.501953125, + 0.212890625, 0.02679443359375, 0.1715087890625, + 0.03466796875, -0.564453125, 2.029296875, 2.45703125, + -0.72216796875, 2.138671875, 0.50830078125, + -0.09356689453125, 0.230224609375, 1.6943359375, + 1.5126953125, 0.39453125, 0.411376953125, 1.07421875, + -0.8046875, 0.51416015625, 0.2271728515625, -0.283447265625, + 0.38427734375, 0.73388671875, 0.6962890625, 1.4990234375, + 0.02813720703125, 0.40478515625, 1.2451171875, 1.1162109375, + -5.5703125, 0.76171875, 0.322021484375, 1.0361328125, + 1.197265625, 0.1163330078125, 0.2425537109375, 1.5595703125, + 1.5791015625, -0.0921630859375, 0.484619140625, + 1.9052734375, 5.31640625, 1.6337890625, 0.95947265625, + -0.1751708984375, 0.466552734375, 0.8330078125, 1.03125, + 0.2044677734375, 0.31298828125, -1.1220703125, 0.5517578125, + 0.93505859375, 0.45166015625, 1.951171875, 0.65478515625, + 1.30859375, 1.0859375, 0.56494140625, 2.322265625, + 0.242919921875, 1.81640625, -0.469970703125, -0.841796875, + 0.90869140625, 1.5361328125, 0.923828125, 1.0595703125, + 0.356689453125, -0.46142578125, 2.134765625, 1.3037109375, + -0.32373046875, -9.2265625, 0.4521484375, 0.88037109375, + -0.53955078125, 0.96484375, 0.7705078125, 0.84521484375, + 1.580078125, -0.1448974609375, 0.7607421875, 1.0166015625, + -0.086669921875, 1.611328125, 0.05938720703125, 0.5078125, + 0.8427734375, 2.431640625, 0.66357421875, 3.203125, + 0.132080078125, 0.461181640625, 0.779296875, 1.9482421875, + 1.8720703125, 0.845703125, -1.3837890625, -0.138916015625, + 0.35546875, 0.2457275390625, 0.75341796875, 1.828125, + 1.4169921875, 0.60791015625, 1.0068359375, 1.109375, + 0.484130859375, -0.302001953125, 0.4951171875, 0.802734375, + 1.9482421875, 0.916015625, 0.1646728515625, 2.599609375, + 1.7177734375, -0.2374267578125, 0.98046875, 0.39306640625, + -1.1396484375, 1.6533203125, 0.375244140625], "scores": + [0.09906005859375, 0.12213134765625, 0.05035400390625, + 0.16552734375, 0.0037384033203125, 0.010101318359375, + 0.058197021484375, 0.10552978515625, 0.1141357421875, + 0.115234375, 0.00444793701171875, 0.00812530517578125, + 0.0343017578125, 0.009002685546875, 0.087158203125, + 0.00818634033203125, 0.003238677978515625, 0.024169921875, + 0.00702667236328125, 0.00936126708984375, + 0.00632476806640625, 0.0293731689453125, 0.01800537109375, + 0.0088348388671875, 0.013397216796875, 0.02557373046875, + 0.12109375, 0.004413604736328125, 0.016571044921875, + 0.00274658203125, 0.2142333984375, 0.29638671875, + 0.16015625, 0.007843017578125, 0.007160186767578125, + 0.00508880615234375, 0.054229736328125, 0.048736572265625, + 0.026397705078125, 0.0265350341796875, 0.051361083984375, + 0.018310546875, 0.00974273681640625, 0.0218963623046875, + 0.216064453125, 0.021728515625, 0.0173797607421875, + 0.004520416259765625], "stoks": [52, 52, 52, 52, 52, 41, 41, + 374, 303, 303, 303, 427], "time": {"begin_ms": 640, + "end_ms": 1140}}]} + - id: 7645a0d1-2e64-410d-83a8-b96040432e9a + chat_id: 470a49f6-1dec-4afe-8b61-035d3b2d63b0 + timestamp: 1716244957031 + role: AGENT + type: AGENT_MESSAGE + message_text: Hello! + emotion_features: >- + {"Admiration": 0.044921875, "Adoration": 0.0253753662109375, + "Aesthetic Appreciation": 0.03265380859375, "Amusement": + 0.118408203125, "Anger": 0.06719970703125, "Anxiety": + 0.0411376953125, "Awe": 0.03802490234375, "Awkwardness": + 0.056549072265625, "Boredom": 0.04217529296875, "Calmness": + 0.08709716796875, "Concentration": 0.070556640625, + "Confusion": 0.06964111328125, "Contemplation": + 0.0343017578125, "Contempt": 0.037689208984375, + "Contentment": 0.059417724609375, "Craving": + 0.01132965087890625, "Desire": 0.01406097412109375, + "Determination": 0.1143798828125, "Disappointment": + 0.051177978515625, "Disgust": 0.028594970703125, "Distress": + 0.054901123046875, "Doubt": 0.04638671875, "Ecstasy": + 0.0258026123046875, "Embarrassment": 0.0222015380859375, + "Empathic Pain": 0.015777587890625, "Entrancement": + 0.0160980224609375, "Envy": 0.0163421630859375, + "Excitement": 0.129638671875, "Fear": 0.03125, "Guilt": + 0.01483917236328125, "Horror": 0.0194549560546875, + "Interest": 0.1341552734375, "Joy": 0.0738525390625, "Love": + 0.0216522216796875, "Nostalgia": 0.0210418701171875, "Pain": + 0.020721435546875, "Pride": 0.05499267578125, "Realization": + 0.0728759765625, "Relief": 0.04052734375, "Romance": + 0.0129241943359375, "Sadness": 0.0254669189453125, + "Satisfaction": 0.07159423828125, "Shame": 0.01495361328125, + "Surprise (negative)": 0.05560302734375, "Surprise + (positive)": 0.07965087890625, "Sympathy": + 0.022247314453125, "Tiredness": 0.0194549560546875, + "Triumph": 0.04107666015625} + metadata: '' + page_number: 0 + page_size: 3 + total_pages: 1 + end_timestamp: 1716244958546 + metadata: '' + config: + id: 1b60e1a0-cc59-424a-8d2c-189d354db3f3 + version: 0 + get-audio: + path: /v0/evi/chats/{id}/audio + method: GET + docs: >- + Fetches the audio of a previous **Chat**. For more details, see our + guide on audio reconstruction + [here](/docs/speech-to-speech-evi/faq#can-i-access-the-audio-of-previous-conversations-with-evi). + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a chat. Formatted as a UUID. + display-name: Get chat audio + response: + docs: Success + type: root.ReturnChatAudioReconstruction + status-code: 200 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 470a49f6-1dec-4afe-8b61-035d3b2d63b0 + response: + body: + id: 470a49f6-1dec-4afe-8b61-035d3b2d63b0 + user_id: e6235940-cfda-3988-9147-ff531627cf42 + status: COMPLETE + filename: >- + e6235940-cfda-3988-9147-ff531627cf42/470a49f6-1dec-4afe-8b61-035d3b2d63b0/reconstructed_audio.mp4 + modified_at: 1729875432555 + signed_audio_url: https://storage.googleapis.com/...etc. + signed_url_expiration_timestamp_millis: 1730232816964 + source: + openapi: evi-openapi.json diff --git a/.mock/definition/empathic-voice/configs.yml b/.mock/definition/empathic-voice/configs.yml new file mode 100644 index 00000000..a74be639 --- /dev/null +++ b/.mock/definition/empathic-voice/configs.yml @@ -0,0 +1,835 @@ +imports: + root: __package__.yml +service: + auth: false + base-path: '' + endpoints: + list-configs: + path: /v0/evi/configs + method: GET + docs: >- + Fetches a paginated list of **Configs**. + + + For more details on configuration options and how to configure EVI, see + our [configuration guide](/docs/speech-to-speech-evi/configuration). + pagination: + offset: $request.page_number + results: $response.configs_page + source: + openapi: evi-openapi.json + display-name: List configs + request: + name: ConfigsListConfigsRequest + query-parameters: + page_number: + type: optional + default: 0 + docs: >- + Specifies the page number to retrieve, enabling pagination. + + + This parameter uses zero-based indexing. For example, setting + `page_number` to 0 retrieves the first page of results (items 0-9 + if `page_size` is 10), setting `page_number` to 1 retrieves the + second page (items 10-19), and so on. Defaults to 0, which + retrieves the first page. + page_size: + type: optional + docs: >- + Specifies the maximum number of results to include per page, + enabling pagination. The value must be between 1 and 100, + inclusive. + + + For example, if `page_size` is set to 10, each page will include + up to 10 items. Defaults to 10. + restrict_to_most_recent: + type: optional + docs: >- + By default, `restrict_to_most_recent` is set to true, returning + only the latest version of each tool. To include all versions of + each tool in the list, set `restrict_to_most_recent` to false. + name: + type: optional + docs: Filter to only include configs with this name. + response: + docs: Success + type: root.ReturnPagedConfigs + status-code: 200 + errors: + - root.BadRequestError + examples: + - query-parameters: + page_number: 0 + page_size: 1 + response: + body: + page_number: 0 + page_size: 1 + total_pages: 1 + configs_page: + - id: 1b60e1a0-cc59-424a-8d2c-189d354db3f3 + version: 0 + version_description: '' + name: Weather Assistant Config + created_on: 1715267200693 + modified_on: 1715267200693 + evi_version: '3' + prompt: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + version: 0 + version_type: FIXED + version_description: '' + name: Weather Assistant Prompt + created_on: 1715267200693 + modified_on: 1715267200693 + text: >- + You are an AI weather assistant providing users with + accurate and up-to-date weather information. Respond to + user queries concisely and clearly. Use simple language + and avoid technical jargon. Provide temperature, + precipitation, wind conditions, and any weather alerts. + Include helpful tips if severe weather is expected. + voice: + provider: HUME_AI + name: Ava Song + id: 5bb7de05-c8fe-426a-8fcc-ba4fc4ce9f9c + language_model: + model_provider: ANTHROPIC + model_resource: claude-3-7-sonnet-latest + temperature: 1 + ellm_model: + allow_short_responses: false + tools: [] + builtin_tools: [] + event_messages: + on_new_chat: + enabled: false + text: '' + on_inactivity_timeout: + enabled: false + text: '' + on_max_duration_timeout: + enabled: false + text: '' + timeouts: + inactivity: + enabled: true + duration_secs: 600 + max_duration: + enabled: true + duration_secs: 1800 + create-config: + path: /v0/evi/configs + method: POST + docs: >- + Creates a **Config** which can be applied to EVI. + + + For more details on configuration options and how to configure EVI, see + our [configuration guide](/docs/speech-to-speech-evi/configuration). + source: + openapi: evi-openapi.json + display-name: Create config + request: + name: PostedConfig + body: + properties: + evi_version: + type: string + docs: >- + EVI version to use. Only versions `3` and `4-mini` are + supported. + name: + type: string + docs: Name applied to all versions of a particular Config. + version_description: + type: optional + docs: An optional description of the Config version. + prompt: optional + voice: + type: optional + docs: A voice specification associated with this Config. + language_model: + type: optional + docs: >- + The supplemental language model associated with this Config. + + + This model is used to generate longer, more detailed responses + from EVI. Choosing an appropriate supplemental language model + for your use case is crucial for generating fast, high-quality + responses from EVI. + ellm_model: + type: optional + docs: >- + The eLLM setup associated with this Config. + + + Hume's eLLM (empathic Large Language Model) is a multimodal + language model that takes into account both expression measures + and language. The eLLM generates short, empathic language + responses and guides text-to-speech (TTS) prosody. + tools: + type: optional>> + docs: List of user-defined tools associated with this Config. + builtin_tools: + type: optional>> + docs: List of built-in tools associated with this Config. + event_messages: optional + nudges: + type: optional + docs: >- + Configures nudges, brief audio prompts that can guide + conversations when users pause or need encouragement to continue + speaking. Nudges help create more natural, flowing interactions + by providing gentle conversational cues. + timeouts: optional + webhooks: + type: optional>> + docs: Webhook config specifications for each subscriber. + content-type: application/json + response: + docs: Created + type: root.ReturnConfig + status-code: 201 + errors: + - root.BadRequestError + examples: + - request: + name: Weather Assistant Config + prompt: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + version: 0 + evi_version: '3' + voice: + provider: HUME_AI + name: Ava Song + language_model: + model_provider: ANTHROPIC + model_resource: claude-3-7-sonnet-latest + temperature: 1 + event_messages: + on_new_chat: + enabled: false + text: '' + on_inactivity_timeout: + enabled: false + text: '' + on_max_duration_timeout: + enabled: false + text: '' + response: + body: + id: 1b60e1a0-cc59-424a-8d2c-189d354db3f3 + version: 0 + version_description: '' + name: Weather Assistant Config + created_on: 1715275452390 + modified_on: 1715275452390 + evi_version: '3' + prompt: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + version: 0 + version_type: FIXED + version_description: '' + name: Weather Assistant Prompt + created_on: 1715267200693 + modified_on: 1715267200693 + text: >- + You are an AI weather assistant providing users with + accurate and up-to-date weather information. Respond to user + queries concisely and clearly. Use simple language and avoid + technical jargon. Provide temperature, precipitation, wind + conditions, and any weather alerts. Include helpful tips if + severe weather is expected. + voice: + provider: HUME_AI + name: Ava Song + id: 5bb7de05-c8fe-426a-8fcc-ba4fc4ce9f9c + language_model: + model_provider: ANTHROPIC + model_resource: claude-3-7-sonnet-latest + temperature: 1 + ellm_model: + allow_short_responses: false + tools: [] + builtin_tools: [] + event_messages: + on_new_chat: + enabled: false + text: '' + on_inactivity_timeout: + enabled: false + text: '' + on_max_duration_timeout: + enabled: false + text: '' + timeouts: + inactivity: + enabled: true + duration_secs: 600 + max_duration: + enabled: true + duration_secs: 1800 + list-config-versions: + path: /v0/evi/configs/{id} + method: GET + docs: >- + Fetches a list of a **Config's** versions. + + + For more details on configuration options and how to configure EVI, see + our [configuration guide](/docs/speech-to-speech-evi/configuration). + pagination: + offset: $request.page_number + results: $response.configs_page + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Config. Formatted as a UUID. + display-name: List config versions + request: + name: ConfigsListConfigVersionsRequest + query-parameters: + page_number: + type: optional + default: 0 + docs: >- + Specifies the page number to retrieve, enabling pagination. + + + This parameter uses zero-based indexing. For example, setting + `page_number` to 0 retrieves the first page of results (items 0-9 + if `page_size` is 10), setting `page_number` to 1 retrieves the + second page (items 10-19), and so on. Defaults to 0, which + retrieves the first page. + page_size: + type: optional + docs: >- + Specifies the maximum number of results to include per page, + enabling pagination. The value must be between 1 and 100, + inclusive. + + + For example, if `page_size` is set to 10, each page will include + up to 10 items. Defaults to 10. + restrict_to_most_recent: + type: optional + docs: >- + By default, `restrict_to_most_recent` is set to true, returning + only the latest version of each config. To include all versions of + each config in the list, set `restrict_to_most_recent` to false. + response: + docs: Success + type: root.ReturnPagedConfigs + status-code: 200 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 1b60e1a0-cc59-424a-8d2c-189d354db3f3 + response: + body: + page_number: 0 + page_size: 10 + total_pages: 1 + configs_page: + - id: 1b60e1a0-cc59-424a-8d2c-189d354db3f3 + version: 0 + version_description: '' + name: Weather Assistant Config + created_on: 1715275452390 + modified_on: 1715275452390 + evi_version: '3' + prompt: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + version: 0 + version_type: FIXED + version_description: '' + name: Weather Assistant Prompt + created_on: 1715267200693 + modified_on: 1715267200693 + text: >- + You are an AI weather assistant providing users with + accurate and up-to-date weather information. Respond to + user queries concisely and clearly. Use simple language + and avoid technical jargon. Provide temperature, + precipitation, wind conditions, and any weather alerts. + Include helpful tips if severe weather is expected. + voice: + provider: HUME_AI + name: Ava Song + id: 5bb7de05-c8fe-426a-8fcc-ba4fc4ce9f9c + language_model: + model_provider: ANTHROPIC + model_resource: claude-3-7-sonnet-latest + temperature: 1 + ellm_model: + allow_short_responses: false + tools: [] + builtin_tools: [] + event_messages: + on_new_chat: + enabled: false + text: '' + on_inactivity_timeout: + enabled: false + text: '' + on_max_duration_timeout: + enabled: false + text: '' + timeouts: + inactivity: + enabled: true + duration_secs: 600 + max_duration: + enabled: true + duration_secs: 1800 + create-config-version: + path: /v0/evi/configs/{id} + method: POST + docs: >- + Updates a **Config** by creating a new version of the **Config**. + + + For more details on configuration options and how to configure EVI, see + our [configuration guide](/docs/speech-to-speech-evi/configuration). + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Config. Formatted as a UUID. + display-name: Create config version + request: + name: PostedConfigVersion + body: + properties: + evi_version: + type: string + docs: The version of the EVI used with this config. + version_description: + type: optional + docs: An optional description of the Config version. + prompt: optional + voice: + type: optional + docs: A voice specification associated with this Config version. + language_model: + type: optional + docs: >- + The supplemental language model associated with this Config + version. + + + This model is used to generate longer, more detailed responses + from EVI. Choosing an appropriate supplemental language model + for your use case is crucial for generating fast, high-quality + responses from EVI. + ellm_model: + type: optional + docs: >- + The eLLM setup associated with this Config version. + + + Hume's eLLM (empathic Large Language Model) is a multimodal + language model that takes into account both expression measures + and language. The eLLM generates short, empathic language + responses and guides text-to-speech (TTS) prosody. + tools: + type: optional>> + docs: List of user-defined tools associated with this Config version. + builtin_tools: + type: optional>> + docs: List of built-in tools associated with this Config version. + event_messages: optional + timeouts: optional + nudges: optional + webhooks: + type: optional>> + docs: Webhook config specifications for each subscriber. + content-type: application/json + response: + docs: Created + type: root.ReturnConfig + status-code: 201 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 1b60e1a0-cc59-424a-8d2c-189d354db3f3 + request: + version_description: This is an updated version of the Weather Assistant Config. + evi_version: '3' + prompt: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + version: 0 + voice: + provider: HUME_AI + name: Ava Song + language_model: + model_provider: ANTHROPIC + model_resource: claude-3-7-sonnet-latest + temperature: 1 + ellm_model: + allow_short_responses: true + event_messages: + on_new_chat: + enabled: false + text: '' + on_inactivity_timeout: + enabled: false + text: '' + on_max_duration_timeout: + enabled: false + text: '' + response: + body: + id: 1b60e1a0-cc59-424a-8d2c-189d354db3f3 + version: 1 + version_description: This is an updated version of the Weather Assistant Config. + name: Weather Assistant Config + created_on: 1715275452390 + modified_on: 1722642242998 + evi_version: '3' + prompt: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + version: 0 + version_type: FIXED + version_description: '' + name: Weather Assistant Prompt + created_on: 1715267200693 + modified_on: 1715267200693 + text: >- + You are an AI weather assistant providing users with + accurate and up-to-date weather information. Respond to user + queries concisely and clearly. Use simple language and avoid + technical jargon. Provide temperature, precipitation, wind + conditions, and any weather alerts. Include helpful tips if + severe weather is expected. + voice: + provider: HUME_AI + name: Ava Song + id: 5bb7de05-c8fe-426a-8fcc-ba4fc4ce9f9c + language_model: + model_provider: ANTHROPIC + model_resource: claude-3-7-sonnet-latest + temperature: 1 + ellm_model: + allow_short_responses: true + tools: [] + builtin_tools: [] + event_messages: + on_new_chat: + enabled: false + text: '' + on_inactivity_timeout: + enabled: false + text: '' + on_max_duration_timeout: + enabled: false + text: '' + timeouts: + inactivity: + enabled: true + duration_secs: 600 + max_duration: + enabled: true + duration_secs: 1800 + delete-config: + path: /v0/evi/configs/{id} + method: DELETE + docs: >- + Deletes a **Config** and its versions. + + + For more details on configuration options and how to configure EVI, see + our [configuration guide](/docs/speech-to-speech-evi/configuration). + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Config. Formatted as a UUID. + display-name: Delete config + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 1b60e1a0-cc59-424a-8d2c-189d354db3f3 + update-config-name: + path: /v0/evi/configs/{id} + method: PATCH + docs: >- + Updates the name of a **Config**. + + + For more details on configuration options and how to configure EVI, see + our [configuration guide](/docs/speech-to-speech-evi/configuration). + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Config. Formatted as a UUID. + display-name: Update config name + request: + name: PostedConfigName + body: + properties: + name: + type: string + docs: Name applied to all versions of a particular Config. + content-type: application/json + response: + docs: Success + type: text + status-code: 200 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 1b60e1a0-cc59-424a-8d2c-189d354db3f3 + request: + name: Updated Weather Assistant Config Name + get-config-version: + path: /v0/evi/configs/{id}/version/{version} + method: GET + docs: >- + Fetches a specified version of a **Config**. + + + For more details on configuration options and how to configure EVI, see + our [configuration guide](/docs/speech-to-speech-evi/configuration). + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Config. Formatted as a UUID. + version: + type: integer + docs: >- + Version number for a Config. + + + Configs, Prompts, Custom Voices, and Tools are versioned. This + versioning system supports iterative development, allowing you to + progressively refine configurations and revert to previous versions + if needed. + + + Version numbers are integer values representing different iterations + of the Config. Each update to the Config increments its version + number. + display-name: Get config version + response: + docs: Success + type: root.ReturnConfig + status-code: 200 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 1b60e1a0-cc59-424a-8d2c-189d354db3f3 + version: 1 + response: + body: + id: 1b60e1a0-cc59-424a-8d2c-189d354db3f3 + version: 1 + version_description: '' + name: Weather Assistant Config + created_on: 1715275452390 + modified_on: 1715275452390 + evi_version: '3' + prompt: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + version: 0 + version_type: FIXED + version_description: '' + name: Weather Assistant Prompt + created_on: 1715267200693 + modified_on: 1715267200693 + text: >- + You are an AI weather assistant providing users with + accurate and up-to-date weather information. Respond to user + queries concisely and clearly. Use simple language and avoid + technical jargon. Provide temperature, precipitation, wind + conditions, and any weather alerts. Include helpful tips if + severe weather is expected. + voice: + provider: HUME_AI + name: Ava Song + id: 5bb7de05-c8fe-426a-8fcc-ba4fc4ce9f9c + language_model: + model_provider: ANTHROPIC + model_resource: claude-3-7-sonnet-latest + temperature: 1 + ellm_model: + allow_short_responses: false + tools: [] + builtin_tools: [] + event_messages: + on_new_chat: + enabled: false + text: '' + on_inactivity_timeout: + enabled: false + text: '' + on_max_duration_timeout: + enabled: false + text: '' + timeouts: + inactivity: + enabled: true + duration_secs: 600 + max_duration: + enabled: true + duration_secs: 1800 + delete-config-version: + path: /v0/evi/configs/{id}/version/{version} + method: DELETE + docs: >- + Deletes a specified version of a **Config**. + + + For more details on configuration options and how to configure EVI, see + our [configuration guide](/docs/speech-to-speech-evi/configuration). + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Config. Formatted as a UUID. + version: + type: integer + docs: >- + Version number for a Config. + + + Configs, Prompts, Custom Voices, and Tools are versioned. This + versioning system supports iterative development, allowing you to + progressively refine configurations and revert to previous versions + if needed. + + + Version numbers are integer values representing different iterations + of the Config. Each update to the Config increments its version + number. + display-name: Delete config version + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 1b60e1a0-cc59-424a-8d2c-189d354db3f3 + version: 1 + update-config-description: + path: /v0/evi/configs/{id}/version/{version} + method: PATCH + docs: >- + Updates the description of a **Config**. + + + For more details on configuration options and how to configure EVI, see + our [configuration guide](/docs/speech-to-speech-evi/configuration). + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Config. Formatted as a UUID. + version: + type: integer + docs: >- + Version number for a Config. + + + Configs, Prompts, Custom Voices, and Tools are versioned. This + versioning system supports iterative development, allowing you to + progressively refine configurations and revert to previous versions + if needed. + + + Version numbers are integer values representing different iterations + of the Config. Each update to the Config increments its version + number. + display-name: Update config description + request: + name: PostedConfigVersionDescription + body: + properties: + version_description: + type: optional + docs: An optional description of the Config version. + content-type: application/json + response: + docs: Success + type: root.ReturnConfig + status-code: 200 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 1b60e1a0-cc59-424a-8d2c-189d354db3f3 + version: 1 + request: + version_description: This is an updated version_description. + response: + body: + id: 1b60e1a0-cc59-424a-8d2c-189d354db3f3 + version: 1 + version_description: This is an updated version_description. + name: Weather Assistant Config + created_on: 1715275452390 + modified_on: 1715275452390 + evi_version: '3' + prompt: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + version: 0 + version_type: FIXED + version_description: '' + name: Weather Assistant Prompt + created_on: 1715267200693 + modified_on: 1715267200693 + text: >- + You are an AI weather assistant providing users with + accurate and up-to-date weather information. Respond to user + queries concisely and clearly. Use simple language and avoid + technical jargon. Provide temperature, precipitation, wind + conditions, and any weather alerts. Include helpful tips if + severe weather is expected. + voice: + provider: HUME_AI + name: Ava Song + id: 5bb7de05-c8fe-426a-8fcc-ba4fc4ce9f9c + language_model: + model_provider: ANTHROPIC + model_resource: claude-3-7-sonnet-latest + temperature: 1 + ellm_model: + allow_short_responses: false + tools: [] + builtin_tools: [] + event_messages: + on_new_chat: + enabled: false + text: '' + on_inactivity_timeout: + enabled: false + text: '' + on_max_duration_timeout: + enabled: false + text: '' + timeouts: + inactivity: + enabled: true + duration_secs: 600 + max_duration: + enabled: true + duration_secs: 1800 + source: + openapi: evi-openapi.json diff --git a/.mock/definition/empathic-voice/controlPlane.yml b/.mock/definition/empathic-voice/controlPlane.yml new file mode 100644 index 00000000..83d760a1 --- /dev/null +++ b/.mock/definition/empathic-voice/controlPlane.yml @@ -0,0 +1,72 @@ +imports: + root: __package__.yml +service: + auth: false + base-path: '' + endpoints: + send: + path: /v0/evi/chat/{chat_id}/send + method: POST + docs: Send a message to a specific chat. + source: + openapi: evi-openapi.json + path-parameters: + chat_id: string + display-name: Send Message + request: + body: root.ControlPlanePublishEvent + content-type: application/json + errors: + - root.UnprocessableEntityError + examples: + - path-parameters: + chat_id: chat_id + request: + type: session_settings + source: + openapi: evi-openapi.json +channel: + path: /chat/{chat_id}/connect + url: evi + auth: false + display-name: Control Plane + docs: >- + Connects to an in-progress EVI chat session. The original chat must have + been started with `allow_connection=true`. The connection can be used to + send and receive the same messages as the original chat, with the exception + that `audio_input` messages are not allowed. + path-parameters: + chat_id: + type: string + docs: The ID of the chat to connect to. + query-parameters: + access_token: + type: optional + default: '' + docs: >- + Access token used for authenticating the client. If not provided, an + `api_key` must be provided to authenticate. + + + The access token is generated using both an API key and a Secret key, + which provides an additional layer of security compared to using just an + API key. + + + For more details, refer to the [Authentication Strategies + Guide](/docs/introduction/api-key#authentication-strategies). + messages: + publish: + origin: client + body: root.ControlPlanePublishEvent + subscribe: + origin: server + body: root.SubscribeEvent + examples: + - messages: + - type: publish + body: + type: session_settings + - type: subscribe + body: + type: assistant_end diff --git a/.mock/definition/empathic-voice/prompts.yml b/.mock/definition/empathic-voice/prompts.yml new file mode 100644 index 00000000..e6d805b7 --- /dev/null +++ b/.mock/definition/empathic-voice/prompts.yml @@ -0,0 +1,549 @@ +imports: + root: __package__.yml +service: + auth: false + base-path: '' + endpoints: + list-prompts: + path: /v0/evi/prompts + method: GET + docs: >- + Fetches a paginated list of **Prompts**. + + + See our [prompting + guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on + crafting your system prompt. + pagination: + offset: $request.page_number + results: $response.prompts_page + source: + openapi: evi-openapi.json + display-name: List prompts + request: + name: PromptsListPromptsRequest + query-parameters: + page_number: + type: optional + default: 0 + docs: >- + Specifies the page number to retrieve, enabling pagination. + + + This parameter uses zero-based indexing. For example, setting + `page_number` to 0 retrieves the first page of results (items 0-9 + if `page_size` is 10), setting `page_number` to 1 retrieves the + second page (items 10-19), and so on. Defaults to 0, which + retrieves the first page. + page_size: + type: optional + docs: >- + Specifies the maximum number of results to include per page, + enabling pagination. The value must be between 1 and 100, + inclusive. + + + For example, if `page_size` is set to 10, each page will include + up to 10 items. Defaults to 10. + restrict_to_most_recent: + type: optional + docs: Only include the most recent version of each prompt in the list. + name: + type: optional + docs: Filter to only include prompts with name. + response: + docs: Success + type: root.ReturnPagedPrompts + status-code: 200 + errors: + - root.BadRequestError + examples: + - query-parameters: + page_number: 0 + page_size: 2 + response: + body: + page_number: 0 + page_size: 2 + total_pages: 1 + prompts_page: + - id: af699d45-2985-42cc-91b9-af9e5da3bac5 + version: 0 + version_type: FIXED + version_description: '' + name: Weather Assistant Prompt + created_on: 1715267200693 + modified_on: 1715267200693 + text: >- + You are an AI weather assistant providing users with + accurate and up-to-date weather information. Respond to user + queries concisely and clearly. Use simple language and avoid + technical jargon. Provide temperature, precipitation, wind + conditions, and any weather alerts. Include helpful tips if + severe weather is expected. + - id: 616b2b4c-a096-4445-9c23-64058b564fc2 + version: 0 + version_type: FIXED + version_description: '' + name: Web Search Assistant Prompt + created_on: 1715267200693 + modified_on: 1715267200693 + text: >- + You are an AI web search assistant designed to help + users find accurate and relevant information on the web. + Respond to user queries promptly, using the built-in web + search tool to retrieve up-to-date results. Present + information clearly and concisely, summarizing key points + where necessary. Use simple language and avoid technical + jargon. If needed, provide helpful tips for refining search + queries to obtain better results. + create-prompt: + path: /v0/evi/prompts + method: POST + docs: >- + Creates a **Prompt** that can be added to an [EVI + configuration](/reference/speech-to-speech-evi/configs/create-config). + + + See our [prompting + guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on + crafting your system prompt. + source: + openapi: evi-openapi.json + display-name: Create prompt + request: + name: PostedPrompt + body: + properties: + name: + type: string + docs: Name applied to all versions of a particular Prompt. + version_description: + type: optional + docs: An optional description of the Prompt version. + text: + type: string + docs: >- + Instructions used to shape EVI's behavior, responses, and style. + + + You can use the Prompt to define a specific goal or role for + EVI, specifying how it should act or what it should focus on + during the conversation. For example, EVI can be instructed to + act as a customer support representative, a fitness coach, or a + travel advisor, each with its own set of behaviors and response + styles. + + + For help writing a system prompt, see our [Prompting + Guide](/docs/speech-to-speech-evi/guides/prompting). + content-type: application/json + response: + docs: Created + type: optional + status-code: 201 + errors: + - root.BadRequestError + examples: + - request: + name: Weather Assistant Prompt + text: >- + You are an AI weather assistant providing users with + accurate and up-to-date weather information. Respond to user + queries concisely and clearly. Use simple language and avoid + technical jargon. Provide temperature, precipitation, wind + conditions, and any weather alerts. Include helpful tips if severe + weather is expected. + response: + body: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + version: 0 + version_type: FIXED + version_description: null + name: Weather Assistant Prompt + created_on: 1722633247488 + modified_on: 1722633247488 + text: >- + You are an AI weather assistant providing users with + accurate and up-to-date weather information. Respond to user + queries concisely and clearly. Use simple language and avoid + technical jargon. Provide temperature, precipitation, wind + conditions, and any weather alerts. Include helpful tips if + severe weather is expected. + list-prompt-versions: + path: /v0/evi/prompts/{id} + method: GET + docs: >- + Fetches a list of a **Prompt's** versions. + + + See our [prompting + guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on + crafting your system prompt. + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Prompt. Formatted as a UUID. + display-name: List prompt versions + request: + name: PromptsListPromptVersionsRequest + query-parameters: + page_number: + type: optional + default: 0 + docs: >- + Specifies the page number to retrieve, enabling pagination. + + + This parameter uses zero-based indexing. For example, setting + `page_number` to 0 retrieves the first page of results (items 0-9 + if `page_size` is 10), setting `page_number` to 1 retrieves the + second page (items 10-19), and so on. Defaults to 0, which + retrieves the first page. + page_size: + type: optional + docs: >- + Specifies the maximum number of results to include per page, + enabling pagination. The value must be between 1 and 100, + inclusive. + + + For example, if `page_size` is set to 10, each page will include + up to 10 items. Defaults to 10. + restrict_to_most_recent: + type: optional + docs: >- + By default, `restrict_to_most_recent` is set to true, returning + only the latest version of each prompt. To include all versions of + each prompt in the list, set `restrict_to_most_recent` to false. + response: + docs: Success + type: root.ReturnPagedPrompts + status-code: 200 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + response: + body: + page_number: 0 + page_size: 10 + total_pages: 1 + prompts_page: + - id: af699d45-2985-42cc-91b9-af9e5da3bac5 + version: 0 + version_type: FIXED + version_description: '' + name: Weather Assistant Prompt + created_on: 1722633247488 + modified_on: 1722633247488 + text: >- + You are an AI weather assistant providing users with + accurate and up-to-date weather information. Respond to user + queries concisely and clearly. Use simple language and avoid + technical jargon. Provide temperature, precipitation, wind + conditions, and any weather alerts. Include helpful tips if + severe weather is expected. + create-prompt-version: + path: /v0/evi/prompts/{id} + method: POST + docs: >- + Updates a **Prompt** by creating a new version of the **Prompt**. + + + See our [prompting + guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on + crafting your system prompt. + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Prompt. Formatted as a UUID. + display-name: Create prompt version + request: + name: PostedPromptVersion + body: + properties: + version_description: + type: optional + docs: An optional description of the Prompt version. + text: + type: string + docs: >- + Instructions used to shape EVI's behavior, responses, and style + for this version of the Prompt. + + + You can use the Prompt to define a specific goal or role for + EVI, specifying how it should act or what it should focus on + during the conversation. For example, EVI can be instructed to + act as a customer support representative, a fitness coach, or a + travel advisor, each with its own set of behaviors and response + styles. + + + For help writing a system prompt, see our [Prompting + Guide](/docs/speech-to-speech-evi/guides/prompting). + content-type: application/json + response: + docs: Created + type: optional + status-code: 201 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + request: + text: >- + You are an updated version of an AI weather assistant + providing users with accurate and up-to-date weather information. + Respond to user queries concisely and clearly. Use simple language + and avoid technical jargon. Provide temperature, precipitation, + wind conditions, and any weather alerts. Include helpful tips if + severe weather is expected. + version_description: This is an updated version of the Weather Assistant Prompt. + response: + body: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + version: 1 + version_type: FIXED + version_description: This is an updated version of the Weather Assistant Prompt. + name: Weather Assistant Prompt + created_on: 1722633247488 + modified_on: 1722635140150 + text: >- + You are an updated version of an AI weather assistant + providing users with accurate and up-to-date weather + information. Respond to user queries concisely and clearly. Use + simple language and avoid technical jargon. Provide temperature, + precipitation, wind conditions, and any weather alerts. Include + helpful tips if severe weather is expected. + delete-prompt: + path: /v0/evi/prompts/{id} + method: DELETE + docs: >- + Deletes a **Prompt** and its versions. + + + See our [prompting + guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on + crafting your system prompt. + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Prompt. Formatted as a UUID. + display-name: Delete prompt + errors: + - root.BadRequestError + examples: + - path-parameters: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + update-prompt-name: + path: /v0/evi/prompts/{id} + method: PATCH + docs: >- + Updates the name of a **Prompt**. + + + See our [prompting + guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on + crafting your system prompt. + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Prompt. Formatted as a UUID. + display-name: Update prompt name + request: + name: PostedPromptName + body: + properties: + name: + type: string + docs: Name applied to all versions of a particular Prompt. + content-type: application/json + response: + docs: Success + type: text + status-code: 200 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + request: + name: Updated Weather Assistant Prompt Name + get-prompt-version: + path: /v0/evi/prompts/{id}/version/{version} + method: GET + docs: >- + Fetches a specified version of a **Prompt**. + + + See our [prompting + guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on + crafting your system prompt. + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Prompt. Formatted as a UUID. + version: + type: integer + docs: >- + Version number for a Prompt. + + + Prompts, Configs, Custom Voices, and Tools are versioned. This + versioning system supports iterative development, allowing you to + progressively refine prompts and revert to previous versions if + needed. + + + Version numbers are integer values representing different iterations + of the Prompt. Each update to the Prompt increments its version + number. + display-name: Get prompt version + response: + docs: Success + type: optional + status-code: 200 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + version: 0 + response: + body: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + version: 0 + version_type: FIXED + version_description: '' + name: Weather Assistant Prompt + created_on: 1722633247488 + modified_on: 1722633247488 + text: >- + You are an AI weather assistant providing users with + accurate and up-to-date weather information. Respond to user + queries concisely and clearly. Use simple language and avoid + technical jargon. Provide temperature, precipitation, wind + conditions, and any weather alerts. Include helpful tips if + severe weather is expected. + delete-prompt-version: + path: /v0/evi/prompts/{id}/version/{version} + method: DELETE + docs: >- + Deletes a specified version of a **Prompt**. + + + See our [prompting + guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on + crafting your system prompt. + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Prompt. Formatted as a UUID. + version: + type: integer + docs: >- + Version number for a Prompt. + + + Prompts, Configs, Custom Voices, and Tools are versioned. This + versioning system supports iterative development, allowing you to + progressively refine prompts and revert to previous versions if + needed. + + + Version numbers are integer values representing different iterations + of the Prompt. Each update to the Prompt increments its version + number. + display-name: Delete prompt version + errors: + - root.BadRequestError + examples: + - path-parameters: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + version: 1 + update-prompt-description: + path: /v0/evi/prompts/{id}/version/{version} + method: PATCH + docs: >- + Updates the description of a **Prompt**. + + + See our [prompting + guide](/docs/speech-to-speech-evi/guides/phone-calling) for tips on + crafting your system prompt. + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Prompt. Formatted as a UUID. + version: + type: integer + docs: >- + Version number for a Prompt. + + + Prompts, Configs, Custom Voices, and Tools are versioned. This + versioning system supports iterative development, allowing you to + progressively refine prompts and revert to previous versions if + needed. + + + Version numbers are integer values representing different iterations + of the Prompt. Each update to the Prompt increments its version + number. + display-name: Update prompt description + request: + name: PostedPromptVersionDescription + body: + properties: + version_description: + type: optional + docs: An optional description of the Prompt version. + content-type: application/json + response: + docs: Success + type: optional + status-code: 200 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + version: 1 + request: + version_description: This is an updated version_description. + response: + body: + id: af699d45-2985-42cc-91b9-af9e5da3bac5 + version: 1 + version_type: FIXED + version_description: This is an updated version_description. + name: string + created_on: 1722633247488 + modified_on: 1722634770585 + text: >- + You are an AI weather assistant providing users with + accurate and up-to-date weather information. Respond to user + queries concisely and clearly. Use simple language and avoid + technical jargon. Provide temperature, precipitation, wind + conditions, and any weather alerts. Include helpful tips if + severe weather is expected. + source: + openapi: evi-openapi.json diff --git a/.mock/definition/empathic-voice/tools.yml b/.mock/definition/empathic-voice/tools.yml new file mode 100644 index 00000000..b5dd6787 --- /dev/null +++ b/.mock/definition/empathic-voice/tools.yml @@ -0,0 +1,617 @@ +imports: + root: __package__.yml +service: + auth: false + base-path: '' + endpoints: + list-tools: + path: /v0/evi/tools + method: GET + docs: >- + Fetches a paginated list of **Tools**. + + + Refer to our [tool + use](/docs/speech-to-speech-evi/features/tool-use#function-calling) + guide for comprehensive instructions on defining and integrating tools + into EVI. + pagination: + offset: $request.page_number + results: $response.tools_page + source: + openapi: evi-openapi.json + display-name: List tools + request: + name: ToolsListToolsRequest + query-parameters: + page_number: + type: optional + default: 0 + docs: >- + Specifies the page number to retrieve, enabling pagination. + + + This parameter uses zero-based indexing. For example, setting + `page_number` to 0 retrieves the first page of results (items 0-9 + if `page_size` is 10), setting `page_number` to 1 retrieves the + second page (items 10-19), and so on. Defaults to 0, which + retrieves the first page. + page_size: + type: optional + docs: >- + Specifies the maximum number of results to include per page, + enabling pagination. The value must be between 1 and 100, + inclusive. + + + For example, if `page_size` is set to 10, each page will include + up to 10 items. Defaults to 10. + restrict_to_most_recent: + type: optional + docs: >- + By default, `restrict_to_most_recent` is set to true, returning + only the latest version of each tool. To include all versions of + each tool in the list, set `restrict_to_most_recent` to false. + name: + type: optional + docs: Filter to only include tools with name. + response: + docs: Success + type: root.ReturnPagedUserDefinedTools + status-code: 200 + errors: + - root.BadRequestError + examples: + - query-parameters: + page_number: 0 + page_size: 2 + response: + body: + page_number: 0 + page_size: 2 + total_pages: 1 + tools_page: + - tool_type: FUNCTION + id: d20827af-5d8d-4f66-b6b9-ce2e3e1ea2b2 + version: 0 + version_type: FIXED + version_description: Fetches user's current location. + name: get_current_location + created_on: 1715267200693 + modified_on: 1715267200693 + fallback_content: Unable to fetch location. + description: Fetches user's current location. + parameters: >- + { "type": "object", "properties": { "location": { "type": + "string", "description": "The city and state, e.g. San + Francisco, CA" }}, "required": ["location"] } + - tool_type: FUNCTION + id: 4442f3ea-9038-40e3-a2ce-1522b7de770f + version: 0 + version_type: FIXED + version_description: >- + Fetches current weather and uses celsius or fahrenheit based + on location of user. + name: get_current_weather + created_on: 1715266126705 + modified_on: 1715266126705 + fallback_content: Unable to fetch location. + description: >- + Fetches current weather and uses celsius or fahrenheit based + on location of user. + parameters: >- + { "type": "object", "properties": { "location": { "type": + "string", "description": "The city and state, e.g. San + Francisco, CA" }, "format": { "type": "string", "enum": + ["celsius", "fahrenheit"], "description": "The temperature + unit to use. Infer this from the users location." } }, + "required": ["location", "format"] } + create-tool: + path: /v0/evi/tools + method: POST + docs: >- + Creates a **Tool** that can be added to an [EVI + configuration](/reference/speech-to-speech-evi/configs/create-config). + + + Refer to our [tool + use](/docs/speech-to-speech-evi/features/tool-use#function-calling) + guide for comprehensive instructions on defining and integrating tools + into EVI. + source: + openapi: evi-openapi.json + display-name: Create tool + request: + name: PostedUserDefinedTool + body: + properties: + name: + type: string + docs: Name applied to all versions of a particular Tool. + version_description: + type: optional + docs: An optional description of the Tool version. + description: + type: optional + docs: >- + An optional description of what the Tool does, used by the + supplemental LLM to choose when and how to call the function. + parameters: + type: string + docs: >- + Stringified JSON defining the parameters used by this version of + the Tool. + + + These parameters define the inputs needed for the Tool's + execution, including the expected data type and description for + each input field. Structured as a stringified JSON schema, this + format ensures the Tool receives data in the expected format. + fallback_content: + type: optional + docs: >- + Optional text passed to the supplemental LLM in place of the + tool call result. The LLM then uses this text to generate a + response back to the user, ensuring continuity in the + conversation if the Tool errors. + content-type: application/json + response: + docs: Created + type: optional + status-code: 201 + errors: + - root.BadRequestError + examples: + - request: + name: get_current_weather + parameters: >- + { "type": "object", "properties": { "location": { "type": + "string", "description": "The city and state, e.g. San Francisco, + CA" }, "format": { "type": "string", "enum": ["celsius", + "fahrenheit"], "description": "The temperature unit to use. Infer + this from the users location." } }, "required": ["location", + "format"] } + version_description: >- + Fetches current weather and uses celsius or fahrenheit based on + location of user. + description: This tool is for getting the current weather. + fallback_content: Unable to fetch current weather. + response: + body: + tool_type: FUNCTION + id: aa9b71c4-723c-47ff-9f83-1a1829e74376 + version: 0 + version_type: FIXED + version_description: >- + Fetches current weather and uses celsius or fahrenheit based on + location of user. + name: get_current_weather + created_on: 1715275452390 + modified_on: 1715275452390 + fallback_content: Unable to fetch current weather. + description: This tool is for getting the current weather. + parameters: >- + { "type": "object", "properties": { "location": { "type": + "string", "description": "The city and state, e.g. San + Francisco, CA" }, "format": { "type": "string", "enum": + ["celsius", "fahrenheit"], "description": "The temperature unit + to use. Infer this from the users location." } }, "required": + ["location", "format"] } + list-tool-versions: + path: /v0/evi/tools/{id} + method: GET + docs: >- + Fetches a list of a **Tool's** versions. + + + Refer to our [tool + use](/docs/speech-to-speech-evi/features/tool-use#function-calling) + guide for comprehensive instructions on defining and integrating tools + into EVI. + pagination: + offset: $request.page_number + results: $response.tools_page + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Tool. Formatted as a UUID. + display-name: List tool versions + request: + name: ToolsListToolVersionsRequest + query-parameters: + page_number: + type: optional + default: 0 + docs: >- + Specifies the page number to retrieve, enabling pagination. + + + This parameter uses zero-based indexing. For example, setting + `page_number` to 0 retrieves the first page of results (items 0-9 + if `page_size` is 10), setting `page_number` to 1 retrieves the + second page (items 10-19), and so on. Defaults to 0, which + retrieves the first page. + page_size: + type: optional + docs: >- + Specifies the maximum number of results to include per page, + enabling pagination. The value must be between 1 and 100, + inclusive. + + + For example, if `page_size` is set to 10, each page will include + up to 10 items. Defaults to 10. + restrict_to_most_recent: + type: optional + docs: >- + By default, `restrict_to_most_recent` is set to true, returning + only the latest version of each tool. To include all versions of + each tool in the list, set `restrict_to_most_recent` to false. + response: + docs: Success + type: root.ReturnPagedUserDefinedTools + status-code: 200 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 00183a3f-79ba-413d-9f3b-609864268bea + response: + body: + page_number: 0 + page_size: 10 + total_pages: 1 + tools_page: + - tool_type: FUNCTION + id: 00183a3f-79ba-413d-9f3b-609864268bea + version: 1 + version_type: FIXED + version_description: >- + Fetches current weather and uses celsius, fahrenheit, or + kelvin based on location of user. + name: get_current_weather + created_on: 1715277014228 + modified_on: 1715277602313 + fallback_content: Unable to fetch current weather. + description: This tool is for getting the current weather. + parameters: >- + { "type": "object", "properties": { "location": { "type": + "string", "description": "The city and state, e.g. San + Francisco, CA" }, "format": { "type": "string", "enum": + ["celsius", "fahrenheit", "kelvin"], "description": "The + temperature unit to use. Infer this from the users + location." } }, "required": ["location", "format"] } + create-tool-version: + path: /v0/evi/tools/{id} + method: POST + docs: >- + Updates a **Tool** by creating a new version of the **Tool**. + + + Refer to our [tool + use](/docs/speech-to-speech-evi/features/tool-use#function-calling) + guide for comprehensive instructions on defining and integrating tools + into EVI. + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Tool. Formatted as a UUID. + display-name: Create tool version + request: + name: PostedUserDefinedToolVersion + body: + properties: + version_description: + type: optional + docs: An optional description of the Tool version. + description: + type: optional + docs: >- + An optional description of what the Tool does, used by the + supplemental LLM to choose when and how to call the function. + parameters: + type: string + docs: >- + Stringified JSON defining the parameters used by this version of + the Tool. + + + These parameters define the inputs needed for the Tool's + execution, including the expected data type and description for + each input field. Structured as a stringified JSON schema, this + format ensures the Tool receives data in the expected format. + fallback_content: + type: optional + docs: >- + Optional text passed to the supplemental LLM in place of the + tool call result. The LLM then uses this text to generate a + response back to the user, ensuring continuity in the + conversation if the Tool errors. + content-type: application/json + response: + docs: Created + type: optional + status-code: 201 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 00183a3f-79ba-413d-9f3b-609864268bea + request: + parameters: >- + { "type": "object", "properties": { "location": { "type": + "string", "description": "The city and state, e.g. San Francisco, + CA" }, "format": { "type": "string", "enum": ["celsius", + "fahrenheit", "kelvin"], "description": "The temperature unit to + use. Infer this from the users location." } }, "required": + ["location", "format"] } + version_description: >- + Fetches current weather and uses celsius, fahrenheit, or kelvin + based on location of user. + fallback_content: Unable to fetch current weather. + description: This tool is for getting the current weather. + response: + body: + tool_type: FUNCTION + id: 00183a3f-79ba-413d-9f3b-609864268bea + version: 1 + version_type: FIXED + version_description: >- + Fetches current weather and uses celsius, fahrenheit, or kelvin + based on location of user. + name: get_current_weather + created_on: 1715277014228 + modified_on: 1715277602313 + fallback_content: Unable to fetch current weather. + description: This tool is for getting the current weather. + parameters: >- + { "type": "object", "properties": { "location": { "type": + "string", "description": "The city and state, e.g. San + Francisco, CA" }, "format": { "type": "string", "enum": + ["celsius", "fahrenheit", "kelvin"], "description": "The + temperature unit to use. Infer this from the users location." } + }, "required": ["location", "format"] } + delete-tool: + path: /v0/evi/tools/{id} + method: DELETE + docs: >- + Deletes a **Tool** and its versions. + + + Refer to our [tool + use](/docs/speech-to-speech-evi/features/tool-use#function-calling) + guide for comprehensive instructions on defining and integrating tools + into EVI. + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Tool. Formatted as a UUID. + display-name: Delete tool + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 00183a3f-79ba-413d-9f3b-609864268bea + update-tool-name: + path: /v0/evi/tools/{id} + method: PATCH + docs: >- + Updates the name of a **Tool**. + + + Refer to our [tool + use](/docs/speech-to-speech-evi/features/tool-use#function-calling) + guide for comprehensive instructions on defining and integrating tools + into EVI. + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Tool. Formatted as a UUID. + display-name: Update tool name + request: + name: PostedUserDefinedToolName + body: + properties: + name: + type: string + docs: Name applied to all versions of a particular Tool. + content-type: application/json + response: + docs: Success + type: text + status-code: 200 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 00183a3f-79ba-413d-9f3b-609864268bea + request: + name: get_current_temperature + get-tool-version: + path: /v0/evi/tools/{id}/version/{version} + method: GET + docs: >- + Fetches a specified version of a **Tool**. + + + Refer to our [tool + use](/docs/speech-to-speech-evi/features/tool-use#function-calling) + guide for comprehensive instructions on defining and integrating tools + into EVI. + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Tool. Formatted as a UUID. + version: + type: integer + docs: >- + Version number for a Tool. + + + Tools, Configs, Custom Voices, and Prompts are versioned. This + versioning system supports iterative development, allowing you to + progressively refine tools and revert to previous versions if + needed. + + + Version numbers are integer values representing different iterations + of the Tool. Each update to the Tool increments its version number. + display-name: Get tool version + response: + docs: Success + type: optional + status-code: 200 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 00183a3f-79ba-413d-9f3b-609864268bea + version: 1 + response: + body: + tool_type: FUNCTION + id: 00183a3f-79ba-413d-9f3b-609864268bea + version: 1 + version_type: FIXED + version_description: >- + Fetches current weather and uses celsius, fahrenheit, or kelvin + based on location of user. + name: string + created_on: 1715277014228 + modified_on: 1715277602313 + fallback_content: Unable to fetch current weather. + description: This tool is for getting the current weather. + parameters: >- + { "type": "object", "properties": { "location": { "type": + "string", "description": "The city and state, e.g. San + Francisco, CA" }, "format": { "type": "string", "enum": + ["celsius", "fahrenheit", "kelvin"], "description": "The + temperature unit to use. Infer this from the users location." } + }, "required": ["location", "format"] } + delete-tool-version: + path: /v0/evi/tools/{id}/version/{version} + method: DELETE + docs: >- + Deletes a specified version of a **Tool**. + + + Refer to our [tool + use](/docs/speech-to-speech-evi/features/tool-use#function-calling) + guide for comprehensive instructions on defining and integrating tools + into EVI. + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Tool. Formatted as a UUID. + version: + type: integer + docs: >- + Version number for a Tool. + + + Tools, Configs, Custom Voices, and Prompts are versioned. This + versioning system supports iterative development, allowing you to + progressively refine tools and revert to previous versions if + needed. + + + Version numbers are integer values representing different iterations + of the Tool. Each update to the Tool increments its version number. + display-name: Delete tool version + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 00183a3f-79ba-413d-9f3b-609864268bea + version: 1 + update-tool-description: + path: /v0/evi/tools/{id}/version/{version} + method: PATCH + docs: >- + Updates the description of a specified **Tool** version. + + + Refer to our [tool + use](/docs/speech-to-speech-evi/features/tool-use#function-calling) + guide for comprehensive instructions on defining and integrating tools + into EVI. + source: + openapi: evi-openapi.json + path-parameters: + id: + type: string + docs: Identifier for a Tool. Formatted as a UUID. + version: + type: integer + docs: >- + Version number for a Tool. + + + Tools, Configs, Custom Voices, and Prompts are versioned. This + versioning system supports iterative development, allowing you to + progressively refine tools and revert to previous versions if + needed. + + + Version numbers are integer values representing different iterations + of the Tool. Each update to the Tool increments its version number. + display-name: Update tool description + request: + name: PostedUserDefinedToolVersionDescription + body: + properties: + version_description: + type: optional + docs: An optional description of the Tool version. + content-type: application/json + response: + docs: Success + type: optional + status-code: 200 + errors: + - root.BadRequestError + examples: + - path-parameters: + id: 00183a3f-79ba-413d-9f3b-609864268bea + version: 1 + request: + version_description: >- + Fetches current temperature, precipitation, wind speed, AQI, and + other weather conditions. Uses Celsius, Fahrenheit, or kelvin + depending on user's region. + response: + body: + tool_type: FUNCTION + id: 00183a3f-79ba-413d-9f3b-609864268bea + version: 1 + version_type: FIXED + version_description: >- + Fetches current temperature, precipitation, wind speed, AQI, and + other weather conditions. Uses Celsius, Fahrenheit, or kelvin + depending on user's region. + name: string + created_on: 1715277014228 + modified_on: 1715277602313 + fallback_content: Unable to fetch current weather. + description: This tool is for getting the current weather. + parameters: >- + { "type": "object", "properties": { "location": { "type": + "string", "description": "The city and state, e.g. San + Francisco, CA" }, "format": { "type": "string", "enum": + ["celsius", "fahrenheit", "kelvin"], "description": "The + temperature unit to use. Infer this from the users location." } + }, "required": ["location", "format"] } + source: + openapi: evi-openapi.json diff --git a/.mock/definition/expression-measurement/__package__.yml b/.mock/definition/expression-measurement/__package__.yml new file mode 100644 index 00000000..0967ef42 --- /dev/null +++ b/.mock/definition/expression-measurement/__package__.yml @@ -0,0 +1 @@ +{} diff --git a/.mock/definition/expression-measurement/batch/__package__.yml b/.mock/definition/expression-measurement/batch/__package__.yml new file mode 100644 index 00000000..2d224469 --- /dev/null +++ b/.mock/definition/expression-measurement/batch/__package__.yml @@ -0,0 +1,1799 @@ +service: + auth: false + base-path: '' + endpoints: + list-jobs: + path: /v0/batch/jobs + method: GET + docs: Sort and filter jobs. + source: + openapi: batch-openapi.json + display-name: List jobs + request: + name: BatchListJobsRequest + query-parameters: + limit: + type: optional + default: 50 + docs: The maximum number of jobs to include in the response. + status: + type: optional + allow-multiple: true + docs: >- + Include only jobs of this status in the response. There are four + possible statuses: + + + - `QUEUED`: The job has been received and is waiting to be + processed. + + + - `IN_PROGRESS`: The job is currently being processed. + + + - `COMPLETED`: The job has finished processing. + + + - `FAILED`: The job encountered an error and could not be + completed successfully. + when: + type: optional + docs: >- + Specify whether to include jobs created before or after a given + `timestamp_ms`. + timestamp_ms: + type: optional + default: 1704319392247 + docs: |- + Provide a timestamp in milliseconds to filter jobs. + + When combined with the `when` parameter, you can filter jobs before or after the given timestamp. Defaults to the current Unix timestamp if one is not provided. + sort_by: + type: optional + docs: >- + Specify which timestamp to sort the jobs by. + + + - `created`: Sort jobs by the time of creation, indicated by + `created_timestamp_ms`. + + + - `started`: Sort jobs by the time processing started, indicated + by `started_timestamp_ms`. + + + - `ended`: Sort jobs by the time processing ended, indicated by + `ended_timestamp_ms`. + direction: + type: optional + docs: >- + Specify the order in which to sort the jobs. Defaults to + descending order. + + + - `asc`: Sort in ascending order (chronological, with the oldest + records first). + + + - `desc`: Sort in descending order (reverse-chronological, with + the newest records first). + response: + docs: '' + type: list + status-code: 200 + examples: + - response: + body: + - job_id: job_id + request: + callback_url: null + files: + - filename: filename + md5sum: md5sum + content_type: content_type + models: + burst: {} + face: + descriptions: null + facs: null + fps_pred: 3 + identify_faces: false + min_face_size: 60 + prob_threshold: 0.99 + save_faces: false + facemesh: {} + language: + granularity: word + identify_speakers: false + sentiment: null + toxicity: null + ner: + identify_speakers: false + prosody: + granularity: utterance + identify_speakers: false + window: null + notify: true + text: [] + urls: + - https://hume-tutorials.s3.amazonaws.com/faces.zip + state: + created_timestamp_ms: 1712587158717 + ended_timestamp_ms: 1712587159274 + num_errors: 0 + num_predictions: 10 + started_timestamp_ms: 1712587158800 + status: COMPLETED + type: INFERENCE + start-inference-job: + path: /v0/batch/jobs + method: POST + docs: Start a new measurement inference job. + source: + openapi: batch-openapi.json + display-name: Start inference job + request: + body: InferenceBaseRequest + content-type: application/json + response: + docs: '' + type: JobId + status-code: 200 + property: job_id + examples: + - request: + urls: + - https://hume-tutorials.s3.amazonaws.com/faces.zip + notify: true + response: + body: + job_id: job_id + get-job-details: + path: /v0/batch/jobs/{id} + method: GET + docs: Get the request details and state of a given job. + source: + openapi: batch-openapi.json + path-parameters: + id: + type: string + docs: The unique identifier for the job. + display-name: Get job details + response: + docs: '' + type: UnionJob + status-code: 200 + examples: + - name: Inference + path-parameters: + id: job_id + response: + body: + type: INFERENCE + job_id: job_id + request: + callback_url: null + files: [] + models: + burst: {} + face: + descriptions: null + facs: null + fps_pred: 3 + identify_faces: false + min_face_size: 60 + prob_threshold: 0.99 + save_faces: false + facemesh: {} + language: + granularity: word + identify_speakers: false + sentiment: null + toxicity: null + ner: + identify_speakers: false + prosody: + granularity: utterance + identify_speakers: false + window: null + notify: true + text: [] + urls: + - https://hume-tutorials.s3.amazonaws.com/faces.zip + state: + created_timestamp_ms: 1712590457884 + ended_timestamp_ms: 1712590462252 + num_errors: 0 + num_predictions: 10 + started_timestamp_ms: 1712590457995 + status: COMPLETED + get-job-predictions: + path: /v0/batch/jobs/{id}/predictions + method: GET + docs: Get the JSON predictions of a completed inference job. + source: + openapi: batch-openapi.json + path-parameters: + id: + type: string + docs: The unique identifier for the job. + display-name: Get job predictions + response: + docs: '' + type: list + status-code: 200 + examples: + - path-parameters: + id: job_id + response: + body: + - source: + type: url + url: https://hume-tutorials.s3.amazonaws.com/faces.zip + results: + predictions: + - file: faces/100.jpg + models: + face: + metadata: null + grouped_predictions: + - id: unknown + predictions: + - frame: 0 + time: 0 + prob: 0.9994111061096191 + box: + x: 1187.885986328125 + 'y': 1397.697509765625 + w: 1401.668701171875 + h: 1961.424560546875 + emotions: + - name: Admiration + score: 0.10722749680280685 + - name: Adoration + score: 0.06395940482616425 + - name: Aesthetic Appreciation + score: 0.05811462551355362 + - name: Amusement + score: 0.14187128841876984 + - name: Anger + score: 0.02804684266448021 + - name: Anxiety + score: 0.2713485360145569 + - name: Awe + score: 0.33812594413757324 + - name: Awkwardness + score: 0.1745193600654602 + - name: Boredom + score: 0.23600080609321594 + - name: Calmness + score: 0.18988418579101562 + - name: Concentration + score: 0.44288986921310425 + - name: Confusion + score: 0.39346569776535034 + - name: Contemplation + score: 0.31002455949783325 + - name: Contempt + score: 0.048870109021663666 + - name: Contentment + score: 0.0579497292637825 + - name: Craving + score: 0.06544201076030731 + - name: Desire + score: 0.05526508390903473 + - name: Determination + score: 0.08590991795063019 + - name: Disappointment + score: 0.19508258998394012 + - name: Disgust + score: 0.031529419124126434 + - name: Distress + score: 0.23210826516151428 + - name: Doubt + score: 0.3284550905227661 + - name: Ecstasy + score: 0.040716782212257385 + - name: Embarrassment + score: 0.1467227339744568 + - name: Empathic Pain + score: 0.07633581757545471 + - name: Entrancement + score: 0.16245244443416595 + - name: Envy + score: 0.03267110139131546 + - name: Excitement + score: 0.10656816512346268 + - name: Fear + score: 0.3115977346897125 + - name: Guilt + score: 0.11615975946187973 + - name: Horror + score: 0.19795553386211395 + - name: Interest + score: 0.3136432468891144 + - name: Joy + score: 0.06285581737756729 + - name: Love + score: 0.06339752674102783 + - name: Nostalgia + score: 0.05866732448339462 + - name: Pain + score: 0.07684041559696198 + - name: Pride + score: 0.026822954416275024 + - name: Realization + score: 0.30000734329223633 + - name: Relief + score: 0.04414166510105133 + - name: Romance + score: 0.042728863656520844 + - name: Sadness + score: 0.14773206412792206 + - name: Satisfaction + score: 0.05902980640530586 + - name: Shame + score: 0.08103451132774353 + - name: Surprise (negative) + score: 0.25518184900283813 + - name: Surprise (positive) + score: 0.28845661878585815 + - name: Sympathy + score: 0.062488824129104614 + - name: Tiredness + score: 0.1559651643037796 + - name: Triumph + score: 0.01955239288508892 + facs: null + descriptions: null + errors: [] + get-job-artifacts: + path: /v0/batch/jobs/{id}/artifacts + method: GET + docs: Get the artifacts ZIP of a completed inference job. + source: + openapi: batch-openapi.json + path-parameters: + id: + type: string + docs: The unique identifier for the job. + display-name: Get job artifacts + response: + docs: '' + type: file + status-code: 200 + start-inference-job-from-local-file: + path: /v0/batch/jobs + method: POST + auth: + - BearerAuth: [] + docs: Start a new batch inference job. + source: + openapi: batch-files-openapi.yml + display-name: Start inference job from local file + request: + name: BatchStartInferenceJobFromLocalFileRequest + body: + properties: + json: + type: optional + docs: >- + Stringified JSON object containing the inference job + configuration. + file: + type: list + docs: >- + Local media files (see recommended input filetypes) to be + processed. + + + If you wish to supply more than 100 files, consider providing + them as an archive (`.zip`, `.tar.gz`, `.tar.bz2`, `.tar.xz`). + content-type: multipart/form-data + response: + docs: '' + type: JobId + status-code: 200 + property: job_id + examples: + - request: {} + response: + body: + job_id: job_id + source: + openapi: batch-files-openapi.yml +types: + Alternative: literal<"language_only"> + Bcp47Tag: + enum: + - zh + - da + - nl + - en + - value: en-AU + name: EnAu + - value: en-IN + name: EnIn + - value: en-NZ + name: EnNz + - value: en-GB + name: EnGb + - fr + - value: fr-CA + name: FrCa + - de + - hi + - value: hi-Latn + name: HiLatn + - id + - it + - ja + - ko + - 'no' + - pl + - pt + - value: pt-BR + name: PtBr + - value: pt-PT + name: PtPt + - ru + - es + - value: es-419 + name: Es419 + - sv + - ta + - tr + - uk + source: + openapi: batch-files-openapi.yml + BoundingBox: + docs: A bounding box around a face. + properties: + x: + type: double + docs: x-coordinate of bounding box top left corner. + 'y': + type: double + docs: y-coordinate of bounding box top left corner. + w: + type: double + docs: Bounding box width. + h: + type: double + docs: Bounding box height. + source: + openapi: batch-openapi.json + BurstPrediction: + properties: + time: TimeInterval + emotions: + docs: A high-dimensional embedding in emotion space. + type: list + descriptions: + docs: Modality-specific descriptive features and their scores. + type: list + source: + openapi: batch-openapi.json + Classification: map + CompletedEmbeddingGeneration: + properties: + created_timestamp_ms: + type: long + docs: When this job was created (Unix timestamp in milliseconds). + started_timestamp_ms: + type: long + docs: When this job started (Unix timestamp in milliseconds). + ended_timestamp_ms: + type: long + docs: When this job ended (Unix timestamp in milliseconds). + source: + openapi: batch-openapi.json + CompletedInference: + properties: + created_timestamp_ms: + type: long + docs: When this job was created (Unix timestamp in milliseconds). + started_timestamp_ms: + type: long + docs: When this job started (Unix timestamp in milliseconds). + ended_timestamp_ms: + type: long + docs: When this job ended (Unix timestamp in milliseconds). + num_predictions: + type: uint64 + docs: The number of predictions that were generated by this job. + num_errors: + type: uint64 + docs: The number of errors that occurred while running this job. + source: + openapi: batch-openapi.json + CompletedTlInference: + properties: + created_timestamp_ms: + type: long + docs: When this job was created (Unix timestamp in milliseconds). + started_timestamp_ms: + type: long + docs: When this job started (Unix timestamp in milliseconds). + ended_timestamp_ms: + type: long + docs: When this job ended (Unix timestamp in milliseconds). + num_predictions: + type: uint64 + docs: The number of predictions that were generated by this job. + num_errors: + type: uint64 + docs: The number of errors that occurred while running this job. + source: + openapi: batch-openapi.json + CompletedTraining: + properties: + created_timestamp_ms: + type: long + docs: When this job was created (Unix timestamp in milliseconds). + started_timestamp_ms: + type: long + docs: When this job started (Unix timestamp in milliseconds). + ended_timestamp_ms: + type: long + docs: When this job ended (Unix timestamp in milliseconds). + custom_model: TrainingCustomModel + alternatives: optional> + source: + openapi: batch-openapi.json + CustomModelPrediction: + properties: + output: map + error: string + task_type: string + source: + openapi: batch-openapi.json + CustomModelRequest: + properties: + name: string + description: optional + tags: optional> + source: + openapi: batch-openapi.json + Dataset: + discriminated: false + union: + - DatasetId + - DatasetVersionId + source: + openapi: batch-openapi.json + DatasetId: + properties: + id: + type: string + validation: + format: uuid + source: + openapi: batch-openapi.json + DatasetVersionId: + properties: + version_id: + type: string + validation: + format: uuid + source: + openapi: batch-openapi.json + DescriptionsScore: + properties: + name: + type: string + docs: Name of the descriptive feature being expressed. + score: + type: float + docs: Embedding value for the descriptive feature being expressed. + source: + openapi: batch-openapi.json + Direction: + enum: + - asc + - desc + source: + openapi: batch-openapi.json + EmbeddingGenerationBaseRequest: + properties: + registry_file_details: + type: optional> + docs: File ID and File URL pairs for an asset registry file + source: + openapi: batch-openapi.json + EmotionScore: + properties: + name: + type: string + docs: Name of the emotion being expressed. + score: + type: float + docs: Embedding value for the emotion being expressed. + source: + openapi: batch-openapi.json + Error: + properties: + message: + type: string + docs: An error message. + file: + type: string + docs: A file path relative to the top level source URL or file. + source: + openapi: batch-openapi.json + EvaluationArgs: + properties: + validation: optional + source: + openapi: batch-openapi.json + Face: + docs: >- + The Facial Emotional Expression model analyzes human facial expressions in + images and videos. Results will be provided per frame for video files. + + + Recommended input file types: `.png`, `.jpeg`, `.mp4` + properties: + fps_pred: + type: optional + docs: >- + Number of frames per second to process. Other frames will be omitted + from the response. Set to `0` to process every frame. + default: 3 + prob_threshold: + type: optional + docs: >- + Face detection probability threshold. Faces detected with a + probability less than this threshold will be omitted from the + response. + default: 0.99 + validation: + min: 0 + max: 1 + identify_faces: + type: optional + docs: >- + Whether to return identifiers for faces across frames. If `true`, + unique identifiers will be assigned to face bounding boxes to + differentiate different faces. If `false`, all faces will be tagged + with an `unknown` ID. + default: false + min_face_size: + type: optional + docs: >- + Minimum bounding box side length in pixels to treat as a face. Faces + detected with a bounding box side length in pixels less than this + threshold will be omitted from the response. + facs: optional + descriptions: optional + save_faces: + type: optional + docs: >- + Whether to extract and save the detected faces in the artifacts zip + created by each job. + default: false + source: + openapi: batch-files-openapi.yml + FacePrediction: + properties: + frame: + type: uint64 + docs: Frame number + time: + type: double + docs: Time in seconds when face detection occurred. + prob: + type: double + docs: The predicted probability that a detected face was actually a face. + box: BoundingBox + emotions: + docs: A high-dimensional embedding in emotion space. + type: list + facs: + type: optional> + docs: FACS 2.0 features and their scores. + descriptions: + type: optional> + docs: Modality-specific descriptive features and their scores. + source: + openapi: batch-openapi.json + FacemeshPrediction: + properties: + emotions: + docs: A high-dimensional embedding in emotion space. + type: list + source: + openapi: batch-openapi.json + FacsScore: + properties: + name: + type: string + docs: Name of the FACS 2.0 feature being expressed. + score: + type: float + docs: Embedding value for the FACS 2.0 feature being expressed. + source: + openapi: batch-openapi.json + Failed: + properties: + created_timestamp_ms: + type: long + docs: When this job was created (Unix timestamp in milliseconds). + started_timestamp_ms: + type: long + docs: When this job started (Unix timestamp in milliseconds). + ended_timestamp_ms: + type: long + docs: When this job ended (Unix timestamp in milliseconds). + message: + type: string + docs: An error message. + source: + openapi: batch-openapi.json + File: + docs: The list of files submitted for analysis. + properties: + filename: + type: optional + docs: The name of the file. + content_type: + type: optional + docs: The content type of the file. + md5sum: + type: string + docs: The MD5 checksum of the file. + source: + openapi: batch-openapi.json + Granularity: + enum: + - word + - sentence + - utterance + - conversational_turn + docs: >- + The granularity at which to generate predictions. The `granularity` field + is ignored if transcription is not enabled or if the `window` field has + been set. + + + - `word`: At the word level, our model provides a separate output for each + word, offering the most granular insight into emotional expression during + speech. + + + - `sentence`: At the sentence level of granularity, we annotate the + emotional tone of each spoken sentence with our Prosody and Emotional + Language models. + + + - `utterance`: Utterance-level granularity is between word- and + sentence-level. It takes into account natural pauses or breaks in speech, + providing more rapidly updated measures of emotional expression within a + flowing conversation. For text inputs, utterance-level granularity will + produce results identical to sentence-level granularity. + + + - `conversational_turn`: Conversational turn-level granularity provides a + distinct output for each change in speaker. It captures the full sequence + of words and sentences spoken uninterrupted by each person. This approach + provides a higher-level view of the emotional dynamics in a + multi-participant dialogue. For text inputs, specifying conversational + turn-level granularity for our Emotional Language model will produce + results for the entire passage. + source: + openapi: batch-files-openapi.yml + GroupedPredictionsBurstPrediction: + properties: + id: + type: string + docs: >- + An automatically generated label to identify individuals in your media + file. Will be `unknown` if you have chosen to disable identification, + or if the model is unable to distinguish between individuals. + predictions: list + source: + openapi: batch-openapi.json + GroupedPredictionsFacePrediction: + properties: + id: + type: string + docs: >- + An automatically generated label to identify individuals in your media + file. Will be `unknown` if you have chosen to disable identification, + or if the model is unable to distinguish between individuals. + predictions: list + source: + openapi: batch-openapi.json + GroupedPredictionsFacemeshPrediction: + properties: + id: + type: string + docs: >- + An automatically generated label to identify individuals in your media + file. Will be `unknown` if you have chosen to disable identification, + or if the model is unable to distinguish between individuals. + predictions: list + source: + openapi: batch-openapi.json + GroupedPredictionsLanguagePrediction: + properties: + id: + type: string + docs: >- + An automatically generated label to identify individuals in your media + file. Will be `unknown` if you have chosen to disable identification, + or if the model is unable to distinguish between individuals. + predictions: list + source: + openapi: batch-openapi.json + GroupedPredictionsNerPrediction: + properties: + id: + type: string + docs: >- + An automatically generated label to identify individuals in your media + file. Will be `unknown` if you have chosen to disable identification, + or if the model is unable to distinguish between individuals. + predictions: list + source: + openapi: batch-openapi.json + GroupedPredictionsProsodyPrediction: + properties: + id: + type: string + docs: >- + An automatically generated label to identify individuals in your media + file. Will be `unknown` if you have chosen to disable identification, + or if the model is unable to distinguish between individuals. + predictions: list + source: + openapi: batch-openapi.json + InProgress: + properties: + created_timestamp_ms: + type: long + docs: When this job was created (Unix timestamp in milliseconds). + started_timestamp_ms: + type: long + docs: When this job started (Unix timestamp in milliseconds). + source: + openapi: batch-openapi.json + InferenceBaseRequest: + properties: + models: + type: optional + docs: >- + Specify the models to use for inference. + + + If this field is not explicitly set, then all models will run by + default. + transcription: optional + urls: + type: optional> + docs: >- + URLs to the media files to be processed. Each must be a valid public + URL to a media file (see recommended input filetypes) or an archive + (`.zip`, `.tar.gz`, `.tar.bz2`, `.tar.xz`) of media files. + + + If you wish to supply more than 100 URLs, consider providing them as + an archive (`.zip`, `.tar.gz`, `.tar.bz2`, `.tar.xz`). + text: + type: optional> + docs: >- + Text supplied directly to our Emotional Language and NER models for + analysis. + callback_url: + type: optional + docs: >- + If provided, a `POST` request will be made to the URL with the + generated predictions on completion or the error message on failure. + notify: + type: optional + docs: >- + Whether to send an email notification to the user upon job + completion/failure. + default: false + source: + openapi: batch-files-openapi.yml + InferencePrediction: + properties: + file: + type: string + docs: A file path relative to the top level source URL or file. + models: ModelsPredictions + source: + openapi: batch-openapi.json + InferenceRequest: + properties: + models: optional + transcription: optional + urls: + type: optional> + docs: >- + URLs to the media files to be processed. Each must be a valid public + URL to a media file (see recommended input filetypes) or an archive + (`.zip`, `.tar.gz`, `.tar.bz2`, `.tar.xz`) of media files. + + + If you wish to supply more than 100 URLs, consider providing them as + an archive (`.zip`, `.tar.gz`, `.tar.bz2`, `.tar.xz`). + text: + type: optional> + docs: Text to supply directly to our language and NER models. + callback_url: + type: optional + docs: >- + If provided, a `POST` request will be made to the URL with the + generated predictions on completion or the error message on failure. + notify: + type: optional + docs: >- + Whether to send an email notification to the user upon job + completion/failure. + default: false + files: list + source: + openapi: batch-openapi.json + InferenceResults: + properties: + predictions: list + errors: list + source: + openapi: batch-openapi.json + InferenceSourcePredictResult: + properties: + source: Source + results: optional + error: + type: optional + docs: An error message. + source: + openapi: batch-openapi.json + JobEmbeddingGeneration: + properties: + job_id: + type: string + docs: The ID associated with this job. + validation: + format: uuid + user_id: + type: string + validation: + format: uuid + request: EmbeddingGenerationBaseRequest + state: StateEmbeddingGeneration + source: + openapi: batch-openapi.json + JobInference: + properties: + job_id: + type: string + docs: The ID associated with this job. + validation: + format: uuid + request: + type: InferenceRequest + docs: The request that initiated the job. + state: + type: StateInference + docs: The current state of the job. + source: + openapi: batch-openapi.json + JobTlInference: + properties: + job_id: + type: string + docs: The ID associated with this job. + validation: + format: uuid + user_id: + type: string + validation: + format: uuid + request: TlInferenceBaseRequest + state: StateTlInference + source: + openapi: batch-openapi.json + JobTraining: + properties: + job_id: + type: string + docs: The ID associated with this job. + validation: + format: uuid + user_id: + type: string + validation: + format: uuid + request: TrainingBaseRequest + state: StateTraining + source: + openapi: batch-openapi.json + JobId: + properties: + job_id: + type: string + docs: The ID of the started job. + validation: + format: uuid + source: + openapi: batch-files-openapi.yml + Language: + docs: >- + The Emotional Language model analyzes passages of text. This also supports + audio and video files by transcribing and then directly analyzing the + transcribed text. + + + Recommended input filetypes: `.txt`, `.mp3`, `.wav`, `.mp4` + properties: + granularity: optional + sentiment: optional + toxicity: optional + identify_speakers: + type: optional + docs: >- + Whether to return identifiers for speakers over time. If `true`, + unique identifiers will be assigned to spoken words to differentiate + different speakers. If `false`, all speakers will be tagged with an + `unknown` ID. + default: false + source: + openapi: batch-files-openapi.yml + LanguagePrediction: + properties: + text: + type: string + docs: A segment of text (like a word or a sentence). + position: PositionInterval + time: optional + confidence: + type: optional + docs: >- + Value between `0.0` and `1.0` that indicates our transcription model's + relative confidence in this text. + speaker_confidence: + type: optional + docs: >- + Value between `0.0` and `1.0` that indicates our transcription model's + relative confidence that this text was spoken by this speaker. + emotions: + docs: A high-dimensional embedding in emotion space. + type: list + sentiment: + type: optional> + docs: >- + Sentiment predictions returned as a distribution. This model predicts + the probability that a given text could be interpreted as having each + sentiment level from `1` (negative) to `9` (positive). + + + Compared to returning one estimate of sentiment, this enables a more + nuanced analysis of a text's meaning. For example, a text with very + neutral sentiment would have an average rating of `5`. But also a text + that could be interpreted as having very positive sentiment or very + negative sentiment would also have an average rating of `5`. The + average sentiment is less informative than the distribution over + sentiment, so this API returns a value for each sentiment level. + toxicity: + type: optional> + docs: >- + Toxicity predictions returned as probabilities that the text can be + classified into the following categories: `toxic`, `severe_toxic`, + `obscene`, `threat`, `insult`, and `identity_hate`. + source: + openapi: batch-openapi.json + Models: + docs: The models used for inference. + properties: + face: optional + burst: optional + prosody: optional + language: optional + ner: optional + facemesh: optional + source: + openapi: batch-files-openapi.yml + ModelsPredictions: + properties: + face: optional + burst: optional + prosody: optional + language: optional + ner: optional + facemesh: optional + source: + openapi: batch-openapi.json + Ner: + docs: >- + The NER (Named-entity Recognition) model identifies real-world objects and + concepts in passages of text. This also supports audio and video files by + transcribing and then directly analyzing the transcribed text. + + + Recommended input filetypes: `.txt`, `.mp3`, `.wav`, `.mp4` + properties: + identify_speakers: + type: optional + docs: >- + Whether to return identifiers for speakers over time. If `true`, + unique identifiers will be assigned to spoken words to differentiate + different speakers. If `false`, all speakers will be tagged with an + `unknown` ID. + default: false + source: + openapi: batch-files-openapi.yml + NerPrediction: + properties: + entity: + type: string + docs: The recognized topic or entity. + position: PositionInterval + entity_confidence: + type: double + docs: Our NER model's relative confidence in the recognized topic or entity. + support: + type: double + docs: A measure of how often the entity is linked to by other entities. + uri: + type: string + docs: >- + A URL which provides more information about the recognized topic or + entity. + link_word: + type: string + docs: The specific word to which the emotion predictions are linked. + time: optional + confidence: + type: optional + docs: >- + Value between `0.0` and `1.0` that indicates our transcription model's + relative confidence in this text. + speaker_confidence: + type: optional + docs: >- + Value between `0.0` and `1.0` that indicates our transcription model's + relative confidence that this text was spoken by this speaker. + emotions: + docs: A high-dimensional embedding in emotion space. + type: list + source: + openapi: batch-openapi.json + 'Null': + type: map + docs: No associated metadata for this model. Value will be `null`. + PositionInterval: + docs: >- + Position of a segment of text within a larger document, measured in + characters. Uses zero-based indexing. The beginning index is inclusive and + the end index is exclusive. + properties: + begin: + type: uint64 + docs: The index of the first character in the text segment, inclusive. + end: + type: uint64 + docs: The index of the last character in the text segment, exclusive. + source: + openapi: batch-openapi.json + PredictionsOptionalNullBurstPrediction: + properties: + metadata: optional + grouped_predictions: list + source: + openapi: batch-openapi.json + PredictionsOptionalNullFacePrediction: + properties: + metadata: optional + grouped_predictions: list + source: + openapi: batch-openapi.json + PredictionsOptionalNullFacemeshPrediction: + properties: + metadata: optional + grouped_predictions: list + source: + openapi: batch-openapi.json + PredictionsOptionalTranscriptionMetadataLanguagePrediction: + properties: + metadata: optional + grouped_predictions: list + source: + openapi: batch-openapi.json + PredictionsOptionalTranscriptionMetadataNerPrediction: + properties: + metadata: optional + grouped_predictions: list + source: + openapi: batch-openapi.json + PredictionsOptionalTranscriptionMetadataProsodyPrediction: + properties: + metadata: optional + grouped_predictions: list + source: + openapi: batch-openapi.json + Prosody: + docs: >- + The Speech Prosody model analyzes the intonation, stress, and rhythm of + spoken word. + + + Recommended input file types: `.wav`, `.mp3`, `.mp4` + properties: + granularity: optional + window: optional + identify_speakers: + type: optional + docs: >- + Whether to return identifiers for speakers over time. If `true`, + unique identifiers will be assigned to spoken words to differentiate + different speakers. If `false`, all speakers will be tagged with an + `unknown` ID. + default: false + source: + openapi: batch-files-openapi.yml + ProsodyPrediction: + properties: + text: + type: optional + docs: A segment of text (like a word or a sentence). + time: TimeInterval + confidence: + type: optional + docs: >- + Value between `0.0` and `1.0` that indicates our transcription model's + relative confidence in this text. + speaker_confidence: + type: optional + docs: >- + Value between `0.0` and `1.0` that indicates our transcription model's + relative confidence that this text was spoken by this speaker. + emotions: + docs: A high-dimensional embedding in emotion space. + type: list + source: + openapi: batch-openapi.json + Queued: + properties: + created_timestamp_ms: + type: long + docs: When this job was created (Unix timestamp in milliseconds). + source: + openapi: batch-openapi.json + RegistryFileDetail: + properties: + file_id: + type: string + docs: File ID in the Asset Registry + file_url: + type: string + docs: URL to the file in the Asset Registry + source: + openapi: batch-openapi.json + Regression: map + SentimentScore: + properties: + name: + type: string + docs: Level of sentiment, ranging from `1` (negative) to `9` (positive) + score: + type: float + docs: Prediction for this level of sentiment + source: + openapi: batch-openapi.json + SortBy: + enum: + - created + - started + - ended + source: + openapi: batch-openapi.json + Source: + discriminant: type + base-properties: {} + union: + url: SourceUrl + file: SourceFile + text: SourceTextSource + source: + openapi: batch-openapi.json + SourceFile: + properties: {} + extends: + - File + source: + openapi: batch-openapi.json + SourceTextSource: + properties: {} + source: + openapi: batch-openapi.json + SourceUrl: + properties: {} + extends: + - Url + source: + openapi: batch-openapi.json + Url: + properties: + url: + type: string + docs: The URL of the source media file. + source: + openapi: batch-openapi.json + StateEmbeddingGeneration: + discriminant: status + base-properties: {} + union: + QUEUED: StateEmbeddingGenerationQueued + IN_PROGRESS: StateEmbeddingGenerationInProgress + COMPLETED: StateEmbeddingGenerationCompletedEmbeddingGeneration + FAILED: StateEmbeddingGenerationFailed + source: + openapi: batch-openapi.json + StateEmbeddingGenerationCompletedEmbeddingGeneration: + properties: {} + extends: + - CompletedEmbeddingGeneration + source: + openapi: batch-openapi.json + StateEmbeddingGenerationFailed: + properties: {} + extends: + - Failed + source: + openapi: batch-openapi.json + StateEmbeddingGenerationInProgress: + properties: {} + extends: + - InProgress + source: + openapi: batch-openapi.json + StateEmbeddingGenerationQueued: + properties: {} + extends: + - Queued + source: + openapi: batch-openapi.json + StateInference: + discriminant: status + base-properties: {} + union: + QUEUED: QueuedState + IN_PROGRESS: InProgressState + COMPLETED: CompletedState + FAILED: FailedState + source: + openapi: batch-openapi.json + CompletedState: + properties: {} + extends: + - CompletedInference + source: + openapi: batch-openapi.json + FailedState: + properties: {} + extends: + - Failed + source: + openapi: batch-openapi.json + InProgressState: + properties: {} + extends: + - InProgress + source: + openapi: batch-openapi.json + QueuedState: + properties: {} + extends: + - Queued + source: + openapi: batch-openapi.json + StateTlInference: + discriminant: status + base-properties: {} + union: + QUEUED: StateTlInferenceQueued + IN_PROGRESS: StateTlInferenceInProgress + COMPLETED: StateTlInferenceCompletedTlInference + FAILED: StateTlInferenceFailed + source: + openapi: batch-openapi.json + StateTlInferenceCompletedTlInference: + properties: {} + extends: + - CompletedTlInference + source: + openapi: batch-openapi.json + StateTlInferenceFailed: + properties: {} + extends: + - Failed + source: + openapi: batch-openapi.json + StateTlInferenceInProgress: + properties: {} + extends: + - InProgress + source: + openapi: batch-openapi.json + StateTlInferenceQueued: + properties: {} + extends: + - Queued + source: + openapi: batch-openapi.json + StateTraining: + discriminant: status + base-properties: {} + union: + QUEUED: StateTrainingQueued + IN_PROGRESS: StateTrainingInProgress + COMPLETED: StateTrainingCompletedTraining + FAILED: StateTrainingFailed + source: + openapi: batch-openapi.json + StateTrainingCompletedTraining: + properties: {} + extends: + - CompletedTraining + source: + openapi: batch-openapi.json + StateTrainingFailed: + properties: {} + extends: + - Failed + source: + openapi: batch-openapi.json + StateTrainingInProgress: + properties: {} + extends: + - InProgress + source: + openapi: batch-openapi.json + StateTrainingQueued: + properties: {} + extends: + - Queued + source: + openapi: batch-openapi.json + Status: + enum: + - QUEUED + - IN_PROGRESS + - COMPLETED + - FAILED + source: + openapi: batch-openapi.json + TlInferencePrediction: + properties: + file: + type: string + docs: A file path relative to the top level source URL or file. + file_type: string + custom_models: map + source: + openapi: batch-openapi.json + TlInferenceResults: + properties: + predictions: list + errors: list + source: + openapi: batch-openapi.json + TlInferenceSourcePredictResult: + properties: + source: Source + results: optional + error: + type: optional + docs: An error message. + source: + openapi: batch-openapi.json + Tag: + properties: + key: string + value: string + source: + openapi: batch-openapi.json + Target: + discriminated: false + union: + - long + - double + - string + source: + openapi: batch-openapi.json + Task: + discriminant: type + base-properties: {} + union: + classification: TaskClassification + regression: TaskRegression + source: + openapi: batch-openapi.json + TaskClassification: + properties: {} + source: + openapi: batch-openapi.json + TaskRegression: + properties: {} + source: + openapi: batch-openapi.json + TextSource: map + TimeInterval: + docs: A time range with a beginning and end, measured in seconds. + properties: + begin: + type: double + docs: Beginning of time range in seconds. + end: + type: double + docs: End of time range in seconds. + source: + openapi: batch-openapi.json + TlInferenceBaseRequest: + properties: + custom_model: CustomModel + urls: + type: optional> + docs: >- + URLs to the media files to be processed. Each must be a valid public + URL to a media file (see recommended input filetypes) or an archive + (`.zip`, `.tar.gz`, `.tar.bz2`, `.tar.xz`) of media files. + + + If you wish to supply more than 100 URLs, consider providing them as + an archive (`.zip`, `.tar.gz`, `.tar.bz2`, `.tar.xz`). + callback_url: + type: optional + docs: >- + If provided, a `POST` request will be made to the URL with the + generated predictions on completion or the error message on failure. + notify: + type: optional + docs: >- + Whether to send an email notification to the user upon job + completion/failure. + default: false + source: + openapi: batch-openapi.json + CustomModel: + discriminated: false + union: + - CustomModelId + - CustomModelVersionId + source: + openapi: batch-openapi.json + CustomModelId: + properties: + id: string + source: + openapi: batch-openapi.json + CustomModelVersionId: + properties: + version_id: string + source: + openapi: batch-openapi.json + ToxicityScore: + properties: + name: + type: string + docs: Category of toxicity. + score: + type: float + docs: Prediction for this category of toxicity + source: + openapi: batch-openapi.json + TrainingBaseRequest: + properties: + custom_model: CustomModelRequest + dataset: Dataset + target_feature: + type: optional + default: label + task: optional + evaluation: optional + alternatives: optional> + callback_url: optional + notify: + type: optional + default: false + source: + openapi: batch-openapi.json + TrainingCustomModel: + properties: + id: string + version_id: optional + source: + openapi: batch-openapi.json + Transcription: + docs: |- + Transcription-related configuration options. + + To disable transcription, explicitly set this field to `null`. + properties: + language: + type: optional + docs: >- + By default, we use an automated language detection method for our + Speech Prosody, Language, and NER models. However, if you know what + language is being spoken in your media samples, you can specify it via + its BCP-47 tag and potentially obtain more accurate results. + + + You can specify any of the following languages: + + - Chinese: `zh` + + - Danish: `da` + + - Dutch: `nl` + + - English: `en` + + - English (Australia): `en-AU` + + - English (India): `en-IN` + + - English (New Zealand): `en-NZ` + + - English (United Kingdom): `en-GB` + + - French: `fr` + + - French (Canada): `fr-CA` + + - German: `de` + + - Hindi: `hi` + + - Hindi (Roman Script): `hi-Latn` + + - Indonesian: `id` + + - Italian: `it` + + - Japanese: `ja` + + - Korean: `ko` + + - Norwegian: `no` + + - Polish: `pl` + + - Portuguese: `pt` + + - Portuguese (Brazil): `pt-BR` + + - Portuguese (Portugal): `pt-PT` + + - Russian: `ru` + + - Spanish: `es` + + - Spanish (Latin America): `es-419` + + - Swedish: `sv` + + - Tamil: `ta` + + - Turkish: `tr` + + - Ukrainian: `uk` + identify_speakers: + type: optional + docs: >- + Whether to return identifiers for speakers over time. If `true`, + unique identifiers will be assigned to spoken words to differentiate + different speakers. If `false`, all speakers will be tagged with an + `unknown` ID. + default: false + confidence_threshold: + type: optional + docs: >- + Transcript confidence threshold. Transcripts generated with a + confidence less than this threshold will be considered invalid and not + used as an input for model inference. + default: 0.5 + validation: + min: 0 + max: 1 + source: + openapi: batch-files-openapi.yml + TranscriptionMetadata: + docs: Transcription metadata for your media file. + properties: + confidence: + type: double + docs: >- + Value between `0.0` and `1.0` indicating our transcription model's + relative confidence in the transcription of your media file. + detected_language: optional + source: + openapi: batch-openapi.json + Type: + enum: + - EMBEDDING_GENERATION + - INFERENCE + - TL_INFERENCE + - TRAINING + source: + openapi: batch-openapi.json + Unconfigurable: + type: map + docs: >- + To include predictions for this model type, set this field to `{}`. It is + currently not configurable further. + UnionJob: InferenceJob + EmbeddingGenerationJob: + properties: + type: string + extends: + - JobEmbeddingGeneration + source: + openapi: batch-openapi.json + InferenceJob: + properties: + type: + type: string + docs: >- + Denotes the job type. + + + Jobs created with the Expression Measurement API will have this field + set to `INFERENCE`. + extends: + - JobInference + source: + openapi: batch-openapi.json + CustomModelsInferenceJob: + properties: + type: string + extends: + - JobTlInference + source: + openapi: batch-openapi.json + CustomModelsTrainingJob: + properties: + type: string + extends: + - JobTraining + source: + openapi: batch-openapi.json + UnionPredictResult: InferenceSourcePredictResult + ValidationArgs: + properties: + positive_label: optional + source: + openapi: batch-openapi.json + When: + enum: + - created_before + - created_after + source: + openapi: batch-openapi.json + Window: + docs: >- + Generate predictions based on time. + + + Setting the `window` field allows for a 'sliding window' approach, where a + fixed-size window moves across the audio or video file in defined steps. + This enables continuous analysis of prosody within subsets of the file, + providing dynamic and localized insights into emotional expression. + properties: + length: + type: optional + docs: The length of the sliding window. + default: 4 + validation: + min: 0.5 + step: + type: optional + docs: The step size of the sliding window. + default: 1 + validation: + min: 0.5 + source: + openapi: batch-files-openapi.yml diff --git a/.mock/definition/expression-measurement/stream/__package__.yml b/.mock/definition/expression-measurement/stream/__package__.yml new file mode 100644 index 00000000..94df9784 --- /dev/null +++ b/.mock/definition/expression-measurement/stream/__package__.yml @@ -0,0 +1,113 @@ +types: + EmotionEmbeddingItem: + properties: + name: + type: optional + docs: Name of the emotion being expressed. + score: + type: optional + docs: Embedding value for the emotion being expressed. + source: + openapi: streaming-asyncapi.yml + EmotionEmbedding: + docs: A high-dimensional embedding in emotion space. + type: list + StreamBoundingBox: + docs: A bounding box around a face. + properties: + x: + type: optional + docs: x-coordinate of bounding box top left corner. + validation: + min: 0 + 'y': + type: optional + docs: y-coordinate of bounding box top left corner. + validation: + min: 0 + w: + type: optional + docs: Bounding box width. + validation: + min: 0 + h: + type: optional + docs: Bounding box height. + validation: + min: 0 + source: + openapi: streaming-asyncapi.yml + TimeRange: + docs: A time range with a beginning and end, measured in seconds. + properties: + begin: + type: optional + docs: Beginning of time range in seconds. + validation: + min: 0 + end: + type: optional + docs: End of time range in seconds. + validation: + min: 0 + source: + openapi: streaming-asyncapi.yml + TextPosition: + docs: > + Position of a segment of text within a larger document, measured in + characters. Uses zero-based indexing. The beginning index is inclusive and + the end index is exclusive. + properties: + begin: + type: optional + docs: The index of the first character in the text segment, inclusive. + validation: + min: 0 + end: + type: optional + docs: The index of the last character in the text segment, exclusive. + validation: + min: 0 + source: + openapi: streaming-asyncapi.yml + SentimentItem: + properties: + name: + type: optional + docs: Level of sentiment, ranging from 1 (negative) to 9 (positive) + score: + type: optional + docs: Prediction for this level of sentiment + source: + openapi: streaming-asyncapi.yml + Sentiment: + docs: >- + Sentiment predictions returned as a distribution. This model predicts the + probability that a given text could be interpreted as having each + sentiment level from 1 (negative) to 9 (positive). + + + Compared to returning one estimate of sentiment, this enables a more + nuanced analysis of a text's meaning. For example, a text with very + neutral sentiment would have an average rating of 5. But also a text that + could be interpreted as having very positive sentiment or very negative + sentiment would also have an average rating of 5. The average sentiment is + less informative than the distribution over sentiment, so this API returns + a value for each sentiment level. + type: list + ToxicityItem: + properties: + name: + type: optional + docs: Category of toxicity. + score: + type: optional + docs: Prediction for this category of toxicity + source: + openapi: streaming-asyncapi.yml + Toxicity: + docs: >- + Toxicity predictions returned as probabilities that the text can be + classified into the following categories: toxic, severe_toxic, obscene, + threat, insult, and identity_hate. + type: list diff --git a/.mock/definition/expression-measurement/stream/stream.yml b/.mock/definition/expression-measurement/stream/stream.yml new file mode 100644 index 00000000..d9c46dc8 --- /dev/null +++ b/.mock/definition/expression-measurement/stream/stream.yml @@ -0,0 +1,437 @@ +channel: + path: /models + url: stream + auth: false + headers: + X-Hume-Api-Key: + type: string + name: humeApiKey + messages: + publish: + origin: client + body: + type: StreamModelsEndpointPayload + docs: Models endpoint payload + subscribe: + origin: server + body: SubscribeEvent + examples: + - messages: + - type: publish + body: {} + - type: subscribe + body: {} +types: + StreamFace: + docs: > + Configuration for the facial expression emotion model. + + + Note: Using the `reset_stream` parameter does not have any effect on face + identification. A single face identifier cache is maintained over a full + session whether `reset_stream` is used or not. + properties: + facs: + type: optional> + docs: >- + Configuration for FACS predictions. If missing or null, no FACS + predictions will be generated. + descriptions: + type: optional> + docs: >- + Configuration for Descriptions predictions. If missing or null, no + Descriptions predictions will be generated. + identify_faces: + type: optional + docs: > + Whether to return identifiers for faces across frames. If true, unique + identifiers will be assigned to face bounding boxes to differentiate + different faces. If false, all faces will be tagged with an "unknown" + ID. + default: false + fps_pred: + type: optional + docs: > + Number of frames per second to process. Other frames will be omitted + from the response. + default: 3 + prob_threshold: + type: optional + docs: > + Face detection probability threshold. Faces detected with a + probability less than this threshold will be omitted from the + response. + default: 3 + min_face_size: + type: optional + docs: > + Minimum bounding box side length in pixels to treat as a face. Faces + detected with a bounding box side length in pixels less than this + threshold will be omitted from the response. + default: 3 + source: + openapi: streaming-asyncapi.yml + inline: true + StreamLanguage: + docs: Configuration for the language emotion model. + properties: + sentiment: + type: optional> + docs: >- + Configuration for sentiment predictions. If missing or null, no + sentiment predictions will be generated. + toxicity: + type: optional> + docs: >- + Configuration for toxicity predictions. If missing or null, no + toxicity predictions will be generated. + granularity: + type: optional + docs: >- + The granularity at which to generate predictions. Values are `word`, + `sentence`, `utterance`, or `passage`. To get a single prediction for + the entire text of your streaming payload use `passage`. Default value + is `word`. + source: + openapi: streaming-asyncapi.yml + inline: true + Config: + docs: > + Configuration used to specify which models should be used and with what + settings. + properties: + burst: + type: optional> + docs: | + Configuration for the vocal burst emotion model. + + Note: Model configuration is not currently available in streaming. + + Please use the default configuration by passing an empty object `{}`. + face: + type: optional + docs: > + Configuration for the facial expression emotion model. + + + Note: Using the `reset_stream` parameter does not have any effect on + face identification. A single face identifier cache is maintained over + a full session whether `reset_stream` is used or not. + facemesh: + type: optional> + docs: | + Configuration for the facemesh emotion model. + + Note: Model configuration is not currently available in streaming. + + Please use the default configuration by passing an empty object `{}`. + language: + type: optional + docs: Configuration for the language emotion model. + prosody: + type: optional> + docs: | + Configuration for the speech prosody emotion model. + + Note: Model configuration is not currently available in streaming. + + Please use the default configuration by passing an empty object `{}`. + source: + openapi: streaming-asyncapi.yml + inline: true + StreamModelsEndpointPayload: + docs: Models endpoint payload + properties: + data: optional + models: + type: optional + docs: > + Configuration used to specify which models should be used and with + what settings. + stream_window_ms: + type: optional + docs: > + Length in milliseconds of streaming sliding window. + + + Extending the length of this window will prepend media context from + past payloads into the current payload. + + + For example, if on the first payload you send 500ms of data and on the + second payload you send an additional 500ms of data, a window of at + least 1000ms will allow the model to process all 1000ms of stream + data. + + + A window of 600ms would append the full 500ms of the second payload to + the last 100ms of the first payload. + + + Note: This feature is currently only supported for audio data and + audio models. For other file types and models this parameter will be + ignored. + default: 5000 + validation: + min: 500 + max: 10000 + reset_stream: + type: optional + docs: > + Whether to reset the streaming sliding window before processing the + current payload. + + + If this parameter is set to `true` then past context will be deleted + before processing the current payload. + + + Use reset_stream when one audio file is done being processed and you + do not want context to leak across files. + default: false + raw_text: + type: optional + docs: > + Set to `true` to enable the data parameter to be parsed as raw text + rather than base64 encoded bytes. + + This parameter is useful if you want to send text to be processed by + the language model, but it cannot be used with other file types like + audio, image, or video. + default: false + job_details: + type: optional + docs: > + Set to `true` to get details about the job. + + + This parameter can be set in the same payload as data or it can be set + without data and models configuration to get the job details between + payloads. + + + This parameter is useful to get the unique job ID. + default: false + payload_id: + type: optional + docs: > + Pass an arbitrary string as the payload ID and get it back at the top + level of the socket response. + + + This can be useful if you have multiple requests running + asynchronously and want to disambiguate responses as they are + received. + source: + openapi: streaming-asyncapi.yml + StreamModelPredictionsJobDetails: + docs: > + If the job_details flag was set in the request, details about the current + streaming job will be returned in the response body. + properties: + job_id: + type: optional + docs: ID of the current streaming job. + source: + openapi: streaming-asyncapi.yml + inline: true + StreamModelPredictionsBurstPredictionsItem: + properties: + time: optional + emotions: optional + source: + openapi: streaming-asyncapi.yml + inline: true + StreamModelPredictionsBurst: + docs: Response for the vocal burst emotion model. + properties: + predictions: optional> + source: + openapi: streaming-asyncapi.yml + inline: true + StreamModelPredictionsFacePredictionsItem: + properties: + frame: + type: optional + docs: Frame number + time: + type: optional + docs: Time in seconds when face detection occurred. + bbox: optional + prob: + type: optional + docs: The predicted probability that a detected face was actually a face. + face_id: + type: optional + docs: >- + Identifier for a face. Not that this defaults to `unknown` unless face + identification is enabled in the face model configuration. + emotions: optional + facs: optional + descriptions: optional + source: + openapi: streaming-asyncapi.yml + inline: true + StreamModelPredictionsFace: + docs: Response for the facial expression emotion model. + properties: + predictions: optional> + source: + openapi: streaming-asyncapi.yml + inline: true + StreamModelPredictionsFacemeshPredictionsItem: + properties: + emotions: optional + source: + openapi: streaming-asyncapi.yml + inline: true + StreamModelPredictionsFacemesh: + docs: Response for the facemesh emotion model. + properties: + predictions: optional> + source: + openapi: streaming-asyncapi.yml + inline: true + StreamModelPredictionsLanguagePredictionsItem: + properties: + text: + type: optional + docs: A segment of text (like a word or a sentence). + position: optional + emotions: optional + sentiment: optional + toxicity: optional + source: + openapi: streaming-asyncapi.yml + inline: true + StreamModelPredictionsLanguage: + docs: Response for the language emotion model. + properties: + predictions: optional> + source: + openapi: streaming-asyncapi.yml + inline: true + StreamModelPredictionsProsodyPredictionsItem: + properties: + time: optional + emotions: optional + source: + openapi: streaming-asyncapi.yml + inline: true + StreamModelPredictionsProsody: + docs: Response for the speech prosody emotion model. + properties: + predictions: optional> + source: + openapi: streaming-asyncapi.yml + inline: true + StreamModelPredictions: + docs: Model predictions + properties: + payload_id: + type: optional + docs: > + If a payload ID was passed in the request, the same payload ID will be + sent back in the response body. + job_details: + type: optional + docs: > + If the job_details flag was set in the request, details about the + current streaming job will be returned in the response body. + burst: + type: optional + docs: Response for the vocal burst emotion model. + face: + type: optional + docs: Response for the facial expression emotion model. + facemesh: + type: optional + docs: Response for the facemesh emotion model. + language: + type: optional + docs: Response for the language emotion model. + prosody: + type: optional + docs: Response for the speech prosody emotion model. + source: + openapi: streaming-asyncapi.yml + inline: true + JobDetails: + docs: > + If the job_details flag was set in the request, details about the current + streaming job will be returned in the response body. + properties: + job_id: + type: optional + docs: ID of the current streaming job. + source: + openapi: streaming-asyncapi.yml + inline: true + StreamErrorMessage: + docs: Error message + properties: + error: + type: optional + docs: Error message text. + code: + type: optional + docs: Unique identifier for the error. + payload_id: + type: optional + docs: > + If a payload ID was passed in the request, the same payload ID will be + sent back in the response body. + job_details: + type: optional + docs: > + If the job_details flag was set in the request, details about the + current streaming job will be returned in the response body. + source: + openapi: streaming-asyncapi.yml + inline: true + StreamWarningMessageJobDetails: + docs: > + If the job_details flag was set in the request, details about the current + streaming job will be returned in the response body. + properties: + job_id: + type: optional + docs: ID of the current streaming job. + source: + openapi: streaming-asyncapi.yml + inline: true + StreamWarningMessage: + docs: Warning message + properties: + warning: + type: optional + docs: Warning message text. + code: + type: optional + docs: Unique identifier for the error. + payload_id: + type: optional + docs: > + If a payload ID was passed in the request, the same payload ID will be + sent back in the response body. + job_details: + type: optional + docs: > + If the job_details flag was set in the request, details about the + current streaming job will be returned in the response body. + source: + openapi: streaming-asyncapi.yml + inline: true + SubscribeEvent: + discriminated: false + union: + - type: StreamModelPredictions + docs: Model predictions + - type: StreamErrorMessage + docs: Error message + - type: StreamWarningMessage + docs: Warning message + source: + openapi: streaming-asyncapi.yml +imports: + streamRoot: __package__.yml diff --git a/.mock/definition/tts/__package__.yml b/.mock/definition/tts/__package__.yml new file mode 100644 index 00000000..51d358af --- /dev/null +++ b/.mock/definition/tts/__package__.yml @@ -0,0 +1,921 @@ +errors: + UnprocessableEntityError: + status-code: 422 + type: HTTPValidationError + docs: Validation Error + examples: + - value: {} + BadRequestError: + status-code: 400 + type: ErrorResponse + docs: Bad Request + examples: + - value: {} +service: + auth: false + base-path: '' + endpoints: + synthesize-json: + path: /v0/tts + method: POST + docs: >- + Synthesizes one or more input texts into speech using the specified + voice. If no voice is provided, a novel voice will be generated + dynamically. Optionally, additional context can be included to influence + the speech's style and prosody. + + + The response includes the base64-encoded audio and metadata in JSON + format. + source: + openapi: tts-openapi.json + display-name: Text-to-Speech (Json) + request: + body: + type: PostedTts + content-type: application/json + response: + docs: Successful Response + type: ReturnTts + status-code: 200 + errors: + - UnprocessableEntityError + examples: + - request: + context: + utterances: + - text: How can people see beauty so differently? + description: >- + A curious student with a clear and respectful tone, seeking + clarification on Hume's ideas with a straightforward + question. + format: + type: mp3 + num_generations: 1 + utterances: + - text: >- + Beauty is no quality in things themselves: It exists merely in + the mind which contemplates them. + description: >- + Middle-aged masculine voice with a clear, rhythmic Scots lilt, + rounded vowels, and a warm, steady tone with an articulate, + academic quality. + response: + body: + generations: + - audio: //PExAA0DDYRvkpNfhv3JI5JZ...etc. + duration: 7.44225 + encoding: + format: mp3 + sample_rate: 48000 + file_size: 120192 + generation_id: 795c949a-1510-4a80-9646-7d0863b023ab + snippets: + - - audio: //PExAA0DDYRvkpNfhv3JI5JZ...etc. + generation_id: 795c949a-1510-4a80-9646-7d0863b023ab + id: 37b1b1b1-1b1b-1b1b-1b1b-1b1b1b1b1b1b + text: >- + Beauty is no quality in things themselves: It exists + merely in the mind which contemplates them. + utterance_index: 0 + timestamps: [] + request_id: 66e01f90-4501-4aa0-bbaf-74f45dc15aa725906 + synthesize-file: + path: /v0/tts/file + method: POST + docs: >- + Synthesizes one or more input texts into speech using the specified + voice. If no voice is provided, a novel voice will be generated + dynamically. Optionally, additional context can be included to influence + the speech's style and prosody. + + + The response contains the generated audio file in the requested format. + source: + openapi: tts-openapi.json + display-name: Text-to-Speech (File) + request: + body: + type: PostedTts + content-type: application/json + response: + docs: OK + type: file + status-code: 200 + errors: + - UnprocessableEntityError + examples: + - request: + context: + generation_id: 09ad914d-8e7f-40f8-a279-e34f07f7dab2 + format: + type: mp3 + num_generations: 1 + utterances: + - text: >- + Beauty is no quality in things themselves: It exists merely in + the mind which contemplates them. + description: >- + Middle-aged masculine voice with a clear, rhythmic Scots lilt, + rounded vowels, and a warm, steady tone with an articulate, + academic quality. + synthesize-file-streaming: + path: /v0/tts/stream/file + method: POST + docs: >- + Streams synthesized speech using the specified voice. If no voice is + provided, a novel voice will be generated dynamically. Optionally, + additional context can be included to influence the speech's style and + prosody. + source: + openapi: tts-openapi.json + display-name: Text-to-Speech (Streamed File) + request: + body: + type: PostedTts + content-type: application/json + response: + docs: OK + type: file + status-code: 200 + errors: + - UnprocessableEntityError + examples: + - request: + utterances: + - text: >- + Beauty is no quality in things themselves: It exists merely in + the mind which contemplates them. + voice: + name: Male English Actor + provider: HUME_AI + synthesize-json-streaming: + path: /v0/tts/stream/json + method: POST + docs: >- + Streams synthesized speech using the specified voice. If no voice is + provided, a novel voice will be generated dynamically. Optionally, + additional context can be included to influence the speech's style and + prosody. + + + The response is a stream of JSON objects including audio encoded in + base64. + source: + openapi: tts-openapi.json + display-name: Text-to-Speech (Streamed JSON) + request: + body: + type: PostedTts + content-type: application/json + response-stream: + docs: Successful Response + type: TtsOutput + format: json + errors: + - UnprocessableEntityError + examples: + - request: + utterances: + - text: >- + Beauty is no quality in things themselves: It exists merely in + the mind which contemplates them. + voice: + name: Male English Actor + provider: HUME_AI + convertVoiceFile: + path: /v0/tts/voice_conversion/file + method: POST + source: + openapi: tts-openapi.json + display-name: Voice Conversion Stream File + request: + name: ConvertVoiceFileRequest + body: + properties: + strip_headers: + type: optional + docs: >- + If enabled, the audio for all the chunks of a generation, once + concatenated together, will constitute a single audio file. + Otherwise, if disabled, each chunk's audio will be its own audio + file, each with its own headers (if applicable). + audio: file + context: + type: optional + docs: >- + Utterances to use as context for generating consistent speech + style and prosody across multiple requests. These will not be + converted to speech output. + voice: optional + format: + type: optional + docs: Specifies the output audio file format. + include_timestamp_types: + type: optional> + docs: The set of timestamp types to include in the response. + content-type: multipart/form-data + response: + docs: Successful Response + type: file + status-code: 200 + errors: + - UnprocessableEntityError + convertVoiceJson: + path: /v0/tts/voice_conversion/json + method: POST + source: + openapi: tts-openapi.json + display-name: Voice Conversion Stream Json + request: + name: ConvertVoiceJsonRequest + body: + properties: + strip_headers: + type: optional + docs: >- + If enabled, the audio for all the chunks of a generation, once + concatenated together, will constitute a single audio file. + Otherwise, if disabled, each chunk's audio will be its own audio + file, each with its own headers (if applicable). + audio: optional + context: + type: optional + docs: >- + Utterances to use as context for generating consistent speech + style and prosody across multiple requests. These will not be + converted to speech output. + voice: optional + format: + type: optional + docs: Specifies the output audio file format. + include_timestamp_types: + type: optional> + docs: The set of timestamp types to include in the response. + content-type: multipart/form-data + response-stream: + docs: Successful Response + type: TtsOutput + format: json + errors: + - UnprocessableEntityError + examples: + - request: {} + response: + stream: + - audio: audio + audio_format: mp3 + chunk_index: 1 + generation_id: generation_id + is_last_chunk: true + request_id: request_id + snippet: + audio: audio + generation_id: generation_id + id: id + text: text + timestamps: + - text: text + time: + begin: 1 + end: 1 + type: word + transcribed_text: transcribed_text + utterance_index: 1 + snippet_id: snippet_id + text: text + transcribed_text: transcribed_text + type: audio + utterance_index: 1 + source: + openapi: tts-openapi.json +types: + PostedContext: + discriminated: false + docs: >- + Utterances to use as context for generating consistent speech style and + prosody across multiple requests. These will not be converted to speech + output. + union: + - type: PostedContextWithGenerationId + - type: PostedContextWithUtterances + source: + openapi: tts-openapi.json + inline: true + Format: + discriminated: false + docs: Specifies the output audio file format. + union: + - type: FormatMp3 + - type: FormatPcm + - type: FormatWav + source: + openapi: tts-openapi.json + inline: true + AudioFormatType: + enum: + - mp3 + - pcm + - wav + source: + openapi: tts-openapi.json + PublishTts: + docs: Input message type for the TTS stream. + properties: + close: + type: optional + docs: Force the generation of audio and close the stream. + default: false + description: + type: optional + docs: >- + Natural language instructions describing how the text should be spoken + by the model (e.g., `"a soft, gentle voice with a strong British + accent"`). + validation: + maxLength: 1000 + flush: + type: optional + docs: >- + Force the generation of audio regardless of how much text has been + supplied. + default: false + speed: + type: optional + docs: A relative measure of how fast this utterance should be spoken. + default: 1 + validation: + min: 0.25 + max: 3 + text: + type: optional + docs: The input text to be converted to speech output. + default: '' + validation: + maxLength: 5000 + trailing_silence: + type: optional + docs: Duration of trailing silence (in seconds) to add to this utterance + default: 0 + validation: + min: 0 + max: 5 + voice: + type: optional + docs: >- + The name or ID of the voice from the `Voice Library` to be used as the + speaker for this and all subsequent utterances, until the `"voice"` + field is updated again. + source: + openapi: tts-asyncapi.json + MillisecondInterval: + properties: + begin: + type: integer + docs: Start time of the interval in milliseconds. + end: + type: integer + docs: End time of the interval in milliseconds. + source: + openapi: tts-openapi.json + TimestampMessage: + docs: A word or phoneme level timestamp for the generated audio. + properties: + generation_id: + type: string + docs: >- + The generation ID of the parent snippet that this chunk corresponds + to. + request_id: + type: string + docs: ID of the initiating request. + snippet_id: + type: string + docs: The ID of the parent snippet that this chunk corresponds to. + timestamp: + type: Timestamp + docs: A word or phoneme level timestamp for the generated audio. + type: literal<"timestamp"> + source: + openapi: tts-openapi.json + SnippetAudioChunk: + docs: Metadata for a chunk of generated audio. + properties: + audio: + type: string + docs: The generated audio output chunk in the requested format. + audio_format: + type: AudioFormatType + docs: The generated audio output format. + chunk_index: + type: integer + docs: The index of the audio chunk in the snippet. + generation_id: + type: string + docs: >- + The generation ID of the parent snippet that this chunk corresponds + to. + is_last_chunk: + type: boolean + docs: >- + Whether or not this is the last chunk streamed back from the decoder + for one input snippet. + request_id: + type: string + docs: ID of the initiating request. + snippet: optional + snippet_id: + type: string + docs: The ID of the parent snippet that this chunk corresponds to. + text: + type: string + docs: The text of the parent snippet that this chunk corresponds to. + transcribed_text: + type: optional + docs: >- + The transcribed text of the generated audio of the parent snippet that + this chunk corresponds to. It is only present if `instant_mode` is set + to `false`. + type: literal<"audio"> + utterance_index: + type: optional + docs: >- + The index of the utterance in the request that the parent snippet of + this chunk corresponds to. + source: + openapi: tts-openapi.json + Timestamp: + properties: + text: string + time: + type: MillisecondInterval + type: + type: TimestampType + source: + openapi: tts-openapi.json + TimestampType: + enum: + - word + - phoneme + source: + openapi: tts-openapi.json + PostedUtteranceVoiceWithId: + properties: + id: + type: string + docs: The unique ID associated with the **Voice**. + provider: + type: optional + docs: >- + Specifies the source provider associated with the chosen voice. + + + - **`HUME_AI`**: Select voices from Hume's [Voice + Library](https://platform.hume.ai/tts/voice-library), containing a + variety of preset, shared voices. + + - **`CUSTOM_VOICE`**: Select from voices you've personally generated + and saved in your account. + + + If no provider is explicitly set, the default provider is + `CUSTOM_VOICE`. When using voices from Hume's **Voice Library**, you + must explicitly set the provider to `HUME_AI`. + + + Preset voices from Hume's **Voice Library** are accessible by all + users. In contrast, your custom voices are private and accessible only + via requests authenticated with your API key. + source: + openapi: tts-openapi.json + PostedUtteranceVoiceWithName: + properties: + name: + type: string + docs: The name of a **Voice**. + provider: + type: optional + docs: >- + Specifies the source provider associated with the chosen voice. + + + - **`HUME_AI`**: Select voices from Hume's [Voice + Library](https://platform.hume.ai/tts/voice-library), containing a + variety of preset, shared voices. + + - **`CUSTOM_VOICE`**: Select from voices you've personally generated + and saved in your account. + + + If no provider is explicitly set, the default provider is + `CUSTOM_VOICE`. When using voices from Hume's **Voice Library**, you + must explicitly set the provider to `HUME_AI`. + + + Preset voices from Hume's **Voice Library** are accessible by all + users. In contrast, your custom voices are private and accessible only + via requests authenticated with your API key. + source: + openapi: tts-openapi.json + VoiceProvider: + enum: + - HUME_AI + - CUSTOM_VOICE + source: + openapi: tts-openapi.json + PostedUtteranceVoice: + discriminated: false + union: + - type: PostedUtteranceVoiceWithId + - type: PostedUtteranceVoiceWithName + source: + openapi: tts-openapi.json + OctaveVersion: + enum: + - value: '1' + name: One + - value: '2' + name: Two + docs: >- + Selects the Octave model version used to synthesize speech for this + request. If you omit this field, Hume automatically routes the request to + the most appropriate model. Setting a specific version ensures stable and + repeatable behavior across requests. + + + Use `2` to opt into the latest Octave capabilities. When you specify + version `2`, you must also provide a `voice`. Requests that set `version: + 2` without a voice will be rejected. + + + For a comparison of Octave versions, see the [Octave + versions](/docs/text-to-speech-tts/overview#octave-versions) section in + the TTS overview. + source: + openapi: tts-openapi.json + TtsOutput: + discriminated: false + union: + - type: SnippetAudioChunk + - type: TimestampMessage + source: + openapi: tts-openapi.json + Snippet: + properties: + audio: + type: string + docs: >- + The segmented audio output in the requested format, encoded as a + base64 string. + generation_id: + type: string + docs: The generation ID this snippet corresponds to. + id: + type: string + docs: A unique ID associated with this **Snippet**. + text: + type: string + docs: The text for this **Snippet**. + timestamps: + docs: A list of word or phoneme level timestamps for the generated audio. + type: list + transcribed_text: + type: optional + docs: >- + The transcribed text of the generated audio. It is only present if + `instant_mode` is set to `false`. + utterance_index: + type: optional + docs: The index of the utterance in the request this snippet corresponds to. + source: + openapi: tts-openapi.json + PostedContextWithGenerationId: + properties: + generation_id: + type: string + docs: >- + The ID of a prior TTS generation to use as context for generating + consistent speech style and prosody across multiple requests. + Including context may increase audio generation times. + source: + openapi: tts-openapi.json + PostedContextWithUtterances: + properties: + utterances: + type: list + source: + openapi: tts-openapi.json + AudioEncoding: + docs: >- + Encoding information about the generated audio, including the `format` and + `sample_rate`. + properties: + format: + type: AudioFormatType + docs: Format for the output audio. + sample_rate: + type: integer + docs: >- + The sample rate (`Hz`) of the generated audio. The default sample rate + is `48000 Hz`. + source: + openapi: tts-openapi.json + ReturnGeneration: + properties: + audio: + type: string + docs: >- + The generated audio output in the requested format, encoded as a + base64 string. + duration: + type: double + docs: Duration of the generated audio in seconds. + encoding: + type: AudioEncoding + file_size: + type: integer + docs: Size of the generated audio in bytes. + generation_id: + type: string + docs: >- + A unique ID associated with this TTS generation that can be used as + context for generating consistent speech style and prosody across + multiple requests. + snippets: + docs: >- + A list of snippet groups where each group corresponds to an utterance + in the request. Each group contains segmented snippets that represent + the original utterance divided into more natural-sounding units + optimized for speech delivery. + type: list> + source: + openapi: tts-openapi.json + HTTPValidationError: + properties: + detail: + type: optional> + source: + openapi: tts-openapi.json + FormatMp3: + properties: + type: literal<"mp3"> + source: + openapi: tts-openapi.json + PostedTts: + properties: + context: + type: optional + docs: >- + Utterances to use as context for generating consistent speech style + and prosody across multiple requests. These will not be converted to + speech output. + format: + type: optional + docs: Specifies the output audio file format. + include_timestamp_types: + type: optional> + docs: The set of timestamp types to include in the response. + num_generations: + type: optional + docs: >- + Number of audio generations to produce from the input utterances. + + + Using `num_generations` enables faster processing than issuing + multiple sequential requests. Additionally, specifying + `num_generations` allows prosody continuation across all generations + without repeating context, ensuring each generation sounds slightly + different while maintaining contextual consistency. + default: 1 + validation: + min: 1 + max: 5 + split_utterances: + type: optional + docs: >- + Controls how audio output is segmented in the response. + + + - When **enabled** (`true`), input utterances are automatically split + into natural-sounding speech segments. + + + - When **disabled** (`false`), the response maintains a strict + one-to-one mapping between input utterances and output snippets. + + + This setting affects how the `snippets` array is structured in the + response, which may be important for applications that need to track + the relationship between input text and generated audio segments. When + setting to `false`, avoid including utterances with long `text`, as + this can result in distorted output. + default: true + strip_headers: + type: optional + docs: >- + If enabled, the audio for all the chunks of a generation, once + concatenated together, will constitute a single audio file. Otherwise, + if disabled, each chunk's audio will be its own audio file, each with + its own headers (if applicable). + default: false + utterances: + docs: >- + A list of **Utterances** to be converted to speech output. + + + An **Utterance** is a unit of input for + [Octave](/docs/text-to-speech-tts/overview), and includes input + `text`, an optional `description` to serve as the prompt for how the + speech should be delivered, an optional `voice` specification, and + additional controls to guide delivery for `speed` and + `trailing_silence`. + type: list + version: + type: optional + docs: >- + Selects the Octave model version used to synthesize speech for this + request. If you omit this field, Hume automatically routes the request + to the most appropriate model. Setting a specific version ensures + stable and repeatable behavior across requests. + + + Use `2` to opt into the latest Octave capabilities. When you specify + version `2`, you must also provide a `voice`. Requests that set + `version: 2` without a voice will be rejected. + + + For a comparison of Octave versions, see the [Octave + versions](/docs/text-to-speech-tts/overview#octave-versions) section + in the TTS overview. + instant_mode: + type: optional + docs: >- + Enables ultra-low latency streaming, significantly reducing the time + until the first audio chunk is received. Recommended for real-time + applications requiring immediate audio playback. For further details, + see our documentation on [instant + mode](/docs/text-to-speech-tts/overview#ultra-low-latency-streaming-instant-mode). + + - A + [voice](/reference/text-to-speech-tts/synthesize-json-streaming#request.body.utterances.voice) + must be specified when instant mode is enabled. Dynamic voice + generation is not supported with this mode. + + - Instant mode is only supported for streaming endpoints (e.g., + [/v0/tts/stream/json](/reference/text-to-speech-tts/synthesize-json-streaming), + [/v0/tts/stream/file](/reference/text-to-speech-tts/synthesize-file-streaming)). + + - Ensure only a single generation is requested + ([num_generations](/reference/text-to-speech-tts/synthesize-json-streaming#request.body.num_generations) + must be `1` or omitted). + default: true + source: + openapi: tts-openapi.json + ReturnTts: + properties: + generations: + type: list + request_id: + type: optional + docs: >- + A unique ID associated with this request for tracking and + troubleshooting. Use this ID when contacting [support](/support) for + troubleshooting assistance. + source: + openapi: tts-openapi.json + ReturnVoice: + docs: An Octave voice available for text-to-speech + properties: + id: + type: optional + docs: ID of the voice in the `Voice Library`. + name: + type: optional + docs: Name of the voice in the `Voice Library`. + provider: + type: optional + docs: >- + The provider associated with the created voice. + + + Voices created through this endpoint will always have the provider set + to `CUSTOM_VOICE`, indicating a custom voice stored in your account. + compatible_octave_models: optional> + source: + openapi: tts-openapi.json + FormatPcm: + properties: + type: literal<"pcm"> + source: + openapi: tts-openapi.json + PostedUtterance: + properties: + description: + type: optional + docs: >- + Natural language instructions describing how the synthesized speech + should sound, including but not limited to tone, intonation, pacing, + and accent. + + + **This field behaves differently depending on whether a voice is + specified**: + + - **Voice specified**: the description will serve as acting directions + for delivery. Keep directions concise—100 characters or fewer—for best + results. See our guide on [acting + instructions](/docs/text-to-speech-tts/acting-instructions). + + - **Voice not specified**: the description will serve as a voice + prompt for generating a voice. See our [prompting + guide](/docs/text-to-speech-tts/prompting) for design tips. + validation: + maxLength: 1000 + speed: + type: optional + docs: >- + Speed multiplier for the synthesized speech. Extreme values below 0.75 + and above 1.5 may sometimes cause instability to the generated output. + default: 1 + validation: + min: 0.5 + max: 2 + text: + type: string + docs: The input text to be synthesized into speech. + validation: + maxLength: 5000 + trailing_silence: + type: optional + docs: Duration of trailing silence (in seconds) to add to this utterance + default: 0 + validation: + min: 0 + max: 5 + voice: + type: optional + docs: >- + The `name` or `id` associated with a **Voice** from the **Voice + Library** to be used as the speaker for this and all subsequent + `utterances`, until the `voice` field is updated again. + + See our [voices guide](/docs/text-to-speech-tts/voices) for more details on generating and specifying **Voices**. + source: + openapi: tts-openapi.json + ValidationErrorLocItem: + discriminated: false + union: + - string + - integer + source: + openapi: tts-openapi.json + inline: true + ValidationError: + properties: + loc: + type: list + msg: string + type: string + source: + openapi: tts-openapi.json + FormatWav: + properties: + type: literal<"wav"> + source: + openapi: tts-openapi.json + ErrorResponse: + properties: + error: optional + message: optional + code: optional + source: + openapi: tts-openapi.json + ReturnPagedVoices: + docs: A paginated list Octave voices available for text-to-speech + properties: + page_number: + type: optional + docs: >- + The page number of the returned list. + + + This value corresponds to the `page_number` parameter specified in the + request. Pagination uses zero-based indexing. + page_size: + type: optional + docs: >- + The maximum number of items returned per page. + + + This value corresponds to the `page_size` parameter specified in the + request. + total_pages: + type: optional + docs: The total number of pages in the collection. + voices_page: + type: optional> + docs: >- + List of voices returned for the specified `page_number` and + `page_size`. + source: + openapi: tts-openapi.json diff --git a/.mock/definition/tts/streamInput.yml b/.mock/definition/tts/streamInput.yml new file mode 100644 index 00000000..807536e2 --- /dev/null +++ b/.mock/definition/tts/streamInput.yml @@ -0,0 +1,96 @@ +imports: + root: __package__.yml +channel: + path: /stream/input + url: tts + auth: false + docs: Generate emotionally expressive speech. + query-parameters: + access_token: + type: optional + default: '' + docs: >- + Access token used for authenticating the client. If not provided, an + `api_key` must be provided to authenticate. + + + The access token is generated using both an API key and a Secret key, + which provides an additional layer of security compared to using just an + API key. + + + For more details, refer to the [Authentication Strategies + Guide](/docs/introduction/api-key#authentication-strategies). + context_generation_id: + type: optional + docs: >- + The ID of a prior TTS generation to use as context for generating + consistent speech style and prosody across multiple requests. Including + context may increase audio generation times. + format_type: + type: optional + docs: The format to be used for audio generation. + include_timestamp_types: + type: optional + allow-multiple: true + docs: The set of timestamp types to include in the response. + instant_mode: + type: optional + default: true + docs: >- + Enables ultra-low latency streaming, significantly reducing the time + until the first audio chunk is received. Recommended for real-time + applications requiring immediate audio playback. For further details, + see our documentation on [instant + mode](/docs/text-to-speech-tts/overview#ultra-low-latency-streaming-instant-mode). + no_binary: + type: optional + default: false + docs: If enabled, no binary websocket messages will be sent to the client. + strip_headers: + type: optional + default: false + docs: >- + If enabled, the audio for all the chunks of a generation, once + concatenated together, will constitute a single audio file. Otherwise, + if disabled, each chunk's audio will be its own audio file, each with + its own headers (if applicable). + version: + type: optional + docs: >- + The version of the Octave Model to use. 1 for the legacy model, 2 for + the new model. + api_key: + type: optional + default: '' + docs: >- + API key used for authenticating the client. If not provided, an + `access_token` must be provided to authenticate. + + + For more details, refer to the [Authentication Strategies + Guide](/docs/introduction/api-key#authentication-strategies). + messages: + publish: + origin: client + body: + type: root.PublishTts + subscribe: + origin: server + body: + type: root.TtsOutput + examples: + - messages: + - type: publish + body: {} + - type: subscribe + body: + audio: audio + audio_format: mp3 + chunk_index: 1 + generation_id: generation_id + is_last_chunk: true + request_id: request_id + snippet_id: snippet_id + text: text + type: audio diff --git a/.mock/definition/tts/voices.yml b/.mock/definition/tts/voices.yml new file mode 100644 index 00000000..198a700d --- /dev/null +++ b/.mock/definition/tts/voices.yml @@ -0,0 +1,140 @@ +imports: + root: __package__.yml +service: + auth: false + base-path: '' + endpoints: + list: + path: /v0/tts/voices + method: GET + docs: >- + Lists voices you have saved in your account, or voices from the [Voice + Library](https://platform.hume.ai/tts/voice-library). + pagination: + offset: $request.page_number + results: $response.voices_page + source: + openapi: tts-openapi.json + display-name: List voices + request: + name: VoicesListRequest + query-parameters: + provider: + type: root.VoiceProvider + docs: >- + Specify the voice provider to filter voices returned by the + endpoint: + + + - **`HUME_AI`**: Lists preset, shared voices from Hume's [Voice + Library](https://platform.hume.ai/tts/voice-library). + + - **`CUSTOM_VOICE`**: Lists custom voices created and saved to + your account. + page_number: + type: optional + default: 0 + docs: >- + Specifies the page number to retrieve, enabling pagination. + + + This parameter uses zero-based indexing. For example, setting + `page_number` to 0 retrieves the first page of results (items 0-9 + if `page_size` is 10), setting `page_number` to 1 retrieves the + second page (items 10-19), and so on. Defaults to 0, which + retrieves the first page. + page_size: + type: optional + docs: >- + Specifies the maximum number of results to include per page, + enabling pagination. The value must be between 1 and 100, + inclusive. + + + For example, if `page_size` is set to 10, each page will include + up to 10 items. Defaults to 10. + ascending_order: optional + response: + docs: Success + type: root.ReturnPagedVoices + status-code: 200 + errors: + - root.BadRequestError + examples: + - query-parameters: + provider: CUSTOM_VOICE + response: + body: + page_number: 0 + page_size: 10 + total_pages: 1 + voices_page: + - id: c42352c0-4566-455d-b180-0f654b65b525 + name: David Hume + provider: CUSTOM_VOICE + - id: d87352b0-26a3-4b11-081b-d157a5674d19 + name: Goliath Hume + provider: CUSTOM_VOICE + create: + path: /v0/tts/voices + method: POST + docs: >- + Saves a new custom voice to your account using the specified TTS + generation ID. + + + Once saved, this voice can be reused in subsequent TTS requests, + ensuring consistent speech style and prosody. For more details on voice + creation, see the [Voices Guide](/docs/text-to-speech-tts/voices). + source: + openapi: tts-openapi.json + display-name: Create voice + request: + name: PostedVoice + body: + properties: + generation_id: + type: string + docs: >- + A unique ID associated with this TTS generation that can be used + as context for generating consistent speech style and prosody + across multiple requests. + name: + type: string + docs: Name of the voice in the `Voice Library`. + content-type: application/json + response: + docs: Successful Response + type: root.ReturnVoice + status-code: 200 + errors: + - root.UnprocessableEntityError + examples: + - request: + generation_id: 795c949a-1510-4a80-9646-7d0863b023ab + name: David Hume + response: + body: + id: c42352c0-4566-455d-b180-0f654b65b525 + name: David Hume + provider: CUSTOM_VOICE + delete: + path: /v0/tts/voices + method: DELETE + docs: Deletes a previously generated custom voice. + source: + openapi: tts-openapi.json + display-name: Delete voice + request: + name: VoicesDeleteRequest + query-parameters: + name: + type: string + docs: Name of the voice to delete + errors: + - root.BadRequestError + examples: + - query-parameters: + name: David Hume + source: + openapi: tts-openapi.json diff --git a/.mock/fern.config.json b/.mock/fern.config.json new file mode 100644 index 00000000..188e89b3 --- /dev/null +++ b/.mock/fern.config.json @@ -0,0 +1,4 @@ +{ + "organization": "hume", + "version": "0.108.0" +} diff --git a/package.json b/package.json index 7c9caf53..051b50a9 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "hume", - "version": "0.15.3", + "version": "0.15.4", "private": false, "repository": "github:humeai/hume-typescript-sdk", "type": "commonjs", diff --git a/pnpm b/pnpm deleted file mode 100644 index e69de29b..00000000 diff --git a/src/Client.ts b/src/Client.ts index de83e9d0..5d942071 100644 --- a/src/Client.ts +++ b/src/Client.ts @@ -27,8 +27,8 @@ export class HumeClient { { "X-Fern-Language": "JavaScript", "X-Fern-SDK-Name": "hume", - "X-Fern-SDK-Version": "0.15.3", - "User-Agent": "hume/0.15.3", + "X-Fern-SDK-Version": "0.15.4", + "User-Agent": "hume/0.15.4", "X-Fern-Runtime": core.RUNTIME.type, "X-Fern-Runtime-Version": core.RUNTIME.version, }, diff --git a/src/api/resources/empathicVoice/types/ToolCallMessage.ts b/src/api/resources/empathicVoice/types/ToolCallMessage.ts index 4c7205b6..0e8f9dd2 100644 --- a/src/api/resources/empathicVoice/types/ToolCallMessage.ts +++ b/src/api/resources/empathicVoice/types/ToolCallMessage.ts @@ -10,9 +10,13 @@ export interface ToolCallMessage { customSessionId?: string; /** Name of the tool called. */ name: string; - /** Parameters of the tool call. Is a stringified JSON schema. */ + /** + * Parameters of the tool. + * + * These parameters define the inputs needed for the tool's execution, including the expected data type and description for each input field. Structured as a stringified JSON schema, this format ensures the tool receives data in the expected format. + */ parameters: string; - /** Indicates whether a response to the tool call is required from the developer, either in the form of a [Tool Response message](/reference/empathic-voice-interface-evi/chat/chat#send.Tool%20Response%20Message.type) or a [Tool Error message](/reference/empathic-voice-interface-evi/chat/chat#send.Tool%20Error%20Message.type). */ + /** Indicates whether a response to the tool call is required from the developer, either in the form of a [Tool Response message](/reference/speech-to-speech-evi/chat#send.ToolResponseMessage) or a [Tool Error message](/reference/speech-to-speech-evi/chat#send.ToolErrorMessage). */ responseRequired: boolean; /** * The unique identifier for a specific tool call instance. @@ -21,11 +25,11 @@ export interface ToolCallMessage { */ toolCallId: string; /** Type of tool called. Either `builtin` for natively implemented tools, like web search, or `function` for user-defined tools. */ - toolType: Hume.empathicVoice.ToolType; + toolType?: Hume.empathicVoice.ToolType; /** * The type of message sent through the socket; for a Tool Call message, this must be `tool_call`. * * This message indicates that the supplemental LLM has detected a need to invoke the specified tool. */ - type?: "tool_call"; + type: "tool_call"; } diff --git a/src/serialization/resources/empathicVoice/types/ToolCallMessage.ts b/src/serialization/resources/empathicVoice/types/ToolCallMessage.ts index 0c0220eb..9cf62d44 100644 --- a/src/serialization/resources/empathicVoice/types/ToolCallMessage.ts +++ b/src/serialization/resources/empathicVoice/types/ToolCallMessage.ts @@ -14,8 +14,8 @@ export const ToolCallMessage: core.serialization.ObjectSchema< parameters: core.serialization.string(), responseRequired: core.serialization.property("response_required", core.serialization.boolean()), toolCallId: core.serialization.property("tool_call_id", core.serialization.string()), - toolType: core.serialization.property("tool_type", ToolType), - type: core.serialization.stringLiteral("tool_call").optional(), + toolType: core.serialization.property("tool_type", ToolType.optional()), + type: core.serialization.stringLiteral("tool_call"), }); export declare namespace ToolCallMessage { @@ -25,7 +25,7 @@ export declare namespace ToolCallMessage { parameters: string; response_required: boolean; tool_call_id: string; - tool_type: ToolType.Raw; - type?: "tool_call" | null; + tool_type?: ToolType.Raw | null; + type: "tool_call"; } } diff --git a/src/version.ts b/src/version.ts index 4d012803..ed5f15c2 100644 --- a/src/version.ts +++ b/src/version.ts @@ -1 +1 @@ -export const SDK_VERSION = "0.15.3"; +export const SDK_VERSION = "0.15.4"; diff --git a/tsc_output.txt b/tsc_output.txt deleted file mode 100644 index 9d547802..00000000 --- a/tsc_output.txt +++ /dev/null @@ -1,50 +0,0 @@ -src/api/resources/empathicVoice/resources/chat/client/Client.ts(145,36): error TS9013: Expression type can't be inferred with --isolatedDeclarations. -src/api/resources/empathicVoice/resources/chat/client/Client.ts(145,64): error TS9013: Expression type can't be inferred with --isolatedDeclarations. -src/api/resources/empathicVoice/resources/chat/client/Socket.ts(79,12): error TS9008: Method must have an explicit return type annotation with --isolatedDeclarations. -src/core/websocket/ws.ts(173,9): error TS9009: At least one accessor must have an explicit type annotation with --isolatedDeclarations. -src/core/websocket/ws.ts(176,9): error TS9009: At least one accessor must have an explicit type annotation with --isolatedDeclarations. -src/core/websocket/ws.ts(179,9): error TS9009: At least one accessor must have an explicit type annotation with --isolatedDeclarations. -src/core/websocket/ws.ts(182,9): error TS9009: At least one accessor must have an explicit type annotation with --isolatedDeclarations. -src/core/websocket/ws.ts(283,12): error TS9008: Method must have an explicit return type annotation with --isolatedDeclarations. -src/core/websocket/ws.ts(302,12): error TS9008: Method must have an explicit return type annotation with --isolatedDeclarations. -src/core/websocket/ws.ts(317,12): error TS9008: Method must have an explicit return type annotation with --isolatedDeclarations. -src/index.ts(4,27): error TS1205: Re-exporting a type when 'isolatedModules' is enabled requires using 'export type'. -src/wrapper/checkForAudioTracks.ts(8,36): error TS9007: Function must have an explicit return type annotation with --isolatedDeclarations. -src/wrapper/EVIWebAudioPlayer.ts(350,5): error TS9008: Method must have an explicit return type annotation with --isolatedDeclarations. -src/wrapper/EVIWebAudioPlayer.ts(381,5): error TS9008: Method must have an explicit return type annotation with --isolatedDeclarations. -src/wrapper/EVIWebAudioPlayer.ts(393,5): error TS9008: Method must have an explicit return type annotation with --isolatedDeclarations. -src/wrapper/EVIWebAudioPlayer.ts(402,5): error TS9008: Method must have an explicit return type annotation with --isolatedDeclarations. -src/wrapper/EVIWebAudioPlayer.ts(411,5): error TS9008: Method must have an explicit return type annotation with --isolatedDeclarations. -src/wrapper/index.ts(12,29): error TS1205: Re-exporting a type when 'isolatedModules' is enabled requires using 'export type'. -src/wrapper/index.ts(12,58): error TS1205: Re-exporting a type when 'isolatedModules' is enabled requires using 'export type'. -src/wrapper/index.ts(16,36): error TS9007: Function must have an explicit return type annotation with --isolatedDeclarations. -Files: 1116 -Lines of Library: 39676 -Lines of Definitions: 96830 -Lines of TypeScript: 26782 -Lines of JavaScript: 0 -Lines of JSON: 0 -Lines of Other: 0 -Identifiers: 181916 -Symbols: 321629 -Types: 182763 -Instantiations: 1106408 -Memory used: 546430K -Assignability cache size: 96818 -Identity cache size: 1699 -Subtype cache size: 680 -Strict subtype cache size: 512 -I/O Read time: 0.06s -Parse time: 0.19s -ResolveModule time: 0.03s -ResolveTypeReference time: 0.00s -ResolveLibrary time: 0.00s -Program time: 0.34s -Bind time: 0.11s -Check time: 1.19s -transformTime time: 0.13s -commentTime time: 0.03s -I/O Write time: 0.11s -printTime time: 0.45s -Emit time: 0.45s -Total time: 2.09s