How to access token usage metadata while using streamed response? [OpenAI, Anthropic] #850

Sarkhad · 2024-06-10T21:17:45Z

Sarkhad
Jun 10, 2024

First of all, my goal is to implement chatbot with billing based on the token usage as the original API does.

It's known that OpenAI's and Anthropic's models return input-n-output tokens usage at the end of the streamed response.
Instead of implementing own low-level communication with LLM models, I'm aimed to integrate through Spring AI's abstraction.
Unfortunately, I've unsuccessfully tried high and low-level approaches that the framework provides to achieve the usage metadata. So, my question is how to get this metadata properly?

Here's some snippets if you wish:
` var messages = request.getMessages().stream()
.filter(m -> !m.role().equals("system"))
.map(m -> new AnthropicApi.RequestMessage(
List.of(new AnthropicApi.MediaContent((String) m.content())),
AnthropicApi.Role.valueOf(m.role().toUpperCase()))).toList();

    // var chatCompletionMessage = new OpenAiApi.ChatCompletionMessage("Hello world", OpenAiApi.ChatCompletionMessage.Role.USER);

    // Streaming request
    Flux<AnthropicApi.StreamResponse> streamResponse = anthropicApi.chatCompletionStream(
            new AnthropicApi.ChatCompletionRequest(
                    AnthropicApi.ChatModel.CLAUDE_3_OPUS.getValue(),
                    messages,
                    null,
                    request.getMaxTokens(),
                    0.8f,
                    true
            )
    );`

`var messages = request.getMessages().stream().map(m -> new OpenAiApi.ChatCompletionMessage(m.content(), OpenAiApi.ChatCompletionMessage.Role.valueOf(m.role().toUpperCase()))).toList();

    // var chatCompletionMessage = new OpenAiApi.ChatCompletionMessage("Hello world", OpenAiApi.ChatCompletionMessage.Role.USER);

    // Streaming request
    Flux<ChatCompletionChunk> streamResponse = openAiApi.chatCompletionStream(
            new OpenAiApi.ChatCompletionRequest(messages, request.getModel(), (float) request.getTemperature(), true));
    //Prompt prompt = new Prompt(new UserMessage(message));
    //return chatModel.stream(prompt);

    return streamResponse;`

` var openAiChatOptions = OpenAiChatOptions.builder()
.withModel(request.getModel())
.withTemperature((float) request.getTemperature())
.withMaxTokens(request.getMaxTokens())
.build();
var chatModel = new OpenAiChatModel(openAiApi, openAiChatOptions);

    var messages = request.getMessages().stream().map(m -> {
        switch (MessageType.fromValue(m.role())) {
            case USER -> {
                return (Message) new UserMessage((String) m.content());
            }
            case ASSISTANT -> {
                return (Message) new AssistantMessage((String) m.content());
            }
            case SYSTEM -> {
                return (Message) new SystemMessage((String) m.content());
            }
            case FUNCTION -> {
                return (Message) new FunctionMessage((String) m.content());
            }
            default -> throw new IllegalStateException("Unexpected value: " + MessageType.fromValue(m.role()));
        }
        //new Message(m.content(), OpenAiApi.ChatCompletionMessage.Role.valueOf(m.role().toUpperCase()))
    }).toList();


    Flux<ChatResponse> response = chatModel.stream(
            new Prompt(messages));
    return response; //.map(ChatResponse::getResult);`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to access token usage metadata while using streamed response? [OpenAI, Anthropic] #850

{{title}}

Replies: 0 comments

Select a reply

How to access token usage metadata while using streamed response? [OpenAI, Anthropic] #850

Sarkhad Jun 10, 2024

Replies: 0 comments

Sarkhad
Jun 10, 2024