Stream response from chatbot

All LLM platforms out there will stream the text piece by piece so you can see the response as it comes out. Right now, ours waits until the chatbot fully generates an answer before displaying it to the user.

This might be harder than it looks, and might not be worth it.

- The frontend will need to be updated to show a streamed response (many libraries are built for this, i think even antd or shadcn has some components for it)
- Backend: Needs to stream from Ollama server (or OpenAI technically) to our Chatbot server to our HelpMe server to our Frontend. This might be somewhat easy to do or a huge pain in the ass.
- We need to save the chatbot response to the database. We could maybe stream it into the database (maybe???), or just save it once the stream has concluded.
- Users are typically able to stop a LLM part-way through its response. Good luck catching all the edge cases with this one.
- Some LLMs (such as Deepseek R1) have `<think>` blocks. Right now, our frontend will just parse out the `<think> thinking text (usually many words) </think>` but it will need to be modified to show "thinking..." if there's no `</think>` 
- Probably more issues with this that I haven't yet thought about

Overall, I don't think it would be *too* hard to pull off, but it's probably just a lot more work than it's worth. Faster chatbot responses is nice, though.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream response from chatbot #487

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Stream response from chatbot #487

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions