Attention Sinks for better long-term chat memory management and better long-term fluency #1339

normanhh3 · 2023-11-30T13:21:48Z

normanhh3
Nov 30, 2023

H folks,

Recently I came across this project and love what I am seeing here.

Additionally I have been doing some reading into this solution for improving attention so that in multi-round chat interactions models do not lose the ability to be fluent in their responses all while retaining their current context windows from pre-training.

I don't have the experience right now to do the integration so this is really just leaving this here as a pointer for future reference.

The work referenced adapts attention sinks to work with the HuggingFace transformers library so that with relative ease multiple models can be evaluated on long-running chat performance.

https://github.com/tomaarsen/attention_sinks

For privateGPT, this could really help organizations run more effective multi-round chat-style assistants with fewer compute resources.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attention Sinks for better long-term chat memory management and better long-term fluency #1339

{{title}}

Replies: 0 comments

Select a reply

Attention Sinks for better long-term chat memory management and better long-term fluency #1339

normanhh3 Nov 30, 2023

Replies: 0 comments

normanhh3
Nov 30, 2023