You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently I came across this project and love what I am seeing here.
Additionally I have been doing some reading into this solution for improving attention so that in multi-round chat interactions models do not lose the ability to be fluent in their responses all while retaining their current context windows from pre-training.
I don't have the experience right now to do the integration so this is really just leaving this here as a pointer for future reference.
The work referenced adapts attention sinks to work with the HuggingFace transformers library so that with relative ease multiple models can be evaluated on long-running chat performance.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
H folks,
Recently I came across this project and love what I am seeing here.
Additionally I have been doing some reading into this solution for improving attention so that in multi-round chat interactions models do not lose the ability to be fluent in their responses all while retaining their current context windows from pre-training.
I don't have the experience right now to do the integration so this is really just leaving this here as a pointer for future reference.
The work referenced adapts attention sinks to work with the HuggingFace transformers library so that with relative ease multiple models can be evaluated on long-running chat performance.
https://github.com/tomaarsen/attention_sinks
For privateGPT, this could really help organizations run more effective multi-round chat-style assistants with fewer compute resources.
Beta Was this translation helpful? Give feedback.
All reactions