Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(core): support configuring stop words in model config #3209

Merged
merged 7 commits into from
Oct 5, 2024

Conversation

zwpaper
Copy link
Member

@zwpaper zwpaper commented Sep 26, 2024

fix #3188

Copy link
Member

@wsxiaoys wsxiaoys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also test the behavior locally (e.g with screen recording). Otherwise LGTM

@@ -252,6 +254,9 @@ pub struct HttpModelConfig {
/// Used by Completion API to construct a chat model.
#[builder(default)]
pub chat_template: Option<String>,

#[builder(default)]
pub stop_words: Vec<String>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub stop_words: Vec<String>,
pub additional_stop_words: Vec<String>,

@@ -4,6 +4,7 @@ use trie_rs::{Trie, TrieBuilder};

pub struct StopConditionFactory {
stop_trie_cache: DashMap<String, Trie<u8>>,
model_stop_words: Vec<String>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
model_stop_words: Vec<String>,
stop_words_from_model_config: Vec<String>,

@zwpaper
Copy link
Member Author

zwpaper commented Sep 27, 2024

when using the following config, I can reproduce the error:

[model.completion.http]
kind = "ollama/completion"
model_name = "qwen2.5-coder:7b-base"
api_endpoint = "http://localhost:11434"
prompt_template = "<|fim_prefix|>{prefix}<|fim_suffix|>{suffix}<|fim_middle|>"
stop_words = []

CleanShot 2024-09-28 at 01 11 27@2x

but some of the stop words seems to be not working, even with the following config:

[model.completion.http]
kind = "ollama/completion"
model_name = "qwen2.5-coder:7b-base"
api_endpoint = "http://localhost:11434"
prompt_template = "<|fim_prefix|>{prefix}<|fim_suffix|>{suffix}<|fim_middle|>"
stop_words = ["<|endoftext|>", "<|file_sep|>"]

CleanShot 2024-09-28 at 01 15 44@2x


still working on it and trying to figure out why

@zwpaper zwpaper force-pushed the feat/model-stop-words branch 4 times, most recently from 455b59e to 2e645bd Compare September 29, 2024 15:48
Comment on lines 35 to 41
let stop_condition_factory = match config {
Some(ModelConfig::Local(config)) => StopConditionFactory::with_stop_words(
config.additional_stop_words.unwrap_or_default(),
),
Some(ModelConfig::Http(config)) => StopConditionFactory::with_stop_words(
config.additional_stop_words.unwrap_or_default(),
),
_ => StopConditionFactory::default(),
};

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let stop_condition_factory = match config {
Some(ModelConfig::Local(config)) => StopConditionFactory::with_stop_words(
config.additional_stop_words.unwrap_or_default(),
),
Some(ModelConfig::Http(config)) => StopConditionFactory::with_stop_words(
config.additional_stop_words.unwrap_or_default(),
),
_ => StopConditionFactory::default(),
};
let additional_stop_words = match config {
Some(ModelConfig::Local(config)) => config.additional_stop_words.unwrap_or_default(),
...
};

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the stop_condition_factory here means all the stop words, not only the additional stop words, are we really want to rename the stop_condition_factory to additional_stop_words?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm asking a restructure, rather than a renaming.

        let additional_stop_words = match config {
            Some(ModelConfig::Local(config)) =>  config.additional_stop_words.unwrap_or_default(),
            ...
        };

       let stop_condition_factory = StopConditionFactory::with_stop_words(additional_stop_words).

So you don't need have to write StopConditionFactory::with_stop_words multiple times.

completion_model: Option<ModelConfig>,
completion_model: &Option<ModelConfig>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the config passes in will be used later, so this must change to a borrowed args, or it will take over the life time of the config passed in

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend just clone it in this particular case.

@wsxiaoys wsxiaoys changed the title feat: add stop words to model config feat(core): support configuring stop words in model config Oct 5, 2024
Copy link
Member

@wsxiaoys wsxiaoys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix unit test

@wsxiaoys wsxiaoys enabled auto-merge (squash) October 5, 2024 04:41
@wsxiaoys wsxiaoys disabled auto-merge October 5, 2024 04:41
@wsxiaoys wsxiaoys merged commit 780b9eb into TabbyML:main Oct 5, 2024
3 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support customize additional stop words.
2 participants