Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(completion): replace CRLF with LF for code completion LLM requests #3303

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions crates/tabby/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ color-eyre = { version = "0.6.3" }
reqwest.workspace = true
async-openai.workspace = true
spinners = "4.1.1"
regex.workspace = true

[dependencies.openssl]
optional = true
Expand Down
29 changes: 1 addition & 28 deletions crates/tabby/src/routes/completions.rs
Original file line number Diff line number Diff line change
Expand Up @@ -36,38 +36,11 @@ pub async fn completions(

let user_agent = user_agent.map(|x| x.0.to_string());

let mut use_crlf = false;
if let Some(segments) = request.segments {
let mut new_segments = segments.clone();
if segments.prefix.contains("\r\n") {
use_crlf = true;
new_segments.prefix = segments.prefix.replace("\r\n", "\n");
}
if let Some(suffix) = segments.suffix {
if suffix.contains("\r\n") {
use_crlf = true;
new_segments.suffix = Some(suffix.replace("\r\n", "\n"));
}
}
request.segments = Some(new_segments);
}

match state
.generate(&request, &allowed_code_repository, user_agent.as_deref())
zwpaper marked this conversation as resolved.
Show resolved Hide resolved
.await
{
Ok(resp) => {
if use_crlf {
let mut response_crlf = resp.clone();
for (index, choice) in resp.choices.iter().enumerate() {
response_crlf.choices[index].text = choice.text.replace("\n", "\r\n");
}

return Ok(Json(response_crlf));
}

Ok(Json(resp))
}
Ok(resp) => Ok(Json(resp)),
Err(err) => {
warn!("{}", err);
Err(StatusCode::BAD_REQUEST)
Expand Down
27 changes: 22 additions & 5 deletions crates/tabby/src/services/completion.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
mod completion_prompt;

use regex::Regex;
use std::sync::Arc;

use serde::{Deserialize, Serialize};
Expand Down Expand Up @@ -321,35 +322,51 @@
self.config.max_decoding_tokens,
);

let mut use_crlf = false;
let (prompt, segments, snippets) = if let Some(prompt) = request.raw_prompt() {
(prompt, None, vec![])
} else if let Some(segments) = request.segments.as_ref() {
let mut new_segments = segments.clone();
if segments.prefix.contains("\r\n") {
wsxiaoys marked this conversation as resolved.
Show resolved Hide resolved
use_crlf = true;
new_segments.prefix = segments.prefix.replace("\r\n", "\n");

Check warning on line 332 in crates/tabby/src/services/completion.rs

View check run for this annotation

Codecov / codecov/patch

crates/tabby/src/services/completion.rs#L331-L332

Added lines #L331 - L332 were not covered by tests
zwpaper marked this conversation as resolved.
Show resolved Hide resolved
}
if let Some(suffix) = &segments.suffix {
if suffix.contains("\r\n") {
use_crlf = true;
new_segments.suffix = Some(suffix.replace("\r\n", "\n"));

Check warning on line 337 in crates/tabby/src/services/completion.rs

View check run for this annotation

Codecov / codecov/patch

crates/tabby/src/services/completion.rs#L336-L337

Added lines #L336 - L337 were not covered by tests
}
}

Check warning on line 339 in crates/tabby/src/services/completion.rs

View check run for this annotation

Codecov / codecov/patch

crates/tabby/src/services/completion.rs#L339

Added line #L339 was not covered by tests

let snippets = self
.build_snippets(
&language,
segments,
&new_segments,
allowed_code_repository,
request.disable_retrieval_augmented_code_completion(),
)
.await;
let prompt = self
.prompt_builder
.build(&language, segments.clone(), &snippets);
.build(&language, new_segments.clone(), &snippets);
(prompt, Some(segments), snippets)
} else {
return Err(CompletionError::EmptyPrompt);
};

let text = self.engine.generate(&prompt, options).await;
let segments = segments.cloned().map(|s| s.into());
let mut text = self.engine.generate(&prompt, options).await;
if use_crlf {
let re = Regex::new(r"([^\r])\n").unwrap(); // Match \n that is preceded by anything except \r
text = re.replace_all(&text, "$1\r\n").to_string() // Replace with captured character and \r\n

Check warning on line 360 in crates/tabby/src/services/completion.rs

View check run for this annotation

Codecov / codecov/patch

crates/tabby/src/services/completion.rs#L359-L360

Added lines #L359 - L360 were not covered by tests
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract a function override_text(text, use_crlf).

Please also document a bit why this regex is needed for replacement (instead of just \n -> \r\n)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

text seems to be too much general, I changed it to generated


self.logger.log(
request.user.clone(),
Event::Completion {
completion_id: completion_id.clone(),
language,
prompt: prompt.clone(),
segments,
segments: segments.cloned().map(|x| x.into()),
choices: vec![api::event::Choice {
index: 0,
text: text.clone(),
Expand Down
Loading