Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up hook diff processing #27

Merged
merged 58 commits into from
Feb 8, 2025
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
7d9843b
test
oleander Feb 7, 2025
fe94188
Add profiling feature with macro for measuring execution time
oleander Feb 7, 2025
462b75d
Refactor diff processing, optimize token handling and storage
oleander Feb 7, 2025
2ea378d
Update test signatures and add profiling tests
oleander Feb 7, 2025
37e5b2e
Refactor GitFile signatures and commit messages
oleander Feb 7, 2025
c87b0d2
Update Cargo.lock dependencies and checksums
oleander Feb 7, 2025
7eb1eb7
Update dependencies in Cargo.toml and Cargo.lock
oleander Feb 7, 2025
e2df5f4
Add StringPool for efficient memory use in PatchDiff
oleander Feb 7, 2025
bf3110f
Update dependencies in Cargo.toml and Cargo.lock
oleander Feb 7, 2025
63a498c
Add `num_cpus` crate and parallelize file processing
oleander Feb 7, 2025
6a5051b
Refactor file processing to use parallel chunks and atomic tokens
oleander Feb 7, 2025
7ed087b
Remove redundant import of `bail` from anyhow
oleander Feb 7, 2025
51f9609
Sort files by token count in `PatchDiff` implementation.
oleander Feb 7, 2025
5e50a25
Delete test.txt file
oleander Feb 7, 2025
d49f534
Improve error handling and path management in config and style modules
oleander Feb 7, 2025
600e5fd
Add tests for StringPool functionality in hook.rs
oleander Feb 7, 2025
450381d
Update default model and add profiling to model and commit functions
oleander Feb 7, 2025
00faa02
Add profiling to filesystem module functions
oleander Feb 7, 2025
4a7d5d8
Implement token counting and generation for commit messages
oleander Feb 7, 2025
a5833c8
Add documentation for Filesystem, File, and Dir structs in filesystem.rs
oleander Feb 7, 2025
5384690
Refactor commit message generation methods and file handling logic
oleander Feb 7, 2025
fa47b5b
Implement configuration file management and update functions in App
oleander Feb 7, 2025
c7778e6
Implement parallel processing of diff data in PatchDiff trait
oleander Feb 7, 2025
da74cd4
```
Feb 8, 2025
1e56b0d
```
Feb 8, 2025
1198750
Remove unused import of `std::fs` from `commit.rs` file.
Feb 8, 2025
140f2df
Remove unused import and adjust available tokens calculation
Feb 8, 2025
e1f49e4
Update max commit length in prompt guidelines
Feb 8, 2025
0ad8074
```
Feb 8, 2025
4f71559
Add directory creation for hooks if it does not exist
Feb 8, 2025
96aedfa
Add dead code allowance in filesystem.rs
Feb 8, 2025
aa4d073
Revert "```"
Feb 8, 2025
6fd6ab8
```
Feb 8, 2025
10192f6
Delete stats.json file
Feb 8, 2025
1e231fb
```
Feb 8, 2025
42f27e2
Build inline
Feb 8, 2025
b75147b
Update default model name in Args implementation
Feb 8, 2025
a546bba
```
Feb 8, 2025
596f662
```
Feb 8, 2025
0f7af0c
Change file permission of comprehensive-tests.
Feb 8, 2025
5d5ce13
Update `comprehensive-tests` script to load environment variables fro…
Feb 8, 2025
62e75f0
Remove note about output being used as a git commit message from 'pro…
Feb 8, 2025
4e8bbc9
Update comprehensive-tests script and prompt.md documentation
Feb 8, 2025
a887149
Update scripts and source code according to visible changes in the diff
Feb 8, 2025
e852edd
Refactor `hook.rs` and ensure a minimum of 512 tokens
Feb 8, 2025
825150c
Update clean-up command in comprehensive-tests script
Feb 8, 2025
3c4c51b
Add attribute to suppress dead code warnings in hook.rs
Feb 8, 2025
8f4942e
Add initial boilerplate for hook.rs
Feb 8, 2025
b22a161
Add debug message when a commit message already exists in hook.rs
Feb 8, 2025
986fd02
Add `to_commit_diff` and `configure_commit_diff_options` methods to `…
Feb 8, 2025
d6efbfc
Optimize max_tokens_per_file calculation in hook.rs
Feb 8, 2025
207f2c3
Merge main into feature/speed, preserving performance improvements
Feb 8, 2025
b7cce70
Refactor method calls and condition checks in openai.rs and patch_tes…
Feb 8, 2025
9e568a3
Refine instructions and guidelines for generating git commit messages
Feb 8, 2025
25b56cb
Add error handling for raw SHA1 resolution in hook.rs
Feb 8, 2025
944577b
Merge remote-tracking branch 'origin/main' into feature/speed
Feb 8, 2025
2fa2562
Refactor function calls in patch_test.rs and simplify conditional log…
Feb 8, 2025
1bfd788
Refactor reference resolution in hook.rs
Feb 8, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,518 changes: 1,049 additions & 469 deletions Cargo.lock

Large diffs are not rendered by default.

49 changes: 26 additions & 23 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,35 +25,38 @@ name = "git-ai-hook"
path = "src/bin/hook.rs"

[dependencies]
anyhow = { version = "1.0.86", default-features = false }
async-openai = { version = "0.18.3", default-features = false }
colored = "2.1.0"
config = { version = "0.13.4", default-features = false, features = ["ini"] }
console = { version = "0.15.8", default-features = false }
ctrlc = "3.4.4"
anyhow = { version = "1.0.95", default-features = false }
async-openai = { version = "0.27.2", default-features = false }
colored = "3.0.0"
config = { version = "0.15.7", default-features = false, features = ["ini"] }
console = { version = "0.15.10", default-features = false }
ctrlc = "3.4.5"
dotenv = "0.15.0"
env_logger = { version = "0.10.2", default-features = false }
git2 = { version = "0.18.3", default-features = false }
home = "0.5.9"
indicatif = { version = "0.17.8", default-features = false }
lazy_static = "1.4.0"
log = "0.4.21"
reqwest = { version = "0.11.27", default-features = true }
env_logger = { version = "0.11.6", default-features = false }
git2 = { version = "0.20.0", default-features = false }
home = "0.5.11"
indicatif = { version = "0.17.11", default-features = false }
lazy_static = "1.5.0"
log = "0.4.25"
reqwest = { version = "0.12.12", default-features = true }
serde = { version = "1", default-features = false }
serde_derive = "1.0.203"
serde_derive = "1.0.217"
serde_ini = "0.2.0"
serde_json = "1.0.117"
serde_json = "1.0.138"
structopt = "0.3.26"
thiserror = "1.0.61"
tokio = { version = "1.38.0", features = ["rt-multi-thread"] }
tiktoken-rs = { version = "0.5.9" }
openssl-sys = { version = "0.9.102", features = ["vendored"] }
thiserror = "2.0.11"
tokio = { version = "1.43.0", features = ["rt-multi-thread"] }
tiktoken-rs = { version = "0.6.0" }
openssl-sys = { version = "0.9.105", features = ["vendored"] }
rayon = "1.10.0"
parking_lot = "0.12.3"
num_cpus = "1.16.0"

[dev-dependencies]
tempfile = "3.10.1"
anyhow = { version = "1.0.86", default-features = false }
git2 = { version = "0.18.3", default-features = false }
rand = { version = "0.8.5", default-features = false }
tempfile = "3.16.0"
anyhow = { version = "1.0.95", default-features = false }
git2 = { version = "0.20.0", default-features = false }
rand = { version = "0.9.0", default-features = false }

[profile.release]
codegen-units = 1
Expand Down
4 changes: 2 additions & 2 deletions src/bin/hook.rs
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ impl Args {
bail!("No changes to commit");
}

let response = commit::generate(patch.to_string(), remaining_tokens, model).await?;
let response = commit::generate_commit_message(patch.to_string(), remaining_tokens, model).await?;
std::fs::write(&self.commit_msg_file, response.response.trim())?;
pb.finish_and_clear();

Expand All @@ -124,7 +124,7 @@ impl Args {
.clone()
.unwrap_or("gpt-4o".to_string())
.into();
let used_tokens = commit::token_used(&model)?;
let used_tokens = commit::get_instruction_token_count(&model)?;
let max_tokens = config::APP.max_tokens.unwrap_or(model.context_size());
let remaining_tokens = max_tokens.saturating_sub(used_tokens);

Expand Down
88 changes: 65 additions & 23 deletions src/commit.rs
Original file line number Diff line number Diff line change
@@ -1,42 +1,84 @@
use anyhow::{bail, Result};

use crate::{config, openai};
use crate::{config, openai, profile};
use crate::model::Model;

fn instruction() -> String {
format!("You are an AI assistant that generates concise and meaningful git commit messages based on provided diffs. Please adhere to the following guidelines:
const INSTRUCTION_TEMPLATE: &str = r#"You are an AI assistant that generates concise and meaningful git commit messages based on provided diffs. Please adhere to the following guidelines:

- Structure: Begin with a clear, present-tense summary.
- Content: Emphasize the changes and their rationale, excluding irrelevant details.
- Consistency: Maintain uniformity in tense, punctuation, and capitalization.
- Accuracy: Ensure the message accurately reflects the changes and their purpose.
- Present tense, imperative mood. (e.g., 'Add x to y' instead of 'Added x to y')
- Max {} chars in the output
- Structure: Begin with a clear, present-tense summary.
- Content: Emphasize the changes and their rationale, excluding irrelevant details.
- Consistency: Maintain uniformity in tense, punctuation, and capitalization.
- Accuracy: Ensure the message accurately reflects the changes and their purpose.
- Present tense, imperative mood. (e.g., 'Add x to y' instead of 'Added x to y')
- Max {} chars in the output

## Output:
## Output:

Your output should be a commit message generated from the input diff and nothing else.
Your output should be a commit message generated from the input diff and nothing else.

## Input:
## Input:

INPUT:", config::APP.max_commit_length.unwrap_or(72))
}
INPUT:"#;

pub fn token_used(model: &Model) -> Result<usize> {
model.count_tokens(&instruction())
/// Returns the instruction template for the AI model.
/// This template guides the model in generating appropriate commit messages.
fn get_instruction_template() -> String {
profile!("Generate instruction template");
INSTRUCTION_TEMPLATE.replace("{}", &config::APP.max_commit_length.unwrap_or(72).to_string())
}

pub async fn generate(diff: String, max_tokens: usize, model: Model) -> Result<openai::Response> {
if max_tokens == 0 {
bail!("Max can't be zero (2)")
}
/// Calculates the number of tokens used by the instruction template.
///
/// # Arguments
/// * `model` - The AI model to use for token counting
///
/// # Returns
/// * `Result<usize>` - The number of tokens used or an error
pub fn get_instruction_token_count(model: &Model) -> Result<usize> {
profile!("Calculate instruction tokens");
model.count_tokens(&get_instruction_template())
}

let request = openai::Request {
system: instruction(),
/// Creates an OpenAI request for commit message generation.
///
/// # Arguments
/// * `diff` - The git diff to generate a commit message for
/// * `max_tokens` - Maximum number of tokens allowed for the response
/// * `model` - The AI model to use for generation
///
/// # Returns
/// * `openai::Request` - The prepared request
fn create_commit_request(diff: String, max_tokens: usize, model: Model) -> openai::Request {
profile!("Prepare OpenAI request");
openai::Request {
system: get_instruction_template(),
prompt: diff,
max_tokens: max_tokens.try_into().unwrap_or(u16::MAX),
model
};
}
}

/// Generates a commit message using the AI model.
///
/// # Arguments
/// * `diff` - The git diff to generate a commit message for
/// * `max_tokens` - Maximum number of tokens allowed for the response
/// * `model` - The AI model to use for generation
///
/// # Returns
/// * `Result<openai::Response>` - The generated commit message or an error
///
/// # Errors
/// Returns an error if:
/// - max_tokens is 0
/// - OpenAI API call fails
pub async fn generate_commit_message(diff: String, max_tokens: usize, model: Model) -> Result<openai::Response> {
profile!("Generate commit message");

if max_tokens == 0 {
bail!("Maximum token count must be greater than zero")
}

let request = create_commit_request(diff, max_tokens, model);
openai::call(request).await
}
114 changes: 73 additions & 41 deletions src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,13 @@ use anyhow::{Context, Result};
use lazy_static::lazy_static;
use console::Emoji;

// Constants
const DEFAULT_TIMEOUT: i64 = 30;
const DEFAULT_MAX_COMMIT_LENGTH: i64 = 72;
const DEFAULT_MAX_TOKENS: i64 = 2024;
const DEFAULT_MODEL: &str = "gpt-4o-mini";
const DEFAULT_API_KEY: &str = "<PLACE HOLDER FOR YOUR API KEY>";

#[derive(Debug, Default, Deserialize, PartialEq, Eq, Serialize)]
pub struct App {
pub openai_api_key: Option<String>,
Expand All @@ -17,40 +24,51 @@ pub struct App {
pub timeout: Option<usize>
}

impl App {
#[allow(dead_code)]
pub fn duration(&self) -> std::time::Duration {
std::time::Duration::from_secs(self.timeout.unwrap_or(30) as u64)
}
#[derive(Debug)]
pub struct ConfigPaths {
pub dir: PathBuf,
pub file: PathBuf
}

lazy_static! {
pub static ref CONFIG_DIR: PathBuf = home::home_dir().unwrap().join(".config/git-ai");
#[derive(Debug)]
pub static ref APP: App = App::new().expect("Failed to load config");
pub static ref CONFIG_PATH: PathBuf = CONFIG_DIR.join("config.ini");
static ref PATHS: ConfigPaths = ConfigPaths::new();
pub static ref APP: App = App::new().expect("Failed to load config");
}

impl ConfigPaths {
fn new() -> Self {
let dir = home::home_dir()
.expect("Failed to determine home directory")
.join(".config/git-ai");
let file = dir.join("config.ini");
Self { dir, file }
}

fn ensure_exists(&self) -> Result<()> {
if !self.dir.exists() {
std::fs::create_dir_all(&self.dir).with_context(|| format!("Failed to create config directory at {:?}", self.dir))?;
}
if !self.file.exists() {
File::create(&self.file).with_context(|| format!("Failed to create config file at {:?}", self.file))?;
}
Ok(())
}
}

impl App {
pub fn new() -> Result<Self> {
dotenv::dotenv().ok();

if !CONFIG_DIR.exists() {
std::fs::create_dir_all(CONFIG_DIR.to_str().unwrap()).context("Failed to create config directory")?;
File::create(CONFIG_PATH.to_str().unwrap()).context("Failed to create config file")?;
} else if !CONFIG_PATH.exists() {
File::create(CONFIG_PATH.to_str().unwrap()).context("Failed to create config file")?;
}
PATHS.ensure_exists()?;

let config = Config::builder()
.add_source(config::Environment::with_prefix("APP").try_parsing(true))
.add_source(config::File::new(CONFIG_PATH.to_str().unwrap(), FileFormat::Ini))
.add_source(config::File::new(PATHS.file.to_string_lossy().as_ref(), FileFormat::Ini))
.set_default("language", "en")?
.set_default("timeout", 30)?
.set_default("max_commit_length", 72)?
.set_default("max_tokens", 2024)?
.set_default("model", "gpt-4o")?
.set_default("openai_api_key", "<PLACE HOLDER FOR YOUR API KEY>")?
.set_default("timeout", DEFAULT_TIMEOUT)?
.set_default("max_commit_length", DEFAULT_MAX_COMMIT_LENGTH)?
.set_default("max_tokens", DEFAULT_MAX_TOKENS)?
.set_default("model", DEFAULT_MODEL)?
.set_default("openai_api_key", DEFAULT_API_KEY)?
.build()?;

config
Expand All @@ -60,48 +78,62 @@ impl App {

pub fn save(&self) -> Result<()> {
let contents = serde_ini::to_string(&self).context(format!("Failed to serialize config: {:?}", self))?;
let mut file = File::create(CONFIG_PATH.to_str().unwrap()).context("Failed to create config file")?;
let mut file = File::create(&PATHS.file).with_context(|| format!("Failed to create config file at {:?}", PATHS.file))?;
file
.write_all(contents.as_bytes())
.context("Failed to write config file")
}

pub fn update_model(&mut self, value: String) -> Result<()> {
self.model = Some(value);
self.save_with_message("model")
}

pub fn update_max_tokens(&mut self, value: usize) -> Result<()> {
self.max_tokens = Some(value);
self.save_with_message("max-tokens")
}

pub fn update_max_commit_length(&mut self, value: usize) -> Result<()> {
self.max_commit_length = Some(value);
self.save_with_message("max-commit-length")
}

pub fn update_openai_api_key(&mut self, value: String) -> Result<()> {
self.openai_api_key = Some(value);
self.save_with_message("openai-api-key")
}

fn save_with_message(&self, option: &str) -> Result<()> {
println!("{} Configuration option {} updated!", Emoji("✨", ":-)"), option);
self.save()
}
}

// Public interface functions
pub fn run_model(value: String) -> Result<()> {
let mut app = App::new()?;
app.model = value.into();
println!("{} Configuration option model updated!", Emoji("✨", ":-)"));
app.save()
App::new()?.update_model(value)
}

pub fn run_max_tokens(max_tokens: usize) -> Result<()> {
let mut app = App::new()?;
app.max_tokens = max_tokens.into();
println!("{} Configuration option max-tokens updated!", Emoji("✨", ":-)"));
app.save()
App::new()?.update_max_tokens(max_tokens)
}

pub fn run_max_commit_length(max_commit_length: usize) -> Result<()> {
let mut app = App::new()?;
app.max_commit_length = max_commit_length.into();
println!("{} Configuration option max-commit-length updated!", Emoji("✨", ":-)"));
app.save()
App::new()?.update_max_commit_length(max_commit_length)
}

pub fn run_openai_api_key(value: String) -> Result<()> {
let mut app = App::new()?;
app.openai_api_key = Some(value);
println!("{} Configuration option openai-api-key updated!", Emoji("✨", ":-)"));
app.save()
App::new()?.update_openai_api_key(value)
}

pub fn run_reset() -> Result<()> {
if !CONFIG_PATH.exists() {
if !PATHS.file.exists() {
eprintln!("{} Configuration file does not exist!", Emoji("🤷", ":-)"));
return Ok(());
}

std::fs::remove_file(CONFIG_PATH.to_str().unwrap()).context("Failed to remove config file")?;
std::fs::remove_file(PATHS.file.to_str().unwrap()).context("Failed to remove config file")?;
println!("{} Configuration reset!", Emoji("✨", ":-)"));
Ok(())
}
Loading
Loading