Skip to content

Commit

Permalink
Fixes, optimizations and pretty up
Browse files Browse the repository at this point in the history
  • Loading branch information
shayonj committed Dec 3, 2023
1 parent ae79034 commit 14d442d
Show file tree
Hide file tree
Showing 12 changed files with 61 additions and 73 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,5 @@

node_modules

.db
*.db
.DS_Store
16 changes: 11 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@

You can now easily run, any kind of analytics on your Git directory using the SQLite database.

![plot](./internal/screenshot.png)

## Features ✨

- Synchronize Git repository data into a SQLite database.
Expand All @@ -18,6 +20,8 @@ After installation, you can use `branch_base` to sync a Git repository:
branch_base sync ~/src/rails
```

The first sync on large git directories might be a bit slow, subsequent runs should be faster.

## Example SQL Queries 📊

Once your repository data is synchronized into a SQLite database, you can run various SQL queries to analyze the data. Here are some examples:
Expand Down Expand Up @@ -64,11 +68,13 @@ Once your repository data is synchronized into a SQLite database, you can run va
6. **Authors Who Have Worked on a Specific File**

```sql
SELECT DISTINCT commits.author
FROM commits
JOIN commit_files ON commits.commit_hash = commit_files.commit_hash
JOIN files ON commit_files.file_id = files.file_id
WHERE files.file_path like '%sqlite3%'
SELECT files.file_path, commits.author, COUNT(*) as times_contributed
FROM commits
JOIN commit_files ON commits.commit_hash = commit_files.commit_hash
JOIN files ON commit_files.file_id = files.file_id
WHERE files.file_path LIKE '%connection_adapters/sqlite%'
GROUP BY files.file_path, commits.author
ORDER BY times_contributed DESC;
```

## Installation 📥
Expand Down
Binary file added internal/screenshot.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added internal/screenshot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
16 changes: 16 additions & 0 deletions lib/branch_base.rb
Original file line number Diff line number Diff line change
@@ -1,9 +1,25 @@
# frozen_string_literal: true

require "logger"
require "branch_base/database"
require "branch_base/repository"
require "branch_base/sync"
require "branch_base/cli"

module BranchBase
def self.logger
@logger ||=
Logger
.new($stdout)
.tap do |log|
log.progname = "BranchBase"

log.level = ENV["DEBUG"] ? Logger::DEBUG : Logger::INFO

log.formatter =
proc do |severity, datetime, progname, msg|
"#{datetime}: #{severity} - #{progname}: #{msg}\n"
end
end
end
end
18 changes: 13 additions & 5 deletions lib/branch_base/cli.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,25 +5,33 @@

module BranchBase
class CLI < Thor
desc "sync REPO_PATH [BRANCH_OR_TAG]",
"Synchronize a specific branch or tag of the Git repository with the SQLite database"
desc "sync REPO_PATH", "Synchronize a Git directory to a SQLite database"
def sync(repo_path)
BranchBase.logger.info("Starting sync process for #{repo_path}...")

full_repo_path = File.expand_path(repo_path)

unless File.directory?(File.join(full_repo_path, ".git"))
puts "The specified path is not a valid Git repository: #{full_repo_path}"
BranchBase.logger.error(
"The specified path is not a valid Git repository: #{full_repo_path}",
)
exit(1)
end

repo_name = File.basename(full_repo_path)
db_filename = "#{repo_name}_git_data.db"
db_directory = full_repo_path
db_filename = File.join(db_directory, "#{repo_name}_git_data.db")

database = Database.new(db_filename)
repository = Repository.new(full_repo_path)
start_time = Time.now
sync = Sync.new(database, repository)

sync.run
puts "Repository data synced successfully for"
elapsed_time = Time.now - start_time
BranchBase.logger.info(
"Repository data synced successfully in #{db_filename} in #{elapsed_time.round(2)} seconds",
)
end
end
end
13 changes: 13 additions & 0 deletions lib/branch_base/sync.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

module BranchBase
class Sync
# TODO acctualy see if bulk inserts are faster
BATCH_SIZE = 1000

def initialize(database, repository)
Expand Down Expand Up @@ -38,6 +39,10 @@ def sync_repository
end

def sync_branches(repo_id)
BranchBase.logger.debug(
"Syncing branches for repository ID: #{@repo.path}",
)

batched_branches = []

@repo.branches.each do |branch|
Expand Down Expand Up @@ -127,6 +132,10 @@ def insert_commit_files(commit, repo_id)
end

def insert_commit_parents(commit)
BranchBase.logger.debug(
"Inserting parent commits for repository: #{@repo.path}",
)

commit.parent_ids.each do |parent_id|
@db.execute(
"INSERT INTO commit_parents (commit_hash, parent_hash) VALUES (?, ?)",
Expand All @@ -147,6 +156,10 @@ def insert_branches(batched_branches)
end

def insert_commits(batched_commits)
BranchBase.logger.debug(
"Inserting commits for repository ID: #{@repo.path}",
)

@db.transaction do
batched_commits.each do |data|
@db.execute(
Expand Down
52 changes: 0 additions & 52 deletions spec/branch_base/cli_spec.rb

This file was deleted.

3 changes: 1 addition & 2 deletions spec/branch_base/database_spec.rb
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
# frozen_string_literal: true
require "branch_base/database"
require "rspec"
require "spec_helper"

RSpec.describe(BranchBase::Database) do
let(:database) { BranchBase::Database.new(":memory:") }
Expand Down
6 changes: 2 additions & 4 deletions spec/branch_base/repository_spec.rb
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
# frozen_string_literal: true
require "branch_base/repository"
require "rugged"
require "rspec"
require "spec_helper"
require "test_helper"

RSpec.describe(BranchBase::Repository) do
Expand All @@ -24,7 +22,7 @@
repository.walk { |commit| commit_messages << commit.message.strip }
expect(commit_messages).to contain_exactly(
"Add contributing guidelines",
"Initial commit"
"Initial commit",
)
end
end
Expand Down
5 changes: 1 addition & 4 deletions spec/branch_base/sync_spec.rb
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
# frozen_string_literal: true
require "branch_base/sync"
require "branch_base/database"
require "branch_base/repository"
require "test_helper"
require "rspec"
require "spec_helper"

RSpec.describe(BranchBase::Sync) do
let(:db) { BranchBase::Database.new(":memory:") }
Expand Down
2 changes: 2 additions & 0 deletions spec/spec_helper.rb
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@
# the additional setup, and require it from the spec files that actually need
# it.
#
require "branch_base"

# See https://rubydoc.info/gems/rspec-core/RSpec/Core/Configuration
RSpec.configure do |config|
# rspec-expectations config goes here. You can use an alternate
Expand Down

0 comments on commit 14d442d

Please sign in to comment.