From eefda59866f45e18be8700b32d832e39cbd543f0 Mon Sep 17 00:00:00 2001 From: Shayon Mukherjee Date: Sat, 2 Dec 2023 20:33:15 -0500 Subject: [PATCH] Add readme --- README.md | 136 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 136 insertions(+) create mode 100644 README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..03fef0b --- /dev/null +++ b/README.md @@ -0,0 +1,136 @@ +# BranchBase + +## Introduction + +`branch_base` is a Ruby gem to synchronize data from a Git repository into a SQLite database. It provides a CLI to easily build and store the data, including commits, branches, and file changes, into a SQLite database. + +You can now easily run, any kind of analytics on your Git directory using the SQLite database. + +## Features + +- Synchronize Git repository data into a SQLite database. +- Query commit history, branch details, and file changes using SQL. +- Easy-to-use CLI for quick setup and execution. + +## Installation + +### Via RubyGems + +You can install `branch_base` directly using RubyGems: + +```bash +gem install branch_base +``` + +### Via Docker + +`branch_base` is also available as a Docker image, which can be used to run the tool without setting up a Ruby environment: + +```bash +docker pull shayonj/branch_base:latest +``` + +To use `branch_base` with Docker, you can mount your Git repository as a volume: + +```bash +docker run -v /repo/path:/repo shayonj/branch_base sync /repo/path +``` + +## Usage + +After installation, you can use `branch_base` to sync a Git repository: + +```bash +branch_base sync /repo/path +``` + +This command will create a SQLite database with the repository's data in the path where the command is called from + +## Example SQL Queries + +Once your repository data is synchronized into a SQLite database, you can run various SQL queries to analyze the data. Here are some examples: + +1. **List all commits**: + + ```sql + SELECT * FROM commits; + ``` + +2. **Find commits by a specific author**: + + ```sql + SELECT * FROM commits WHERE author = 'John Doe'; + ``` + +3. **Get the number of commits in each branch**: + + ```sql + SELECT branches.name, COUNT(commits.commit_hash) as commit_count + FROM branches + JOIN commits ON branches.head_commit = commits.commit_hash + GROUP BY branches.name; + ``` + +4. **List files changed in a specific commit**: + + ```sql + SELECT files.file_path + FROM commit_files + JOIN files ON commit_files.file_id = files.file_id + WHERE commit_files.commit_hash = 'commit_hash_here'; + ``` + +5. **Count of Commits per Author** + + ```sql + SELECT author, COUNT(*) as commit_count + FROM commits + GROUP BY author + ORDER BY commit_count DESC; + ``` + +6. **Authors Who Have Worked on a Specific File** + + ```sql + SELECT DISTINCT commits.author + FROM commits + JOIN commit_files ON commits.commit_hash = commit_files.commit_hash + JOIN files ON commit_files.file_id = files.file_id + WHERE files.file_path like '%sqlite3%' + ``` + +## Database Schema + +The SQLite database of the follow tables: + +``` +repositories + │ + └─ commits ───── commit_files ── files + │ │ │ + ├─ branches ───────┘ │ + │ │ + └─ commit_parents ────────────────┘ +``` + +**Table Descriptions**: + +- `repositories`: Contains details about the repositories. +- `commits`: Stores individual commits. Each commit is linked to a repository. +- `branches`: Branch information linked to their latest commits. +- `files`: Information about files changed in commits. We don't store file contents. +- `commit_files`: Associates commits with files, including changes. +- `commit_parents`: Tracks parent-child relationships between commits. + +## Contributing + +Contributions to `branch_base` are welcome! + +## License + +Distributed under the MIT License. See `LICENSE` for more information. + +## Development + +- Install ruby `3.1.4` using RVM ([instruction](https://rvm.io/rvm/install#any-other-system)) +- `bundle exec rspec` for specs