gix is a command-line interface (CLI) to access git repositories in various ways best described as low-level for use by experts or those validating functionality in real-world scenarios. Performance and efficiency are staples of the implementation.
ein is reserved for one-off tools that are useful to many, and will one day implement a truly unique workflow with the potential to become the preferred way to interact with git repositories.
Please note that all functionality comes from the gitoxide-core
library, which mirrors these capabilities
and itself relies on all gix-*
crates. It's not meant for consumption, for application development, please use gix
.
- the
ein
program - convenient and for humans- init - initialize a new non-bare repository with a
main
branch - clone - initialize a local copy of a remote repository
- tools
- organize - find all git repositories and place them in directories according to their remote paths
- find - find all git repositories in a given directory - useful for tools like skim
- estimate-hours - estimate the time invested into a repository by evaluating commit dates.
- Based on the git-hours algorithm.
- See the discussion for some performance data.
- init - initialize a new non-bare repository with a
- the
gix
program (plumbing) - lower level commands for use in automation- progress - provide an overview of what works and what doesn't from the perspective of the git configuration. This is likely to change a lot over time depending on actual needs, but maybe useful for you to see if particular git-configuration is picked up and where it deviates.
- config - list the complete git configuration in human-readable form and optionally filter sections by name.
- exclude
- query - check if path specs are excluded via gits exclusion rules like
.gitignore
.
- query - check if path specs are excluded via gits exclusion rules like
- verify - validate a whole repository, for now only the object database.
- commit
- describe - identify a commit by its closest tag in its past
- tree
- entries - list tree entries for a single tree or recursively
- info - display tree statistics
- odb
- info - display odb statistics
- entries - display all object ids in the object database
- mailmap
- entries - display all entries of the aggregated mailmap git would use for substitution
- revision
- list - list plain revision hashes from a starting point, similar to a very simple version of
git rev-list
. - explain - show what would be done while parsing a revision specification like
HEAD~1
- resolve - show which objects a revspec resolves to, similar to
git rev-parse
but faster and with much better error handling - previous-branches - list all previously checked out branches, powered by the ref-log.
- list - list plain revision hashes from a starting point, similar to a very simple version of
- remote
- refs - list all references available on the remote based on the current remote configuration.
- ref-map - show how remote references relate to their local tracking branches as mapped by refspecs.
- fetch - fetch the current remote or the given one, optionally just as dry-run.
- clone
- initialize a new bare repository and fetch all objects.
- initialize a new repository, fetch all objects and checkout the main worktree.
- credential
- fill/approve/reject - The same as
git credential
, but implemented in Rust, calling helpers only when from trusted configuration.
- fill/approve/reject - The same as
- free - no git repository necessary
- pack
- verify
- index verify including each object sha1 and statistics
- explode, useful for transforming packs into loose objects for inspection or restoration
- verify written objects (by reading them back from disk)
- receive - receive a whole pack produced by pack-send or git-upload-pack, useful for
clone
like operations. - create - create a pack from given objects or tips of the commit graph.
- send - create a pack and send it using the pack protocol to stdout, similar to 'git-upload-pack', for consumption by pack-receive or git-receive-pack
- multi-index
- info - print information about the file
- create - create a multi-index from pack indices
- verify - check the file for consistency
- entries - list all entries of the file
- index
- create - create an index file by streaming a pack file as done during clone
- support for thin packs (as needed for fetch/pull)
- create - create an index file by streaming a pack file as done during clone
- commit-graph
- verify - assure that a commit-graph is consistent
- mailmap
- verify - check entries of a mailmap file for parse errors and display them
- index
- entries - show detailed entry information for human or machine consumption (via JSON)
- verify - check the index for consistency
- info - display general information about the index itself, with detailed extension information by default
- detailed information about the TREE extension
- …other extensions details aren't implemented yet
- checkout-exclusive - a predecessor of
git worktree
, providing flexible options to evaluate checkout performance from an index and/or an object database.
- pack
- read and write a signature that uniquely identifies an actor within a git repository
- a way to parse
name <email>
tuples (instead of full signatures) to facilitate parsing commit trailers. - a way to write only actors, useful for commit trailers.
- types to represent hash digests to identify git objects.
- used to abstract over different kinds of hashes, like SHA1 and the upcoming SHA256
- API documentation
- Some examples
- decode the chunk file table of contents and provide convenient API
- write the table of contents
- hashmap
- hashset
- filesystem
- probe capabilities
- symlink creation and removal
- file snapshots
- probe capabilities
- symlink creation and removal
- file snapshots
- stack abstraction
- decode (zero-copy) borrowed objects
- commit
- parse trailers
- tree
- commit
- encode owned objects
- commit
- tree
- tag
- transform borrowed to owned objects
- API documentation
- Some examples
- packs
- traverse pack index
- 'object' abstraction
- decode (zero copy)
- verify checksum
- simple and fast pack traversal
- decode
- full objects
- deltified objects
- decode
- decode a pack from
Read
input- Add support for zlib-ng for 20% faster decompression performance
-
Read
toIterator
of entries- read as is, verify hash, and restore partial packs
- create index from pack alone (much faster than git)
- resolve 'thin' packs
- decode a pack from
- encode
- Add support for zlib-ng for 2.5x compression performance
- objects to entries iterator
- input objects as-is
- pack only changed objects as derived from input
- base object compression
- delta compression
- respect the
delta=false
attribute
- respect the
- create 'thin' pack, i.e. deltas that are based on objects the other side has.
- parallel implementation that scales perfectly
- entries to pack data iterator
- write index along with the new pack
- verify pack with statistics
- brute force - less memory
- indexed - optimal speed, but more memory
- advanced
- Multi-Pack index file (MIDX)
- read
- write
- verify
- 'bitmap' file
- special handling for networked packs
- detect and retry packed object reading
- Multi-Pack index file (MIDX)
- API documentation
- Some examples
- loose object store
- traverse
- read
- into memory
- streaming
- verify checksum
- streaming write for blobs
- buffer write for small in-memory objects/non-blobs to bring IO down to open-read-close == 3 syscalls
- read object header (size + kind) without full decompression
- dynamic store
- auto-refresh of on-disk state
- handles alternates
- multi-pack indices
- perfect scaling with cores
- support for pack caches, object caches and MRU for best per-thread performance.
- prefix/short-id lookup, with optional listing of ambiguous objects.
- object replacements (
git replace
) - high-speed packed object traversal without wasted CPU time
- user defined filters
- read object header (size + kind) without full decompression
- sink
- write objects and obtain id
- alternates
- resolve links between object databases
- safe with cycles and recursive configurations
- multi-line with comments and quotes
- promisor
- It's vague, but these seems to be like index files allowing to fetch objects from a server on demand.
- API documentation
- Some examples
Check out the performance discussion as well.
- tree
- changes needed to obtain other tree
- patches
- There are various ways to generate a patch from two blobs.
- any
- lines
- Simple line-by-line diffs powered by the
imara-diff
crate.
- Simple line-by-line diffs powered by the
- diffing, merging, working with hunks of data
- find differences between various states, i.e. index, working tree, commit-tree
- API documentation
- Examples
Check out the performance discussion as well.
- trees
- nested traversal
- commits
- ancestor graph traversal similar to
git revlog
-
commitgraph
support
- ancestor graph traversal similar to
- API documentation
- Examples
- As documented here: https://www.git-scm.com/docs/git-clone#_git_urls
- parse
- ssh URLs and SCP like syntax
- file, git, and SSH
- paths (OS paths, without need for UTF-8)
- username expansion for ssh and git urls
- convert URL to string
- API documentation
- Some examples
- PKT-Line
- encode
- decode (zero-copy)
- error line
- V2 additions
- side-band mode
-
Read
from packet line with (optional) progress support via sidebands -
Write
with built-in packet line encoding -
async
support - API documentation
- Some examples
- No matter what we do here, timeouts must be supported to prevent hanging forever and to make interrupts destructor-safe.
- client
- general purpose
connect(…)
for clients- file:// launches service application
- ssh:// launches service application in a remote shell using ssh
- git:// establishes a tcp connection to a git daemon
- http(s):// establishes connections to web server
- via
curl
(blocking only) - via
reqwest
(blocking only)
- via
- pass context for scheme specific configuration, like timeouts
- git://
- V1 handshake
- send values + receive data with sidebands
-
support for receiving 'shallow' refs in case the remote repository is shallow itself (I presume)- Since V2 doesn't seem to support that, let's skip this until there is an actual need. No completionist :D
- V2 handshake
- send command request, receive response with sideband support
- V1 handshake
- http(s)://
- set identity for basic authentication
- V1 handshake
- send values + receive data with sidebands
- V2 handshake
- send command request, receive response with sideband support
-
'dumb'- we opt out using this protocol seems too slow to be useful, unless it downloads entire packs for clones?
- authentication failures are communicated by io::ErrorKind::PermissionDenied, allowing other layers to retry with authentication
-
async
support
- general purpose
- server
- general purpose
accept(…)
for servers
- general purpose
- API documentation
- Some examples
feature | curl | reqwest |
---|---|---|
01 | ||
02 | X | |
03 | X | |
04 | X | |
05 |
- 01 -> async
- 02 -> proxy support
- 03 -> custom request configuration via fn(request)
- 04 -> proxy authentication
- 05 -> reauthentication after redirect
- abstract over protocol versions to allow delegates to deal only with a single way of doing things
- credentials
- via gix-credentials
- via pure Rust implementation if no git is installed
- handshake
- parse initial response of V1 and V2 servers
- ls-refs
- parse V1 refs as provided during handshake
- parse V2 refs
- handle empty refs, AKA PKT-LINE(zero-id SP "capabilities^{}" NUL capability-list)
- fetch
- detailed progress
- control credentials provider to fill, approve and reject
- initialize and validate command arguments and features sanely
- abort early for ls-remote capabilities
- packfile negotiation
- delegate can support for all fetch features, including shallow, deepen, etc.
- receive parsed shallow refs
- push
- API documentation
- Some examples
- parse
.gitattribute
files - an attributes stack for matching paths to their attributes, with support for built-in
binary
macro for-text -diff -merge
- parse
.gitignore
files - an attributes stack for checking if paths are excluded
- ansi-c
- quote
- unquote
- parsing
- lookup and mapping of author names
- transformations to and from bytes
- conversions between different platforms
- virtual canonicalization for more concise paths via
absolutize()
- more flexible canonicalization with symlink resolution for paths which are partially virtual via
realpath()
- spec
- parse
- check for match
- parse
- matching of paths
- parse
- matching of references and object names
- for fetch
- for push
- execute commands directly
- execute commands with
sh
- support for
GIT_EXEC_PATH
environment variable withgix-sec
filter
- open prompts for usernames for example
- secure prompts for password
- use
askpass
program if available - signal handling (resetting and restoring terminal settings)
- windows prompts for
cmd.exe
and mingw terminals
A mechanism to associate metadata with any object, and keep revisions of it using git itself.
- CRUD for git notes
- algorithms
-
noop
-
consecutive
-
skipping
-
- parse
FETCH_HEAD
information back entirely - write typical fetch-head lines
- check if a git directory is a git repository
- find a git repository by searching upward
- define ceilings that should not be surpassed
- prevent crossing file-systems (non-windows only)
- handle linked worktrees
- a way to handle
safe.directory
- note that it's less critical to support it as
gitoxide
allows access but prevents untrusted configuration to become effective.
- note that it's less critical to support it as
- parse git dates
- serialize
Time
- launch git credentials helpers with a given action
- built-in
git credential
program - as scripts
- as absolute paths to programs with optional arguments
- program name with optional arguments, transformed into
git credential-<name>
- built-in
-
helper::main()
for easy custom credential helper programs written in Rust
Provide base-implementations for dealing with smudge and clean filters as well as filter processes, facilitating their development.
- clean filter base
- smudge filter base
- filter process base
Provides a trust model to share across gitoxide crates. It helps configuring how to interact with external processes, among other things.
- integrations
- gix-config
- gix
- obtain rebase status
- drive a rebase operation
Handle human-aided operations which cannot be completed in one command invocation.
Implement git large file support using the process protocol and make it flexible enough to handle a variety of cases. Make it the best-performing implementation and the most convenient one.
- parse pattern
- a type for pattern matching of paths and non-paths, optionally case-insensitively.
- handle the working tree/checkout
- checkout an index of files, executables and symlinks just as fast as git
- forbid symlinks in directories
- handle submodules
- handle sparse directories
- handle sparse index
- linear scaling with multi-threading up to IO saturation
- supported attributes to affect working tree and index contents
- eol
- working-tree-encoding
- …more
- filtering
-
text
-
ident
- filter processes
- single-invocation clean/smudge filters
-
- checkout an index of files, executables and symlinks just as fast as git
- manage multiple worktrees
- access to per-path information, like
.gitignore
and.gitattributes
in a manner well suited for efficient lookups- exclude information
- attributes
-
describe()
(similar togit name-rev
) - parse specifications
- parsing and navigation
- revision ranges
- full date parsing support (depends on
gix-date
)
- primitives to help with graph traversal, along with commit-graph acceleration.
- CRUD for submodules
- try to handle with all the nifty interactions and be a little more comfortable than what git offers, lay a foundation for smarter git submodules.
A plumbing crate with shared functionality regarding EWAH compressed bitmaps, as well as other kinds of bitmap implementations.
- EWAH
Array
type to read and write bits- execute closure for each
true
bit
- execute closure for each
- decode on-disk representation
- encode on-disk representation
The git staging area.
- read
- V2 - the default, including long-paths support
- V3 - extended flags
- V4 - delta-compression for paths
- optional threading
- concurrent loading of index extensions
- threaded entry reading
- extensions
- TREE for speeding up tree generation
- REUC resolving undo
- UNTR untracked cache
- FSMN file system monitor cache V1 and V2
- EOIE end of index entry
- IEOT index entry offset table
- 'link' base indices to take information from, split index
- 'sdir' sparse directory entries - marker
- verification of entries and extensions as well as checksum
- write
- V2
- V3 - extension bits
- V4
- extensions
- TREE
- REUC
- UNTR
- FSMN
- EOIE
- 'sdir'
- 'link'
- note that we currently dissolve any shared index we read so when writing this extension is removed.
stat
update- optional threaded
stat
based on thread_cost (aka preload)
- optional threaded
- handling of
.gitignore
and system file exclude configuration - handle potential races
- maintain extensions when altering the cache
- TREE for speeding up tree generation
- REUC resolving undo
- UNTR untracked cache
- FSMN file system monitor cache V1 and V2
- EOIE end of index entry
- IEOT index entry offset table
- 'link' base indices to take information from, split index
- 'sdir' sparse directory entries
- add and remove entries
- API documentation
- Some examples
- read-only access
- Graph lookup of commit information to obtain timestamps, generation and parents, and extra edges
- Corrected generation dates
- Bloom filter index
- Bloom filter data
- create and update graphs and graph files
- API documentation
- Some examples
See its README.md.
See its README.md.
- parse
- boolean
- integer
- color
- ANSI code output for terminal colors
- path (incl. resolution)
- date
- [permission][https://github.com/git/git/blob/71a8fab31b70c417e8f5b5f716581f89955a7082/setup.c#L1526:L1526]
- read
- zero-copy parsing with event emission
- all config values as per the
gix-config-value
crate - includeIf
-
gitdir
,gitdir/i
, andonbranch
-
hasconfig
-
- access values and sections by name and sub-section
- edit configuration in memory, non-destructively
- cross-platform newline handling
- write files back for lossless round-trips.
- keep comments and whitespace, and only change lines that are affected by actual changes, to allow truly non-destructive editing
- cascaded loading of various configuration files into one
- load from environment variables
- load from well-known sources for global configuration
- load repository configuration with all known sources
- API documentation
- Some examples
- utilities for applications to make long running operations interruptible gracefully and to support timeouts in servers.
- handle
core.repositoryFormatVersion
and extensions - support for unicode-precomposition of command-line arguments (needs explicit use in parent application)
- Repository
- discovery
- option to not cross file systems (default)
- handle git-common-dir
- support for
GIT_CEILING_DIRECTORIES
environment variable - handle other non-discovery modes and provide control over environment variable usage required in applications
- rev-parse
- rev-walk
- include tips
- exclude commits
- instantiation
- access to refs and objects
- credentials
- run
git credential
directly - use credential helper configuration and to obtain credentials with
gix_credentials::helper::Cascade
- run
- config
- facilities to apply the url-match algorithm and to normalize urls before comparison.
- traverse
- commit graphs
- make git-notes accessible
- tree entries
- diffs/changes
- tree with other tree
- respect case-sensitivity of host filesystem.
- a way to access various diff related settings or use them
- respect
diff.*.textconv
,diff.*.cachetextconv
and external diff viewers withdiff.*.command
, along with support for readingdiff
gitattributes. - rewrite tracking
- deviation - git keeps up to four candidates whereas we use the first-found candidate that matches the similarity percentage. This can lead to different sources being found. As such, we also don't consider the filename at all.
- handle binary files correctly, and apply filters for that matter
- computation limit with observable reduction of precision when it is hit, for copies and renames separately
- by identity
- renames (sym-links are only ever compared by identity)
- copies
- by similarity - similarity factor controllable separately from renames
- renames
- copies
- 'find-copies-harder' - find copies with the source being the entire tree.
- tree or index with working tree
- diffs between modified blobs with various algorithms
- tree with index
- tree with other tree
- initialize
- Proper configuration depending on platform (e.g. ignorecase, filemode, …)
- Id
- short hashes with detection of ambiguity.
- Commit
-
git describe
like functionality, with optional commit-graph acceleration - create new commit from tree
-
- Objects
- lookup
- peel to object kind
- create signed commits and tags
- trees
- lookup path
- references
- peel to end
- ref-log access
- remote name
- find remote itself
- respect
branch.<name>.merge
in the returned remote.
- respect
- remotes
- clone
- shallow
- include-tags when shallow is used (needs separate fetch)
- prune non-existing shallow commits
- bundles
- shallow
- fetch
- shallow (remains shallow, options to adjust shallow boundary)
- a way to auto-explode small packs to avoid them to pile up
- 'ref-in-want'
- 'wanted-ref'
- standard negotiation algorithms
consecutive
,skipping
andnoop
.
- push
- ls-refs
- ls-refs with ref-spec filter
- list, find by name
- create in memory
- groups
- remote and branch files
- clone
- execute hooks
- refs
- run transaction hooks and handle special repository states like quarantine
- support for different backends like
files
andreftable
- main or linked worktree
- add files with
.gitignore
handling - checkout with conversions like clean + smudge as in
.gitattributes
- diff index with working tree
- sparse checkout support
- read per-worktree config if
extensions.worktreeConfig
is enabled. - index
- tree from index
- index from tree
- add files with
- worktrees
- open a repository with worktrees
- read locked state
- obtain 'prunable' information
- proper handling of worktree related refs
- create a byte stream and create archives for such a stream, including worktree filters and conversions
- create, move, remove, and repair
- access exclude information
- access attribute information
- respect
core.worktree
configuration- deviation
- The delicate interplay between
GIT_COMMON_DIR
andGIT_WORK_TREE
isn't implemented.
- The delicate interplay between
- deviation
- open a repository with worktrees
- config
- read the primitive types
boolean
,integer
,string
- read and interpolate trusted paths
- low-level API for more elaborate access to all details of
gix-config
files - a way to make changes to individual configuration files
- read the primitive types
- mailmap
- object replacements (
git replace
) - configuration
- merging
- stashing
- Use Commit Graph to speed up certain queries
- subtree
- interactive rebase status/manipulation
- submodules
- refs
- discovery
- API documentation
- Some examples
- encode git-tree as stream of bytes (with large file support and actual streaming)
- produce a stream of entries
- add custom entries to the stream
- respect
export-ignore
git attribute - apply standard worktree conversion to simulate an actual checkout
- support for submodule inclusion
- API documentation
- Some examples
-
write_to()
for creating an archive with various container formats-
tar
andtar.gz
-
zip
-
- add prefix and modification date
- API documentation
- Some examples
- create a bundle from an archive
- respect
export-ignore
andexport-subst
- respect
- extract a branch from a bundle into a repository
- API documentation
- Some examples
- validate ref names
- validate tag names
- Prepare code for arrival of longer hashes like Sha256. It's part of the V2 proposal but should work for loose refs as well.
- Stores
- disable transactions during quarantine
- namespaces
- a server-side feature to transparently isolate refs in a single shared repository, allowing all forks to live in the same condensed repository.
- loose file
- ref validation
- find single ref by name
- special handling of
FETCH_HEAD
andMERGE_HEAD
- iterate refs with optional prefix
- worktree support
- support multiple bases and classify refs
- support for ref iteration merging common and private refs seamlessly.
- avoid packing refs which are worktree private
symbolic ref support, using symbolic links- This is a legacy feature which is not in use anymore.
- transactions
- delete, create or update single ref or multiple refs while handling the reflog
- set any valid ref value (not just object ids)
- reflog changes can be entirely disabled (i.e. for bare repos)
- rename or copy references
- transparent handling of packed-refs during deletion
- writing loose refs into packed-refs and optionally delete them
- initial transaction optimization (a faster way to create clones with a lot of refs)
- log
- forward iteration
- backward iteration
- expire
- ref
- peel to id
- packed
- find single ref by name
- iterate refs with optional prefix
- handle unsorted packed refs and those without a header
- reftable,
- API documentation
- Some examples
- io-pipe feature toggle
- a unix like pipeline for bytes
- parallel feature toggle
- When on…
in_parallel
join
- When off all functions execute serially
- When on…
- fast-sha1
- provides a faster SHA1 implementation using CPU intrinsics
- API documentation
- a terminal user interface seeking to replace and improve on
tig
- Can display complex history in novel ways to make them graspable. Maybe this post can be an inspiration.
A re-implementation of a minimal tig
like UI that aims to be fast and to the point.
Definitely optimize for performance and see how we fare compared to oxen.
Right now, git lfs
is 40x slower, due to sequential uploads and lack of fast compression. It seems this can be greatly improved to get
close to 6min for 200k images (1.4GB). GitHub seems to cap upload speeds to 100kb/s, one major reason it's so slow, and it can only do
it sequentially as git-lfs
doesn't use the new filter-process
protocol which would allow parallelization.
Oxen uses the XXH3 (30gb/s) which greatly outperforms SHA1 - however, it doesn't look like the hash is necessarily the bottleneck in typical benchmarks.