- test-first development
- protect against regression and make implementing features easy.
- user containers to test more elaborate user interactions
- keep it practical, knowing the Rust compiler already has your back for the mundane things, like unhappy code paths.
- use git itself as reference implementation, and use their test-cases and fixtures where
appropriate. At the very least, try to learn from them.
- Run the same test against git whenever feasible to assure git agrees with our implementation.
See
gix-glob
for examples.
- Run the same test against git whenever feasible to assure git agrees with our implementation.
See
- use libgit2 test fixtures and cases where appropriate, or learn from them.
- safety first
- handle all errors, never
unwrap()
. If needed,expect("why")
. - provide an error chain and make it easy to understand what went wrong.
- We
thiserror
generally.
- handle all errors, never
- Adhere to the stability guide
We use a style I'd call 'purposeful conventional commits', and instead of classifying every commit using conventional commit messaging, we do so only if the message should show up in the changelog.
The subject usually informs about the what and the body provides details and explains the why.
Commit messages must show up in the changelog in case of breaking changes. Examples for that are:
-
change!: rename
Foo
toBar
. (#123)And this is why we do it in the body.
-
remove!:
Repository::obsolete()
.Nobody used this method.
Features or other changes that are visible and people should know about look like this:
-
feat: add
Repository::foo()
to do great things. (#234)And here is how it's used and some more details.
-
fix: don't panic when calling
foo()
in a bare repository. (#456)
Everything else, particularly refactors or chores, don't use conventional commits as these don't affect users of the API. Examples could be:
- make test module structure similar to the modules they are testing for consistency
make fmt
- thanks clippy
Please refrain from using chore:
or refactor:
prefixes as for the most part, users of the API don't care about those. When a refactor
changes the API in some way, prefer to use feat
, change
, rename
or remove
instead, and most of the time the ones that are not feat
are breaking so would be seen with their exclamation mark suffix, like change!
.
Commit messages are used for guiding cargo smart-release
to do most of the release work for us. This includes changelog generation
as well as picking the right version bump for each crate.
Knowing that cargo smart-release
is driven by commit messages and affects their versions with per-crate granularity, it becomes important
to split edits into multiple commits to clearly indicate which crate is actually broken.
Typical patterns include making a breaking change in one crate and then fix all others to work with it. For changelogs to look proper
and version bumps to be correct, the first commit would contain only the breaking changes themselves,
like "rename: foo()
to bar()
", and the second commit would contain all changes to adapt to that and look like "adapt to changes in <crate name>
".
We generally follow a 'track everything' approach and there is a lot of freedom leading to more commits rather than less. There is no obligation to squash commits or to otherwise tune the history.
We use feature branches and PRs most of the time to be able to take advantage of CI and GitHub review tools, and merge with merge commits
to make individual efforts stand out. There is no need for linearizing history or tuning it in any other way. However, each commit
must follow the guidelines laid out in the Commit Messages
paragraph.
There is value in organizing commits by topic and Stacked Git is hereby endorsed to do that.
As a general rule, respect and implement all applicable git-config by default, but allow the caller to set overrides. How overrides work depends on the goals of the particular API so it can be done on the main call path, forcing a choice, or more typically, as a side-lane where overrides can be done on demand.
Note that it should be possible to obtain the current configuration for modification by the user for selective overrides, either
by calling methods or by obtaining a data structure that can be set as a whole using a get -> modify -> set
cycle.
Note that without any of that, one should document that with config_snapshot_mut()
any of the relevant configuration can be
changed in memory before invoking a method in order to affect it.
Parameters which are not available in git or specific to gitoxide
or the needs of the caller can be passed as parameters or via
Options
or Context
structures as needed.
-
async
- library client-side
Don't use it client side, as operations there are usually bound by the CPU and ultra-fast access to memory mapped files. It's no problem to saturate either CPU or the IO system.- Provide
async
clients as opt-in using feature toggles to help integrating into an existing async codebase.
- Provide
- User Interfaces
- User interfaces can greatly benefit from using async as it's much easier to maintain a responsive UI thread that way thanks to the wonderful future combinators.
blocking
can be used to makeRead
andIterator
async, or move any operation onto a thread which blends it into the async world.- Most operations are fast and 'interrupting' them is as easy as ignoring their result by cancelling their task.
- Long-running operations can be roughly interacted with using
gix_features::interrupt::trigger()
function, and after a moment of waiting the flag can be unset with the…::uninterrupt()
function to allow new long-running operations to work. Every long running operation supports this.
- server-side
Building a pack is CPU and at some point, IO bound, and it makes no sense to use async to handle more connections - git needs a lot of resources and threads will do just fine.- Support async out of the box without locking it into particular traits using conditional complication. This will make integrating into an async codebase easier, which we assume is given on the server side these days.
- usage of
maybe_async
- Right not we intentionally only use it in tests to allow one set of test cases to test both blocking and async implementations. This is the only way to prevent drift of otherwise distinct implementations.
- Why not use it to generate blocking versions of traits automatically?
- This would require
maybe_async
and its dependencies to always be present, increasing compile times. For now we chose a little more code to handle over increasing compile times for everyone. This stance may change later once compile times don't matter that much anymore to allow the removal of code.
- This would require
- library client-side
-
Default
trait implementations- These can change only if the effect is contained within the callers process. This means changing the default of a file version is a breaking change.
-
Using the
Progress
trait- When receiving a
Progress
implementation- without calling
add_child(…)
then use it directly to communicate progress, leaving control of the name to the caller. However, call.init(…)
to configure the iteration. - and when calling
add_child(…)
don't use the parent progress instance for anything else.
- without calling
- When receiving a
-
interruption of long-running operations
- Use
gix-features::interrupt::*
for building support for interruptions of long-running operations only.- It's up to the author to decide how to best integrate it, generally we use a poll-based mechanism to check whether an interrupt flag is set.
- this is a must if…
- …temporary resources like files might otherwise be leaked.
- this is optional but desirable if…
- …there is no leakage otherwise to support user interfaces. They background long-running operations and need them to be cancellable.
- Use
-
prepare for SHA256 support by using
gix_hash::ObjectId
andgix_hash::oid
- eventually there will be the need to support both Sha1 and Sha256. We anticipate it by using the
Id
type instead of slices or arrays of 20 bytes. This way, eventually we can support multiple hash digest sizes. - Right now it's unclear how Sha256 is going to work in git, so we only support Sha1 for now. It might be an avenue to proactively implement it ahead of time once there is a specification to follow.
- It looks like Git prepares to support it by using compile time, we can support it at runtime though with minimal cost. If needed, we can later remove support using a cargo feature toggle.
- eventually there will be the need to support both Sha1 and Sha256. We anticipate it by using the
-
symbolic links do not exist as far as we are concerned
- in older, probably linux only, git versions symbolic links were used for symbolic references for example. This required special handling in some places. We don't implement that and assume more modern repositories.
-
when to use interior mutability
- in plumbing, do not use it at all but instead provide the mutable part (like caches, buffers) as arguments, pushing their handling entirely to the caller.
- Set on top an optional abstraction that manages the above for you using interior mutability only if part of the mutable state has to be returned as borrow or if otherwise it wouldn't be possible to borrowcheck. Or in other words: start without interior mutability and try to do it the standard way, but switch when needed.
- When using primitives to support interior mutability, use the provided ones and utility functions in
gix_features::threading::*
exclusively to allow switching between thread-safe and none-threadsafe versions at compile time.- The preferred way of using it is to start out as upgradable reader, and upgrading to write if needed, keeping contention to a minimum.
- If shared ownership is involved, one always needs interior mutability, but may still decide to use an API that requires
&mut self
if locally stored caches are involved. - Types that are not thread-local must be
Sync
, but only if thegix-features/parallel
is enabled due to the usage ofgix_features::threading::…
primitives which won't be thread-safe without the feature.
-
when to use shared ownership
- Use
gix_features::threading::OwnShared
particularly when shared resources supposed to be used by thread-local handles. Going through a wrapper for shared ownership is fast and won't be the bottleneck, as it's only about 16% slower than going through a shared reference on a single core.
- Use
-
Path encoding
- For
git
, paths are just bytes no matter on which platform. We assume that on windows its path handling goes through some abstraction layer likeMSYS2
which avoids it to seeing UTF-16 encoded paths (and writing them). Thus it should be safe to assumegit
s path encoding is byte oriented. - Assuming UTF8-ish bytes in paths produced by
git
even on windows due toMSYS2
, we useos_str_bytes
to convert these back intoOsStr
and derivatives likePath
as needed even though it might not always be the case depending on the actual encoding used byMSYS2
or other abstraction layers, or avoiding to use std types altogether using our own instead.
- For
A bunch of notes collected to keep track of what's needed to eventually support it
- read
hash-function-transition.txt
- support
gpgsig-sha256
field - we won't break, but also don't do anything with it (e.g.extra_headers
) - support index V3
- Pack file PSRC field
- don't use unwrap, not even in tests. Instead use
quick_error!()
orBox<dyn std::error::Error>
. - Use
expect(…)
as assertion on Options, providing context on why the expectations should hold. Or in other words, answer "This should work because…<expect(…)>"
- Use
Options
whenever there is something to configure in terms of branching behaviour. It can be defaulted, and if it can't these fields should be parameters of the method that takes theseOptions
. - Use
Context
when data is required to perform an operation at all. Seegix_config::path::Context
as reference. It can't be defaulted and the fields could also be parameters.
In plumbing crates, prefer to default to keeping references if this is feasible to avoid typically expensive clones.
In porcelain crates, like gix
, we have Platforms
which are typically cheap enough to create on demand as they configure one or more method calls. These
should keep a reference to the Repository
instance that created them as the user is expected to clone the Repository
if there is the need.
However, if these structures are more expensive, call them Cache
or <NotPlatform>
and prefer to clone the Repository
into them or otherwise keep them free of lifetimes
to allow the user to keep this structure around for repeated calls. References for this paragraph are this PR and
this discussion.
Both terms are coming from the git
implementation itself, even though it won't necessarily point out which commands are plumbing and which
are porcelain.
The term plumbing refers to lower-level, more rarely used commands that complement porcelain by being invoked by it or by hand for certain use
cases.
The term porcelain refers to those with a decent user experience, they are primarily intended for use by humans.
In any case, both types of programs must self-document their capabilities using through the --help
flag.
From there, we can derive a few rules to adhere to unless there are good reasons not to:
- does not show any progress or logging output by default
- if supported and logging is enabled, it will show timestamps in UTC
- it does not need a git repository, but instead takes all required information via the command-line
- Provides output to stderr by default to provide progress information. There is no need to allow disabling it, but it shouldn't show up unless the operation takes some time.
- If timestamps are shown, they are in localtime.
- Non-progress information goes to stdout.
Here is the hierarchy of programs - each level requires more polish and generally work to be done. Experiments are the quickest ways to obtain some insights. Examples are materialized ideas that others can learn from but that don't quite have the polish (or the potential) to move up to plumbing or porcelain. Plumbing is programs for use in scripts, whereas porcelain is for use by humans.
- Experiments (out of tree due to
git2
builds sometimes failing CI)- quick, potentially one-off programs to learn about an aspect of gitoxide potentially in comparison to other implementations like
libgit2
. - No need for tests of any kind, but it must compile and be idiomatic Rust and
gitoxide
. - Manual command-line parsing is OK
- no polish
- make it compile quickly, so no extras
- quick, potentially one-off programs to learn about an aspect of gitoxide potentially in comparison to other implementations like
- Examples
- An implementation of ideas for actual occasional use and the first step towards getting integrated into Porcelain or Plumbing CLIs.
- Proper command-line parsing with Clap.
- No tests or progress.
- High quality Rust code along with idiomatic
gitoxide
usage so people can learn from it.
- Plumbing CLI
- Use Clap AND Argh for command-line parsing via feature toggles to allow for tiny builds as plumbing is mostly for scripts.
- Journey tests
- Progress can be turned on using the
--verbose
flag, quiet by default. - Examples can be turned into plumbing by adding journey tests and
argh
command-line parsing, as well as progress.
- Porcelain CLI
- Use Clap for command-line parsing for the best quality CLI experience - it's for the user.
- Journey tests.
- Support for
--quiet
and--progress
. - Verbose by default.
- Examples can be turned into plumbing by adding journey tests and progress.
Utilities to aid in keeping the project fresh and in sync can be found in the Maintenance
section of the makefile
. Run make
to
get an overview.
- Be sure to clone locally and run tests with
GIX_TEST_IGNORE_ARCHIVES=1
to assure new fixture scripts (if there are any) are validated on MacOS and Windows. Linux doesn't need to be tested locally that way, as CI on Linux includes it.
Run make publish-all
to publish all crates in leaf-first order using cargo release
based on the currently set version.
For this to work, you have to run cargo release minor|major
each time you break the API of a crate but abort it during package verification.
That way, cargo release
updates all the dependents for you with the new version, without actually publishing to crates.io.
Generally, we take the git version installed on ubuntu-latest as the one we stay compatible with (while maintaining backwards compatibility). Certain tests only run on CI, designed to validate certain assumptions still hold against possibly changed git program versions.
This also means that CI may fail despite everything being alright locally, and the fix depends on the problem at hand.
Fixtures are created by using a line like this which produces a line we ignore via tail +1
followed by the un-prettified object payload
trailed by a newline.
echo c56a8e7aa92c86c41a923bc760d2dc39e8a31cf7 | git cat-file --batch | tail +2 > fixture
Thus one has to post-process the file by reducing its size by one using truncate -s -1 fixture
, removing the newline byte.
GIT_TRACE=true \
GIT_TRACE_PACK_ACCESS=true \
GIT_TRACE_PACKET=true \
GIT_TRACE_PACKFILE=true \
GIT_TRACE_PERFORMANCE=true \
GIT_TRACE_SHALLOW=true \
GIT_TRACE_SETUP=true \
GIT_CURL_VERBOSE=true \
GIT_SSH_COMMAND="ssh -VVV" \
git <command>
Consider adding GIT_TRACE2_PERF=1
(possibly add GIT_TRACE2_PERF_BRIEF=1
for brevity) as well for statistics and variables
(see their source for more.