Performance improvement & cleanup in loose ref iteration#1594
Performance improvement & cleanup in loose ref iteration#1594vdye wants to merge 4 commits intogitgitgadget:masterfrom vdye:vdye/ref-iteration-cleanup
Conversation
|
/submit |
|
Submitted as pull.1594.git.1696615769.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
|
On the Git mailing list, Junio C Hamano wrote (reply to this): "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:
> While investigating ref iteration performance in builtins like
> 'for-each-ref' and 'show-ref', I found two small improvement opportunities.
>
> The first patch tweaks the logic around prefix matching in
> 'cache_ref_iterator_advance' so that we correctly skip refs that do not
> actually match a given prefix. The unnecessary iteration doesn't seem to be
> causing any bugs in the ref iteration commands that I've tested, but it
> doesn't hurt to be more precise (and it helps with some other patches I'm
> working on ;) ).
>
> The next three patches update how 'loose_fill_ref_dir' determines the type
> of ref cache entry to create (directory or regular). On platforms that
> include d_type information in 'struct dirent' (as far as I can tell, all
> except NonStop & certain versions of Cygwin), this allows us to skip calling
> 'stat'. In ad-hoc testing, this improved performance of 'git for-each-ref'
> by about 20%.
Yay. That is a very obvious one, once it is pointed out. Thanks
for noticing and improving. Looking forward to reading the patches
themselves.
|
| @@ -412,7 +412,8 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator) | |||
|
|
|||
There was a problem hiding this comment.
On the Git mailing list, Junio C Hamano wrote (reply to this):
"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Victoria Dye <vdye@github.com>
>
> Update 'cache_ref_iterator_advance' to skip over refs that are not matched
> by the given prefix.
>
> Currently, a ref entry is considered "matched" if the entry name is fully
> contained within the prefix:
>
> * prefix: "refs/heads/v1"
> * entry: "refs/heads/v1.0"
>
> OR if the prefix is fully contained in the entry name:
>
> * prefix: "refs/heads/v1.0"
> * entry: "refs/heads/v1"
>
> The first case is always correct, but the second is only correct if the ref
> cache entry is a directory, for example:
>
> * prefix: "refs/heads/example"
> * entry: "refs/heads/"
>
> Modify the logic in 'cache_ref_iterator_advance' to reflect these
> expectations:
>
> 1. If 'overlaps_prefix' returns 'PREFIX_EXCLUDES_DIR', then the prefix and
> ref cache entry do not overlap at all. Skip this entry.
> 2. If 'overlaps_prefix' returns 'PREFIX_WITHIN_DIR', then the prefix matches
> inside this entry if it is a directory. Skip if the entry is not a
> directory, otherwise iterate over it.
> 3. Otherwise, 'overlaps_prefix' returned 'PREFIX_CONTAINS_DIR', indicating
> that the cache entry (directory or not) is fully contained by or equal to
> the prefix. Iterate over this entry.
>
> Note that condition 2 relies on the names of directory entries having the
> appropriate trailing slash. The existing function documentation of
> 'create_dir_entry' explicitly calls out the trailing slash requirement, so
> this is a safe assumption to make.
Thanks for explaining it very well and clearly.
Allowing prefix="refs/heads/v1.0" to yield entry="refs/heads/v1"
(case #2 above that this patch fixes the behaviour for) would cause
ref_iterator_advance() to return a ref outside the hierarhcy,
wouldn't it? So it appears to me that either one of the two would
be true:
* the code is structured in such a way that such a condition does
not actually happen (in which case this patch would be a no-op),
or
* there is a bug in the current code that is fixed by this patch,
whose externally observable behaviour can be verified with a
test.
It is not quite clear to me which is the case here. The code with
the patch looks more logical than the original, but I am not sure
how to demonstrate the existing breakage (if any).
> Signed-off-by: Victoria Dye <vdye@github.com>
> ---
> refs/ref-cache.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/refs/ref-cache.c b/refs/ref-cache.c
> index 2294c4564fb..6e3b725245c 100644
> --- a/refs/ref-cache.c
> +++ b/refs/ref-cache.c
> @@ -412,7 +412,8 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
>
> if (level->prefix_state == PREFIX_WITHIN_DIR) {
> entry_prefix_state = overlaps_prefix(entry->name, iter->prefix);
> - if (entry_prefix_state == PREFIX_EXCLUDES_DIR)
> + if (entry_prefix_state == PREFIX_EXCLUDES_DIR ||
> + (entry_prefix_state == PREFIX_WITHIN_DIR && !(entry->flag & REF_DIR)))
> continue;
> } else {
> entry_prefix_state = level->prefix_state;There was a problem hiding this comment.
On the Git mailing list, Patrick Steinhardt wrote (reply to this):
On Fri, Oct 06, 2023 at 02:51:24PM -0700, Junio C Hamano wrote:
> "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Victoria Dye <vdye@github.com>
> >
> > Update 'cache_ref_iterator_advance' to skip over refs that are not matched
> > by the given prefix.
> >
> > Currently, a ref entry is considered "matched" if the entry name is fully
> > contained within the prefix:
> >
> > * prefix: "refs/heads/v1"
> > * entry: "refs/heads/v1.0"
> >
> > OR if the prefix is fully contained in the entry name:
> >
> > * prefix: "refs/heads/v1.0"
> > * entry: "refs/heads/v1"
> >
> > The first case is always correct, but the second is only correct if the ref
> > cache entry is a directory, for example:
> >
> > * prefix: "refs/heads/example"
> > * entry: "refs/heads/"
> >
> > Modify the logic in 'cache_ref_iterator_advance' to reflect these
> > expectations:
> >
> > 1. If 'overlaps_prefix' returns 'PREFIX_EXCLUDES_DIR', then the prefix and
> > ref cache entry do not overlap at all. Skip this entry.
> > 2. If 'overlaps_prefix' returns 'PREFIX_WITHIN_DIR', then the prefix matches
> > inside this entry if it is a directory. Skip if the entry is not a
> > directory, otherwise iterate over it.
> > 3. Otherwise, 'overlaps_prefix' returned 'PREFIX_CONTAINS_DIR', indicating
> > that the cache entry (directory or not) is fully contained by or equal to
> > the prefix. Iterate over this entry.
> >
> > Note that condition 2 relies on the names of directory entries having the
> > appropriate trailing slash. The existing function documentation of
> > 'create_dir_entry' explicitly calls out the trailing slash requirement, so
> > this is a safe assumption to make.
>
> Thanks for explaining it very well and clearly.
>
> Allowing prefix="refs/heads/v1.0" to yield entry="refs/heads/v1"
> (case #2 above that this patch fixes the behaviour for) would cause
> ref_iterator_advance() to return a ref outside the hierarhcy,
> wouldn't it? So it appears to me that either one of the two would
> be true:
>
> * the code is structured in such a way that such a condition does
> not actually happen (in which case this patch would be a no-op),
> or
>
> * there is a bug in the current code that is fixed by this patch,
> whose externally observable behaviour can be verified with a
> test.
>
> It is not quite clear to me which is the case here. The code with
> the patch looks more logical than the original, but I am not sure
> how to demonstrate the existing breakage (if any).
Agreed, I also had a bit of a hard time to figure out whether this is an
actual bug fix, a performance improvement or merely a refactoring.
Patrick
> > Signed-off-by: Victoria Dye <vdye@github.com>
> > ---
> > refs/ref-cache.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/refs/ref-cache.c b/refs/ref-cache.c
> > index 2294c4564fb..6e3b725245c 100644
> > --- a/refs/ref-cache.c
> > +++ b/refs/ref-cache.c
> > @@ -412,7 +412,8 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
> >
> > if (level->prefix_state == PREFIX_WITHIN_DIR) {
> > entry_prefix_state = overlaps_prefix(entry->name, iter->prefix);
> > - if (entry_prefix_state == PREFIX_EXCLUDES_DIR)
> > + if (entry_prefix_state == PREFIX_EXCLUDES_DIR ||
> > + (entry_prefix_state == PREFIX_WITHIN_DIR && !(entry->flag & REF_DIR)))
> > continue;
> > } else {
> > entry_prefix_state = level->prefix_state;There was a problem hiding this comment.
On the Git mailing list, Victoria Dye wrote (reply to this):
Patrick Steinhardt wrote:
>> Allowing prefix="refs/heads/v1.0" to yield entry="refs/heads/v1"
>> (case #2 above that this patch fixes the behaviour for) would cause
>> ref_iterator_advance() to return a ref outside the hierarhcy,
>> wouldn't it? So it appears to me that either one of the two would
>> be true:
>>
>> * the code is structured in such a way that such a condition does
>> not actually happen (in which case this patch would be a no-op),
>> or
>>
>> * there is a bug in the current code that is fixed by this patch,
>> whose externally observable behaviour can be verified with a
>> test.
>>
>> It is not quite clear to me which is the case here. The code with
>> the patch looks more logical than the original, but I am not sure
>> how to demonstrate the existing breakage (if any).
>
> Agreed, I also had a bit of a hard time to figure out whether this is an
> actual bug fix, a performance improvement or merely a refactoring.
>
I originally operated on the assumption that it was the first case, which is
why I didn't include a test in this patch. Commands like 'for-each-ref',
'show-ref', etc. either use an empty prefix or a directory prefix with a
trailing slash, which won't trigger this issue. I encountered the problem
while working on a builtin that filtered refs by a user-specified prefix -
the results included refs that should not have been matched, which led me to
this fix.
Scanning through the codebase again, though, I do see a way to replicate the
issue:
$ git update-ref refs/bisect/b HEAD
$ git rev-parse --abbrev-ref --bisect
refs/bisect/b
Because 'rev-parse --bisect' uses the "refs/bisect/bad" prefix (no trailing
slash) and does no additional filtering in its 'for_each_fullref_in'
callback, refs like "refs/bisect/b" and "refs/bisect/ba" are (incorrectly)
matched. I'll re-roll with the added test.
There was a problem hiding this comment.
On the Git mailing list, Junio C Hamano wrote (reply to this):
Victoria Dye <vdye@github.com> writes:
> I originally operated on the assumption that it was the first case, which is
> why I didn't include a test in this patch. Commands like 'for-each-ref',
> 'show-ref', etc. either use an empty prefix or a directory prefix with a
> trailing slash, which won't trigger this issue.
Ah, yes, I didn't mention it but I suspected as such (i.e. the code
is structured in such a way that this broken implementation does not
matter to the current callers).
> I encountered the problem
> while working on a builtin that filtered refs by a user-specified prefix -
> the results included refs that should not have been matched, which led me to
> this fix.
OK, perfectly understandable.
> Scanning through the codebase again, though, I do see a way to replicate the
> issue:
>
> $ git update-ref refs/bisect/b HEAD
> $ git rev-parse --abbrev-ref --bisect
> refs/bisect/b
>
> Because 'rev-parse --bisect' uses the "refs/bisect/bad" prefix (no trailing
> slash) and does no additional filtering in its 'for_each_fullref_in'
> callback, refs like "refs/bisect/b" and "refs/bisect/ba" are (incorrectly)
> matched. I'll re-roll with the added test.
Good find. Thanks!| @@ -71,42 +71,6 @@ static int dir_file_stats(struct object_directory *object_dir, void *data) | |||
| return 0; | |||
There was a problem hiding this comment.
On the Git mailing list, Junio C Hamano wrote (reply to this):
"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Victoria Dye <vdye@github.com>
>
> Move 'get_dtype()' from 'diagnose.c' to 'dir.c' and add its declaration to
> 'dir.h' so that it is accessible to callers in other files. The function and
> its documentation are moved verbatim except for a small addition to the
> description clarifying what the 'path' arg represents.
>
> Signed-off-by: Victoria Dye <vdye@github.com>
> ---
> diagnose.c | 36 ------------------------------------
> dir.c | 28 ++++++++++++++++++++++++++++
> dir.h | 11 +++++++++++
> 3 files changed, 39 insertions(+), 36 deletions(-)
OK. diagnose.c should still have access to the function as it
includes <dir.h>, and to anybody that includes <dir.h> and sees the
declaration of get_dtype(), DT_FOO should be visible because <dir.h>
includes <statinfo.h> that has fallback definition of DT_FOO.
Looking simple and straight-forward.
> diff --git a/diagnose.c b/diagnose.c
> index 8430064000b..fc4d344bd63 100644
> --- a/diagnose.c
> +++ b/diagnose.c
> @@ -71,42 +71,6 @@ static int dir_file_stats(struct object_directory *object_dir, void *data)
> return 0;
> }
>
> -/*
> - * Get the d_type of a dirent. If the d_type is unknown, derive it from
> - * stat.st_mode.
> - *
> - * Note that 'path' is assumed to have a trailing slash. It is also modified
> - * in-place during the execution of the function, but is then reverted to its
> - * original value before returning.
> - */
> -static unsigned char get_dtype(struct dirent *e, struct strbuf *path)
> -{
> - struct stat st;
> - unsigned char dtype = DTYPE(e);
> - size_t base_path_len;
> -
> - if (dtype != DT_UNKNOWN)
> - return dtype;
> -
> - /* d_type unknown in dirent, try to fall back on lstat results */
> - base_path_len = path->len;
> - strbuf_addstr(path, e->d_name);
> - if (lstat(path->buf, &st))
> - goto cleanup;
> -
> - /* determine d_type from st_mode */
> - if (S_ISREG(st.st_mode))
> - dtype = DT_REG;
> - else if (S_ISDIR(st.st_mode))
> - dtype = DT_DIR;
> - else if (S_ISLNK(st.st_mode))
> - dtype = DT_LNK;
> -
> -cleanup:
> - strbuf_setlen(path, base_path_len);
> - return dtype;
> -}
> -
> static int count_files(struct strbuf *path)
> {
> DIR *dir = opendir(path->buf);
> diff --git a/dir.c b/dir.c
> index 8486e4d56ff..5e01af3a25e 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -2235,6 +2235,34 @@ static int get_index_dtype(struct index_state *istate,
> return DT_UNKNOWN;
> }
>
> +unsigned char get_dtype(struct dirent *e, struct strbuf *path)
> +{
> + struct stat st;
> + unsigned char dtype = DTYPE(e);
> + size_t base_path_len;
> +
> + if (dtype != DT_UNKNOWN)
> + return dtype;
> +
> + /* d_type unknown in dirent, try to fall back on lstat results */
> + base_path_len = path->len;
> + strbuf_addstr(path, e->d_name);
> + if (lstat(path->buf, &st))
> + goto cleanup;
> +
> + /* determine d_type from st_mode */
> + if (S_ISREG(st.st_mode))
> + dtype = DT_REG;
> + else if (S_ISDIR(st.st_mode))
> + dtype = DT_DIR;
> + else if (S_ISLNK(st.st_mode))
> + dtype = DT_LNK;
> +
> +cleanup:
> + strbuf_setlen(path, base_path_len);
> + return dtype;
> +}
> +
> static int resolve_dtype(int dtype, struct index_state *istate,
> const char *path, int len)
> {
> diff --git a/dir.h b/dir.h
> index ad06682fd54..28c630ce806 100644
> --- a/dir.h
> +++ b/dir.h
> @@ -363,6 +363,17 @@ struct dir_struct {
>
> struct dirent *readdir_skip_dot_and_dotdot(DIR *dirp);
>
> +/*
> + * Get the d_type of a dirent. If the d_type is unknown, derive it from
> + * stat.st_mode using the path to the dirent's containing directory (path) and
> + * the name of the dirent itself.
> + *
> + * Note that 'path' is assumed to have a trailing slash. It is also modified
> + * in-place during the execution of the function, but is then reverted to its
> + * original value before returning.
> + */
> +unsigned char get_dtype(struct dirent *e, struct strbuf *path);
> +
> /*Count the number of slashes for string s*/
> int count_slashes(const char *s);| @@ -246,10 +246,8 @@ static void loose_fill_ref_dir(struct ref_store *ref_store, | |||
| int dirnamelen = strlen(dirname); | |||
There was a problem hiding this comment.
On the Git mailing list, Junio C Hamano wrote (reply to this):
"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:
> Unlike other existing usage of 'get_dtype', the 'follow_symlinks' arg is set
> to 1 to replicate the existing handling of symlink dirents. This
> unfortunately requires calling 'stat' on the associated entry regardless of
> platform, but symlinks in the loose ref store are highly unlikely since
> they'd need to be created manually by a user.
Yeek. I wonder what breaks if we do not do this follow_symlinks()
part, i.e., either just replace stat() with lstat() in the original
without any of these four patches (which would be simple to figure
out what breaks), or omit [3/4] and let get_dtype() yield DT_LNK.
It seems that it comes from a7e66ae3 ([PATCH] Make do_each_ref()
follow symlinks., 2005-08-16), and just like I commented on there in
its log message back then, I still doubt that following a symbolic
link is a great idea here in this codepath.
But optimization without behaviour change is a good way to ensure
that optimization does not introduce new bugs, and because keeping
the historical behaviour like the patches [3/4] and this patch does
is more work (meaning: if it proves unnecessary to dereference
symbolic links, we can remove code instead of having to write new
code to support the new behaviour), let's take the series as-is, and
defer it to future developers to further clean-up the semantics.
> Note that this patch also changes the condition for skipping creation of a
> ref entry from "when 'stat' fails" to "when the d_type is anything other
> than DT_REG or DT_DIR". If a dirent's d_type is DT_UNKNOWN (either because
> the platform doesn't support d_type in dirents or some other reason) or
> DT_LNK, 'get_dtype' will try to derive the underlying type with 'stat'. If
> the 'stat' fails, the d_type will remain 'DT_UNKNOWN' and dirent will be
> skipped. However, it will also be skipped if it is any other valid d_type
> (e.g. DT_FIFO for named pipes, DT_LNK for a nested symlink). Git does not
> handle these properly anyway, so we can safely constrain accepted types to
> directories and regular files.
Sounds good.
> Signed-off-by: Victoria Dye <vdye@github.com>
> ---
> refs/files-backend.c | 14 +++++---------
> 1 file changed, 5 insertions(+), 9 deletions(-)
Thanks.
|
This branch is now known as |
|
This patch series was integrated into seen via git@39859ad. |
|
There was a status update in the "New Topics" section about the branch The code to iterate over loose references have been optimized to reduce the number of lstat() system calls. source: <pull.1594.git.1696615769.gitgitgadget@gmail.com> |
|
On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Fri, Oct 06, 2023 at 06:09:25PM +0000, Victoria Dye via GitGitGadget wrote:
> While investigating ref iteration performance in builtins like
> 'for-each-ref' and 'show-ref', I found two small improvement opportunities.
>
> The first patch tweaks the logic around prefix matching in
> 'cache_ref_iterator_advance' so that we correctly skip refs that do not
> actually match a given prefix. The unnecessary iteration doesn't seem to be
> causing any bugs in the ref iteration commands that I've tested, but it
> doesn't hurt to be more precise (and it helps with some other patches I'm
> working on ;) ).
>
> The next three patches update how 'loose_fill_ref_dir' determines the type
> of ref cache entry to create (directory or regular). On platforms that
> include d_type information in 'struct dirent' (as far as I can tell, all
> except NonStop & certain versions of Cygwin), this allows us to skip calling
> 'stat'. In ad-hoc testing, this improved performance of 'git for-each-ref'
> by about 20%.
I've done a small set of benchmarks with my usual test repositories,
which is linux.git with a bunch of references added. The repository
comes in four sizes:
- small: 50k references
- medium: 500k references
- high: 1.1m references
- huge: 12m references
Unfortunately, I couldn't really reproduce the performance improvements.
In fact, the new version runs consistently a tiny bit slower than the
old version:
# Old version, which is 3a06386e31 (The fifteenth batch, 2023-10-04).
Benchmark 1: git for-each-ref (revision=old,refcount=small)
Time (mean ± σ): 135.5 ms ± 1.2 ms [User: 76.4 ms, System: 59.0 ms]
Range (min … max): 134.8 ms … 136.9 ms 3 runs
Benchmark 2: git for-each-ref (revision=old,refcount=medium)
Time (mean ± σ): 822.7 ms ± 2.2 ms [User: 697.4 ms, System: 125.1 ms]
Range (min … max): 821.1 ms … 825.2 ms 3 runs
Benchmark 3: git for-each-ref (revision=old,refcount=high)
Time (mean ± σ): 1.960 s ± 0.015 s [User: 1.702 s, System: 0.257 s]
Range (min … max): 1.944 s … 1.973 s 3 runs
# New version, which is your tip.
Benchmark 4: git for-each-ref (revision=old,refcount=huge)
Time (mean ± σ): 16.815 s ± 0.054 s [User: 15.091 s, System: 1.722 s]
Range (min … max): 16.760 s … 16.869 s 3 runs
Benchmark 5: git for-each-ref (revision=new,refcount=small)
Time (mean ± σ): 136.0 ms ± 0.2 ms [User: 78.8 ms, System: 57.1 ms]
Range (min … max): 135.8 ms … 136.2 ms 3 runs
Benchmark 6: git for-each-ref (revision=new,refcount=medium)
Time (mean ± σ): 830.4 ms ± 21.2 ms [User: 691.3 ms, System: 138.7 ms]
Range (min … max): 814.2 ms … 854.5 ms 3 runs
Benchmark 7: git for-each-ref (revision=new,refcount=high)
Time (mean ± σ): 1.966 s ± 0.013 s [User: 1.717 s, System: 0.249 s]
Range (min … max): 1.952 s … 1.978 s 3 runs
Benchmark 8: git for-each-ref (revision=new,refcount=huge)
Time (mean ± σ): 16.945 s ± 0.037 s [User: 15.182 s, System: 1.760 s]
Range (min … max): 16.910 s … 16.983 s 3 runs
Summary
git for-each-ref (revision=old,refcount=small) ran
1.00 ± 0.01 times faster than git for-each-ref (revision=new,refcount=small)
6.07 ± 0.06 times faster than git for-each-ref (revision=old,refcount=medium)
6.13 ± 0.17 times faster than git for-each-ref (revision=new,refcount=medium)
14.46 ± 0.17 times faster than git for-each-ref (revision=old,refcount=high)
14.51 ± 0.16 times faster than git for-each-ref (revision=new,refcount=high)
124.09 ± 1.15 times faster than git for-each-ref (revision=old,refcount=huge)
125.05 ± 1.12 times faster than git for-each-ref (revision=new,refcount=huge)
The performance regression isn't all that concerning, but it makes me
wonder why I see things becoming slower rather than faster. My guess is
that this is because all my test repositories are well-packed and don't
have a lot of loose references. But I just wanted to confirm how you
benchmarked your change and what the underlying shape of your test repo
was.
Patrick
> Thanks!
>
> * Victoria
>
> Victoria Dye (4):
> ref-cache.c: fix prefix matching in ref iteration
> dir.[ch]: expose 'get_dtype'
> dir.[ch]: add 'follow_symlink' arg to 'get_dtype'
> files-backend.c: avoid stat in 'loose_fill_ref_dir'
>
> diagnose.c | 42 +++---------------------------------------
> dir.c | 33 +++++++++++++++++++++++++++++++++
> dir.h | 16 ++++++++++++++++
> refs/files-backend.c | 14 +++++---------
> refs/ref-cache.c | 3 ++-
> 5 files changed, 59 insertions(+), 49 deletions(-)
>
>
> base-commit: 3a06386e314565108ad56a9bdb8f7b80ac52fb69
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1594%2Fvdye%2Fvdye%2Fref-iteration-cleanup-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1594/vdye/vdye/ref-iteration-cleanup-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1594
> --
> gitgitgadget |
|
User |
Update 'cache_ref_iterator_advance' to skip over refs that are not matched by the given prefix. Currently, a ref entry is considered "matched" if the entry name is fully contained within the prefix: * prefix: "refs/heads/v1" * entry: "refs/heads/v1.0" OR if the prefix is fully contained in the entry name: * prefix: "refs/heads/v1.0" * entry: "refs/heads/v1" The first case is always correct, but the second is only correct if the ref cache entry is a directory, for example: * prefix: "refs/heads/example" * entry: "refs/heads/" Modify the logic in 'cache_ref_iterator_advance' to reflect these expectations: 1. If 'overlaps_prefix' returns 'PREFIX_EXCLUDES_DIR', then the prefix and ref cache entry do not overlap at all. Skip this entry. 2. If 'overlaps_prefix' returns 'PREFIX_WITHIN_DIR', then the prefix matches inside this entry if it is a directory. Skip if the entry is not a directory, otherwise iterate over it. 3. Otherwise, 'overlaps_prefix' returned 'PREFIX_CONTAINS_DIR', indicating that the cache entry (directory or not) is fully contained by or equal to the prefix. Iterate over this entry. Note that condition 2 relies on the names of directory entries having the appropriate trailing slash. The existing function documentation of 'create_dir_entry' explicitly calls out the trailing slash requirement, so this is a safe assumption to make. This bug generally doesn't have any user-facing impact, since it requires: 1. using a non-empty prefix without a trailing slash in an iteration like 'for_each_fullref_in', 2. the callback to said iteration not reapplying the original filter (as for-each-ref does) to ensure unmatched refs are skipped, and 3. the repository having one or more refs that match part of, but not all of, the prefix. However, there are some niche scenarios that meet those criteria (specifically, 'rev-parse --bisect' and '(log|show|shortlog) --bisect'). Add tests covering those cases to demonstrate the fix in this patch. Signed-off-by: Victoria Dye <vdye@github.com>
Move 'get_dtype()' from 'diagnose.c' to 'dir.c' and add its declaration to 'dir.h' so that it is accessible to callers in other files. The function and its documentation are moved verbatim except for a small addition to the description clarifying what the 'path' arg represents. Signed-off-by: Victoria Dye <vdye@github.com>
Add a 'follow_symlink' boolean option to 'get_type()'. If 'follow_symlink' is enabled, DT_LNK (in addition to DT_UNKNOWN) d_types triggers the stat-based d_type resolution, using 'stat' instead of 'lstat' to get the type of the followed symlink. Note that symlinks are not followed recursively, so a symlink pointing to another symlink will still resolve to DT_LNK. Update callers in 'diagnose.c' to specify 'follow_symlink = 0' to preserve current behavior. Signed-off-by: Victoria Dye <vdye@github.com>
Modify the 'readdir' loop in 'loose_fill_ref_dir' to, rather than 'stat' a file to determine whether it is a directory or not, use 'get_dtype'. Currently, the loop uses 'stat' to determine whether each dirent is a directory itself or not in order to construct the appropriate ref cache entry. If 'stat' fails (returning a negative value), the dirent is silently skipped; otherwise, 'S_ISDIR(st.st_mode)' is used to check whether the entry is a directory. On platforms that include an entry's d_type in in the 'dirent' struct, this extra 'stat' check is redundant. We can use the 'get_dtype' method to extract this information on platforms that support it (i.e. where NO_D_TYPE_IN_DIRENT is unset), and derive it with 'stat' on platforms that don't. Because 'stat' is an expensive call, this confers a modest-but-noticeable performance improvement when iterating over large numbers of refs (approximately 20% speedup in 'git for-each-ref' in a 30k ref repo). Unlike other existing usage of 'get_dtype', the 'follow_symlinks' arg is set to 1 to replicate the existing handling of symlink dirents. This unfortunately requires calling 'stat' on the associated entry regardless of platform, but symlinks in the loose ref store are highly unlikely since they'd need to be created manually by a user. Note that this patch also changes the condition for skipping creation of a ref entry from "when 'stat' fails" to "when the d_type is anything other than DT_REG or DT_DIR". If a dirent's d_type is DT_UNKNOWN (either because the platform doesn't support d_type in dirents or some other reason) or DT_LNK, 'get_dtype' will try to derive the underlying type with 'stat'. If the 'stat' fails, the d_type will remain 'DT_UNKNOWN' and dirent will be skipped. However, it will also be skipped if it is any other valid d_type (e.g. DT_FIFO for named pipes, DT_LNK for a nested symlink). Git does not handle these properly anyway, so we can safely constrain accepted types to directories and regular files. Signed-off-by: Victoria Dye <vdye@github.com>
|
On the Git mailing list, Victoria Dye wrote (reply to this): Patrick Steinhardt wrote:
> On Fri, Oct 06, 2023 at 06:09:25PM +0000, Victoria Dye via GitGitGadget wrote:
>> While investigating ref iteration performance in builtins like
>> 'for-each-ref' and 'show-ref', I found two small improvement opportunities.
>>
>> The first patch tweaks the logic around prefix matching in
>> 'cache_ref_iterator_advance' so that we correctly skip refs that do not
>> actually match a given prefix. The unnecessary iteration doesn't seem to be
>> causing any bugs in the ref iteration commands that I've tested, but it
>> doesn't hurt to be more precise (and it helps with some other patches I'm
>> working on ;) ).
>>
>> The next three patches update how 'loose_fill_ref_dir' determines the type
>> of ref cache entry to create (directory or regular). On platforms that
>> include d_type information in 'struct dirent' (as far as I can tell, all
>> except NonStop & certain versions of Cygwin), this allows us to skip calling
>> 'stat'. In ad-hoc testing, this improved performance of 'git for-each-ref'
>> by about 20%.
>
> I've done a small set of benchmarks with my usual test repositories,
> which is linux.git with a bunch of references added. The repository
> comes in four sizes:
>
> - small: 50k references
> - medium: 500k references
> - high: 1.1m references
> - huge: 12m references
>
> Unfortunately, I couldn't really reproduce the performance improvements.
> In fact, the new version runs consistently a tiny bit slower than the
> old version:
>
> # Old version, which is 3a06386e31 (The fifteenth batch, 2023-10-04).
>
> Benchmark 1: git for-each-ref (revision=old,refcount=small)
> Time (mean ± σ): 135.5 ms ± 1.2 ms [User: 76.4 ms, System: 59.0 ms]
> Range (min … max): 134.8 ms … 136.9 ms 3 runs
>
> Benchmark 2: git for-each-ref (revision=old,refcount=medium)
> Time (mean ± σ): 822.7 ms ± 2.2 ms [User: 697.4 ms, System: 125.1 ms]
> Range (min … max): 821.1 ms … 825.2 ms 3 runs
>
> Benchmark 3: git for-each-ref (revision=old,refcount=high)
> Time (mean ± σ): 1.960 s ± 0.015 s [User: 1.702 s, System: 0.257 s]
> Range (min … max): 1.944 s … 1.973 s 3 runs
>
> # New version, which is your tip.
>
> Benchmark 4: git for-each-ref (revision=old,refcount=huge)
> Time (mean ± σ): 16.815 s ± 0.054 s [User: 15.091 s, System: 1.722 s]
> Range (min … max): 16.760 s … 16.869 s 3 runs
>
> Benchmark 5: git for-each-ref (revision=new,refcount=small)
> Time (mean ± σ): 136.0 ms ± 0.2 ms [User: 78.8 ms, System: 57.1 ms]
> Range (min … max): 135.8 ms … 136.2 ms 3 runs
>
> Benchmark 6: git for-each-ref (revision=new,refcount=medium)
> Time (mean ± σ): 830.4 ms ± 21.2 ms [User: 691.3 ms, System: 138.7 ms]
> Range (min … max): 814.2 ms … 854.5 ms 3 runs
>
> Benchmark 7: git for-each-ref (revision=new,refcount=high)
> Time (mean ± σ): 1.966 s ± 0.013 s [User: 1.717 s, System: 0.249 s]
> Range (min … max): 1.952 s … 1.978 s 3 runs
>
> Benchmark 8: git for-each-ref (revision=new,refcount=huge)
> Time (mean ± σ): 16.945 s ± 0.037 s [User: 15.182 s, System: 1.760 s]
> Range (min … max): 16.910 s … 16.983 s 3 runs
>
> Summary
> git for-each-ref (revision=old,refcount=small) ran
> 1.00 ± 0.01 times faster than git for-each-ref (revision=new,refcount=small)
> 6.07 ± 0.06 times faster than git for-each-ref (revision=old,refcount=medium)
> 6.13 ± 0.17 times faster than git for-each-ref (revision=new,refcount=medium)
> 14.46 ± 0.17 times faster than git for-each-ref (revision=old,refcount=high)
> 14.51 ± 0.16 times faster than git for-each-ref (revision=new,refcount=high)
> 124.09 ± 1.15 times faster than git for-each-ref (revision=old,refcount=huge)
> 125.05 ± 1.12 times faster than git for-each-ref (revision=new,refcount=huge)
>
> The performance regression isn't all that concerning, but it makes me
> wonder why I see things becoming slower rather than faster. My guess is
> that this is because all my test repositories are well-packed and don't
> have a lot of loose references. But I just wanted to confirm how you
> benchmarked your change and what the underlying shape of your test repo
> was.
I ran my benchmark on my (Intel) Mac with a test repository (single commit,
one file) containing:
- 10k refs/heads/ references
- 10k refs/tags/ references
- 10k refs/special/ references
All refs in the repository are loose. My Mac has historically been somewhat
slow and inconsistent when it comes to perf testing, though, so I re-ran the
benchmark a bit more formally on an Ubuntu VM (3 warmup iterations followed
by at least 10 iterations per test):
---
Benchmark 1: git for-each-ref (revision=old,refcount=3k)
Time (mean ± σ): 40.6 ms ± 3.9 ms [User: 13.2 ms, System: 27.1 ms]
Range (min … max): 37.2 ms … 59.1 ms 76 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Benchmark 2: git for-each-ref (revision=new,refcount=3k)
Time (mean ± σ): 38.7 ms ± 4.4 ms [User: 13.8 ms, System: 24.5 ms]
Range (min … max): 35.1 ms … 57.2 ms 71 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Benchmark 3: git for-each-ref (revision=old,refcount=30k)
Time (mean ± σ): 419.4 ms ± 43.9 ms [User: 136.4 ms, System: 274.1 ms]
Range (min … max): 385.1 ms … 528.7 ms 10 runs
Benchmark 4: git for-each-ref (revision=new,refcount=30k)
Time (mean ± σ): 390.4 ms ± 27.2 ms [User: 133.1 ms, System: 251.6 ms]
Range (min … max): 360.3 ms … 447.6 ms 10 runs
Benchmark 5: git for-each-ref (revision=old,refcount=300k)
Time (mean ± σ): 4.171 s ± 0.052 s [User: 1.400 s, System: 2.715 s]
Range (min … max): 4.118 s … 4.283 s 10 runs
Benchmark 6: git for-each-ref (revision=new,refcount=300k)
Time (mean ± σ): 3.939 s ± 0.054 s [User: 1.403 s, System: 2.466 s]
Range (min … max): 3.858 s … 4.026 s 10 runs
Summary
'git for-each-ref (revision=new,refcount=3k)' ran
1.05 ± 0.16 times faster than 'git for-each-ref (revision=old,refcount=3k)'
10.08 ± 1.34 times faster than 'git for-each-ref (revision=new,refcount=30k)'
10.83 ± 1.67 times faster than 'git for-each-ref (revision=old,refcount=30k)'
101.68 ± 11.63 times faster than 'git for-each-ref (revision=new,refcount=300k)'
107.67 ± 12.30 times faster than 'git for-each-ref (revision=old,refcount=300k)'
---
So it's not the 20% speedup I saw on my local test repo (it's more like
5-8%), but there does appear to be a consistent improvement. As for your
results, the changes in this series shouldn't affect packed ref operations,
and the difference between old & new doesn't seem to indicate a regression.
|
|
/submit |
|
Submitted as pull.1594.v2.git.1696888736.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
|
This patch series was integrated into seen via git@a91bb38. |
|
This patch series was integrated into seen via git@5c93f1f. |
| @@ -412,7 +412,8 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator) | |||
|
|
|||
There was a problem hiding this comment.
On the Git mailing list, Patrick Steinhardt wrote (reply to this):
On Mon, Oct 09, 2023 at 09:58:53PM +0000, Victoria Dye via GitGitGadget wrote:
> From: Victoria Dye <vdye@github.com>
>
> Update 'cache_ref_iterator_advance' to skip over refs that are not matched
> by the given prefix.
>
> Currently, a ref entry is considered "matched" if the entry name is fully
> contained within the prefix:
>
> * prefix: "refs/heads/v1"
> * entry: "refs/heads/v1.0"
>
> OR if the prefix is fully contained in the entry name:
>
> * prefix: "refs/heads/v1.0"
> * entry: "refs/heads/v1"
>
> The first case is always correct, but the second is only correct if the ref
> cache entry is a directory, for example:
>
> * prefix: "refs/heads/example"
> * entry: "refs/heads/"
>
> Modify the logic in 'cache_ref_iterator_advance' to reflect these
> expectations:
>
> 1. If 'overlaps_prefix' returns 'PREFIX_EXCLUDES_DIR', then the prefix and
> ref cache entry do not overlap at all. Skip this entry.
> 2. If 'overlaps_prefix' returns 'PREFIX_WITHIN_DIR', then the prefix matches
> inside this entry if it is a directory. Skip if the entry is not a
> directory, otherwise iterate over it.
> 3. Otherwise, 'overlaps_prefix' returned 'PREFIX_CONTAINS_DIR', indicating
> that the cache entry (directory or not) is fully contained by or equal to
> the prefix. Iterate over this entry.
>
> Note that condition 2 relies on the names of directory entries having the
> appropriate trailing slash. The existing function documentation of
> 'create_dir_entry' explicitly calls out the trailing slash requirement, so
> this is a safe assumption to make.
>
> This bug generally doesn't have any user-facing impact, since it requires:
>
> 1. using a non-empty prefix without a trailing slash in an iteration like
> 'for_each_fullref_in',
> 2. the callback to said iteration not reapplying the original filter (as
> for-each-ref does) to ensure unmatched refs are skipped, and
> 3. the repository having one or more refs that match part of, but not all
> of, the prefix.
>
> However, there are some niche scenarios that meet those criteria
> (specifically, 'rev-parse --bisect' and '(log|show|shortlog) --bisect'). Add
> tests covering those cases to demonstrate the fix in this patch.
>
> Signed-off-by: Victoria Dye <vdye@github.com>
> ---
> refs/ref-cache.c | 3 ++-
> t/t1500-rev-parse.sh | 23 +++++++++++++++++++++++
> t/t4205-log-pretty-formats.sh | 30 ++++++++++++++++++++++++++++++
> 3 files changed, 55 insertions(+), 1 deletion(-)
>
> diff --git a/refs/ref-cache.c b/refs/ref-cache.c
> index 2294c4564fb..6e3b725245c 100644
> --- a/refs/ref-cache.c
> +++ b/refs/ref-cache.c
> @@ -412,7 +412,8 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
>
> if (level->prefix_state == PREFIX_WITHIN_DIR) {
> entry_prefix_state = overlaps_prefix(entry->name, iter->prefix);
> - if (entry_prefix_state == PREFIX_EXCLUDES_DIR)
> + if (entry_prefix_state == PREFIX_EXCLUDES_DIR ||
> + (entry_prefix_state == PREFIX_WITHIN_DIR && !(entry->flag & REF_DIR)))
> continue;
> } else {
> entry_prefix_state = level->prefix_state;
> diff --git a/t/t1500-rev-parse.sh b/t/t1500-rev-parse.sh
> index 37ee5091b5c..3f9e7f62e45 100755
> --- a/t/t1500-rev-parse.sh
> +++ b/t/t1500-rev-parse.sh
> @@ -264,4 +264,27 @@ test_expect_success 'rev-parse --since= unsqueezed ordering' '
> test_cmp expect actual
> '
>
> +test_expect_success 'rev-parse --bisect includes bad, excludes good' '
> + test_commit_bulk 6 &&
> +
> + git update-ref refs/bisect/bad-1 HEAD~1 &&
> + git update-ref refs/bisect/b HEAD~2 &&
> + git update-ref refs/bisect/bad-3 HEAD~3 &&
> + git update-ref refs/bisect/good-3 HEAD~3 &&
> + git update-ref refs/bisect/bad-4 HEAD~4 &&
> + git update-ref refs/bisect/go HEAD~4 &&
> +
> + # Note: refs/bisect/b and refs/bisect/go should be ignored because they
> + # do not match the refs/bisect/bad or refs/bisect/good prefixes.
> + cat >expect <<-EOF &&
> + refs/bisect/bad-1
> + refs/bisect/bad-3
> + refs/bisect/bad-4
> + ^refs/bisect/good-3
> + EOF
> +
> + git rev-parse --symbolic-full-name --bisect >actual &&
> + test_cmp expect actual
> +'
> +
> test_done
> diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh
> index 16626e4fe96..62c7bfed5d7 100755
> --- a/t/t4205-log-pretty-formats.sh
> +++ b/t/t4205-log-pretty-formats.sh
> @@ -956,6 +956,36 @@ test_expect_success '%S in git log --format works with other placeholders (part
> test_cmp expect actual
> '
>
> +test_expect_success 'setup more commits for %S with --bisect' '
> + test_commit four &&
> + test_commit five &&
> +
> + head1=$(git rev-parse --verify HEAD~0) &&
> + head2=$(git rev-parse --verify HEAD~1) &&
> + head3=$(git rev-parse --verify HEAD~2) &&
> + head4=$(git rev-parse --verify HEAD~3)
> +'
> +
> +test_expect_success '%S with --bisect labels commits with refs/bisect/bad ref' '
> + git update-ref refs/bisect/bad-$head1 $head1 &&
> + git update-ref refs/bisect/go $head1 &&
> + git update-ref refs/bisect/bad-$head2 $head2 &&
> + git update-ref refs/bisect/b $head3 &&
> + git update-ref refs/bisect/bad-$head4 $head4 &&
> + git update-ref refs/bisect/good-$head4 $head4 &&
> +
> + # We expect to see the range of commits betwee refs/bisect/good-$head4
Nit: s/betwee/between. Probably not worth rerolling this series only
because of this typo though.
Patrick
> + # and refs/bisect/bad-$head1. The "source" ref is the nearest bisect ref
> + # from which the commit is reachable.
> + cat >expect <<-EOF &&
> + $head1 refs/bisect/bad-$head1
> + $head2 refs/bisect/bad-$head2
> + $head3 refs/bisect/bad-$head2
> + EOF
> + git log --bisect --format="%H %S" >actual &&
> + test_cmp expect actual
> +'
> +
> test_expect_success 'log --pretty=reference' '
> git log --pretty="tformat:%h (%s, %as)" >expect &&
> git log --pretty=reference >actual &&
> --
> gitgitgadget
> |
On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Mon, Oct 09, 2023 at 02:49:14PM -0700, Victoria Dye wrote:
> Patrick Steinhardt wrote:
> > On Fri, Oct 06, 2023 at 06:09:25PM +0000, Victoria Dye via GitGitGadget wrote:
> >> While investigating ref iteration performance in builtins like
> >> 'for-each-ref' and 'show-ref', I found two small improvement opportunities.
> >>
> >> The first patch tweaks the logic around prefix matching in
> >> 'cache_ref_iterator_advance' so that we correctly skip refs that do not
> >> actually match a given prefix. The unnecessary iteration doesn't seem to be
> >> causing any bugs in the ref iteration commands that I've tested, but it
> >> doesn't hurt to be more precise (and it helps with some other patches I'm
> >> working on ;) ).
> >>
> >> The next three patches update how 'loose_fill_ref_dir' determines the type
> >> of ref cache entry to create (directory or regular). On platforms that
> >> include d_type information in 'struct dirent' (as far as I can tell, all
> >> except NonStop & certain versions of Cygwin), this allows us to skip calling
> >> 'stat'. In ad-hoc testing, this improved performance of 'git for-each-ref'
> >> by about 20%.
> >
> > I've done a small set of benchmarks with my usual test repositories,
> > which is linux.git with a bunch of references added. The repository
> > comes in four sizes:
> >
> > - small: 50k references
> > - medium: 500k references
> > - high: 1.1m references
> > - huge: 12m references
> >
> > Unfortunately, I couldn't really reproduce the performance improvements.
> > In fact, the new version runs consistently a tiny bit slower than the
> > old version:
> >
> > # Old version, which is 3a06386e31 (The fifteenth batch, 2023-10-04).
> >
> > Benchmark 1: git for-each-ref (revision=old,refcount=small)
> > Time (mean ± σ): 135.5 ms ± 1.2 ms [User: 76.4 ms, System: 59.0 ms]
> > Range (min … max): 134.8 ms … 136.9 ms 3 runs
> >
> > Benchmark 2: git for-each-ref (revision=old,refcount=medium)
> > Time (mean ± σ): 822.7 ms ± 2.2 ms [User: 697.4 ms, System: 125.1 ms]
> > Range (min … max): 821.1 ms … 825.2 ms 3 runs
> >
> > Benchmark 3: git for-each-ref (revision=old,refcount=high)
> > Time (mean ± σ): 1.960 s ± 0.015 s [User: 1.702 s, System: 0.257 s]
> > Range (min … max): 1.944 s … 1.973 s 3 runs
> >
> > # New version, which is your tip.
> >
> > Benchmark 4: git for-each-ref (revision=old,refcount=huge)
> > Time (mean ± σ): 16.815 s ± 0.054 s [User: 15.091 s, System: 1.722 s]
> > Range (min … max): 16.760 s … 16.869 s 3 runs
> >
> > Benchmark 5: git for-each-ref (revision=new,refcount=small)
> > Time (mean ± σ): 136.0 ms ± 0.2 ms [User: 78.8 ms, System: 57.1 ms]
> > Range (min … max): 135.8 ms … 136.2 ms 3 runs
> >
> > Benchmark 6: git for-each-ref (revision=new,refcount=medium)
> > Time (mean ± σ): 830.4 ms ± 21.2 ms [User: 691.3 ms, System: 138.7 ms]
> > Range (min … max): 814.2 ms … 854.5 ms 3 runs
> >
> > Benchmark 7: git for-each-ref (revision=new,refcount=high)
> > Time (mean ± σ): 1.966 s ± 0.013 s [User: 1.717 s, System: 0.249 s]
> > Range (min … max): 1.952 s … 1.978 s 3 runs
> >
> > Benchmark 8: git for-each-ref (revision=new,refcount=huge)
> > Time (mean ± σ): 16.945 s ± 0.037 s [User: 15.182 s, System: 1.760 s]
> > Range (min … max): 16.910 s … 16.983 s 3 runs
> >
> > Summary
> > git for-each-ref (revision=old,refcount=small) ran
> > 1.00 ± 0.01 times faster than git for-each-ref (revision=new,refcount=small)
> > 6.07 ± 0.06 times faster than git for-each-ref (revision=old,refcount=medium)
> > 6.13 ± 0.17 times faster than git for-each-ref (revision=new,refcount=medium)
> > 14.46 ± 0.17 times faster than git for-each-ref (revision=old,refcount=high)
> > 14.51 ± 0.16 times faster than git for-each-ref (revision=new,refcount=high)
> > 124.09 ± 1.15 times faster than git for-each-ref (revision=old,refcount=huge)
> > 125.05 ± 1.12 times faster than git for-each-ref (revision=new,refcount=huge)
> >
> > The performance regression isn't all that concerning, but it makes me
> > wonder why I see things becoming slower rather than faster. My guess is
> > that this is because all my test repositories are well-packed and don't
> > have a lot of loose references. But I just wanted to confirm how you
> > benchmarked your change and what the underlying shape of your test repo
> > was.
>
> I ran my benchmark on my (Intel) Mac with a test repository (single commit,
> one file) containing:
>
> - 10k refs/heads/ references
> - 10k refs/tags/ references
> - 10k refs/special/ references
>
> All refs in the repository are loose. My Mac has historically been somewhat
> slow and inconsistent when it comes to perf testing, though, so I re-ran the
> benchmark a bit more formally on an Ubuntu VM (3 warmup iterations followed
> by at least 10 iterations per test):
>
> ---
>
> Benchmark 1: git for-each-ref (revision=old,refcount=3k)
> Time (mean ± σ): 40.6 ms ± 3.9 ms [User: 13.2 ms, System: 27.1 ms]
> Range (min … max): 37.2 ms … 59.1 ms 76 runs
>
> Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
>
> Benchmark 2: git for-each-ref (revision=new,refcount=3k)
> Time (mean ± σ): 38.7 ms ± 4.4 ms [User: 13.8 ms, System: 24.5 ms]
> Range (min … max): 35.1 ms … 57.2 ms 71 runs
>
> Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
>
> Benchmark 3: git for-each-ref (revision=old,refcount=30k)
> Time (mean ± σ): 419.4 ms ± 43.9 ms [User: 136.4 ms, System: 274.1 ms]
> Range (min … max): 385.1 ms … 528.7 ms 10 runs
>
> Benchmark 4: git for-each-ref (revision=new,refcount=30k)
> Time (mean ± σ): 390.4 ms ± 27.2 ms [User: 133.1 ms, System: 251.6 ms]
> Range (min … max): 360.3 ms … 447.6 ms 10 runs
>
> Benchmark 5: git for-each-ref (revision=old,refcount=300k)
> Time (mean ± σ): 4.171 s ± 0.052 s [User: 1.400 s, System: 2.715 s]
> Range (min … max): 4.118 s … 4.283 s 10 runs
>
> Benchmark 6: git for-each-ref (revision=new,refcount=300k)
> Time (mean ± σ): 3.939 s ± 0.054 s [User: 1.403 s, System: 2.466 s]
> Range (min … max): 3.858 s … 4.026 s 10 runs
>
> Summary
> 'git for-each-ref (revision=new,refcount=3k)' ran
> 1.05 ± 0.16 times faster than 'git for-each-ref (revision=old,refcount=3k)'
> 10.08 ± 1.34 times faster than 'git for-each-ref (revision=new,refcount=30k)'
> 10.83 ± 1.67 times faster than 'git for-each-ref (revision=old,refcount=30k)'
> 101.68 ± 11.63 times faster than 'git for-each-ref (revision=new,refcount=300k)'
> 107.67 ± 12.30 times faster than 'git for-each-ref (revision=old,refcount=300k)'
>
> ---
>
> So it's not the 20% speedup I saw on my local test repo (it's more like
> 5-8%), but there does appear to be a consistent improvement.
Thanks a bunch for re-doing the benchmark with a documented setup.
> As for your results, the changes in this series shouldn't affect
> packed ref operations, and the difference between old & new doesn't
> seem to indicate a regression.
Yeah, I've also been surprised to see the performance regression for
packed-refs files. The regression is persistent and reproducable on my
machine though, so even though I tend to agree that the patches
shouldn't negatively impact packed-refs performance they somehow do. It
could just as well be something like different optimization choices by
the compler due to the added patches, or hitting different cache lines.
I dunno.
Anyway, I agree with your assessment. The regression I see is less than
1% for packed-refs, while the improvements for loose refs are a lot more
significant and conceptually make a lot of sense. So I didn't intend to
say that we shouldn't do these optimizations because of the miniscule
peformance regression with packed-refs.
Or in other words: this series looks good to me.
Thanks!
Patrick |
|
This patch series was integrated into seen via git@62a5770. |
|
This patch series was integrated into seen via git@29a5bc1. |
|
There was a status update in the "Cooking" section about the branch The code to iterate over loose references have been optimized to reduce the number of lstat() system calls. Will merge to 'next'? source: <pull.1594.v2.git.1696888736.gitgitgadget@gmail.com> |
|
This patch series was integrated into seen via git@095f2de. |
|
This patch series was integrated into seen via git@d9924f9. |
|
This patch series was integrated into next via git@99e2f83. |
|
There was a status update in the "Cooking" section about the branch The code to iterate over loose references have been optimized to reduce the number of lstat() system calls. Will merge to 'master'. source: <pull.1594.v2.git.1696888736.gitgitgadget@gmail.com> |
|
This patch series was integrated into seen via git@5bbb845. |
|
There was a status update in the "Cooking" section about the branch The code to iterate over loose references have been optimized to reduce the number of lstat() system calls. Will merge to 'master'. source: <pull.1594.v2.git.1696888736.gitgitgadget@gmail.com> |
|
This patch series was integrated into seen via git@d6258bf. |
|
This patch series was integrated into seen via git@6f0940a. |
|
This patch series was integrated into seen via git@c064020. |
|
There was a status update in the "Cooking" section about the branch The code to iterate over loose references have been optimized to reduce the number of lstat() system calls. Will merge to 'master'. source: <pull.1594.v2.git.1696888736.gitgitgadget@gmail.com> |
|
This patch series was integrated into seen via git@8622bdf. |
|
This patch series was integrated into seen via git@6b1e225. |
|
This patch series was integrated into master via git@6b1e225. |
|
This patch series was integrated into next via git@6b1e225. |
|
Closed via 6b1e225. |
While investigating ref iteration performance in builtins like 'for-each-ref' and 'show-ref', I found two small improvement opportunities.
The first patch tweaks the logic around prefix matching in 'cache_ref_iterator_advance' so that we correctly skip refs that do not actually match a given prefix. The unnecessary iteration doesn't seem to be causing any bugs in the ref iteration commands that I've tested, but it doesn't hurt to be more precise (and it helps with some other patches I'm working on ;) ).
The next three patches update how 'loose_fill_ref_dir' determines the type of ref cache entry to create (directory or regular). On platforms that include d_type information in 'struct dirent' (as far as I can tell, all except NonStop & certain versions of Cygwin), this allows us to skip calling 'stat'. Benchmarking against repos with various quantities of loose refs indicates a 5-8% speedup from these changes [1].
Changes since V1
Thanks!
[1] https://lore.kernel.org/git/28ae03f5-7091-d3f3-8a70-56aba6639640@github.com/
cc: Patrick Steinhardt ps@pks.im