Performance improvement & cleanup in loose ref iteration by vdye · Pull Request #1594 · gitgitgadget/git

vdye · 2023-10-06T17:33:04Z

While investigating ref iteration performance in builtins like 'for-each-ref' and 'show-ref', I found two small improvement opportunities.

The first patch tweaks the logic around prefix matching in 'cache_ref_iterator_advance' so that we correctly skip refs that do not actually match a given prefix. The unnecessary iteration doesn't seem to be causing any bugs in the ref iteration commands that I've tested, but it doesn't hurt to be more precise (and it helps with some other patches I'm working on ;) ).

The next three patches update how 'loose_fill_ref_dir' determines the type of ref cache entry to create (directory or regular). On platforms that include d_type information in 'struct dirent' (as far as I can tell, all except NonStop & certain versions of Cygwin), this allows us to skip calling 'stat'. Benchmarking against repos with various quantities of loose refs indicates a 5-8% speedup from these changes [1].

Changes since V1

Added tests in patch 1 to demonstrate the bugfix

Thanks!

Victoria

[1] https://lore.kernel.org/git/28ae03f5-7091-d3f3-8a70-56aba6639640@github.com/

cc: Patrick Steinhardt ps@pks.im

vdye · 2023-10-06T18:08:25Z

/submit

gitgitgadget · 2023-10-06T18:09:41Z

Submitted as pull.1594.git.1696615769.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1594/vdye/vdye/ref-iteration-cleanup-v1

To fetch this version to local tag pr-1594/vdye/vdye/ref-iteration-cleanup-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1594/vdye/vdye/ref-iteration-cleanup-v1

gitgitgadget · 2023-10-06T19:12:16Z

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:

> While investigating ref iteration performance in builtins like
> 'for-each-ref' and 'show-ref', I found two small improvement opportunities.
>
> The first patch tweaks the logic around prefix matching in
> 'cache_ref_iterator_advance' so that we correctly skip refs that do not
> actually match a given prefix. The unnecessary iteration doesn't seem to be
> causing any bugs in the ref iteration commands that I've tested, but it
> doesn't hurt to be more precise (and it helps with some other patches I'm
> working on ;) ).
>
> The next three patches update how 'loose_fill_ref_dir' determines the type
> of ref cache entry to create (directory or regular). On platforms that
> include d_type information in 'struct dirent' (as far as I can tell, all
> except NonStop & certain versions of Cygwin), this allows us to skip calling
> 'stat'. In ad-hoc testing, this improved performance of 'git for-each-ref'
> by about 20%.

Yay.  That is a very obvious one, once it is pointed out.  Thanks
for noticing and improving.  Looking forward to reading the patches
themselves.

gitgitgadget · 2023-10-06T21:54:40Z

refs/ref-cache.c

@@ -412,7 +412,8 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)



On the Git mailing list, Junio C Hamano wrote (reply to this):

"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Victoria Dye <vdye@github.com> > > Update 'cache_ref_iterator_advance' to skip over refs that are not matched > by the given prefix. > > Currently, a ref entry is considered "matched" if the entry name is fully > contained within the prefix: > > * prefix: "refs/heads/v1" > * entry: "refs/heads/v1.0" > > OR if the prefix is fully contained in the entry name: > > * prefix: "refs/heads/v1.0" > * entry: "refs/heads/v1" > > The first case is always correct, but the second is only correct if the ref > cache entry is a directory, for example: > > * prefix: "refs/heads/example" > * entry: "refs/heads/" > > Modify the logic in 'cache_ref_iterator_advance' to reflect these > expectations: > > 1. If 'overlaps_prefix' returns 'PREFIX_EXCLUDES_DIR', then the prefix and > ref cache entry do not overlap at all. Skip this entry. > 2. If 'overlaps_prefix' returns 'PREFIX_WITHIN_DIR', then the prefix matches > inside this entry if it is a directory. Skip if the entry is not a > directory, otherwise iterate over it. > 3. Otherwise, 'overlaps_prefix' returned 'PREFIX_CONTAINS_DIR', indicating > that the cache entry (directory or not) is fully contained by or equal to > the prefix. Iterate over this entry. > > Note that condition 2 relies on the names of directory entries having the > appropriate trailing slash. The existing function documentation of > 'create_dir_entry' explicitly calls out the trailing slash requirement, so > this is a safe assumption to make. Thanks for explaining it very well and clearly. Allowing prefix="refs/heads/v1.0" to yield entry="refs/heads/v1" (case #2 above that this patch fixes the behaviour for) would cause ref_iterator_advance() to return a ref outside the hierarhcy, wouldn't it? So it appears to me that either one of the two would be true: * the code is structured in such a way that such a condition does not actually happen (in which case this patch would be a no-op), or * there is a bug in the current code that is fixed by this patch, whose externally observable behaviour can be verified with a test. It is not quite clear to me which is the case here. The code with the patch looks more logical than the original, but I am not sure how to demonstrate the existing breakage (if any). > Signed-off-by: Victoria Dye <vdye@github.com> > --- > refs/ref-cache.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/refs/ref-cache.c b/refs/ref-cache.c > index 2294c4564fb..6e3b725245c 100644 > --- a/refs/ref-cache.c > +++ b/refs/ref-cache.c > @@ -412,7 +412,8 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator) > > if (level->prefix_state == PREFIX_WITHIN_DIR) { > entry_prefix_state = overlaps_prefix(entry->name, iter->prefix); > - if (entry_prefix_state == PREFIX_EXCLUDES_DIR) > + if (entry_prefix_state == PREFIX_EXCLUDES_DIR || > + (entry_prefix_state == PREFIX_WITHIN_DIR && !(entry->flag & REF_DIR))) > continue; > } else { > entry_prefix_state = level->prefix_state;

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Fri, Oct 06, 2023 at 02:51:24PM -0700, Junio C Hamano wrote: > "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > From: Victoria Dye <vdye@github.com> > > > > Update 'cache_ref_iterator_advance' to skip over refs that are not matched > > by the given prefix. > > > > Currently, a ref entry is considered "matched" if the entry name is fully > > contained within the prefix: > > > > * prefix: "refs/heads/v1" > > * entry: "refs/heads/v1.0" > > > > OR if the prefix is fully contained in the entry name: > > > > * prefix: "refs/heads/v1.0" > > * entry: "refs/heads/v1" > > > > The first case is always correct, but the second is only correct if the ref > > cache entry is a directory, for example: > > > > * prefix: "refs/heads/example" > > * entry: "refs/heads/" > > > > Modify the logic in 'cache_ref_iterator_advance' to reflect these > > expectations: > > > > 1. If 'overlaps_prefix' returns 'PREFIX_EXCLUDES_DIR', then the prefix and > > ref cache entry do not overlap at all. Skip this entry. > > 2. If 'overlaps_prefix' returns 'PREFIX_WITHIN_DIR', then the prefix matches > > inside this entry if it is a directory. Skip if the entry is not a > > directory, otherwise iterate over it. > > 3. Otherwise, 'overlaps_prefix' returned 'PREFIX_CONTAINS_DIR', indicating > > that the cache entry (directory or not) is fully contained by or equal to > > the prefix. Iterate over this entry. > > > > Note that condition 2 relies on the names of directory entries having the > > appropriate trailing slash. The existing function documentation of > > 'create_dir_entry' explicitly calls out the trailing slash requirement, so > > this is a safe assumption to make. > > Thanks for explaining it very well and clearly. > > Allowing prefix="refs/heads/v1.0" to yield entry="refs/heads/v1" > (case #2 above that this patch fixes the behaviour for) would cause > ref_iterator_advance() to return a ref outside the hierarhcy, > wouldn't it? So it appears to me that either one of the two would > be true: > > * the code is structured in such a way that such a condition does > not actually happen (in which case this patch would be a no-op), > or > > * there is a bug in the current code that is fixed by this patch, > whose externally observable behaviour can be verified with a > test. > > It is not quite clear to me which is the case here. The code with > the patch looks more logical than the original, but I am not sure > how to demonstrate the existing breakage (if any). Agreed, I also had a bit of a hard time to figure out whether this is an actual bug fix, a performance improvement or merely a refactoring. Patrick > > Signed-off-by: Victoria Dye <vdye@github.com> > > --- > > refs/ref-cache.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/refs/ref-cache.c b/refs/ref-cache.c > > index 2294c4564fb..6e3b725245c 100644 > > --- a/refs/ref-cache.c > > +++ b/refs/ref-cache.c > > @@ -412,7 +412,8 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator) > > > > if (level->prefix_state == PREFIX_WITHIN_DIR) { > > entry_prefix_state = overlaps_prefix(entry->name, iter->prefix); > > - if (entry_prefix_state == PREFIX_EXCLUDES_DIR) > > + if (entry_prefix_state == PREFIX_EXCLUDES_DIR || > > + (entry_prefix_state == PREFIX_WITHIN_DIR && !(entry->flag & REF_DIR))) > > continue; > > } else { > > entry_prefix_state = level->prefix_state;

On the Git mailing list, Victoria Dye wrote (reply to this):

Patrick Steinhardt wrote: >> Allowing prefix="refs/heads/v1.0" to yield entry="refs/heads/v1" >> (case #2 above that this patch fixes the behaviour for) would cause >> ref_iterator_advance() to return a ref outside the hierarhcy, >> wouldn't it? So it appears to me that either one of the two would >> be true: >> >> * the code is structured in such a way that such a condition does >> not actually happen (in which case this patch would be a no-op), >> or >> >> * there is a bug in the current code that is fixed by this patch, >> whose externally observable behaviour can be verified with a >> test. >> >> It is not quite clear to me which is the case here. The code with >> the patch looks more logical than the original, but I am not sure >> how to demonstrate the existing breakage (if any). > > Agreed, I also had a bit of a hard time to figure out whether this is an > actual bug fix, a performance improvement or merely a refactoring. > I originally operated on the assumption that it was the first case, which is why I didn't include a test in this patch. Commands like 'for-each-ref', 'show-ref', etc. either use an empty prefix or a directory prefix with a trailing slash, which won't trigger this issue. I encountered the problem while working on a builtin that filtered refs by a user-specified prefix - the results included refs that should not have been matched, which led me to this fix. Scanning through the codebase again, though, I do see a way to replicate the issue: $ git update-ref refs/bisect/b HEAD $ git rev-parse --abbrev-ref --bisect refs/bisect/b Because 'rev-parse --bisect' uses the "refs/bisect/bad" prefix (no trailing slash) and does no additional filtering in its 'for_each_fullref_in' callback, refs like "refs/bisect/b" and "refs/bisect/ba" are (incorrectly) matched. I'll re-roll with the added test.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Victoria Dye <vdye@github.com> writes: > I originally operated on the assumption that it was the first case, which is > why I didn't include a test in this patch. Commands like 'for-each-ref', > 'show-ref', etc. either use an empty prefix or a directory prefix with a > trailing slash, which won't trigger this issue. Ah, yes, I didn't mention it but I suspected as such (i.e. the code is structured in such a way that this broken implementation does not matter to the current callers). > I encountered the problem > while working on a builtin that filtered refs by a user-specified prefix - > the results included refs that should not have been matched, which led me to > this fix. OK, perfectly understandable. > Scanning through the codebase again, though, I do see a way to replicate the > issue: > > $ git update-ref refs/bisect/b HEAD > $ git rev-parse --abbrev-ref --bisect > refs/bisect/b > > Because 'rev-parse --bisect' uses the "refs/bisect/bad" prefix (no trailing > slash) and does no additional filtering in its 'for_each_fullref_in' > callback, refs like "refs/bisect/b" and "refs/bisect/ba" are (incorrectly) > matched. I'll re-roll with the added test. Good find. Thanks!

gitgitgadget · 2023-10-06T22:03:28Z

diagnose.c

@@ -71,42 +71,6 @@ static int dir_file_stats(struct object_directory *object_dir, void *data)
 	return 0;


On the Git mailing list, Junio C Hamano wrote (reply to this):

"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Victoria Dye <vdye@github.com> > > Move 'get_dtype()' from 'diagnose.c' to 'dir.c' and add its declaration to > 'dir.h' so that it is accessible to callers in other files. The function and > its documentation are moved verbatim except for a small addition to the > description clarifying what the 'path' arg represents. > > Signed-off-by: Victoria Dye <vdye@github.com> > --- > diagnose.c | 36 ------------------------------------ > dir.c | 28 ++++++++++++++++++++++++++++ > dir.h | 11 +++++++++++ > 3 files changed, 39 insertions(+), 36 deletions(-) OK. diagnose.c should still have access to the function as it includes <dir.h>, and to anybody that includes <dir.h> and sees the declaration of get_dtype(), DT_FOO should be visible because <dir.h> includes <statinfo.h> that has fallback definition of DT_FOO. Looking simple and straight-forward. > diff --git a/diagnose.c b/diagnose.c > index 8430064000b..fc4d344bd63 100644 > --- a/diagnose.c > +++ b/diagnose.c > @@ -71,42 +71,6 @@ static int dir_file_stats(struct object_directory *object_dir, void *data) > return 0; > } > > -/* > - * Get the d_type of a dirent. If the d_type is unknown, derive it from > - * stat.st_mode. > - * > - * Note that 'path' is assumed to have a trailing slash. It is also modified > - * in-place during the execution of the function, but is then reverted to its > - * original value before returning. > - */ > -static unsigned char get_dtype(struct dirent *e, struct strbuf *path) > -{ > - struct stat st; > - unsigned char dtype = DTYPE(e); > - size_t base_path_len; > - > - if (dtype != DT_UNKNOWN) > - return dtype; > - > - /* d_type unknown in dirent, try to fall back on lstat results */ > - base_path_len = path->len; > - strbuf_addstr(path, e->d_name); > - if (lstat(path->buf, &st)) > - goto cleanup; > - > - /* determine d_type from st_mode */ > - if (S_ISREG(st.st_mode)) > - dtype = DT_REG; > - else if (S_ISDIR(st.st_mode)) > - dtype = DT_DIR; > - else if (S_ISLNK(st.st_mode)) > - dtype = DT_LNK; > - > -cleanup: > - strbuf_setlen(path, base_path_len); > - return dtype; > -} > - > static int count_files(struct strbuf *path) > { > DIR *dir = opendir(path->buf); > diff --git a/dir.c b/dir.c > index 8486e4d56ff..5e01af3a25e 100644 > --- a/dir.c > +++ b/dir.c > @@ -2235,6 +2235,34 @@ static int get_index_dtype(struct index_state *istate, > return DT_UNKNOWN; > } > > +unsigned char get_dtype(struct dirent *e, struct strbuf *path) > +{ > + struct stat st; > + unsigned char dtype = DTYPE(e); > + size_t base_path_len; > + > + if (dtype != DT_UNKNOWN) > + return dtype; > + > + /* d_type unknown in dirent, try to fall back on lstat results */ > + base_path_len = path->len; > + strbuf_addstr(path, e->d_name); > + if (lstat(path->buf, &st)) > + goto cleanup; > + > + /* determine d_type from st_mode */ > + if (S_ISREG(st.st_mode)) > + dtype = DT_REG; > + else if (S_ISDIR(st.st_mode)) > + dtype = DT_DIR; > + else if (S_ISLNK(st.st_mode)) > + dtype = DT_LNK; > + > +cleanup: > + strbuf_setlen(path, base_path_len); > + return dtype; > +} > + > static int resolve_dtype(int dtype, struct index_state *istate, > const char *path, int len) > { > diff --git a/dir.h b/dir.h > index ad06682fd54..28c630ce806 100644 > --- a/dir.h > +++ b/dir.h > @@ -363,6 +363,17 @@ struct dir_struct { > > struct dirent *readdir_skip_dot_and_dotdot(DIR *dirp); > > +/* > + * Get the d_type of a dirent. If the d_type is unknown, derive it from > + * stat.st_mode using the path to the dirent's containing directory (path) and > + * the name of the dirent itself. > + * > + * Note that 'path' is assumed to have a trailing slash. It is also modified > + * in-place during the execution of the function, but is then reverted to its > + * original value before returning. > + */ > +unsigned char get_dtype(struct dirent *e, struct strbuf *path); > + > /*Count the number of slashes for string s*/ > int count_slashes(const char *s);

gitgitgadget · 2023-10-06T22:15:22Z

refs/files-backend.c

@@ -246,10 +246,8 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
 	int dirnamelen = strlen(dirname);


On the Git mailing list, Junio C Hamano wrote (reply to this):

"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes: > Unlike other existing usage of 'get_dtype', the 'follow_symlinks' arg is set > to 1 to replicate the existing handling of symlink dirents. This > unfortunately requires calling 'stat' on the associated entry regardless of > platform, but symlinks in the loose ref store are highly unlikely since > they'd need to be created manually by a user. Yeek. I wonder what breaks if we do not do this follow_symlinks() part, i.e., either just replace stat() with lstat() in the original without any of these four patches (which would be simple to figure out what breaks), or omit [3/4] and let get_dtype() yield DT_LNK. It seems that it comes from a7e66ae3 ([PATCH] Make do_each_ref() follow symlinks., 2005-08-16), and just like I commented on there in its log message back then, I still doubt that following a symbolic link is a great idea here in this codepath. But optimization without behaviour change is a good way to ensure that optimization does not introduce new bugs, and because keeping the historical behaviour like the patches [3/4] and this patch does is more work (meaning: if it proves unnecessary to dereference symbolic links, we can remove code instead of having to write new code to support the new behaviour), let's take the series as-is, and defer it to future developers to further clean-up the semantics. > Note that this patch also changes the condition for skipping creation of a > ref entry from "when 'stat' fails" to "when the d_type is anything other > than DT_REG or DT_DIR". If a dirent's d_type is DT_UNKNOWN (either because > the platform doesn't support d_type in dirents or some other reason) or > DT_LNK, 'get_dtype' will try to derive the underlying type with 'stat'. If > the 'stat' fails, the d_type will remain 'DT_UNKNOWN' and dirent will be > skipped. However, it will also be skipped if it is any other valid d_type > (e.g. DT_FIFO for named pipes, DT_LNK for a nested symlink). Git does not > handle these properly anyway, so we can safely constrain accepted types to > directories and regular files. Sounds good. > Signed-off-by: Victoria Dye <vdye@github.com> > --- > refs/files-backend.c | 14 +++++--------- > 1 file changed, 5 insertions(+), 9 deletions(-) Thanks.

gitgitgadget · 2023-10-07T00:09:47Z

This branch is now known as vd/loose-ref-iteration-optimization.

gitgitgadget · 2023-10-07T00:09:48Z

This patch series was integrated into seen via git@39859ad.

gitgitgadget · 2023-10-07T08:25:22Z

There was a status update in the "New Topics" section about the branch vd/loose-ref-iteration-optimization on the Git mailing list:

The code to iterate over loose references have been optimized to
reduce the number of lstat() system calls.
source: <pull.1594.git.1696615769.gitgitgadget@gmail.com>

gitgitgadget · 2023-10-09T10:07:59Z

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Fri, Oct 06, 2023 at 06:09:25PM +0000, Victoria Dye via GitGitGadget wrote:
> While investigating ref iteration performance in builtins like
> 'for-each-ref' and 'show-ref', I found two small improvement opportunities.
> 
> The first patch tweaks the logic around prefix matching in
> 'cache_ref_iterator_advance' so that we correctly skip refs that do not
> actually match a given prefix. The unnecessary iteration doesn't seem to be
> causing any bugs in the ref iteration commands that I've tested, but it
> doesn't hurt to be more precise (and it helps with some other patches I'm
> working on ;) ).
> 
> The next three patches update how 'loose_fill_ref_dir' determines the type
> of ref cache entry to create (directory or regular). On platforms that
> include d_type information in 'struct dirent' (as far as I can tell, all
> except NonStop & certain versions of Cygwin), this allows us to skip calling
> 'stat'. In ad-hoc testing, this improved performance of 'git for-each-ref'
> by about 20%.

I've done a small set of benchmarks with my usual test repositories,
which is linux.git with a bunch of references added. The repository
comes in four sizes:

- small: 50k references
- medium: 500k references
- high:  1.1m references
- huge: 12m references

Unfortunately, I couldn't really reproduce the performance improvements.
In fact, the new version runs consistently a tiny bit slower than the
old version:

    # Old version, which is 3a06386e31 (The fifteenth batch, 2023-10-04).

    Benchmark 1: git for-each-ref (revision=old,refcount=small)
      Time (mean ± σ):     135.5 ms ±   1.2 ms    [User: 76.4 ms, System: 59.0 ms]
      Range (min … max):   134.8 ms … 136.9 ms    3 runs

    Benchmark 2: git for-each-ref (revision=old,refcount=medium)
      Time (mean ± σ):     822.7 ms ±   2.2 ms    [User: 697.4 ms, System: 125.1 ms]
      Range (min … max):   821.1 ms … 825.2 ms    3 runs

    Benchmark 3: git for-each-ref (revision=old,refcount=high)
      Time (mean ± σ):      1.960 s ±  0.015 s    [User: 1.702 s, System: 0.257 s]
      Range (min … max):    1.944 s …  1.973 s    3 runs

    # New version, which is your tip.

    Benchmark 4: git for-each-ref (revision=old,refcount=huge)
      Time (mean ± σ):     16.815 s ±  0.054 s    [User: 15.091 s, System: 1.722 s]
      Range (min … max):   16.760 s … 16.869 s    3 runs

    Benchmark 5: git for-each-ref (revision=new,refcount=small)
      Time (mean ± σ):     136.0 ms ±   0.2 ms    [User: 78.8 ms, System: 57.1 ms]
      Range (min … max):   135.8 ms … 136.2 ms    3 runs

    Benchmark 6: git for-each-ref (revision=new,refcount=medium)
      Time (mean ± σ):     830.4 ms ±  21.2 ms    [User: 691.3 ms, System: 138.7 ms]
      Range (min … max):   814.2 ms … 854.5 ms    3 runs

    Benchmark 7: git for-each-ref (revision=new,refcount=high)
      Time (mean ± σ):      1.966 s ±  0.013 s    [User: 1.717 s, System: 0.249 s]
      Range (min … max):    1.952 s …  1.978 s    3 runs

    Benchmark 8: git for-each-ref (revision=new,refcount=huge)
      Time (mean ± σ):     16.945 s ±  0.037 s    [User: 15.182 s, System: 1.760 s]
      Range (min … max):   16.910 s … 16.983 s    3 runs

    Summary
      git for-each-ref (revision=old,refcount=small) ran
        1.00 ± 0.01 times faster than git for-each-ref (revision=new,refcount=small)
        6.07 ± 0.06 times faster than git for-each-ref (revision=old,refcount=medium)
        6.13 ± 0.17 times faster than git for-each-ref (revision=new,refcount=medium)
       14.46 ± 0.17 times faster than git for-each-ref (revision=old,refcount=high)
       14.51 ± 0.16 times faster than git for-each-ref (revision=new,refcount=high)
      124.09 ± 1.15 times faster than git for-each-ref (revision=old,refcount=huge)
      125.05 ± 1.12 times faster than git for-each-ref (revision=new,refcount=huge)

The performance regression isn't all that concerning, but it makes me
wonder why I see things becoming slower rather than faster. My guess is
that this is because all my test repositories are well-packed and don't
have a lot of loose references. But I just wanted to confirm how you
benchmarked your change and what the underlying shape of your test repo
was.

Patrick

> Thanks!
> 
>  * Victoria
> 
> Victoria Dye (4):
>   ref-cache.c: fix prefix matching in ref iteration
>   dir.[ch]: expose 'get_dtype'
>   dir.[ch]: add 'follow_symlink' arg to 'get_dtype'
>   files-backend.c: avoid stat in 'loose_fill_ref_dir'
> 
>  diagnose.c           | 42 +++---------------------------------------
>  dir.c                | 33 +++++++++++++++++++++++++++++++++
>  dir.h                | 16 ++++++++++++++++
>  refs/files-backend.c | 14 +++++---------
>  refs/ref-cache.c     |  3 ++-
>  5 files changed, 59 insertions(+), 49 deletions(-)
> 
> 
> base-commit: 3a06386e314565108ad56a9bdb8f7b80ac52fb69
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1594%2Fvdye%2Fvdye%2Fref-iteration-cleanup-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1594/vdye/vdye/ref-iteration-cleanup-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1594
> -- 
> gitgitgadget

gitgitgadget · 2023-10-09T10:08:03Z

User Patrick Steinhardt <ps@pks.im> has been added to the cc: list.

Update 'cache_ref_iterator_advance' to skip over refs that are not matched by the given prefix. Currently, a ref entry is considered "matched" if the entry name is fully contained within the prefix: * prefix: "refs/heads/v1" * entry: "refs/heads/v1.0" OR if the prefix is fully contained in the entry name: * prefix: "refs/heads/v1.0" * entry: "refs/heads/v1" The first case is always correct, but the second is only correct if the ref cache entry is a directory, for example: * prefix: "refs/heads/example" * entry: "refs/heads/" Modify the logic in 'cache_ref_iterator_advance' to reflect these expectations: 1. If 'overlaps_prefix' returns 'PREFIX_EXCLUDES_DIR', then the prefix and ref cache entry do not overlap at all. Skip this entry. 2. If 'overlaps_prefix' returns 'PREFIX_WITHIN_DIR', then the prefix matches inside this entry if it is a directory. Skip if the entry is not a directory, otherwise iterate over it. 3. Otherwise, 'overlaps_prefix' returned 'PREFIX_CONTAINS_DIR', indicating that the cache entry (directory or not) is fully contained by or equal to the prefix. Iterate over this entry. Note that condition 2 relies on the names of directory entries having the appropriate trailing slash. The existing function documentation of 'create_dir_entry' explicitly calls out the trailing slash requirement, so this is a safe assumption to make. This bug generally doesn't have any user-facing impact, since it requires: 1. using a non-empty prefix without a trailing slash in an iteration like 'for_each_fullref_in', 2. the callback to said iteration not reapplying the original filter (as for-each-ref does) to ensure unmatched refs are skipped, and 3. the repository having one or more refs that match part of, but not all of, the prefix. However, there are some niche scenarios that meet those criteria (specifically, 'rev-parse --bisect' and '(log|show|shortlog) --bisect'). Add tests covering those cases to demonstrate the fix in this patch. Signed-off-by: Victoria Dye <vdye@github.com>

Move 'get_dtype()' from 'diagnose.c' to 'dir.c' and add its declaration to 'dir.h' so that it is accessible to callers in other files. The function and its documentation are moved verbatim except for a small addition to the description clarifying what the 'path' arg represents. Signed-off-by: Victoria Dye <vdye@github.com>

Add a 'follow_symlink' boolean option to 'get_type()'. If 'follow_symlink' is enabled, DT_LNK (in addition to DT_UNKNOWN) d_types triggers the stat-based d_type resolution, using 'stat' instead of 'lstat' to get the type of the followed symlink. Note that symlinks are not followed recursively, so a symlink pointing to another symlink will still resolve to DT_LNK. Update callers in 'diagnose.c' to specify 'follow_symlink = 0' to preserve current behavior. Signed-off-by: Victoria Dye <vdye@github.com>

Modify the 'readdir' loop in 'loose_fill_ref_dir' to, rather than 'stat' a file to determine whether it is a directory or not, use 'get_dtype'. Currently, the loop uses 'stat' to determine whether each dirent is a directory itself or not in order to construct the appropriate ref cache entry. If 'stat' fails (returning a negative value), the dirent is silently skipped; otherwise, 'S_ISDIR(st.st_mode)' is used to check whether the entry is a directory. On platforms that include an entry's d_type in in the 'dirent' struct, this extra 'stat' check is redundant. We can use the 'get_dtype' method to extract this information on platforms that support it (i.e. where NO_D_TYPE_IN_DIRENT is unset), and derive it with 'stat' on platforms that don't. Because 'stat' is an expensive call, this confers a modest-but-noticeable performance improvement when iterating over large numbers of refs (approximately 20% speedup in 'git for-each-ref' in a 30k ref repo). Unlike other existing usage of 'get_dtype', the 'follow_symlinks' arg is set to 1 to replicate the existing handling of symlink dirents. This unfortunately requires calling 'stat' on the associated entry regardless of platform, but symlinks in the loose ref store are highly unlikely since they'd need to be created manually by a user. Note that this patch also changes the condition for skipping creation of a ref entry from "when 'stat' fails" to "when the d_type is anything other than DT_REG or DT_DIR". If a dirent's d_type is DT_UNKNOWN (either because the platform doesn't support d_type in dirents or some other reason) or DT_LNK, 'get_dtype' will try to derive the underlying type with 'stat'. If the 'stat' fails, the d_type will remain 'DT_UNKNOWN' and dirent will be skipped. However, it will also be skipped if it is any other valid d_type (e.g. DT_FIFO for named pipes, DT_LNK for a nested symlink). Git does not handle these properly anyway, so we can safely constrain accepted types to directories and regular files. Signed-off-by: Victoria Dye <vdye@github.com>

gitgitgadget · 2023-10-09T21:53:38Z

On the Git mailing list, Victoria Dye wrote (reply to this):

Patrick Steinhardt wrote:
> On Fri, Oct 06, 2023 at 06:09:25PM +0000, Victoria Dye via GitGitGadget wrote:
>> While investigating ref iteration performance in builtins like
>> 'for-each-ref' and 'show-ref', I found two small improvement opportunities.
>>
>> The first patch tweaks the logic around prefix matching in
>> 'cache_ref_iterator_advance' so that we correctly skip refs that do not
>> actually match a given prefix. The unnecessary iteration doesn't seem to be
>> causing any bugs in the ref iteration commands that I've tested, but it
>> doesn't hurt to be more precise (and it helps with some other patches I'm
>> working on ;) ).
>>
>> The next three patches update how 'loose_fill_ref_dir' determines the type
>> of ref cache entry to create (directory or regular). On platforms that
>> include d_type information in 'struct dirent' (as far as I can tell, all
>> except NonStop & certain versions of Cygwin), this allows us to skip calling
>> 'stat'. In ad-hoc testing, this improved performance of 'git for-each-ref'
>> by about 20%.
> 
> I've done a small set of benchmarks with my usual test repositories,
> which is linux.git with a bunch of references added. The repository
> comes in four sizes:
> 
> - small: 50k references
> - medium: 500k references
> - high:  1.1m references
> - huge: 12m references
> 
> Unfortunately, I couldn't really reproduce the performance improvements.
> In fact, the new version runs consistently a tiny bit slower than the
> old version:
> 
>     # Old version, which is 3a06386e31 (The fifteenth batch, 2023-10-04).
> 
>     Benchmark 1: git for-each-ref (revision=old,refcount=small)
>       Time (mean ± σ):     135.5 ms ±   1.2 ms    [User: 76.4 ms, System: 59.0 ms]
>       Range (min … max):   134.8 ms … 136.9 ms    3 runs
> 
>     Benchmark 2: git for-each-ref (revision=old,refcount=medium)
>       Time (mean ± σ):     822.7 ms ±   2.2 ms    [User: 697.4 ms, System: 125.1 ms]
>       Range (min … max):   821.1 ms … 825.2 ms    3 runs
> 
>     Benchmark 3: git for-each-ref (revision=old,refcount=high)
>       Time (mean ± σ):      1.960 s ±  0.015 s    [User: 1.702 s, System: 0.257 s]
>       Range (min … max):    1.944 s …  1.973 s    3 runs
> 
>     # New version, which is your tip.
> 
>     Benchmark 4: git for-each-ref (revision=old,refcount=huge)
>       Time (mean ± σ):     16.815 s ±  0.054 s    [User: 15.091 s, System: 1.722 s]
>       Range (min … max):   16.760 s … 16.869 s    3 runs
> 
>     Benchmark 5: git for-each-ref (revision=new,refcount=small)
>       Time (mean ± σ):     136.0 ms ±   0.2 ms    [User: 78.8 ms, System: 57.1 ms]
>       Range (min … max):   135.8 ms … 136.2 ms    3 runs
> 
>     Benchmark 6: git for-each-ref (revision=new,refcount=medium)
>       Time (mean ± σ):     830.4 ms ±  21.2 ms    [User: 691.3 ms, System: 138.7 ms]
>       Range (min … max):   814.2 ms … 854.5 ms    3 runs
> 
>     Benchmark 7: git for-each-ref (revision=new,refcount=high)
>       Time (mean ± σ):      1.966 s ±  0.013 s    [User: 1.717 s, System: 0.249 s]
>       Range (min … max):    1.952 s …  1.978 s    3 runs
> 
>     Benchmark 8: git for-each-ref (revision=new,refcount=huge)
>       Time (mean ± σ):     16.945 s ±  0.037 s    [User: 15.182 s, System: 1.760 s]
>       Range (min … max):   16.910 s … 16.983 s    3 runs
> 
>     Summary
>       git for-each-ref (revision=old,refcount=small) ran
>         1.00 ± 0.01 times faster than git for-each-ref (revision=new,refcount=small)
>         6.07 ± 0.06 times faster than git for-each-ref (revision=old,refcount=medium)
>         6.13 ± 0.17 times faster than git for-each-ref (revision=new,refcount=medium)
>        14.46 ± 0.17 times faster than git for-each-ref (revision=old,refcount=high)
>        14.51 ± 0.16 times faster than git for-each-ref (revision=new,refcount=high)
>       124.09 ± 1.15 times faster than git for-each-ref (revision=old,refcount=huge)
>       125.05 ± 1.12 times faster than git for-each-ref (revision=new,refcount=huge)
> 
> The performance regression isn't all that concerning, but it makes me
> wonder why I see things becoming slower rather than faster. My guess is
> that this is because all my test repositories are well-packed and don't
> have a lot of loose references. But I just wanted to confirm how you
> benchmarked your change and what the underlying shape of your test repo
> was.

I ran my benchmark on my (Intel) Mac with a test repository (single commit,
one file) containing:

- 10k refs/heads/ references
- 10k refs/tags/ references
- 10k refs/special/ references 

All refs in the repository are loose. My Mac has historically been somewhat
slow and inconsistent when it comes to perf testing, though, so I re-ran the
benchmark a bit more formally on an Ubuntu VM (3 warmup iterations followed
by at least 10 iterations per test):

---

Benchmark 1: git for-each-ref (revision=old,refcount=3k)
  Time (mean ± σ):      40.6 ms ±   3.9 ms    [User: 13.2 ms, System: 27.1 ms]
  Range (min … max):    37.2 ms …  59.1 ms    76 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 2: git for-each-ref (revision=new,refcount=3k)
  Time (mean ± σ):      38.7 ms ±   4.4 ms    [User: 13.8 ms, System: 24.5 ms]
  Range (min … max):    35.1 ms …  57.2 ms    71 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 3: git for-each-ref (revision=old,refcount=30k)
  Time (mean ± σ):     419.4 ms ±  43.9 ms    [User: 136.4 ms, System: 274.1 ms]
  Range (min … max):   385.1 ms … 528.7 ms    10 runs
 
Benchmark 4: git for-each-ref (revision=new,refcount=30k)
  Time (mean ± σ):     390.4 ms ±  27.2 ms    [User: 133.1 ms, System: 251.6 ms]
  Range (min … max):   360.3 ms … 447.6 ms    10 runs
 
Benchmark 5: git for-each-ref (revision=old,refcount=300k)
  Time (mean ± σ):      4.171 s ±  0.052 s    [User: 1.400 s, System: 2.715 s]
  Range (min … max):    4.118 s …  4.283 s    10 runs
 
Benchmark 6: git for-each-ref (revision=new,refcount=300k)
  Time (mean ± σ):      3.939 s ±  0.054 s    [User: 1.403 s, System: 2.466 s]
  Range (min … max):    3.858 s …  4.026 s    10 runs
 
Summary
  'git for-each-ref (revision=new,refcount=3k)' ran
    1.05 ± 0.16 times faster than 'git for-each-ref (revision=old,refcount=3k)'
   10.08 ± 1.34 times faster than 'git for-each-ref (revision=new,refcount=30k)'
   10.83 ± 1.67 times faster than 'git for-each-ref (revision=old,refcount=30k)'
  101.68 ± 11.63 times faster than 'git for-each-ref (revision=new,refcount=300k)'
  107.67 ± 12.30 times faster than 'git for-each-ref (revision=old,refcount=300k)'

---

So it's not the 20% speedup I saw on my local test repo (it's more like
5-8%), but there does appear to be a consistent improvement. As for your
results, the changes in this series shouldn't affect packed ref operations,
and the difference between old & new doesn't seem to indicate a regression.

vdye · 2023-10-09T21:58:10Z

/submit

gitgitgadget · 2023-10-09T21:59:07Z

Submitted as pull.1594.v2.git.1696888736.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1594/vdye/vdye/ref-iteration-cleanup-v2

To fetch this version to local tag pr-1594/vdye/vdye/ref-iteration-cleanup-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1594/vdye/vdye/ref-iteration-cleanup-v2

gitgitgadget · 2023-10-09T22:47:24Z

This patch series was integrated into seen via git@a91bb38.

gitgitgadget · 2023-10-09T23:59:11Z

This patch series was integrated into seen via git@5c93f1f.

gitgitgadget · 2023-10-10T07:24:29Z

refs/ref-cache.c

@@ -412,7 +412,8 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)



On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Mon, Oct 09, 2023 at 09:58:53PM +0000, Victoria Dye via GitGitGadget wrote: > From: Victoria Dye <vdye@github.com> > > Update 'cache_ref_iterator_advance' to skip over refs that are not matched > by the given prefix. > > Currently, a ref entry is considered "matched" if the entry name is fully > contained within the prefix: > > * prefix: "refs/heads/v1" > * entry: "refs/heads/v1.0" > > OR if the prefix is fully contained in the entry name: > > * prefix: "refs/heads/v1.0" > * entry: "refs/heads/v1" > > The first case is always correct, but the second is only correct if the ref > cache entry is a directory, for example: > > * prefix: "refs/heads/example" > * entry: "refs/heads/" > > Modify the logic in 'cache_ref_iterator_advance' to reflect these > expectations: > > 1. If 'overlaps_prefix' returns 'PREFIX_EXCLUDES_DIR', then the prefix and > ref cache entry do not overlap at all. Skip this entry. > 2. If 'overlaps_prefix' returns 'PREFIX_WITHIN_DIR', then the prefix matches > inside this entry if it is a directory. Skip if the entry is not a > directory, otherwise iterate over it. > 3. Otherwise, 'overlaps_prefix' returned 'PREFIX_CONTAINS_DIR', indicating > that the cache entry (directory or not) is fully contained by or equal to > the prefix. Iterate over this entry. > > Note that condition 2 relies on the names of directory entries having the > appropriate trailing slash. The existing function documentation of > 'create_dir_entry' explicitly calls out the trailing slash requirement, so > this is a safe assumption to make. > > This bug generally doesn't have any user-facing impact, since it requires: > > 1. using a non-empty prefix without a trailing slash in an iteration like > 'for_each_fullref_in', > 2. the callback to said iteration not reapplying the original filter (as > for-each-ref does) to ensure unmatched refs are skipped, and > 3. the repository having one or more refs that match part of, but not all > of, the prefix. > > However, there are some niche scenarios that meet those criteria > (specifically, 'rev-parse --bisect' and '(log|show|shortlog) --bisect'). Add > tests covering those cases to demonstrate the fix in this patch. > > Signed-off-by: Victoria Dye <vdye@github.com> > --- > refs/ref-cache.c | 3 ++- > t/t1500-rev-parse.sh | 23 +++++++++++++++++++++++ > t/t4205-log-pretty-formats.sh | 30 ++++++++++++++++++++++++++++++ > 3 files changed, 55 insertions(+), 1 deletion(-) > > diff --git a/refs/ref-cache.c b/refs/ref-cache.c > index 2294c4564fb..6e3b725245c 100644 > --- a/refs/ref-cache.c > +++ b/refs/ref-cache.c > @@ -412,7 +412,8 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator) > > if (level->prefix_state == PREFIX_WITHIN_DIR) { > entry_prefix_state = overlaps_prefix(entry->name, iter->prefix); > - if (entry_prefix_state == PREFIX_EXCLUDES_DIR) > + if (entry_prefix_state == PREFIX_EXCLUDES_DIR || > + (entry_prefix_state == PREFIX_WITHIN_DIR && !(entry->flag & REF_DIR))) > continue; > } else { > entry_prefix_state = level->prefix_state; > diff --git a/t/t1500-rev-parse.sh b/t/t1500-rev-parse.sh > index 37ee5091b5c..3f9e7f62e45 100755 > --- a/t/t1500-rev-parse.sh > +++ b/t/t1500-rev-parse.sh > @@ -264,4 +264,27 @@ test_expect_success 'rev-parse --since= unsqueezed ordering' ' > test_cmp expect actual > ' > > +test_expect_success 'rev-parse --bisect includes bad, excludes good' ' > + test_commit_bulk 6 && > + > + git update-ref refs/bisect/bad-1 HEAD~1 && > + git update-ref refs/bisect/b HEAD~2 && > + git update-ref refs/bisect/bad-3 HEAD~3 && > + git update-ref refs/bisect/good-3 HEAD~3 && > + git update-ref refs/bisect/bad-4 HEAD~4 && > + git update-ref refs/bisect/go HEAD~4 && > + > + # Note: refs/bisect/b and refs/bisect/go should be ignored because they > + # do not match the refs/bisect/bad or refs/bisect/good prefixes. > + cat >expect <<-EOF && > + refs/bisect/bad-1 > + refs/bisect/bad-3 > + refs/bisect/bad-4 > + ^refs/bisect/good-3 > + EOF > + > + git rev-parse --symbolic-full-name --bisect >actual && > + test_cmp expect actual > +' > + > test_done > diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh > index 16626e4fe96..62c7bfed5d7 100755 > --- a/t/t4205-log-pretty-formats.sh > +++ b/t/t4205-log-pretty-formats.sh > @@ -956,6 +956,36 @@ test_expect_success '%S in git log --format works with other placeholders (part > test_cmp expect actual > ' > > +test_expect_success 'setup more commits for %S with --bisect' ' > + test_commit four && > + test_commit five && > + > + head1=$(git rev-parse --verify HEAD~0) && > + head2=$(git rev-parse --verify HEAD~1) && > + head3=$(git rev-parse --verify HEAD~2) && > + head4=$(git rev-parse --verify HEAD~3) > +' > + > +test_expect_success '%S with --bisect labels commits with refs/bisect/bad ref' ' > + git update-ref refs/bisect/bad-$head1 $head1 && > + git update-ref refs/bisect/go $head1 && > + git update-ref refs/bisect/bad-$head2 $head2 && > + git update-ref refs/bisect/b $head3 && > + git update-ref refs/bisect/bad-$head4 $head4 && > + git update-ref refs/bisect/good-$head4 $head4 && > + > + # We expect to see the range of commits betwee refs/bisect/good-$head4 Nit: s/betwee/between. Probably not worth rerolling this series only because of this typo though. Patrick > + # and refs/bisect/bad-$head1. The "source" ref is the nearest bisect ref > + # from which the commit is reachable. > + cat >expect <<-EOF && > + $head1 refs/bisect/bad-$head1 > + $head2 refs/bisect/bad-$head2 > + $head3 refs/bisect/bad-$head2 > + EOF > + git log --bisect --format="%H %S" >actual && > + test_cmp expect actual > +' > + > test_expect_success 'log --pretty=reference' ' > git log --pretty="tformat:%h (%s, %as)" >expect && > git log --pretty=reference >actual && > -- > gitgitgadget >

gitgitgadget · 2023-10-10T07:24:31Z

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Mon, Oct 09, 2023 at 02:49:14PM -0700, Victoria Dye wrote:
> Patrick Steinhardt wrote:
> > On Fri, Oct 06, 2023 at 06:09:25PM +0000, Victoria Dye via GitGitGadget wrote:
> >> While investigating ref iteration performance in builtins like
> >> 'for-each-ref' and 'show-ref', I found two small improvement opportunities.
> >>
> >> The first patch tweaks the logic around prefix matching in
> >> 'cache_ref_iterator_advance' so that we correctly skip refs that do not
> >> actually match a given prefix. The unnecessary iteration doesn't seem to be
> >> causing any bugs in the ref iteration commands that I've tested, but it
> >> doesn't hurt to be more precise (and it helps with some other patches I'm
> >> working on ;) ).
> >>
> >> The next three patches update how 'loose_fill_ref_dir' determines the type
> >> of ref cache entry to create (directory or regular). On platforms that
> >> include d_type information in 'struct dirent' (as far as I can tell, all
> >> except NonStop & certain versions of Cygwin), this allows us to skip calling
> >> 'stat'. In ad-hoc testing, this improved performance of 'git for-each-ref'
> >> by about 20%.
> > 
> > I've done a small set of benchmarks with my usual test repositories,
> > which is linux.git with a bunch of references added. The repository
> > comes in four sizes:
> > 
> > - small: 50k references
> > - medium: 500k references
> > - high:  1.1m references
> > - huge: 12m references
> > 
> > Unfortunately, I couldn't really reproduce the performance improvements.
> > In fact, the new version runs consistently a tiny bit slower than the
> > old version:
> > 
> >     # Old version, which is 3a06386e31 (The fifteenth batch, 2023-10-04).
> > 
> >     Benchmark 1: git for-each-ref (revision=old,refcount=small)
> >       Time (mean ± σ):     135.5 ms ±   1.2 ms    [User: 76.4 ms, System: 59.0 ms]
> >       Range (min … max):   134.8 ms … 136.9 ms    3 runs
> > 
> >     Benchmark 2: git for-each-ref (revision=old,refcount=medium)
> >       Time (mean ± σ):     822.7 ms ±   2.2 ms    [User: 697.4 ms, System: 125.1 ms]
> >       Range (min … max):   821.1 ms … 825.2 ms    3 runs
> > 
> >     Benchmark 3: git for-each-ref (revision=old,refcount=high)
> >       Time (mean ± σ):      1.960 s ±  0.015 s    [User: 1.702 s, System: 0.257 s]
> >       Range (min … max):    1.944 s …  1.973 s    3 runs
> > 
> >     # New version, which is your tip.
> > 
> >     Benchmark 4: git for-each-ref (revision=old,refcount=huge)
> >       Time (mean ± σ):     16.815 s ±  0.054 s    [User: 15.091 s, System: 1.722 s]
> >       Range (min … max):   16.760 s … 16.869 s    3 runs
> > 
> >     Benchmark 5: git for-each-ref (revision=new,refcount=small)
> >       Time (mean ± σ):     136.0 ms ±   0.2 ms    [User: 78.8 ms, System: 57.1 ms]
> >       Range (min … max):   135.8 ms … 136.2 ms    3 runs
> > 
> >     Benchmark 6: git for-each-ref (revision=new,refcount=medium)
> >       Time (mean ± σ):     830.4 ms ±  21.2 ms    [User: 691.3 ms, System: 138.7 ms]
> >       Range (min … max):   814.2 ms … 854.5 ms    3 runs
> > 
> >     Benchmark 7: git for-each-ref (revision=new,refcount=high)
> >       Time (mean ± σ):      1.966 s ±  0.013 s    [User: 1.717 s, System: 0.249 s]
> >       Range (min … max):    1.952 s …  1.978 s    3 runs
> > 
> >     Benchmark 8: git for-each-ref (revision=new,refcount=huge)
> >       Time (mean ± σ):     16.945 s ±  0.037 s    [User: 15.182 s, System: 1.760 s]
> >       Range (min … max):   16.910 s … 16.983 s    3 runs
> > 
> >     Summary
> >       git for-each-ref (revision=old,refcount=small) ran
> >         1.00 ± 0.01 times faster than git for-each-ref (revision=new,refcount=small)
> >         6.07 ± 0.06 times faster than git for-each-ref (revision=old,refcount=medium)
> >         6.13 ± 0.17 times faster than git for-each-ref (revision=new,refcount=medium)
> >        14.46 ± 0.17 times faster than git for-each-ref (revision=old,refcount=high)
> >        14.51 ± 0.16 times faster than git for-each-ref (revision=new,refcount=high)
> >       124.09 ± 1.15 times faster than git for-each-ref (revision=old,refcount=huge)
> >       125.05 ± 1.12 times faster than git for-each-ref (revision=new,refcount=huge)
> > 
> > The performance regression isn't all that concerning, but it makes me
> > wonder why I see things becoming slower rather than faster. My guess is
> > that this is because all my test repositories are well-packed and don't
> > have a lot of loose references. But I just wanted to confirm how you
> > benchmarked your change and what the underlying shape of your test repo
> > was.
> 
> I ran my benchmark on my (Intel) Mac with a test repository (single commit,
> one file) containing:
> 
> - 10k refs/heads/ references
> - 10k refs/tags/ references
> - 10k refs/special/ references 
> 
> All refs in the repository are loose. My Mac has historically been somewhat
> slow and inconsistent when it comes to perf testing, though, so I re-ran the
> benchmark a bit more formally on an Ubuntu VM (3 warmup iterations followed
> by at least 10 iterations per test):
> 
> ---
> 
> Benchmark 1: git for-each-ref (revision=old,refcount=3k)
>   Time (mean ± σ):      40.6 ms ±   3.9 ms    [User: 13.2 ms, System: 27.1 ms]
>   Range (min … max):    37.2 ms …  59.1 ms    76 runs
>  
>   Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
>  
> Benchmark 2: git for-each-ref (revision=new,refcount=3k)
>   Time (mean ± σ):      38.7 ms ±   4.4 ms    [User: 13.8 ms, System: 24.5 ms]
>   Range (min … max):    35.1 ms …  57.2 ms    71 runs
>  
>   Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
>  
> Benchmark 3: git for-each-ref (revision=old,refcount=30k)
>   Time (mean ± σ):     419.4 ms ±  43.9 ms    [User: 136.4 ms, System: 274.1 ms]
>   Range (min … max):   385.1 ms … 528.7 ms    10 runs
>  
> Benchmark 4: git for-each-ref (revision=new,refcount=30k)
>   Time (mean ± σ):     390.4 ms ±  27.2 ms    [User: 133.1 ms, System: 251.6 ms]
>   Range (min … max):   360.3 ms … 447.6 ms    10 runs
>  
> Benchmark 5: git for-each-ref (revision=old,refcount=300k)
>   Time (mean ± σ):      4.171 s ±  0.052 s    [User: 1.400 s, System: 2.715 s]
>   Range (min … max):    4.118 s …  4.283 s    10 runs
>  
> Benchmark 6: git for-each-ref (revision=new,refcount=300k)
>   Time (mean ± σ):      3.939 s ±  0.054 s    [User: 1.403 s, System: 2.466 s]
>   Range (min … max):    3.858 s …  4.026 s    10 runs
>  
> Summary
>   'git for-each-ref (revision=new,refcount=3k)' ran
>     1.05 ± 0.16 times faster than 'git for-each-ref (revision=old,refcount=3k)'
>    10.08 ± 1.34 times faster than 'git for-each-ref (revision=new,refcount=30k)'
>    10.83 ± 1.67 times faster than 'git for-each-ref (revision=old,refcount=30k)'
>   101.68 ± 11.63 times faster than 'git for-each-ref (revision=new,refcount=300k)'
>   107.67 ± 12.30 times faster than 'git for-each-ref (revision=old,refcount=300k)'
> 
> ---
> 
> So it's not the 20% speedup I saw on my local test repo (it's more like
> 5-8%), but there does appear to be a consistent improvement.

Thanks a bunch for re-doing the benchmark with a documented setup.

> As for your results, the changes in this series shouldn't affect
> packed ref operations, and the difference between old & new doesn't
> seem to indicate a regression. 

Yeah, I've also been surprised to see the performance regression for
packed-refs files. The regression is persistent and reproducable on my
machine though, so even though I tend to agree that the patches
shouldn't negatively impact packed-refs performance they somehow do. It
could just as well be something like different optimization choices by
the compler due to the added patches, or hitting different cache lines.
I dunno.

Anyway, I agree with your assessment. The regression I see is less than
1% for packed-refs, while the improvements for loose refs are a lot more
significant and conceptually make a lot of sense. So I didn't intend to
say that we shouldn't do these optimizations because of the miniscule
peformance regression with packed-refs.

Or in other words: this series looks good to me.

Thanks!
Patrick

gitgitgadget · 2023-10-10T22:45:33Z

This patch series was integrated into seen via git@62a5770.

gitgitgadget · 2023-10-10T23:55:43Z

This patch series was integrated into seen via git@29a5bc1.

gitgitgadget · 2023-10-11T01:37:03Z

There was a status update in the "Cooking" section about the branch vd/loose-ref-iteration-optimization on the Git mailing list:

The code to iterate over loose references have been optimized to
reduce the number of lstat() system calls.

Will merge to 'next'?
source: <pull.1594.v2.git.1696888736.gitgitgadget@gmail.com>

gitgitgadget · 2023-10-11T22:21:57Z

This patch series was integrated into seen via git@095f2de.

gitgitgadget · 2023-10-12T20:58:18Z

This patch series was integrated into seen via git@d9924f9.

gitgitgadget · 2023-10-12T20:58:21Z

This patch series was integrated into next via git@99e2f83.

gitgitgadget · 2023-10-12T23:25:02Z

There was a status update in the "Cooking" section about the branch vd/loose-ref-iteration-optimization on the Git mailing list:

The code to iterate over loose references have been optimized to
reduce the number of lstat() system calls.

Will merge to 'master'.
source: <pull.1594.v2.git.1696888736.gitgitgadget@gmail.com>

gitgitgadget · 2023-10-13T22:59:15Z

This patch series was integrated into seen via git@5bbb845.

gitgitgadget · 2023-10-14T20:14:51Z

There was a status update in the "Cooking" section about the branch vd/loose-ref-iteration-optimization on the Git mailing list:

The code to iterate over loose references have been optimized to
reduce the number of lstat() system calls.

Will merge to 'master'.
source: <pull.1594.v2.git.1696888736.gitgitgadget@gmail.com>

gitgitgadget · 2023-10-16T20:59:14Z

This patch series was integrated into seen via git@d6258bf.

gitgitgadget · 2023-10-17T22:16:27Z

This patch series was integrated into seen via git@6f0940a.

gitgitgadget · 2023-10-18T21:34:57Z

This patch series was integrated into seen via git@c064020.

gitgitgadget · 2023-10-18T23:39:03Z

There was a status update in the "Cooking" section about the branch vd/loose-ref-iteration-optimization on the Git mailing list:

The code to iterate over loose references have been optimized to
reduce the number of lstat() system calls.

Will merge to 'master'.
source: <pull.1594.v2.git.1696888736.gitgitgadget@gmail.com>

gitgitgadget · 2023-10-20T00:44:32Z

This patch series was integrated into seen via git@8622bdf.

gitgitgadget · 2023-10-21T00:13:25Z

This patch series was integrated into seen via git@6b1e225.

gitgitgadget · 2023-10-21T00:13:26Z

This patch series was integrated into master via git@6b1e225.

gitgitgadget · 2023-10-21T00:13:27Z

This patch series was integrated into next via git@6b1e225.

gitgitgadget · 2023-10-21T00:13:30Z

Closed via 6b1e225.

vdye self-assigned this Oct 6, 2023

gitgitgadget bot reviewed Oct 6, 2023

View reviewed changes

gitgitgadget bot added the seen label Oct 7, 2023

vdye added 4 commits October 9, 2023 12:13

gitgitgadget bot reviewed Oct 10, 2023

View reviewed changes

gitgitgadget bot added the next label Oct 12, 2023

gitgitgadget bot added the master label Oct 21, 2023

gitgitgadget bot closed this Oct 21, 2023

		@@ -412,7 +412,8 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)

		@@ -71,42 +71,6 @@ static int dir_file_stats(struct object_directory object_dir, void data)
		return 0;

		@@ -246,10 +246,8 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
		int dirnamelen = strlen(dirname);

Conversation

vdye commented Oct 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes since V1

Uh oh!

vdye commented Oct 6, 2023

Uh oh!

gitgitgadget bot commented Oct 6, 2023

Uh oh!

gitgitgadget bot commented Oct 6, 2023

Uh oh!

gitgitgadget bot Oct 6, 2023

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Oct 9, 2023

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Oct 9, 2023

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Oct 9, 2023

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Oct 6, 2023

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Oct 6, 2023

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot commented Oct 7, 2023

Uh oh!

gitgitgadget bot commented Oct 7, 2023

Uh oh!

gitgitgadget bot commented Oct 7, 2023

Uh oh!

gitgitgadget bot commented Oct 9, 2023

Uh oh!

gitgitgadget bot commented Oct 9, 2023

Uh oh!

gitgitgadget bot commented Oct 9, 2023

Uh oh!

vdye commented Oct 9, 2023

Uh oh!

gitgitgadget bot commented Oct 9, 2023

Uh oh!

gitgitgadget bot commented Oct 9, 2023

Uh oh!

gitgitgadget bot commented Oct 9, 2023

Uh oh!

gitgitgadget bot Oct 10, 2023

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot commented Oct 10, 2023

Uh oh!

gitgitgadget bot commented Oct 10, 2023

Uh oh!

gitgitgadget bot commented Oct 10, 2023

Uh oh!

gitgitgadget bot commented Oct 11, 2023

Uh oh!

gitgitgadget bot commented Oct 11, 2023

Uh oh!

gitgitgadget bot commented Oct 12, 2023

Uh oh!

gitgitgadget bot commented Oct 12, 2023

Uh oh!

gitgitgadget bot commented Oct 12, 2023

Uh oh!

gitgitgadget bot commented Oct 13, 2023

Uh oh!

gitgitgadget bot commented Oct 14, 2023

Uh oh!

gitgitgadget bot commented Oct 16, 2023

Uh oh!

gitgitgadget bot commented Oct 17, 2023

Uh oh!

gitgitgadget bot commented Oct 18, 2023

Uh oh!

gitgitgadget bot commented Oct 18, 2023

Uh oh!

vdye commented Oct 6, 2023 •

edited

Loading