Skip to content

Performance improvement & cleanup in loose ref iteration#1594

Closed
vdye wants to merge 4 commits intogitgitgadget:masterfrom
vdye:vdye/ref-iteration-cleanup
Closed

Performance improvement & cleanup in loose ref iteration#1594
vdye wants to merge 4 commits intogitgitgadget:masterfrom
vdye:vdye/ref-iteration-cleanup

Conversation

@vdye
Copy link

@vdye vdye commented Oct 6, 2023

While investigating ref iteration performance in builtins like 'for-each-ref' and 'show-ref', I found two small improvement opportunities.

The first patch tweaks the logic around prefix matching in 'cache_ref_iterator_advance' so that we correctly skip refs that do not actually match a given prefix. The unnecessary iteration doesn't seem to be causing any bugs in the ref iteration commands that I've tested, but it doesn't hurt to be more precise (and it helps with some other patches I'm working on ;) ).

The next three patches update how 'loose_fill_ref_dir' determines the type of ref cache entry to create (directory or regular). On platforms that include d_type information in 'struct dirent' (as far as I can tell, all except NonStop & certain versions of Cygwin), this allows us to skip calling 'stat'. Benchmarking against repos with various quantities of loose refs indicates a 5-8% speedup from these changes [1].

Changes since V1

  • Added tests in patch 1 to demonstrate the bugfix

Thanks!

  • Victoria

[1] https://lore.kernel.org/git/28ae03f5-7091-d3f3-8a70-56aba6639640@github.com/

cc: Patrick Steinhardt ps@pks.im

@vdye vdye self-assigned this Oct 6, 2023
@vdye
Copy link
Author

vdye commented Oct 6, 2023

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 6, 2023

Submitted as pull.1594.git.1696615769.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1594/vdye/vdye/ref-iteration-cleanup-v1

To fetch this version to local tag pr-1594/vdye/vdye/ref-iteration-cleanup-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1594/vdye/vdye/ref-iteration-cleanup-v1

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 6, 2023

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:

> While investigating ref iteration performance in builtins like
> 'for-each-ref' and 'show-ref', I found two small improvement opportunities.
>
> The first patch tweaks the logic around prefix matching in
> 'cache_ref_iterator_advance' so that we correctly skip refs that do not
> actually match a given prefix. The unnecessary iteration doesn't seem to be
> causing any bugs in the ref iteration commands that I've tested, but it
> doesn't hurt to be more precise (and it helps with some other patches I'm
> working on ;) ).
>
> The next three patches update how 'loose_fill_ref_dir' determines the type
> of ref cache entry to create (directory or regular). On platforms that
> include d_type information in 'struct dirent' (as far as I can tell, all
> except NonStop & certain versions of Cygwin), this allows us to skip calling
> 'stat'. In ad-hoc testing, this improved performance of 'git for-each-ref'
> by about 20%.

Yay.  That is a very obvious one, once it is pointed out.  Thanks
for noticing and improving.  Looking forward to reading the patches
themselves.

@@ -412,7 +412,8 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Victoria Dye <vdye@github.com>
>
> Update 'cache_ref_iterator_advance' to skip over refs that are not matched
> by the given prefix.
>
> Currently, a ref entry is considered "matched" if the entry name is fully
> contained within the prefix:
>
> * prefix: "refs/heads/v1"
> * entry: "refs/heads/v1.0"
>
> OR if the prefix is fully contained in the entry name:
>
> * prefix: "refs/heads/v1.0"
> * entry: "refs/heads/v1"
>
> The first case is always correct, but the second is only correct if the ref
> cache entry is a directory, for example:
>
> * prefix: "refs/heads/example"
> * entry: "refs/heads/"
>
> Modify the logic in 'cache_ref_iterator_advance' to reflect these
> expectations:
>
> 1. If 'overlaps_prefix' returns 'PREFIX_EXCLUDES_DIR', then the prefix and
>    ref cache entry do not overlap at all. Skip this entry.
> 2. If 'overlaps_prefix' returns 'PREFIX_WITHIN_DIR', then the prefix matches
>    inside this entry if it is a directory. Skip if the entry is not a
>    directory, otherwise iterate over it.
> 3. Otherwise, 'overlaps_prefix' returned 'PREFIX_CONTAINS_DIR', indicating
>    that the cache entry (directory or not) is fully contained by or equal to
>    the prefix. Iterate over this entry.
>
> Note that condition 2 relies on the names of directory entries having the
> appropriate trailing slash. The existing function documentation of
> 'create_dir_entry' explicitly calls out the trailing slash requirement, so
> this is a safe assumption to make.

Thanks for explaining it very well and clearly.  

Allowing prefix="refs/heads/v1.0" to yield entry="refs/heads/v1"
(case #2 above that this patch fixes the behaviour for) would cause
ref_iterator_advance() to return a ref outside the hierarhcy,
wouldn't it?  So it appears to me that either one of the two would
be true:

 * the code is structured in such a way that such a condition does
   not actually happen (in which case this patch would be a no-op),
   or

 * there is a bug in the current code that is fixed by this patch,
   whose externally observable behaviour can be verified with a
   test.

It is not quite clear to me which is the case here.  The code with
the patch looks more logical than the original, but I am not sure
how to demonstrate the existing breakage (if any).

> Signed-off-by: Victoria Dye <vdye@github.com>
> ---
>  refs/ref-cache.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/refs/ref-cache.c b/refs/ref-cache.c
> index 2294c4564fb..6e3b725245c 100644
> --- a/refs/ref-cache.c
> +++ b/refs/ref-cache.c
> @@ -412,7 +412,8 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
>  
>  		if (level->prefix_state == PREFIX_WITHIN_DIR) {
>  			entry_prefix_state = overlaps_prefix(entry->name, iter->prefix);
> -			if (entry_prefix_state == PREFIX_EXCLUDES_DIR)
> +			if (entry_prefix_state == PREFIX_EXCLUDES_DIR ||
> +			    (entry_prefix_state == PREFIX_WITHIN_DIR && !(entry->flag & REF_DIR)))
>  				continue;
>  		} else {
>  			entry_prefix_state = level->prefix_state;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Fri, Oct 06, 2023 at 02:51:24PM -0700, Junio C Hamano wrote:
> "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
> > From: Victoria Dye <vdye@github.com>
> >
> > Update 'cache_ref_iterator_advance' to skip over refs that are not matched
> > by the given prefix.
> >
> > Currently, a ref entry is considered "matched" if the entry name is fully
> > contained within the prefix:
> >
> > * prefix: "refs/heads/v1"
> > * entry: "refs/heads/v1.0"
> >
> > OR if the prefix is fully contained in the entry name:
> >
> > * prefix: "refs/heads/v1.0"
> > * entry: "refs/heads/v1"
> >
> > The first case is always correct, but the second is only correct if the ref
> > cache entry is a directory, for example:
> >
> > * prefix: "refs/heads/example"
> > * entry: "refs/heads/"
> >
> > Modify the logic in 'cache_ref_iterator_advance' to reflect these
> > expectations:
> >
> > 1. If 'overlaps_prefix' returns 'PREFIX_EXCLUDES_DIR', then the prefix and
> >    ref cache entry do not overlap at all. Skip this entry.
> > 2. If 'overlaps_prefix' returns 'PREFIX_WITHIN_DIR', then the prefix matches
> >    inside this entry if it is a directory. Skip if the entry is not a
> >    directory, otherwise iterate over it.
> > 3. Otherwise, 'overlaps_prefix' returned 'PREFIX_CONTAINS_DIR', indicating
> >    that the cache entry (directory or not) is fully contained by or equal to
> >    the prefix. Iterate over this entry.
> >
> > Note that condition 2 relies on the names of directory entries having the
> > appropriate trailing slash. The existing function documentation of
> > 'create_dir_entry' explicitly calls out the trailing slash requirement, so
> > this is a safe assumption to make.
> 
> Thanks for explaining it very well and clearly.  
> 
> Allowing prefix="refs/heads/v1.0" to yield entry="refs/heads/v1"
> (case #2 above that this patch fixes the behaviour for) would cause
> ref_iterator_advance() to return a ref outside the hierarhcy,
> wouldn't it?  So it appears to me that either one of the two would
> be true:
> 
>  * the code is structured in such a way that such a condition does
>    not actually happen (in which case this patch would be a no-op),
>    or
> 
>  * there is a bug in the current code that is fixed by this patch,
>    whose externally observable behaviour can be verified with a
>    test.
> 
> It is not quite clear to me which is the case here.  The code with
> the patch looks more logical than the original, but I am not sure
> how to demonstrate the existing breakage (if any).

Agreed, I also had a bit of a hard time to figure out whether this is an
actual bug fix, a performance improvement or merely a refactoring.

Patrick

> > Signed-off-by: Victoria Dye <vdye@github.com>
> > ---
> >  refs/ref-cache.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/refs/ref-cache.c b/refs/ref-cache.c
> > index 2294c4564fb..6e3b725245c 100644
> > --- a/refs/ref-cache.c
> > +++ b/refs/ref-cache.c
> > @@ -412,7 +412,8 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
> >  
> >  		if (level->prefix_state == PREFIX_WITHIN_DIR) {
> >  			entry_prefix_state = overlaps_prefix(entry->name, iter->prefix);
> > -			if (entry_prefix_state == PREFIX_EXCLUDES_DIR)
> > +			if (entry_prefix_state == PREFIX_EXCLUDES_DIR ||
> > +			    (entry_prefix_state == PREFIX_WITHIN_DIR && !(entry->flag & REF_DIR)))
> >  				continue;
> >  		} else {
> >  			entry_prefix_state = level->prefix_state;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Victoria Dye wrote (reply to this):

Patrick Steinhardt wrote:
>> Allowing prefix="refs/heads/v1.0" to yield entry="refs/heads/v1"
>> (case #2 above that this patch fixes the behaviour for) would cause
>> ref_iterator_advance() to return a ref outside the hierarhcy,
>> wouldn't it?  So it appears to me that either one of the two would
>> be true:
>>
>>  * the code is structured in such a way that such a condition does
>>    not actually happen (in which case this patch would be a no-op),
>>    or
>>
>>  * there is a bug in the current code that is fixed by this patch,
>>    whose externally observable behaviour can be verified with a
>>    test.
>>
>> It is not quite clear to me which is the case here.  The code with
>> the patch looks more logical than the original, but I am not sure
>> how to demonstrate the existing breakage (if any).
> 
> Agreed, I also had a bit of a hard time to figure out whether this is an
> actual bug fix, a performance improvement or merely a refactoring.
> 

I originally operated on the assumption that it was the first case, which is
why I didn't include a test in this patch. Commands like 'for-each-ref',
'show-ref', etc. either use an empty prefix or a directory prefix with a
trailing slash, which won't trigger this issue. I encountered the problem
while working on a builtin that filtered refs by a user-specified prefix -
the results included refs that should not have been matched, which led me to
this fix.

Scanning through the codebase again, though, I do see a way to replicate the
issue:

$ git update-ref refs/bisect/b HEAD
$ git rev-parse --abbrev-ref --bisect
refs/bisect/b

Because 'rev-parse --bisect' uses the "refs/bisect/bad" prefix (no trailing
slash) and does no additional filtering in its 'for_each_fullref_in'
callback, refs like "refs/bisect/b" and "refs/bisect/ba" are (incorrectly)
matched. I'll re-roll with the added test.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Victoria Dye <vdye@github.com> writes:

> I originally operated on the assumption that it was the first case, which is
> why I didn't include a test in this patch. Commands like 'for-each-ref',
> 'show-ref', etc. either use an empty prefix or a directory prefix with a
> trailing slash, which won't trigger this issue.

Ah, yes, I didn't mention it but I suspected as such (i.e. the code
is structured in such a way that this broken implementation does not
matter to the current callers).

> I encountered the problem
> while working on a builtin that filtered refs by a user-specified prefix -
> the results included refs that should not have been matched, which led me to
> this fix.

OK, perfectly understandable.

> Scanning through the codebase again, though, I do see a way to replicate the
> issue:
>
> $ git update-ref refs/bisect/b HEAD
> $ git rev-parse --abbrev-ref --bisect
> refs/bisect/b
>
> Because 'rev-parse --bisect' uses the "refs/bisect/bad" prefix (no trailing
> slash) and does no additional filtering in its 'for_each_fullref_in'
> callback, refs like "refs/bisect/b" and "refs/bisect/ba" are (incorrectly)
> matched. I'll re-roll with the added test.

Good find.  Thanks!

@@ -71,42 +71,6 @@ static int dir_file_stats(struct object_directory *object_dir, void *data)
return 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Victoria Dye <vdye@github.com>
>
> Move 'get_dtype()' from 'diagnose.c' to 'dir.c' and add its declaration to
> 'dir.h' so that it is accessible to callers in other files. The function and
> its documentation are moved verbatim except for a small addition to the
> description clarifying what the 'path' arg represents.
>
> Signed-off-by: Victoria Dye <vdye@github.com>
> ---
>  diagnose.c | 36 ------------------------------------
>  dir.c      | 28 ++++++++++++++++++++++++++++
>  dir.h      | 11 +++++++++++
>  3 files changed, 39 insertions(+), 36 deletions(-)

OK.  diagnose.c should still have access to the function as it
includes <dir.h>, and to anybody that includes <dir.h> and sees the
declaration of get_dtype(), DT_FOO should be visible because <dir.h>
includes <statinfo.h> that has fallback definition of DT_FOO.

Looking simple and straight-forward.


> diff --git a/diagnose.c b/diagnose.c
> index 8430064000b..fc4d344bd63 100644
> --- a/diagnose.c
> +++ b/diagnose.c
> @@ -71,42 +71,6 @@ static int dir_file_stats(struct object_directory *object_dir, void *data)
>  	return 0;
>  }
>  
> -/*
> - * Get the d_type of a dirent. If the d_type is unknown, derive it from
> - * stat.st_mode.
> - *
> - * Note that 'path' is assumed to have a trailing slash. It is also modified
> - * in-place during the execution of the function, but is then reverted to its
> - * original value before returning.
> - */
> -static unsigned char get_dtype(struct dirent *e, struct strbuf *path)
> -{
> -	struct stat st;
> -	unsigned char dtype = DTYPE(e);
> -	size_t base_path_len;
> -
> -	if (dtype != DT_UNKNOWN)
> -		return dtype;
> -
> -	/* d_type unknown in dirent, try to fall back on lstat results */
> -	base_path_len = path->len;
> -	strbuf_addstr(path, e->d_name);
> -	if (lstat(path->buf, &st))
> -		goto cleanup;
> -
> -	/* determine d_type from st_mode */
> -	if (S_ISREG(st.st_mode))
> -		dtype = DT_REG;
> -	else if (S_ISDIR(st.st_mode))
> -		dtype = DT_DIR;
> -	else if (S_ISLNK(st.st_mode))
> -		dtype = DT_LNK;
> -
> -cleanup:
> -	strbuf_setlen(path, base_path_len);
> -	return dtype;
> -}
> -
>  static int count_files(struct strbuf *path)
>  {
>  	DIR *dir = opendir(path->buf);
> diff --git a/dir.c b/dir.c
> index 8486e4d56ff..5e01af3a25e 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -2235,6 +2235,34 @@ static int get_index_dtype(struct index_state *istate,
>  	return DT_UNKNOWN;
>  }
>  
> +unsigned char get_dtype(struct dirent *e, struct strbuf *path)
> +{
> +	struct stat st;
> +	unsigned char dtype = DTYPE(e);
> +	size_t base_path_len;
> +
> +	if (dtype != DT_UNKNOWN)
> +		return dtype;
> +
> +	/* d_type unknown in dirent, try to fall back on lstat results */
> +	base_path_len = path->len;
> +	strbuf_addstr(path, e->d_name);
> +	if (lstat(path->buf, &st))
> +		goto cleanup;
> +
> +	/* determine d_type from st_mode */
> +	if (S_ISREG(st.st_mode))
> +		dtype = DT_REG;
> +	else if (S_ISDIR(st.st_mode))
> +		dtype = DT_DIR;
> +	else if (S_ISLNK(st.st_mode))
> +		dtype = DT_LNK;
> +
> +cleanup:
> +	strbuf_setlen(path, base_path_len);
> +	return dtype;
> +}
> +
>  static int resolve_dtype(int dtype, struct index_state *istate,
>  			 const char *path, int len)
>  {
> diff --git a/dir.h b/dir.h
> index ad06682fd54..28c630ce806 100644
> --- a/dir.h
> +++ b/dir.h
> @@ -363,6 +363,17 @@ struct dir_struct {
>  
>  struct dirent *readdir_skip_dot_and_dotdot(DIR *dirp);
>  
> +/*
> + * Get the d_type of a dirent. If the d_type is unknown, derive it from
> + * stat.st_mode using the path to the dirent's containing directory (path) and
> + * the name of the dirent itself.
> + *
> + * Note that 'path' is assumed to have a trailing slash. It is also modified
> + * in-place during the execution of the function, but is then reverted to its
> + * original value before returning.
> + */
> +unsigned char get_dtype(struct dirent *e, struct strbuf *path);
> +
>  /*Count the number of slashes for string s*/
>  int count_slashes(const char *s);

@@ -246,10 +246,8 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
int dirnamelen = strlen(dirname);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Unlike other existing usage of 'get_dtype', the 'follow_symlinks' arg is set
> to 1 to replicate the existing handling of symlink dirents. This
> unfortunately requires calling 'stat' on the associated entry regardless of
> platform, but symlinks in the loose ref store are highly unlikely since
> they'd need to be created manually by a user.

Yeek.  I wonder what breaks if we do not do this follow_symlinks()
part, i.e., either just replace stat() with lstat() in the original
without any of these four patches (which would be simple to figure
out what breaks), or omit [3/4] and let get_dtype() yield DT_LNK.

It seems that it comes from a7e66ae3 ([PATCH] Make do_each_ref()
follow symlinks., 2005-08-16), and just like I commented on there in
its log message back then, I still doubt that following a symbolic
link is a great idea here in this codepath.

But optimization without behaviour change is a good way to ensure
that optimization does not introduce new bugs, and because keeping
the historical behaviour like the patches [3/4] and this patch does
is more work (meaning: if it proves unnecessary to dereference
symbolic links, we can remove code instead of having to write new
code to support the new behaviour), let's take the series as-is, and
defer it to future developers to further clean-up the semantics.

> Note that this patch also changes the condition for skipping creation of a
> ref entry from "when 'stat' fails" to "when the d_type is anything other
> than DT_REG or DT_DIR". If a dirent's d_type is DT_UNKNOWN (either because
> the platform doesn't support d_type in dirents or some other reason) or
> DT_LNK, 'get_dtype' will try to derive the underlying type with 'stat'. If
> the 'stat' fails, the d_type will remain 'DT_UNKNOWN' and dirent will be
> skipped. However, it will also be skipped if it is any other valid d_type
> (e.g. DT_FIFO for named pipes, DT_LNK for a nested symlink). Git does not
> handle these properly anyway, so we can safely constrain accepted types to
> directories and regular files.

Sounds good.

> Signed-off-by: Victoria Dye <vdye@github.com>
> ---
>  refs/files-backend.c | 14 +++++---------
>  1 file changed, 5 insertions(+), 9 deletions(-)

Thanks.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 7, 2023

This branch is now known as vd/loose-ref-iteration-optimization.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 7, 2023

This patch series was integrated into seen via git@39859ad.

@gitgitgadget gitgitgadget bot added the seen label Oct 7, 2023
@gitgitgadget
Copy link

gitgitgadget bot commented Oct 7, 2023

There was a status update in the "New Topics" section about the branch vd/loose-ref-iteration-optimization on the Git mailing list:

The code to iterate over loose references have been optimized to
reduce the number of lstat() system calls.
source: <pull.1594.git.1696615769.gitgitgadget@gmail.com>

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 9, 2023

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Fri, Oct 06, 2023 at 06:09:25PM +0000, Victoria Dye via GitGitGadget wrote:
> While investigating ref iteration performance in builtins like
> 'for-each-ref' and 'show-ref', I found two small improvement opportunities.
> 
> The first patch tweaks the logic around prefix matching in
> 'cache_ref_iterator_advance' so that we correctly skip refs that do not
> actually match a given prefix. The unnecessary iteration doesn't seem to be
> causing any bugs in the ref iteration commands that I've tested, but it
> doesn't hurt to be more precise (and it helps with some other patches I'm
> working on ;) ).
> 
> The next three patches update how 'loose_fill_ref_dir' determines the type
> of ref cache entry to create (directory or regular). On platforms that
> include d_type information in 'struct dirent' (as far as I can tell, all
> except NonStop & certain versions of Cygwin), this allows us to skip calling
> 'stat'. In ad-hoc testing, this improved performance of 'git for-each-ref'
> by about 20%.

I've done a small set of benchmarks with my usual test repositories,
which is linux.git with a bunch of references added. The repository
comes in four sizes:

- small: 50k references
- medium: 500k references
- high:  1.1m references
- huge: 12m references

Unfortunately, I couldn't really reproduce the performance improvements.
In fact, the new version runs consistently a tiny bit slower than the
old version:

    # Old version, which is 3a06386e31 (The fifteenth batch, 2023-10-04).

    Benchmark 1: git for-each-ref (revision=old,refcount=small)
      Time (mean ± σ):     135.5 ms ±   1.2 ms    [User: 76.4 ms, System: 59.0 ms]
      Range (min … max):   134.8 ms … 136.9 ms    3 runs

    Benchmark 2: git for-each-ref (revision=old,refcount=medium)
      Time (mean ± σ):     822.7 ms ±   2.2 ms    [User: 697.4 ms, System: 125.1 ms]
      Range (min … max):   821.1 ms … 825.2 ms    3 runs

    Benchmark 3: git for-each-ref (revision=old,refcount=high)
      Time (mean ± σ):      1.960 s ±  0.015 s    [User: 1.702 s, System: 0.257 s]
      Range (min … max):    1.944 s …  1.973 s    3 runs

    # New version, which is your tip.

    Benchmark 4: git for-each-ref (revision=old,refcount=huge)
      Time (mean ± σ):     16.815 s ±  0.054 s    [User: 15.091 s, System: 1.722 s]
      Range (min … max):   16.760 s … 16.869 s    3 runs

    Benchmark 5: git for-each-ref (revision=new,refcount=small)
      Time (mean ± σ):     136.0 ms ±   0.2 ms    [User: 78.8 ms, System: 57.1 ms]
      Range (min … max):   135.8 ms … 136.2 ms    3 runs

    Benchmark 6: git for-each-ref (revision=new,refcount=medium)
      Time (mean ± σ):     830.4 ms ±  21.2 ms    [User: 691.3 ms, System: 138.7 ms]
      Range (min … max):   814.2 ms … 854.5 ms    3 runs

    Benchmark 7: git for-each-ref (revision=new,refcount=high)
      Time (mean ± σ):      1.966 s ±  0.013 s    [User: 1.717 s, System: 0.249 s]
      Range (min … max):    1.952 s …  1.978 s    3 runs

    Benchmark 8: git for-each-ref (revision=new,refcount=huge)
      Time (mean ± σ):     16.945 s ±  0.037 s    [User: 15.182 s, System: 1.760 s]
      Range (min … max):   16.910 s … 16.983 s    3 runs

    Summary
      git for-each-ref (revision=old,refcount=small) ran
        1.00 ± 0.01 times faster than git for-each-ref (revision=new,refcount=small)
        6.07 ± 0.06 times faster than git for-each-ref (revision=old,refcount=medium)
        6.13 ± 0.17 times faster than git for-each-ref (revision=new,refcount=medium)
       14.46 ± 0.17 times faster than git for-each-ref (revision=old,refcount=high)
       14.51 ± 0.16 times faster than git for-each-ref (revision=new,refcount=high)
      124.09 ± 1.15 times faster than git for-each-ref (revision=old,refcount=huge)
      125.05 ± 1.12 times faster than git for-each-ref (revision=new,refcount=huge)

The performance regression isn't all that concerning, but it makes me
wonder why I see things becoming slower rather than faster. My guess is
that this is because all my test repositories are well-packed and don't
have a lot of loose references. But I just wanted to confirm how you
benchmarked your change and what the underlying shape of your test repo
was.

Patrick

> Thanks!
> 
>  * Victoria
> 
> Victoria Dye (4):
>   ref-cache.c: fix prefix matching in ref iteration
>   dir.[ch]: expose 'get_dtype'
>   dir.[ch]: add 'follow_symlink' arg to 'get_dtype'
>   files-backend.c: avoid stat in 'loose_fill_ref_dir'
> 
>  diagnose.c           | 42 +++---------------------------------------
>  dir.c                | 33 +++++++++++++++++++++++++++++++++
>  dir.h                | 16 ++++++++++++++++
>  refs/files-backend.c | 14 +++++---------
>  refs/ref-cache.c     |  3 ++-
>  5 files changed, 59 insertions(+), 49 deletions(-)
> 
> 
> base-commit: 3a06386e314565108ad56a9bdb8f7b80ac52fb69
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1594%2Fvdye%2Fvdye%2Fref-iteration-cleanup-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1594/vdye/vdye/ref-iteration-cleanup-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1594
> -- 
> gitgitgadget

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 9, 2023

User Patrick Steinhardt <ps@pks.im> has been added to the cc: list.

vdye added 4 commits October 9, 2023 12:13
Update 'cache_ref_iterator_advance' to skip over refs that are not matched
by the given prefix.

Currently, a ref entry is considered "matched" if the entry name is fully
contained within the prefix:

* prefix: "refs/heads/v1"
* entry: "refs/heads/v1.0"

OR if the prefix is fully contained in the entry name:

* prefix: "refs/heads/v1.0"
* entry: "refs/heads/v1"

The first case is always correct, but the second is only correct if the ref
cache entry is a directory, for example:

* prefix: "refs/heads/example"
* entry: "refs/heads/"

Modify the logic in 'cache_ref_iterator_advance' to reflect these
expectations:

1. If 'overlaps_prefix' returns 'PREFIX_EXCLUDES_DIR', then the prefix and
   ref cache entry do not overlap at all. Skip this entry.
2. If 'overlaps_prefix' returns 'PREFIX_WITHIN_DIR', then the prefix matches
   inside this entry if it is a directory. Skip if the entry is not a
   directory, otherwise iterate over it.
3. Otherwise, 'overlaps_prefix' returned 'PREFIX_CONTAINS_DIR', indicating
   that the cache entry (directory or not) is fully contained by or equal to
   the prefix. Iterate over this entry.

Note that condition 2 relies on the names of directory entries having the
appropriate trailing slash. The existing function documentation of
'create_dir_entry' explicitly calls out the trailing slash requirement, so
this is a safe assumption to make.

This bug generally doesn't have any user-facing impact, since it requires:

1. using a non-empty prefix without a trailing slash in an iteration like
   'for_each_fullref_in',
2. the callback to said iteration not reapplying the original filter (as
   for-each-ref does) to ensure unmatched refs are skipped, and
3. the repository having one or more refs that match part of, but not all
   of, the prefix.

However, there are some niche scenarios that meet those criteria
(specifically, 'rev-parse --bisect' and '(log|show|shortlog) --bisect'). Add
tests covering those cases to demonstrate the fix in this patch.

Signed-off-by: Victoria Dye <vdye@github.com>
Move 'get_dtype()' from 'diagnose.c' to 'dir.c' and add its declaration to
'dir.h' so that it is accessible to callers in other files. The function and
its documentation are moved verbatim except for a small addition to the
description clarifying what the 'path' arg represents.

Signed-off-by: Victoria Dye <vdye@github.com>
Add a 'follow_symlink' boolean option to 'get_type()'. If 'follow_symlink'
is enabled, DT_LNK (in addition to DT_UNKNOWN) d_types triggers the
stat-based d_type resolution, using 'stat' instead of 'lstat' to get the
type of the followed symlink. Note that symlinks are not followed
recursively, so a symlink pointing to another symlink will still resolve to
DT_LNK.

Update callers in 'diagnose.c' to specify 'follow_symlink = 0' to preserve
current behavior.

Signed-off-by: Victoria Dye <vdye@github.com>
Modify the 'readdir' loop in 'loose_fill_ref_dir' to, rather than 'stat' a
file to determine whether it is a directory or not, use 'get_dtype'.

Currently, the loop uses 'stat' to determine whether each dirent is a
directory itself or not in order to construct the appropriate ref cache
entry. If 'stat' fails (returning a negative value), the dirent is silently
skipped; otherwise, 'S_ISDIR(st.st_mode)' is used to check whether the entry
is a directory.

On platforms that include an entry's d_type in in the 'dirent' struct, this
extra 'stat' check is redundant. We can use the 'get_dtype' method to
extract this information on platforms that support it (i.e. where
NO_D_TYPE_IN_DIRENT is unset), and derive it with 'stat' on platforms that
don't. Because 'stat' is an expensive call, this confers a
modest-but-noticeable performance improvement when iterating over large
numbers of refs (approximately 20% speedup in 'git for-each-ref' in a 30k
ref repo).

Unlike other existing usage of 'get_dtype', the 'follow_symlinks' arg is set
to 1 to replicate the existing handling of symlink dirents. This
unfortunately requires calling 'stat' on the associated entry regardless of
platform, but symlinks in the loose ref store are highly unlikely since
they'd need to be created manually by a user.

Note that this patch also changes the condition for skipping creation of a
ref entry from "when 'stat' fails" to "when the d_type is anything other
than DT_REG or DT_DIR". If a dirent's d_type is DT_UNKNOWN (either because
the platform doesn't support d_type in dirents or some other reason) or
DT_LNK, 'get_dtype' will try to derive the underlying type with 'stat'. If
the 'stat' fails, the d_type will remain 'DT_UNKNOWN' and dirent will be
skipped. However, it will also be skipped if it is any other valid d_type
(e.g. DT_FIFO for named pipes, DT_LNK for a nested symlink). Git does not
handle these properly anyway, so we can safely constrain accepted types to
directories and regular files.

Signed-off-by: Victoria Dye <vdye@github.com>
@gitgitgadget
Copy link

gitgitgadget bot commented Oct 9, 2023

On the Git mailing list, Victoria Dye wrote (reply to this):

Patrick Steinhardt wrote:
> On Fri, Oct 06, 2023 at 06:09:25PM +0000, Victoria Dye via GitGitGadget wrote:
>> While investigating ref iteration performance in builtins like
>> 'for-each-ref' and 'show-ref', I found two small improvement opportunities.
>>
>> The first patch tweaks the logic around prefix matching in
>> 'cache_ref_iterator_advance' so that we correctly skip refs that do not
>> actually match a given prefix. The unnecessary iteration doesn't seem to be
>> causing any bugs in the ref iteration commands that I've tested, but it
>> doesn't hurt to be more precise (and it helps with some other patches I'm
>> working on ;) ).
>>
>> The next three patches update how 'loose_fill_ref_dir' determines the type
>> of ref cache entry to create (directory or regular). On platforms that
>> include d_type information in 'struct dirent' (as far as I can tell, all
>> except NonStop & certain versions of Cygwin), this allows us to skip calling
>> 'stat'. In ad-hoc testing, this improved performance of 'git for-each-ref'
>> by about 20%.
> 
> I've done a small set of benchmarks with my usual test repositories,
> which is linux.git with a bunch of references added. The repository
> comes in four sizes:
> 
> - small: 50k references
> - medium: 500k references
> - high:  1.1m references
> - huge: 12m references
> 
> Unfortunately, I couldn't really reproduce the performance improvements.
> In fact, the new version runs consistently a tiny bit slower than the
> old version:
> 
>     # Old version, which is 3a06386e31 (The fifteenth batch, 2023-10-04).
> 
>     Benchmark 1: git for-each-ref (revision=old,refcount=small)
>       Time (mean ± σ):     135.5 ms ±   1.2 ms    [User: 76.4 ms, System: 59.0 ms]
>       Range (min … max):   134.8 ms … 136.9 ms    3 runs
> 
>     Benchmark 2: git for-each-ref (revision=old,refcount=medium)
>       Time (mean ± σ):     822.7 ms ±   2.2 ms    [User: 697.4 ms, System: 125.1 ms]
>       Range (min … max):   821.1 ms … 825.2 ms    3 runs
> 
>     Benchmark 3: git for-each-ref (revision=old,refcount=high)
>       Time (mean ± σ):      1.960 s ±  0.015 s    [User: 1.702 s, System: 0.257 s]
>       Range (min … max):    1.944 s …  1.973 s    3 runs
> 
>     # New version, which is your tip.
> 
>     Benchmark 4: git for-each-ref (revision=old,refcount=huge)
>       Time (mean ± σ):     16.815 s ±  0.054 s    [User: 15.091 s, System: 1.722 s]
>       Range (min … max):   16.760 s … 16.869 s    3 runs
> 
>     Benchmark 5: git for-each-ref (revision=new,refcount=small)
>       Time (mean ± σ):     136.0 ms ±   0.2 ms    [User: 78.8 ms, System: 57.1 ms]
>       Range (min … max):   135.8 ms … 136.2 ms    3 runs
> 
>     Benchmark 6: git for-each-ref (revision=new,refcount=medium)
>       Time (mean ± σ):     830.4 ms ±  21.2 ms    [User: 691.3 ms, System: 138.7 ms]
>       Range (min … max):   814.2 ms … 854.5 ms    3 runs
> 
>     Benchmark 7: git for-each-ref (revision=new,refcount=high)
>       Time (mean ± σ):      1.966 s ±  0.013 s    [User: 1.717 s, System: 0.249 s]
>       Range (min … max):    1.952 s …  1.978 s    3 runs
> 
>     Benchmark 8: git for-each-ref (revision=new,refcount=huge)
>       Time (mean ± σ):     16.945 s ±  0.037 s    [User: 15.182 s, System: 1.760 s]
>       Range (min … max):   16.910 s … 16.983 s    3 runs
> 
>     Summary
>       git for-each-ref (revision=old,refcount=small) ran
>         1.00 ± 0.01 times faster than git for-each-ref (revision=new,refcount=small)
>         6.07 ± 0.06 times faster than git for-each-ref (revision=old,refcount=medium)
>         6.13 ± 0.17 times faster than git for-each-ref (revision=new,refcount=medium)
>        14.46 ± 0.17 times faster than git for-each-ref (revision=old,refcount=high)
>        14.51 ± 0.16 times faster than git for-each-ref (revision=new,refcount=high)
>       124.09 ± 1.15 times faster than git for-each-ref (revision=old,refcount=huge)
>       125.05 ± 1.12 times faster than git for-each-ref (revision=new,refcount=huge)
> 
> The performance regression isn't all that concerning, but it makes me
> wonder why I see things becoming slower rather than faster. My guess is
> that this is because all my test repositories are well-packed and don't
> have a lot of loose references. But I just wanted to confirm how you
> benchmarked your change and what the underlying shape of your test repo
> was.

I ran my benchmark on my (Intel) Mac with a test repository (single commit,
one file) containing:

- 10k refs/heads/ references
- 10k refs/tags/ references
- 10k refs/special/ references 

All refs in the repository are loose. My Mac has historically been somewhat
slow and inconsistent when it comes to perf testing, though, so I re-ran the
benchmark a bit more formally on an Ubuntu VM (3 warmup iterations followed
by at least 10 iterations per test):

---

Benchmark 1: git for-each-ref (revision=old,refcount=3k)
  Time (mean ± σ):      40.6 ms ±   3.9 ms    [User: 13.2 ms, System: 27.1 ms]
  Range (min … max):    37.2 ms …  59.1 ms    76 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 2: git for-each-ref (revision=new,refcount=3k)
  Time (mean ± σ):      38.7 ms ±   4.4 ms    [User: 13.8 ms, System: 24.5 ms]
  Range (min … max):    35.1 ms …  57.2 ms    71 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 3: git for-each-ref (revision=old,refcount=30k)
  Time (mean ± σ):     419.4 ms ±  43.9 ms    [User: 136.4 ms, System: 274.1 ms]
  Range (min … max):   385.1 ms … 528.7 ms    10 runs
 
Benchmark 4: git for-each-ref (revision=new,refcount=30k)
  Time (mean ± σ):     390.4 ms ±  27.2 ms    [User: 133.1 ms, System: 251.6 ms]
  Range (min … max):   360.3 ms … 447.6 ms    10 runs
 
Benchmark 5: git for-each-ref (revision=old,refcount=300k)
  Time (mean ± σ):      4.171 s ±  0.052 s    [User: 1.400 s, System: 2.715 s]
  Range (min … max):    4.118 s …  4.283 s    10 runs
 
Benchmark 6: git for-each-ref (revision=new,refcount=300k)
  Time (mean ± σ):      3.939 s ±  0.054 s    [User: 1.403 s, System: 2.466 s]
  Range (min … max):    3.858 s …  4.026 s    10 runs
 
Summary
  'git for-each-ref (revision=new,refcount=3k)' ran
    1.05 ± 0.16 times faster than 'git for-each-ref (revision=old,refcount=3k)'
   10.08 ± 1.34 times faster than 'git for-each-ref (revision=new,refcount=30k)'
   10.83 ± 1.67 times faster than 'git for-each-ref (revision=old,refcount=30k)'
  101.68 ± 11.63 times faster than 'git for-each-ref (revision=new,refcount=300k)'
  107.67 ± 12.30 times faster than 'git for-each-ref (revision=old,refcount=300k)'

---

So it's not the 20% speedup I saw on my local test repo (it's more like
5-8%), but there does appear to be a consistent improvement. As for your
results, the changes in this series shouldn't affect packed ref operations,
and the difference between old & new doesn't seem to indicate a regression. 

@vdye
Copy link
Author

vdye commented Oct 9, 2023

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 9, 2023

Submitted as pull.1594.v2.git.1696888736.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1594/vdye/vdye/ref-iteration-cleanup-v2

To fetch this version to local tag pr-1594/vdye/vdye/ref-iteration-cleanup-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1594/vdye/vdye/ref-iteration-cleanup-v2

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 9, 2023

This patch series was integrated into seen via git@a91bb38.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 9, 2023

This patch series was integrated into seen via git@5c93f1f.

@@ -412,7 +412,8 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Mon, Oct 09, 2023 at 09:58:53PM +0000, Victoria Dye via GitGitGadget wrote:
> From: Victoria Dye <vdye@github.com>
> 
> Update 'cache_ref_iterator_advance' to skip over refs that are not matched
> by the given prefix.
> 
> Currently, a ref entry is considered "matched" if the entry name is fully
> contained within the prefix:
> 
> * prefix: "refs/heads/v1"
> * entry: "refs/heads/v1.0"
> 
> OR if the prefix is fully contained in the entry name:
> 
> * prefix: "refs/heads/v1.0"
> * entry: "refs/heads/v1"
> 
> The first case is always correct, but the second is only correct if the ref
> cache entry is a directory, for example:
> 
> * prefix: "refs/heads/example"
> * entry: "refs/heads/"
> 
> Modify the logic in 'cache_ref_iterator_advance' to reflect these
> expectations:
> 
> 1. If 'overlaps_prefix' returns 'PREFIX_EXCLUDES_DIR', then the prefix and
>    ref cache entry do not overlap at all. Skip this entry.
> 2. If 'overlaps_prefix' returns 'PREFIX_WITHIN_DIR', then the prefix matches
>    inside this entry if it is a directory. Skip if the entry is not a
>    directory, otherwise iterate over it.
> 3. Otherwise, 'overlaps_prefix' returned 'PREFIX_CONTAINS_DIR', indicating
>    that the cache entry (directory or not) is fully contained by or equal to
>    the prefix. Iterate over this entry.
> 
> Note that condition 2 relies on the names of directory entries having the
> appropriate trailing slash. The existing function documentation of
> 'create_dir_entry' explicitly calls out the trailing slash requirement, so
> this is a safe assumption to make.
> 
> This bug generally doesn't have any user-facing impact, since it requires:
> 
> 1. using a non-empty prefix without a trailing slash in an iteration like
>    'for_each_fullref_in',
> 2. the callback to said iteration not reapplying the original filter (as
>    for-each-ref does) to ensure unmatched refs are skipped, and
> 3. the repository having one or more refs that match part of, but not all
>    of, the prefix.
> 
> However, there are some niche scenarios that meet those criteria
> (specifically, 'rev-parse --bisect' and '(log|show|shortlog) --bisect'). Add
> tests covering those cases to demonstrate the fix in this patch.
> 
> Signed-off-by: Victoria Dye <vdye@github.com>
> ---
>  refs/ref-cache.c              |  3 ++-
>  t/t1500-rev-parse.sh          | 23 +++++++++++++++++++++++
>  t/t4205-log-pretty-formats.sh | 30 ++++++++++++++++++++++++++++++
>  3 files changed, 55 insertions(+), 1 deletion(-)
> 
> diff --git a/refs/ref-cache.c b/refs/ref-cache.c
> index 2294c4564fb..6e3b725245c 100644
> --- a/refs/ref-cache.c
> +++ b/refs/ref-cache.c
> @@ -412,7 +412,8 @@ static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
>  
>  		if (level->prefix_state == PREFIX_WITHIN_DIR) {
>  			entry_prefix_state = overlaps_prefix(entry->name, iter->prefix);
> -			if (entry_prefix_state == PREFIX_EXCLUDES_DIR)
> +			if (entry_prefix_state == PREFIX_EXCLUDES_DIR ||
> +			    (entry_prefix_state == PREFIX_WITHIN_DIR && !(entry->flag & REF_DIR)))
>  				continue;
>  		} else {
>  			entry_prefix_state = level->prefix_state;
> diff --git a/t/t1500-rev-parse.sh b/t/t1500-rev-parse.sh
> index 37ee5091b5c..3f9e7f62e45 100755
> --- a/t/t1500-rev-parse.sh
> +++ b/t/t1500-rev-parse.sh
> @@ -264,4 +264,27 @@ test_expect_success 'rev-parse --since= unsqueezed ordering' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'rev-parse --bisect includes bad, excludes good' '
> +	test_commit_bulk 6 &&
> +
> +	git update-ref refs/bisect/bad-1 HEAD~1 &&
> +	git update-ref refs/bisect/b HEAD~2 &&
> +	git update-ref refs/bisect/bad-3 HEAD~3 &&
> +	git update-ref refs/bisect/good-3 HEAD~3 &&
> +	git update-ref refs/bisect/bad-4 HEAD~4 &&
> +	git update-ref refs/bisect/go HEAD~4 &&
> +
> +	# Note: refs/bisect/b and refs/bisect/go should be ignored because they
> +	# do not match the refs/bisect/bad or refs/bisect/good prefixes.
> +	cat >expect <<-EOF &&
> +	refs/bisect/bad-1
> +	refs/bisect/bad-3
> +	refs/bisect/bad-4
> +	^refs/bisect/good-3
> +	EOF
> +
> +	git rev-parse --symbolic-full-name --bisect >actual &&
> +	test_cmp expect actual
> +'
> +
>  test_done
> diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh
> index 16626e4fe96..62c7bfed5d7 100755
> --- a/t/t4205-log-pretty-formats.sh
> +++ b/t/t4205-log-pretty-formats.sh
> @@ -956,6 +956,36 @@ test_expect_success '%S in git log --format works with other placeholders (part
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'setup more commits for %S with --bisect' '
> +	test_commit four &&
> +	test_commit five &&
> +
> +	head1=$(git rev-parse --verify HEAD~0) &&
> +	head2=$(git rev-parse --verify HEAD~1) &&
> +	head3=$(git rev-parse --verify HEAD~2) &&
> +	head4=$(git rev-parse --verify HEAD~3)
> +'
> +
> +test_expect_success '%S with --bisect labels commits with refs/bisect/bad ref' '
> +	git update-ref refs/bisect/bad-$head1 $head1 &&
> +	git update-ref refs/bisect/go $head1 &&
> +	git update-ref refs/bisect/bad-$head2 $head2 &&
> +	git update-ref refs/bisect/b $head3 &&
> +	git update-ref refs/bisect/bad-$head4 $head4 &&
> +	git update-ref refs/bisect/good-$head4 $head4 &&
> +
> +	# We expect to see the range of commits betwee refs/bisect/good-$head4

Nit: s/betwee/between. Probably not worth rerolling this series only
because of this typo though.

Patrick

> +	# and refs/bisect/bad-$head1. The "source" ref is the nearest bisect ref
> +	# from which the commit is reachable.
> +	cat >expect <<-EOF &&
> +	$head1 refs/bisect/bad-$head1
> +	$head2 refs/bisect/bad-$head2
> +	$head3 refs/bisect/bad-$head2
> +	EOF
> +	git log --bisect --format="%H %S" >actual &&
> +	test_cmp expect actual
> +'
> +
>  test_expect_success 'log --pretty=reference' '
>  	git log --pretty="tformat:%h (%s, %as)" >expect &&
>  	git log --pretty=reference >actual &&
> -- 
> gitgitgadget
> 

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 10, 2023

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Mon, Oct 09, 2023 at 02:49:14PM -0700, Victoria Dye wrote:
> Patrick Steinhardt wrote:
> > On Fri, Oct 06, 2023 at 06:09:25PM +0000, Victoria Dye via GitGitGadget wrote:
> >> While investigating ref iteration performance in builtins like
> >> 'for-each-ref' and 'show-ref', I found two small improvement opportunities.
> >>
> >> The first patch tweaks the logic around prefix matching in
> >> 'cache_ref_iterator_advance' so that we correctly skip refs that do not
> >> actually match a given prefix. The unnecessary iteration doesn't seem to be
> >> causing any bugs in the ref iteration commands that I've tested, but it
> >> doesn't hurt to be more precise (and it helps with some other patches I'm
> >> working on ;) ).
> >>
> >> The next three patches update how 'loose_fill_ref_dir' determines the type
> >> of ref cache entry to create (directory or regular). On platforms that
> >> include d_type information in 'struct dirent' (as far as I can tell, all
> >> except NonStop & certain versions of Cygwin), this allows us to skip calling
> >> 'stat'. In ad-hoc testing, this improved performance of 'git for-each-ref'
> >> by about 20%.
> > 
> > I've done a small set of benchmarks with my usual test repositories,
> > which is linux.git with a bunch of references added. The repository
> > comes in four sizes:
> > 
> > - small: 50k references
> > - medium: 500k references
> > - high:  1.1m references
> > - huge: 12m references
> > 
> > Unfortunately, I couldn't really reproduce the performance improvements.
> > In fact, the new version runs consistently a tiny bit slower than the
> > old version:
> > 
> >     # Old version, which is 3a06386e31 (The fifteenth batch, 2023-10-04).
> > 
> >     Benchmark 1: git for-each-ref (revision=old,refcount=small)
> >       Time (mean ± σ):     135.5 ms ±   1.2 ms    [User: 76.4 ms, System: 59.0 ms]
> >       Range (min … max):   134.8 ms … 136.9 ms    3 runs
> > 
> >     Benchmark 2: git for-each-ref (revision=old,refcount=medium)
> >       Time (mean ± σ):     822.7 ms ±   2.2 ms    [User: 697.4 ms, System: 125.1 ms]
> >       Range (min … max):   821.1 ms … 825.2 ms    3 runs
> > 
> >     Benchmark 3: git for-each-ref (revision=old,refcount=high)
> >       Time (mean ± σ):      1.960 s ±  0.015 s    [User: 1.702 s, System: 0.257 s]
> >       Range (min … max):    1.944 s …  1.973 s    3 runs
> > 
> >     # New version, which is your tip.
> > 
> >     Benchmark 4: git for-each-ref (revision=old,refcount=huge)
> >       Time (mean ± σ):     16.815 s ±  0.054 s    [User: 15.091 s, System: 1.722 s]
> >       Range (min … max):   16.760 s … 16.869 s    3 runs
> > 
> >     Benchmark 5: git for-each-ref (revision=new,refcount=small)
> >       Time (mean ± σ):     136.0 ms ±   0.2 ms    [User: 78.8 ms, System: 57.1 ms]
> >       Range (min … max):   135.8 ms … 136.2 ms    3 runs
> > 
> >     Benchmark 6: git for-each-ref (revision=new,refcount=medium)
> >       Time (mean ± σ):     830.4 ms ±  21.2 ms    [User: 691.3 ms, System: 138.7 ms]
> >       Range (min … max):   814.2 ms … 854.5 ms    3 runs
> > 
> >     Benchmark 7: git for-each-ref (revision=new,refcount=high)
> >       Time (mean ± σ):      1.966 s ±  0.013 s    [User: 1.717 s, System: 0.249 s]
> >       Range (min … max):    1.952 s …  1.978 s    3 runs
> > 
> >     Benchmark 8: git for-each-ref (revision=new,refcount=huge)
> >       Time (mean ± σ):     16.945 s ±  0.037 s    [User: 15.182 s, System: 1.760 s]
> >       Range (min … max):   16.910 s … 16.983 s    3 runs
> > 
> >     Summary
> >       git for-each-ref (revision=old,refcount=small) ran
> >         1.00 ± 0.01 times faster than git for-each-ref (revision=new,refcount=small)
> >         6.07 ± 0.06 times faster than git for-each-ref (revision=old,refcount=medium)
> >         6.13 ± 0.17 times faster than git for-each-ref (revision=new,refcount=medium)
> >        14.46 ± 0.17 times faster than git for-each-ref (revision=old,refcount=high)
> >        14.51 ± 0.16 times faster than git for-each-ref (revision=new,refcount=high)
> >       124.09 ± 1.15 times faster than git for-each-ref (revision=old,refcount=huge)
> >       125.05 ± 1.12 times faster than git for-each-ref (revision=new,refcount=huge)
> > 
> > The performance regression isn't all that concerning, but it makes me
> > wonder why I see things becoming slower rather than faster. My guess is
> > that this is because all my test repositories are well-packed and don't
> > have a lot of loose references. But I just wanted to confirm how you
> > benchmarked your change and what the underlying shape of your test repo
> > was.
> 
> I ran my benchmark on my (Intel) Mac with a test repository (single commit,
> one file) containing:
> 
> - 10k refs/heads/ references
> - 10k refs/tags/ references
> - 10k refs/special/ references 
> 
> All refs in the repository are loose. My Mac has historically been somewhat
> slow and inconsistent when it comes to perf testing, though, so I re-ran the
> benchmark a bit more formally on an Ubuntu VM (3 warmup iterations followed
> by at least 10 iterations per test):
> 
> ---
> 
> Benchmark 1: git for-each-ref (revision=old,refcount=3k)
>   Time (mean ± σ):      40.6 ms ±   3.9 ms    [User: 13.2 ms, System: 27.1 ms]
>   Range (min … max):    37.2 ms …  59.1 ms    76 runs
>  
>   Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
>  
> Benchmark 2: git for-each-ref (revision=new,refcount=3k)
>   Time (mean ± σ):      38.7 ms ±   4.4 ms    [User: 13.8 ms, System: 24.5 ms]
>   Range (min … max):    35.1 ms …  57.2 ms    71 runs
>  
>   Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
>  
> Benchmark 3: git for-each-ref (revision=old,refcount=30k)
>   Time (mean ± σ):     419.4 ms ±  43.9 ms    [User: 136.4 ms, System: 274.1 ms]
>   Range (min … max):   385.1 ms … 528.7 ms    10 runs
>  
> Benchmark 4: git for-each-ref (revision=new,refcount=30k)
>   Time (mean ± σ):     390.4 ms ±  27.2 ms    [User: 133.1 ms, System: 251.6 ms]
>   Range (min … max):   360.3 ms … 447.6 ms    10 runs
>  
> Benchmark 5: git for-each-ref (revision=old,refcount=300k)
>   Time (mean ± σ):      4.171 s ±  0.052 s    [User: 1.400 s, System: 2.715 s]
>   Range (min … max):    4.118 s …  4.283 s    10 runs
>  
> Benchmark 6: git for-each-ref (revision=new,refcount=300k)
>   Time (mean ± σ):      3.939 s ±  0.054 s    [User: 1.403 s, System: 2.466 s]
>   Range (min … max):    3.858 s …  4.026 s    10 runs
>  
> Summary
>   'git for-each-ref (revision=new,refcount=3k)' ran
>     1.05 ± 0.16 times faster than 'git for-each-ref (revision=old,refcount=3k)'
>    10.08 ± 1.34 times faster than 'git for-each-ref (revision=new,refcount=30k)'
>    10.83 ± 1.67 times faster than 'git for-each-ref (revision=old,refcount=30k)'
>   101.68 ± 11.63 times faster than 'git for-each-ref (revision=new,refcount=300k)'
>   107.67 ± 12.30 times faster than 'git for-each-ref (revision=old,refcount=300k)'
> 
> ---
> 
> So it's not the 20% speedup I saw on my local test repo (it's more like
> 5-8%), but there does appear to be a consistent improvement.

Thanks a bunch for re-doing the benchmark with a documented setup.

> As for your results, the changes in this series shouldn't affect
> packed ref operations, and the difference between old & new doesn't
> seem to indicate a regression. 

Yeah, I've also been surprised to see the performance regression for
packed-refs files. The regression is persistent and reproducable on my
machine though, so even though I tend to agree that the patches
shouldn't negatively impact packed-refs performance they somehow do. It
could just as well be something like different optimization choices by
the compler due to the added patches, or hitting different cache lines.
I dunno.

Anyway, I agree with your assessment. The regression I see is less than
1% for packed-refs, while the improvements for loose refs are a lot more
significant and conceptually make a lot of sense. So I didn't intend to
say that we shouldn't do these optimizations because of the miniscule
peformance regression with packed-refs.

Or in other words: this series looks good to me.

Thanks!
Patrick

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 10, 2023

This patch series was integrated into seen via git@62a5770.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 10, 2023

This patch series was integrated into seen via git@29a5bc1.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 11, 2023

There was a status update in the "Cooking" section about the branch vd/loose-ref-iteration-optimization on the Git mailing list:

The code to iterate over loose references have been optimized to
reduce the number of lstat() system calls.

Will merge to 'next'?
source: <pull.1594.v2.git.1696888736.gitgitgadget@gmail.com>

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 11, 2023

This patch series was integrated into seen via git@095f2de.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 12, 2023

This patch series was integrated into seen via git@d9924f9.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 12, 2023

This patch series was integrated into next via git@99e2f83.

@gitgitgadget gitgitgadget bot added the next label Oct 12, 2023
@gitgitgadget
Copy link

gitgitgadget bot commented Oct 12, 2023

There was a status update in the "Cooking" section about the branch vd/loose-ref-iteration-optimization on the Git mailing list:

The code to iterate over loose references have been optimized to
reduce the number of lstat() system calls.

Will merge to 'master'.
source: <pull.1594.v2.git.1696888736.gitgitgadget@gmail.com>

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 13, 2023

This patch series was integrated into seen via git@5bbb845.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 14, 2023

There was a status update in the "Cooking" section about the branch vd/loose-ref-iteration-optimization on the Git mailing list:

The code to iterate over loose references have been optimized to
reduce the number of lstat() system calls.

Will merge to 'master'.
source: <pull.1594.v2.git.1696888736.gitgitgadget@gmail.com>

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 16, 2023

This patch series was integrated into seen via git@d6258bf.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 17, 2023

This patch series was integrated into seen via git@6f0940a.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 18, 2023

This patch series was integrated into seen via git@c064020.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 18, 2023

There was a status update in the "Cooking" section about the branch vd/loose-ref-iteration-optimization on the Git mailing list:

The code to iterate over loose references have been optimized to
reduce the number of lstat() system calls.

Will merge to 'master'.
source: <pull.1594.v2.git.1696888736.gitgitgadget@gmail.com>

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 20, 2023

This patch series was integrated into seen via git@8622bdf.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 21, 2023

This patch series was integrated into seen via git@6b1e225.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 21, 2023

This patch series was integrated into master via git@6b1e225.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 21, 2023

This patch series was integrated into next via git@6b1e225.

@gitgitgadget gitgitgadget bot added the master label Oct 21, 2023
@gitgitgadget gitgitgadget bot closed this Oct 21, 2023
@gitgitgadget
Copy link

gitgitgadget bot commented Oct 21, 2023

Closed via 6b1e225.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant