Skip to content

Conversation

pvanbuijtene
Copy link

@pvanbuijtene pvanbuijtene commented Sep 30, 2025

Description

Reduce the amount of allocations of Commit instances.

Related Issue

Fix #4680, Fix #4409

Motivation and Context

After some investigation it turned out that a lot of commits are being re-created during the process of calculating the version.
By implementing a cache for the commits the overall memory consumption went from ~10GB to ~400MB.

How Has This Been Tested?

The solution was tested by running the project again with the changes against the same repository resulting in less memory consumption as shown below.

The problem of the memory usage was internally reported on Linux environments and reproduced on Windows.
My machine has 64GB & 22 logical processors.

Screenshots (if appropriate):

The CPU and Memory consumption before the change, peaking at ~10GB.
before
The CPU and Memory consumption after the change, peaking at ~400MB.
after

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@arturcic arturcic self-requested a review September 30, 2025 15:40
Copy link
Member

@asbjornu asbjornu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very promising! 👍🏼

@arturcic
Copy link
Member

arturcic commented Oct 2, 2025

@pvanbuijtene I'm quite happy with the direction you're heading with these changes, I will have a look later as well. Thanks for shared the anonymous repo

@pvanbuijtene pvanbuijtene force-pushed the cache-commits branch 2 times, most recently from 60a70cd to c295a82 Compare October 3, 2025 06:01
@pvanbuijtene
Copy link
Author

As you might have noticed, there are some tests failing.

I think one of the reasons for this is that the cache's life span is longer than you would normally expect I think.

From a CLI perspective you would request the version using GitVersion and you're done.

Within the tests there are complete scenarios being checked and I'm not sure if and how the cache should behave.
Maybe it should be cleared with certain steps, or it should be disabled 🤔

As an example this one, it fails with caching of branches and succeeds without branch caching: https://github.com/GitTools/GitVersion/blob/main/src/GitVersion.Core.Tests/VersionCalculation/NextVersionCalculatorTests.cs#L124

    [Test]
    public void MergeFeatureIntoMainline()
    {
        var configuration = TrunkBasedConfigurationBuilder.New.Build();

        using var fixture = new EmptyRepositoryFixture();
        fixture.MakeACommit();
        fixture.ApplyTag("1.0.0");
        fixture.AssertFullSemver("1.0.0", configuration);

        fixture.BranchTo("feature/foo");
        fixture.MakeACommit();
        fixture.AssertFullSemver("1.1.0-foo.1", configuration);
        fixture.ApplyTag("1.1.0-foo.1");

        fixture.Checkout(MainBranch);
        fixture.MergeNoFF("feature/foo");
        fixture.AssertFullSemver("1.1.0", configuration);
    }

Copy link
Member

@asbjornu asbjornu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Within the tests there are complete scenarios being checked and I'm not sure if and how the cache should behave. Maybe it should be cleared with certain steps, or it should be disabled 🤔

As an example this one, it fails with caching of branches and succeeds without branch caching […]

If the cache breaks tests, I'm thinking it might be uncovering a problem with the cache.

Since this PR was originally about caching the Commit class, perhaps we should keep it at that and ensure that works with 100% green tests before we move on with a future followup PR where we attempt to cache even more Git objects?

@arturcic
Copy link
Member

arturcic commented Oct 3, 2025

Within the tests there are complete scenarios being checked and I'm not sure if and how the cache should behave. Maybe it should be cleared with certain steps, or it should be disabled 🤔
As an example this one, it fails with caching of branches and succeeds without branch caching […]

If the cache breaks tests, I'm thinking it might be uncovering a problem with the cache.

Since this PR was originally about caching the Commit class, perhaps we should keep it at that and ensure that works with 100% green tests before we move on with a future followup PR where we attempt to cache even more Git objects?

I will try to investigate why those tests are failing as well

@pvanbuijtene
Copy link
Author

Within the tests there are complete scenarios being checked and I'm not sure if and how the cache should behave. Maybe it should be cleared with certain steps, or it should be disabled 🤔
As an example this one, it fails with caching of branches and succeeds without branch caching […]

If the cache breaks tests, I'm thinking it might be uncovering a problem with the cache.

Since this PR was originally about caching the Commit class, perhaps we should keep it at that and ensure that works with 100% green tests before we move on with a future followup PR where we attempt to cache even more Git objects?

Think I would agree with that, caching branches and tags is as you might expect not as straight forward as the commits.

@arturcic
Copy link
Member

arturcic commented Oct 7, 2025

@pvanbuijtene I see you managed to fix the CI build, please rebase onto main and perform the dotnet format

@pvanbuijtene
Copy link
Author

@arturcic it was more a test to see if I could get it work.

I couldn't get the tests to work with cached References and Remotes so I removed those.

Let me know what you think, and if you want some tests specifically for the GitCache.

@arturcic
Copy link
Member

arturcic commented Oct 7, 2025

@arturcic it was more a test to see if I could get it work.

I couldn't get the tests to work with cached References and Remotes so I removed those.

Let me know what you think, and if you want some tests specifically for the GitCache.

I think for now we can skip those 2.

@arturcic
Copy link
Member

arturcic commented Oct 7, 2025

@pvanbuijtene have you run this version of GitVersion for that repo with the issues? If so can you provide some statistics/graphs with the comparison?

@pvanbuijtene
Copy link
Author

@pvanbuijtene have you run this version of GitVersion for that repo with the issues? If so can you provide some statistics/graphs with the comparison?

I didn't, but here are the results.

Main @ 9939c56:
image

This PR:
image

@pvanbuijtene pvanbuijtene marked this pull request as ready for review October 7, 2025 15:01
Copy link

sonarqubecloud bot commented Oct 7, 2025

@arturcic
Copy link
Member

arturcic commented Oct 7, 2025

@pvanbuijtene have you run this version of GitVersion for that repo with the issues? If so can you provide some statistics/graphs with the comparison?

I didn't, but here are the results.

Main @ 9939c56: image

This PR: image

So there was quite a pressure on the GC

@arturcic arturcic mentioned this pull request Oct 7, 2025
2 tasks
@arturcic arturcic requested review from asbjornu and arturcic October 7, 2025 15:35
Copy link
Member

@arturcic arturcic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, approved, thanks for the time you spent

@arturcic arturcic linked an issue Oct 7, 2025 that may be closed by this pull request
2 tasks
@arturcic arturcic enabled auto-merge October 7, 2025 18:51
@arturcic
Copy link
Member

arturcic commented Oct 7, 2025

@asbjornu time for a second review?

Copy link
Member

@asbjornu asbjornu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is most excellent! Well done! 👏🏼 I have a few questions and comments that can be ignored if there's good reason to, which is why I'm approving. But I would appreciate if they were addressed. :)

return new Branch(innerBranch, repoDiff, this);
}

var cacheKey = $"{innerBranch.CanonicalName}|{innerBranch.Tip.Sha}|{innerBranch.RemoteName}";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not discovering this earlier. It's not very important, but will improve debugging if the format of the key matches what one would expect to find used in Git and GitHub a bit more than this.

Suggested change
var cacheKey = $"{innerBranch.CanonicalName}|{innerBranch.Tip.Sha}|{innerBranch.RemoteName}";
var cacheKey = $"{innerBranch.RemoteName}/{innerBranch.CanonicalName}@{innerBranch.Tip.Sha}|";


public Tag GetOrCreate(LibGit2Sharp.Tag innerTag, Diff repoDiff)
{
var cacheKey = $"{innerTag.CanonicalName}|{innerTag.Target.Sha}";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var cacheKey = $"{innerTag.CanonicalName}|{innerTag.Target.Sha}";
var cacheKey = $"{innerTag.CanonicalName}@{innerTag.Target.Sha}";

public Tag GetOrCreate(LibGit2Sharp.Tag innerTag, Diff repoDiff)
{
var cacheKey = $"{innerTag.CanonicalName}|{innerTag.Target.Sha}";
return cachedTags.GetOrAdd(cacheKey, new Tag(innerTag, repoDiff, this));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return cachedTags.GetOrAdd(cacheKey, new Tag(innerTag, repoDiff, this));
return cachedTags.GetOrAdd(cacheKey, () => new Tag(innerTag, repoDiff, this));

}

public Commit GetOrCreate(LibGit2Sharp.Commit innerCommit, Diff repoDiff) =>
cachedCommits.GetOrAdd(innerCommit.Sha, new Commit(innerCommit, repoDiff, this));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cachedCommits.GetOrAdd(innerCommit.Sha, new Commit(innerCommit, repoDiff, this));
cachedCommits.GetOrAdd(innerCommit.Sha, () => new Commit(innerCommit, repoDiff, this));

}

var cacheKey = $"{innerBranch.CanonicalName}|{innerBranch.Tip.Sha}|{innerBranch.RemoteName}";
return cachedBranches.GetOrAdd(cacheKey, new Branch(innerBranch, repoDiff, this));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't lazy initialization here be even more optimal, avoiding the construction of a Branch instance altogether if it already exists?

Suggested change
return cachedBranches.GetOrAdd(cacheKey, new Branch(innerBranch, repoDiff, this));
return cachedBranches.GetOrAdd(cacheKey, () => new Branch(innerBranch, repoDiff, this));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the rename from GitCache to GitRepository? I don't feel like the provided functionality is quite that of a repository, which is often used as an abstraction over a database. 🤔

@arturcic arturcic merged commit 5310b31 into GitTools:main Oct 7, 2025
85 checks passed
Copy link
Contributor

mergify bot commented Oct 7, 2025

Thank you @pvanbuijtene for your contribution!

@asbjornu
Copy link
Member

asbjornu commented Oct 7, 2025

Oh, it was set to auto-merge. 😅 Well well. You can work through my comments and see if you think they are worthy a followup PR, @pvanbuijtene.

@arturcic
Copy link
Member

arturcic commented Oct 8, 2025

Oh, it was set to auto-merge. 😅 Well well. You can work through my comments and see if you think they are worthy a followup PR, @pvanbuijtene.

oh sorry about that, I might create a PR with the suggestions

@HHobeck
Copy link
Contributor

HHobeck commented Oct 8, 2025

@pvanbuijtene: Thank you for your contribution and the work you have done.

I have following remarks/suggestions:

  • The classes Commit, Branch and so on are wrapper classes around the LibGit2Sharp objects. I would say the wrapper classes should not have a reference to GitRepostory. Is it an idea to extract the functionality of caching in a separate class and use this instead?
  • Because of the nature of a wrapper class I would expect that the LibGit2Sharp objects are reused as well and the hash code is the same (otherwise we would have still a memory problem). Probably you can use the hash code instead of using an arbitrary string (needs to be tested).

@arturcic
Copy link
Member

arturcic commented Oct 8, 2025

@asbjornu @HHobeck @pvanbuijtene follow up here #4685

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ISSUE]: High memory usage [ISSUE]: gitversion 6.10.0 (and 6.5.0 as well) hangs for a while when calculating version
4 participants