Skip to content

Large simplification in how diagnostics are cached in the diagnostic service. #77111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 30 commits into from
Feb 8, 2025

Conversation

CyrusNajmabadi
Copy link
Member

@CyrusNajmabadi CyrusNajmabadi commented Feb 7, 2025

It may be worthwhile to review this PR commit by commit.

The general realization is this:

The diagnostic service does have a caching mechanism to save some computed state that later requests could potentially read from. HOWEVER, the only thing actually saving that computed state was the 'ForceAnalyzeProjectAsync' call, which is what is called for the 'Run Code Analysis' feature. This feature itself doesn't ever use the cached state (it returns the value from the ForceAnalyzeProjectAsync itself). If that feature was never used, the cache was totally ineffective as no other codepaths would save their results to the cache. Note that this is actually fairly fine as most other diagnostic service consumers (like LSP pull diags) cache at their level, avoiding calling back into the service if the project checksum hasn't changed.

Now, once those other calls did go into the diagnostic service, then they might read that cached data, but only if a prior ForceAnalyzeProjectAsync had happened and if their project-dependent-version timestamp exactly matched the versions when that prior ForceAnalyzeProjectAsync call was made.

Note that project-dependent-version-timestamp represents: has this project (or dependents) changed in any way. We already have a better mechanism to represent that concept. The Project Instance itself. Whenever it or deps change, you get a new instance. So the entire caching mechanism can be replaced trivially with a CWT<Project, ... cached data ...>. The two entrypoints that previous might examine the cache then just examine this CWT to see if the result was already stored there. Otherwise they compute as appropriate.

THis MASSIVELY simplifies the code in this layer, leading to huge amounts of old, complex, buggy code that could be outright eliminated. I have doc'ed the PR to help walk through the logic leading to all of this.

@ghost ghost added Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead labels Feb 7, 2025
public bool TryGetResult(DiagnosticAnalyzer analyzer, out DiagnosticAnalysisResult result)
=> Result.TryGetValue(analyzer, out result);

public static async Task<ProjectAnalysisData> CreateAsync(Project project, ImmutableArray<StateSet> stateSets, CancellationToken cancellationToken)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this call effectively would never return anything useful. and as such, could be removed. At that point, this type just was holding onto a ImmDic<DiagAnalyzer, DiagResult>. So the entire type could be removed, and only that ImmDict used in its stead.

foreach (var stateSet in stateSets)
{
var state = stateSet.GetOrCreateProjectState(project.Id);
var result = await state.GetAnalysisDataAsync(project, cancellationToken).ConfigureAwait(false);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

later on i will show that this call would always return nothign useful here (an 'empty/default' analysis data object). LEading to the caller always having to compute anyways.


if (existingData.Version == version)
return existingData;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would either:

  1. not ever succeed (in the case where ForceProjectAnalysisAsync had not run). In which case, it is fine to remove and just do the computation path below.
  2. succeed, if it had run and no other changes to the project had been made. However, the check for that is now pulled into the caller. So these checks are pointless and this method just bcomes a helper that always computes the results.

var version = await GetDiagnosticVersionAsync(project, cancellationToken).ConfigureAwait(false);

var ideAnalyzers = stateSets.Select(s => s.Analyzer).Where(a => a is ProjectDiagnosticAnalyzer or DocumentDiagnosticAnalyzer).ToImmutableArrayOrEmpty();

if (compilationWithAnalyzers != null && TryReduceAnalyzersToRun(compilationWithAnalyzers, version, existing, out var projectAnalyzersToRun, out var hostAnalyzersToRun))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TryReduceAnalyzersToRun would always return false. i'll show that below.

cancellationToken).ConfigureAwait(false);

var result = await ComputeDiagnosticsAsync(compilationWithReducedAnalyzers, project, ideAnalyzers, cancellationToken).ConfigureAwait(false);
return MergeExistingDiagnostics(version, existing, result);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only call of this method. So it can be removed entirely.

projectAnalyzers = compilationWithAnalyzers.ProjectAnalyzers.WhereAsArray(
static (analyzer, arg) =>
{
if (arg.existing.TryGetValue(analyzer, out var analysisResult) &&
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the crux to realize here is that 'existing' was never present (since it comes from the last cached data when ForceANalyzeProjectAsync was called, and that is not called unless user explicitly runs 'run code analysis').

And, if the user does that and the project is the same, then a higher layer reuses the cached results and we don't get here now. So this would always end up returning 'true' in this lambda (and hte lambda below).WHich would cause the arrays to stay the same length in the lowest if check in this method. leading us to return 'false' for the entire method. So the caller in the method above can delete all this unnecessary code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then a higher layer reuses the cached results and we don't get here now.

this being the new cwt with Project keys?

{
private partial class DiagnosticIncrementalAnalyzer
{
private static class InMemoryStorage
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all this code away. this was only used in the ForceAnalyzeProjectAsync codepath to cache its result. But now we just cache that trivially in a CWT, so this becomes entirely irrelevant.

@CyrusNajmabadi CyrusNajmabadi changed the title WIP: Update caching for diagnostic service. Massively simplify how diagnostics are cached in the diagnostic service. Feb 7, 2025
@CyrusNajmabadi CyrusNajmabadi marked this pull request as ready for review February 7, 2025 20:39
@CyrusNajmabadi CyrusNajmabadi requested a review from a team as a code owner February 7, 2025 20:39
@CyrusNajmabadi CyrusNajmabadi changed the title Massively simplify how diagnostics are cached in the diagnostic service. Large simplification in how diagnostics are cached in the diagnostic service. Feb 7, 2025
CompilationWithAnalyzersPair? compilationWithAnalyzers, Project project, ImmutableArray<StateSet> stateSets, CancellationToken cancellationToken)
{
using (Logger.LogBlock(FunctionId.Diagnostics_ProjectDiagnostic, GetProjectLogMessage, project, stateSets, cancellationToken))
{
try
{
var version = await GetDiagnosticVersionAsync(project, cancellationToken).ConfigureAwait(false);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'versions' ended up being virtually eliminated. GetDiagnosticVersionAsync returns project.GetDependentVersionAsync which is:

The most recent version of the project, its documents and all dependent projects and documents.

So any change to eh project at all (options, info, docs, dependencies) would rev this. no need for this at all now, since we can just use the Project instnace itself in a CWT.

var projectLoadedSuccessfully = await project.HasSuccessfullyLoadedAsync(cancellationToken).ConfigureAwait(false);
if (projectLoadedSuccessfully)
async Task<ImmutableDictionary<DiagnosticAnalyzer, DiagnosticAnalysisResult>> RemoveCompilerSemanticErrorsIfProjectNotLoadedAsync(
ImmutableDictionary<DiagnosticAnalyzer, DiagnosticAnalysisResult> result)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved all these helpers to be local functions to make it much clearer that they should not be used by anything else.

Copy link
Member

@dibarbet dibarbet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -18,26 +19,26 @@ internal partial class DiagnosticAnalyzerService
{
private partial class DiagnosticIncrementalAnalyzer
{
private readonly ConditionalWeakTable<Project, StrongBox<(ImmutableArray<StateSet> stateSets, ProjectAnalysisData projectAnalysisData)>> _projectToForceAnalysisData = new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could doc this (e.g. why we're using a weak table, why project vs projectId, etc)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs added .

projectAnalyzers = compilationWithAnalyzers.ProjectAnalyzers.WhereAsArray(
static (analyzer, arg) =>
{
if (arg.existing.TryGetValue(analyzer, out var analysisResult) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then a higher layer reuses the cached results and we don't get here now.

this being the new cwt with Project keys?

/// <em>all</em> the analyzers for the project. This data can then be used by <see
/// cref="DiagnosticGetter.ProduceDiagnosticsAsync"/> to speed up subsequent calls through the normal <see
/// cref="IDiagnosticAnalyzerService"/> entry points as long as the project hasn't changed at all.
/// </summary>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignore my previous comment then :)

@CyrusNajmabadi
Copy link
Member Author

No regressions reported. So merging in.

@CyrusNajmabadi CyrusNajmabadi merged commit 0fcaef9 into dotnet:main Feb 8, 2025
25 checks passed
@CyrusNajmabadi CyrusNajmabadi deleted the diagNoMemorySave branch February 8, 2025 17:35
@dotnet-policy-service dotnet-policy-service bot added this to the Next milestone Feb 8, 2025
CyrusNajmabadi added a commit that referenced this pull request Feb 9, 2025
…r a particular project in the diagnostics layer. (#77113)

Followup to #77111
@akhera99 akhera99 modified the milestones: Next, 17.14 P2 Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead VSCode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants