Skip to content

Significant performance improvement#668

Open
gelverpl wants to merge 6 commits intoAcumatica:devfrom
gelverpl:playground
Open

Significant performance improvement#668
gelverpl wants to merge 6 commits intoAcumatica:devfrom
gelverpl:playground

Conversation

@gelverpl
Copy link
Copy Markdown

@gelverpl gelverpl commented May 6, 2026

Proposed changes

  1. NestedInvocationWalker now caches retrieved ISymbol instances using a new SymbolInfoCache.
  2. Fixed a serious issue that was causing a large amount of redundant work: multiple call graph traversals, which in turn caused an excessive number of semantic queries to resolve ISymbol.

Motivation

While profiling the analyzer, I noticed extremely high memory usage, and running solution analysis from the CLI took more than 40 minutes.
dotTrace pointed to Acuminator.Utilities.Roslyn.NestedInvocationWalker.GetSymbol(ExpressionSyntax) as a primary hotspot:
Screenshot 2026-05-06 201213
Screenshot 2026-05-06 201339

Memory profiling confirmed the same method:
Screenshot 2026-05-06 204141

Caching

As a load-reducing measure, I implemented caching. This significantly reduced both execution time and memory consumption.

Yes, caching is often a last-resort approach and ideally shouldn't be here. However, when implemented carefully it can also provide valuable diagnostic data that helps guide further optimization work.

Cache statistics

SymbolInfoCache can optionally collect statistics. Enable it by defining the SYMBOL_INFO_CACHE_STATISTICS constant, then inspect Acuminator.Utilities.Roslyn.SymbolInfoCache.SymbolInfoCacheStatisticsContext.GetStatistics under the debugger.

I intentionally did not add any logging of it because the statistics can become very large. It's much more practical to analyze it interactively (e.g., via LINQ in a debug session).

While digging into the stats, I noticed something weird: the same expression was cached under two different key types: InvocationExpressionSyntax and MemberAccessExpressionSyntax. Debugging led to VisitMemberAccessExpression.

foo.Bar() is both MemberAccessExpressionSyntax and InvocationExpressionSyntax

By filtering out redundant invocation processing inside MemberAccessExpressionSyntax, I added an additional fix. This change, similarly to caching, delivered a substantial improvement in both time and memory.

Benchmarks

I measured the impact of each fix independently, as well as both together.

Total runtime (wall-clock):
Screenshot 2026-05-06 211607

  • Speedup ranges from 1.30x to 7.43x (about 4x on average)

Memory allocation / GC pressure:
Screenshot 2026-05-06 211648

  • Allocations reduced by 1.59x to 8.78x (about 5x on average)
  • Significantly less GC overhead due to drastically fewer allocations

With both fixes applied, traces no longer show GetSymbol as a top hotspot; instead, the cache path (GetOrCreate) appears, and overall memory usage is dramatically improved.
Screenshot 2026-05-06 203630
Screenshot 2026-05-06 204713

Conclusion

Even though caching contributes less after the second fix, I'd still like to keep it (at least for upcoming optimization work) as a way to collect detailed statistics. The goal is to reach a hit rate that is 0 or as close to 0 as possible (i.e., eliminate the underlying redundant symbol queries altogether).

I plan to continue working on this area and believe we can achieve even more significant improvements.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets analyzer performance hotspots in NestedInvocationWalker by reducing redundant Roslyn semantic symbol queries and avoiding duplicate traversal work, with an additional (unrelated) addition of Rider/JetBrains project configuration files.

Changes:

  • Introduced SymbolInfoCache to cache SemanticModel.GetSymbolInfo(...) results per ExpressionSyntax.
  • Updated NestedInvocationWalker.GetSymbol<T> to use the cache and reduced redundant processing in VisitMemberAccessExpression for invocation callees.
  • Added Rider/JetBrains .idea configuration files under src/.idea/.idea.Acuminator/.idea/.

Reviewed changes

Copilot reviewed 2 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/Acuminator/Acuminator.Utilities/Roslyn/SymbolInfoCache.cs New symbol-info cache (and optional statistics collection) used to reduce repeated semantic queries.
src/Acuminator/Acuminator.Utilities/Roslyn/NestedInvocationWalker.cs Uses the cache in GetSymbol<T> and avoids redundant member-access processing for foo.Bar() patterns.
src/.idea/.idea.Acuminator/.idea/vcs.xml Adds Rider VCS mapping configuration.
src/.idea/.idea.Acuminator/.idea/indexLayout.xml Adds Rider index layout configuration.
src/.idea/.idea.Acuminator/.idea/.name / .gitignore Adds Rider project name and IDE-local ignore rules.
Files not reviewed (4)
  • src/.idea/.idea.Acuminator/.idea/.gitignore: Language not supported
  • src/.idea/.idea.Acuminator/.idea/.name: Language not supported
  • src/.idea/.idea.Acuminator/.idea/indexLayout.xml: Language not supported
  • src/.idea/.idea.Acuminator/.idea/vcs.xml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +90 to +102
public StatisticsSnapshot GetSnapshot()
{
var lookupsTotal = _hitsTotal + _missesTotal;
return new StatisticsSnapshot(
_cacheId,
_owner,
_hitsTotal,
_missesTotal,
_setsTotal,
_byTypeCounter,
CalcRatio(_hitsTotal, lookupsTotal),
CalcRatio(_missesTotal, lookupsTotal)
);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gelverpl this is a good point. It's not really a snapshot if you are passing the original reference which will be mutated after the creation of the snapshot.

Comment on lines +73 to +88
public void Hit() => _hitsTotal++;

public void Miss() => _missesTotal++;

public void Set(Type type)
{
if (!_byTypeCounter.ContainsKey(type))
{
_byTypeCounter[type] = 1;
}
else
{
_byTypeCounter[type]++;
}
_setsTotal++;
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be a problem for now since each instance of symbol cache and related statistic is owned by only one dedicated nested invocation walker which is executed only by one thread.

Later, when this cache will be expanded to be shared with all analyzers, this indeed will be a problem.

Comment on lines +1 to +6
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="VcsDirectoryMappings">
<mapping directory="$PROJECT_DIR$/.." vcs="Git" />
</component>
</project> No newline at end of file
Copy link
Copy Markdown
Collaborator

@SENya1990 SENya1990 May 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gelverpl this seems like IDE specific. I think it's better to update the .gitignore file in the repo to not track the entire .idea folder.

Comment thread src/Acuminator/Acuminator.Utilities/Roslyn/SymbolInfoCache.cs
@SENya1990 SENya1990 self-requested a review May 6, 2026 20:10
public sealed class SymbolInfoCache
{
private readonly Dictionary<ExpressionSyntax, SymbolInfo> _map = new();
#if SYMBOL_INFO_CACHE_STATISTICS
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gelverpl we spoke today about using some kind of feature flags for performance profiling features. Did you decide not to use it in the end?

namespace Acuminator.Utilities.Roslyn;

/// <summary>
/// Caches Roslyn symbol lookup results for expression syntax nodes visited by a syntax walker.
Copy link
Copy Markdown
Collaborator

@SENya1990 SENya1990 May 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, this can be already a bit generalized to be lookup for syntax nodes, not just expression syntax nodes. There are many other types of nodes that can be passed to the semantic model:
https://learn.microsoft.com/en-us/dotnet/api/microsoft.codeanalysis.semanticmodel?view=roslyn-dotnet-5.0.0

private static double CalcRatio(long value, long total) => total == 0 ? 0 : (double)value / total;
}

public static class SymbolInfoCacheStatisticsContext
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gelverpl please use 1 type per file convention (except maybe 1 line records types) and move this to a separate file.


public static class SymbolInfoCacheStatisticsContext
{
private static readonly object Gate = new();
Copy link
Copy Markdown
Collaborator

@SENya1990 SENya1990 May 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For private fields the current convention is to use underscore-prefixed camel case.

Also it's not very important, but the established naming convention for these objects is to call them _locker

}

#if SYMBOL_INFO_CACHE_STATISTICS
public sealed class SymbolInfoCacheStatistics
Copy link
Copy Markdown
Collaborator

@SENya1990 SENya1990 May 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be moved to a separate file.
The type should probably be marked as internal instead of public.

}
}

public record StatisticsSnapshot(int CacheId, Type Owner, long HitsTotal, long MissesTotal, long SetsTotal, Dictionary<Type, long> ByTypeCounter, double HitRatio, double MissRatio);
Copy link
Copy Markdown
Collaborator

@SENya1990 SENya1990 May 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imho, this DTO can also be in a separate file, but I won't mind if it will be placed in a file with the SymbolInfoCacheStatistics type.

And it also should be internal.


public static StatisticsSnapshot[] GetStatistics()
{
lock (Gate)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a big improvement, since GetSnapshot doesn't do a lot currently, but you can reduce the time under the lock in this algorithm, if you make under the lock a copy of the Statistics list. Then you can call GetSnapshot on the copy of the list without locking the Statistics collection.

var result = new StatisticsSnapshot[Statistics.Count];
for (var i = 0; i < Statistics.Count; i++)
{
result[i] = Statistics[i].GetSnapshot();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's a static context, it can be shared by multiple analyzers running on different threads. Which means that while the call to GetStatistics is executed on one thread, the other thread may update the counters for a concrete statistic. This make this collection a bit different from a full snapshot and makes the measurement a bit inaccurate - snapshot for one statistic is collected at the moment X, snapshot for the next statistic will be collected for the moment X + some small delta.

For full snapshots you would need to use sync the updates of the counters with the retrieval of statistics... which will make things more difficult for us and introduce undesirable threads synchronization.

I think it's OK to leave it as is since we are not micro benchmarking and looking for milliseconds. But this should be a conscious decision, so I think a comment about this decision should be added here describing it.


private readonly ISet<(SyntaxNode, DiagnosticDescriptor)> _reportedDiagnostics = new HashSet<(SyntaxNode, DiagnosticDescriptor)>();

private readonly SymbolInfoCache _cache;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other caches in this type, so IMHO a more specific name should be used like _symbolsCache.

private static readonly List<SymbolInfoCacheStatistics> Statistics = new();
private static int _nextCacheId;

internal static SymbolInfoCacheStatistics Register(Type owner)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gelverpl there can be multiple statistics registered for the same owner (walker). Is it okay to not aggregate them into one statistic?

/// </summary>
public sealed class SymbolInfoCache
{
private readonly Dictionary<ExpressionSyntax, SymbolInfo> _map = new();
Copy link
Copy Markdown
Collaborator

@SENya1990 SENya1990 May 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gelverpl we should cache ISymbol?, there is no need to keep intermediate symbol info DTO.

var semanticModel = GetSemanticModel(node.SyntaxTree);

if (semanticModel != null)
SymbolInfo? cached = _cache.GetOrCreate(node, () =>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here should be the retrieval of the symbol from cache, not SymbolInfo.
As I wrote in https://github.com/Acumatica/Acuminator/pull/668/changes#r3197690911, cache should keep ISymbol?.

Also, since the symbol is not always cached, IMHO it's better to call it just symbol

Copy link
Copy Markdown
Collaborator

@SENya1990 SENya1990 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, the PR content is good, there is no need to significantly change anything.
But I have some remarks that should be processed:

  • about the type of cached data,
  • some minor remarks about the concurrent code
  • remarks about the code style.

_cacheId = cacheId;
}

public void Hit() => _hitsTotal++;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All counting methods should be annotated with the System.Diagnostics.ConditionalAttribute instead of using pragma directives at their every usage.
https://learn.microsoft.com/en-us/dotnet/api/system.diagnostics.conditionalattribute?view=netstandard-2.0

{
if (_map.TryGetValue(key, out SymbolInfo cached))
{
#if SYMBOL_INFO_CACHE_STATISTICS
Copy link
Copy Markdown
Collaborator

@SENya1990 SENya1990 May 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All counting methods can be annotated with the System.Diagnostics.ConditionalAttribute instead of using pragma directives at their every usage.
https://learn.microsoft.com/en-us/dotnet/api/system.diagnostics.conditionalattribute?view=netstandard-2.0

It won't compile without the pragma directive as is though. So, to make it compile and get rid of to many pragmas, maybe you can always keep the field with statistics object and for regular builds initialize it to null or an empty static object.

#if SYMBOL_INFO_CACHE_STATISTICS
public sealed class SymbolInfoCacheStatistics
{
private readonly Dictionary<Type, long> _byTypeCounter = new();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imho, the name can indicate that the keys are type of syntax nodes. For example, it can be _byNodeTypeCounter


public void Miss() => _missesTotal++;

public void Set(Type type)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imho, nodeType is clearer

#endif

SymbolInfo? potentialValue = factory();
if (potentialValue is not null)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not caching null results? It's not like the semantic model will return a different result with another call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants