Add support for writing HTML literals using UTF-8 strings#13052
Add support for writing HTML literals using UTF-8 strings#13052chsienki wants to merge 9 commits into
Conversation
Implement auto-detection of UTF-8 WriteLiteral support for legacy .cshtml code generation. When a page's @inherits base class has a callable WriteLiteral(ReadOnlySpan<byte>) overload, HTML literals are emitted as C# UTF-8 string literals ("..."u8).
- FullyQualifiedInherits: namespaced type with fully-qualified @inherits - ShortNameInherits_WithUsing: documents that short names don't resolve for UTF-8 detection (GetTypeByMetadataName requires full qualification) - PartiallyQualifiedInherits: documents partial names don't resolve - SwitchesWhenOverloadAddedOrRemoved: uses fresh drivers per edit step Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Just realized a flaw in this I need to fix. No need to review yet, I'll ping when its ready. |
|
|
||
| // Build a map of which @inherits base types support UTF-8 WriteLiteral. | ||
| var utf8SupportMap = parsedDocuments | ||
| .Select(static (item, _) => item.Item3.CodeDocument.GetInheritsDirectiveContent()) |
There was a problem hiding this comment.
Just to be horribly annoying, couldn't two different files have the same @inherits content, but different base types, due to @using directives?
There was a problem hiding this comment.
Yeah, I think that's possible. Given that we need to fully resolve the type anyway, I'll use the assembly qualified type name as the key so they can't clash.
Too late! 😛 Though I think, based on the commit message, you found the same issue I did anyway :) |
- Add GetInheritsDirectiveContent and GetUsingDirectives extension methods on RazorCodeDocument for extracting @inherits and @using directives - Resolve short/aliased type names via augmented compilation with the document's @using directives when GetTypeByMetadataName fails - Dual-lookup Utf8SupportMap: per-file (filePath -> FQN) + per-type (FQN -> bool) to handle same @inherits text resolving differently - Use GetFullName() for metadata name formatting - Call HasCallableUtf8WriteLiteralOverload via string overload to avoid cross-compilation symbol issues - Add InheritsInfo nested record on DefaultUtf8WriteLiteralFeature - Tests: short name with @using, alias via _ViewImports, file-level alias, alias shadowing (CS0576 graceful fallback), fully-qualified Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Build one probe syntax tree with namespace-scoped usings for all entries that need resolution, instead of creating a separate augmented compilation per entry. This reduces O(N) AddSyntaxTrees calls to O(1). - Two-pass Create: fast path via GetTypeByMetadataName, then batch slow path - ResolveTypeNamesWithUsings takes CSharpCompilation directly - Split pipeline: extract @inherits first, then usings only for files that need it - Rename GetInheritsDirectiveContent to GetInheritsDirectiveValue - Make InheritsInfo fields non-nullable Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GetInheritsDirectiveValue() now searches import syntax trees when the main document has no @inherits directive. The most specific _ViewImports wins, and the page's own @inherits overrides everything. Added tests for @inherits in _ViewImports (global and namespaced types) and cascading _ViewImports with override precedence. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| private readonly ImmutableSortedDictionary<string, string> _fileToType; | ||
| // fully-qualified type name -> supports UTF-8 | ||
| private readonly ImmutableSortedDictionary<string, bool> _typeSupport; |
There was a problem hiding this comment.
Since these are only written to once, I wonder if a Dictionary might be better, for faster lookup.
There was a problem hiding this comment.
We need it for the equality checks. Dictionary doesn't have an ordering so you can't easily compare across two different iterations of the generator.
| var results = new List<(int, string)>(); | ||
|
|
||
| // Build a single probe tree with namespace-scoped usings for each entry. | ||
| var sb = new StringBuilder(); |
| CancellationToken cancellationToken) | ||
| { | ||
| if (_utf8Feature is null || | ||
| !codeDocument.FileKind.IsLegacy() || |
There was a problem hiding this comment.
So, as per my comment below in the resolved thread, is this restriction necessary? Feels like only supporting cshtml and needing an explicit @inherit directive are just making the feature harder to use without any benefit. Are we worried the ASP.NET base classes will add a compatible overload that doesn't mean "write a literal"?
There was a problem hiding this comment.
Are we worried the ASP.NET base classes will add a compatible overload that doesn't mean "write a literal"?
I'm certainly not. I do agree it feels restrictive to require that a explicit @inherits is present rather than just the default base class for the view type (MVC view or Razor Page). While we could relax this in the future if and when dotnet/aspnetcore#65605 is done, I'm not sure what benefit the guard based on @inherits adds.
As for support in .razor files, that's a whole other world as the rendering model of Razor Components is fundamentally different to that of .cshtml in that it's inherently delayed rendering, not immediate, and as such isn't a good fit for ReadOnlySpan<byte>. Not to say .razor couldn't be updated to realize a benefit, but the mechanics are just completely different, i.e. mapping a literal span to a write method vs. a method that inserts a node in the render tree.
In both the case of MVC and Blazor, it's almost certain that the UTF8 literal would need to be passed as ref via ReadOnlyMemory<byte> instead of a span and we'd need to do more work in the Razor Compiler to emit code that suits that in the most optimal (and safe) way.
There was a problem hiding this comment.
@davidwengier Apologies I missed that (requiring an explicit @inherits) comment on the previous thread.
We certainly don't have to limit to the inherits. I just wanted to start with the smallest change possible that we can open up in the future. Right now, we know that the default base class doesn't support it, so it didn't seem worth checking it. I put in a comment a couple lines down saying that in effect.
If we update the base classes to support it, it would need a new TFM, which would need a new SDK, which would mean a newer generator, so I don't think there is a scenario where the base class would get enabled and a user couldn't pick it up.
That being said if we want to just start checking it upfront we can do so.
|
@chsienki tried this out with Razor Slices and it works great! Found a bug in my existing code that pre-empted this feature too 😄 |
|
@chsienki OK I think I found a bug, views that inherit from generic base classes with a Can be worked around by putting a non-generic type in the inheritance chain (aspnet/Benchmarks@cc5ab3f) |
The slow path for resolving @inherits type names previously skipped entries with no Razor @using directives. Since .cshtml files always have default MVC imports, this filter was ineffective. Removing it ensures types resolvable via C# global usings or the compilation's existing context are not missed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The slow path now uses GetFullMetadataName() which builds a proper CLR metadata name (with backtick arity for generics and + for nested types) instead of GetFullName() which produces C# display syntax that cannot be resolved by GetTypeByMetadataName. Added tests for generic base classes (single and multiple type params), generics in namespaces, nested generics, and generics from metadata references. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Superseded by dotnet/roslyn#83457 -- ported onto |
Summary
Builds on #12848 by @DamianEdwards, refactoring the UTF-8 HTML literal detection to use a pipeline-friendly pre-computed map approach.
When a
.cshtmlpage's@inheritsbase class has a callableWriteLiteral(ReadOnlySpan<byte>)overload, HTML literals are emitted as C# UTF-8 string literals (\"...\"u8), enabling direct binding to the byte-span overload and avoiding UTF-16→UTF-8 transcoding at runtime.Key changes from the original PR
Utf8SupportMapinstead of per-file probe compilations — the source generator extracts@inheritsbase type names from parsed syntax trees, combines with the declaration compilation to build a value-comparable map, and passes it toProcessRemainingIUtf8WriteLiteralFeatureengine feature withDefaultUtf8WriteLiteralFeatureimplementation backed by the mapUtf8WriteLiteralDetectionPasstoCSharpnamespaceTests
u8vs string literals).cshtmlfiles with different@inherits, only one uses UTF-8@inheritsdirective → string literals (default base class)@inheritsdetectionCloses #8429