Investigate MVC view buffering subsystem changes to support UTF-8 HTML literal bytes end-to-end

# Investigation: UTF8 HTML Literal Support for MVC Razor Views

## Problem Statement

The Razor compiler PRs [dotnet/razor#12848](https://github.com/dotnet/razor/pull/12848) and https://github.com/dotnet/razor/pull/13052 add support for the Razor compiler to emit HTML literal blocks as C# UTF-8 string literals (`"..."u8`) instead of regular string literals. This means the generated code will call `WriteLiteral(ReadOnlySpan<byte>)` instead of `WriteLiteral(string)` if a suitable `WriteLiteral(ReadOnlySpan<byte>)` method overload is present on the view base class.

The goal is to eliminate the runtime cost of encoding HTML literal strings from UTF-16 (`string`) to UTF-8 bytes every time a view is rendered. Since the vast majority of ASP.NET Core apps serve UTF-8 responses, pre-encoding these literals at compile time should reduce CPU and allocation overhead.

**The key insight**: Simply adding a `WriteLiteral(ReadOnlySpan<byte>)` overload that converts the bytes back to a `string` (as the current PR's test code does) would be **worse** than the status quo — it would add an extra UTF-8→UTF-16 decode step on top of the existing UTF-16→UTF-8 encode step. To realize the benefit of the Razor compiler change, the MVC buffering subsystem must be updated to natively handle UTF-8 byte sequences end-to-end.

## Current Architecture Analysis

### The Write Pipeline (summary)

```
Generated Razor Code: WriteLiteral("html string")
        ↓
RazorPageBase.WriteLiteral(string) → Output.Write(value)  [no HTML encoding]
        ↓
ViewBufferTextWriter.Write(string) → Buffer.AppendHtml(value)
        ↓
ViewBuffer.AppendHtml(string) → ViewBufferPage.Append(new ViewBufferValue(value))
        ↓
[Content buffered in ViewBufferValue[] pages, pooled via MemoryPoolViewBufferScope]
        ↓
RazorView.RenderLayoutAsync() → bodyWriter.Buffer.WriteToAsync(writer, encoder)
        ↓
ViewBuffer.WriteToAsync() iterates pages:
  - string values → writer.WriteAsync(string)
  - IHtmlContent values → content.WriteTo(writer, encoder)
        ↓
PagedBufferedTextWriter (char[] buffering layer)
        ↓
HttpResponseStreamWriter: chars → Encoder.GetBytes() → response.Body stream
```

### Key Types

| Type | Role |
|------|------|
| `RazorPageBase` | View base class with `WriteLiteral(string)` and `Write(string)` methods |
| `ViewBuffer` | `IHtmlContentBuilder` backed by pooled `ViewBufferValue[]` pages |
| `ViewBufferValue` | Union struct holding either `string` (pre-encoded HTML) or `IHtmlContent` |
| `ViewBufferTextWriter` | `TextWriter` that buffers into `ViewBuffer`, flushes to inner writer |
| `ViewBufferPage` | A single page of `ViewBufferValue[]` with count/capacity tracking |
| `IViewBufferScope` | Manages pooling of `ViewBufferValue[]` arrays |
| `PagedBufferedTextWriter` | Char-buffering layer between `ViewBuffer` and `HttpResponseStreamWriter` |
| `HttpResponseStreamWriter` | Final `TextWriter` that encodes chars→bytes via `Encoding.GetEncoder()` |
| `ViewExecutor` | Creates the `HttpResponseStreamWriter` with resolved encoding (default UTF-8) |

### Critical Observations

1. **The entire pipeline is `string`/`char`-based.** `ViewBufferValue` stores `object` that is either `string` or `IHtmlContent`. There is no concept of raw bytes anywhere in the buffer.

2. **Layout pages require buffering.** The content of a page is buffered so it can be injected into `@RenderBody()` in the layout. This means we can't just stream bytes directly to the response — we need to buffer them.

3. **The final encoding step is in `HttpResponseStreamWriter`.** It takes chars from `TextWriter.Write(string)` calls and encodes them to bytes using `Encoding.GetEncoder()`. For UTF-8 output, this is where string→bytes conversion happens on every request.

4. **`WriteLiteral` writes pre-encoded HTML** (no HTML entity encoding needed). It goes through `Output.Write(value)` which calls `ViewBufferTextWriter.Write(string)` which calls `ViewBuffer.AppendHtml(string)`. The string is stored as-is and written as-is during flush.

5. **Non-UTF-8 encodings are supported** via `ResponseContentTypeHelper.ResolveContentTypeAndEncoding()`. If a response uses e.g. `text/html; charset=iso-8859-1`, the `HttpResponseStreamWriter` would use that encoding. UTF-8 byte literals would need to be re-encoded to the target encoding.

6. **Blazor SSR already uses `TextWriter` abstraction** on top of UTF-8 `HttpResponseStreamWriter`. It does NOT use raw byte paths. Its `BufferedTextWriter` buffers strings/chars, not bytes.

7. **`HttpResponse.BodyWriter` (PipeWriter)** exists and is used by `HttpResponseWritingExtensions` to write bytes directly, but MVC's view rendering pipeline does not use it.

## Critical Constraints

### No Implicit Flushing Behavior Changes

The current MVC view rendering model buffers all output during view execution. This is critical for exception handling: if an exception occurs partway through rendering, because no bytes have been written to the response yet, the framework can still set an error status code, render an error page, or take other corrective action. **Any approach that introduces implicit/automatic flushes to the response stream as part of handling UTF-8 byte literals would break this guarantee.** All approaches below must ensure that UTF-8 byte data is fully buffered alongside string data, and only written to the response at the same points that string data would be flushed today (i.e., during `ViewBuffer.WriteToAsync` at the end of layout rendering, or when the user explicitly calls `FlushAsync()`).

### Pooled Byte Buffers (No Per-Literal Allocations)

A typical Razor view contains many HTML literal blocks interspersed with dynamic expressions. Allocating a `byte[]` for each `WriteLiteral(ReadOnlySpan<byte>)` call would create significant GC pressure — potentially dozens of `byte[]` allocations per request, per view. Instead, UTF-8 byte data should be copied into **pooled byte buffers** (e.g., `ArrayPool<byte>`-backed pages, analogous to the existing `ViewBufferValue[]` pages used for strings). The `ReadOnlySpan<byte>` data from each `WriteLiteral` call would be appended to the current pooled byte page, with new pages allocated only when the current one is full.

## Approaches

### Approach A: Dual-Type ViewBuffer (Hybrid String + Byte Pages)

**Description:** Extend the `ViewBuffer` to natively support both string-based entries and pooled UTF-8 byte regions. Rather than storing individual `byte[]` per literal, UTF-8 byte data is appended to pooled `byte[]` pages (from `ArrayPool<byte>`) and tracked as byte-range entries in the existing `ViewBufferValue[]` page structure. `ViewBufferValue` would support a third variant representing a byte range (offset + length into a pooled byte page). During the flush at the end of rendering (not implicitly during writes), byte entries are written directly to the underlying stream while string entries go through the normal `TextWriter` path.

**Changes required:**
- Extend `ViewBufferValue` to support a third variant: a reference to a byte region in a pooled page. Options:
  - Add a new struct `Utf8ByteSegment` (holding `byte[] page`, `int offset`, `int length`) and store it as the `object Value` in `ViewBufferValue` (would box)
  - Alternatively, redesign `ViewBufferValue` as a proper tagged union with explicit fields for each type (avoids boxing, but changes struct layout)
- Add pooled `byte[]` pages to `IViewBufferScope` / `MemoryPoolViewBufferScope` (using `ArrayPool<byte>`)
- Add `AppendUtf8(ReadOnlySpan<byte>)` to `ViewBuffer` that copies bytes into the current pooled byte page and appends a byte-range `ViewBufferValue`
- Add `WriteLiteral(ReadOnlySpan<byte>)` to `RazorPageBase` that calls `ViewBuffer.AppendUtf8()`
- Add a method to `ViewBufferTextWriter` to forward `WriteLiteral(ReadOnlySpan<byte>)` to the underlying buffer
- Modify `ViewBuffer.WriteTo`/`WriteToAsync` to handle byte entries during the **existing flush points only** (end of layout rendering, explicit `FlushAsync`):
  - Flush the intermediate `TextWriter` (char buffer) before writing byte entries directly to the underlying `Stream`
  - Resume `TextWriter` writing for subsequent string entries
  - **No new implicit flushes** — bytes are fully buffered just like strings and only written to the response at the same points
- Modify `ViewBuffer.CopyTo`/`MoveTo` to handle byte-range entries (copy byte data between buffers for layout scenarios)
- The flush path needs access to both `TextWriter` (for strings) and raw `Stream` (for bytes); pass `Stream` as additional parameter to `WriteToAsync`

**Pros:**
- Maximum performance: UTF-8 byte literals skip all `string` → `char[]` → `byte[]` conversion
- Pooled byte pages avoid per-literal `byte[]` allocations; data is copied into `ArrayPool<byte>`-backed pages (amortized)
- Existing string-based content continues to work unchanged
- Incremental adoption: views that don't opt in are unaffected
- No changes to implicit flushing behavior — bytes are buffered identically to strings
- Layout/section rendering works via byte-aware `CopyTo`/`MoveTo`

**Cons:**
- **Significantly increases complexity** of `ViewBuffer`, `ViewBufferValue`, and flush logic
- `ViewBufferValue.Value` is currently `object` — storing byte ranges efficiently may require redesigning the struct (or accepting boxing overhead for a wrapper struct)
- The flush path needs access to both `TextWriter` and raw `Stream`, requiring `WriteToAsync` signature changes
- **Non-UTF-8 encoding scenario**: byte entries must be decoded from UTF-8 to string and re-encoded to the target encoding (fallback path)
- `ViewBuffer` implements `IHtmlContentBuilder` which has no concept of bytes; internal methods needed outside the interface
- `byte[]` page lifecycle management adds complexity to `MemoryPoolViewBufferScope`
- `MoveTo` between `ViewBuffer` instances (used in layout rendering) must correctly transfer byte-range entries along with their backing byte pages

### Approach B: Parallel Byte-Based Pipeline

**Description:** Create an entirely separate output pipeline for UTF-8 byte content that bypasses `TextWriter` entirely. When UTF-8 HTML literals are enabled, the view would write to a byte-based buffer (`IBufferWriter<byte>` or pooled `byte[]` pages) for literal content, and switch to the string-based `TextWriter` for dynamic content (which would be encoded to UTF-8 and written to the same byte stream at flush time).

**Changes required:**
- Create a new byte-oriented buffer (e.g., `ViewByteBuffer` backed by pooled `byte[]` pages from `ArrayPool<byte>`)
- Add `WriteLiteral(ReadOnlySpan<byte>)` to `RazorPageBase` that writes to the byte buffer
- Modify the view execution pipeline to expose both a `TextWriter` (for `Write()`/`WriteLiteral(string)`) and a byte writer (for `WriteLiteral(ReadOnlySpan<byte>)`)
- Create an ordering mechanism (e.g., a sequence of "write string region N" / "write byte region M" commands) to maintain correct interleaving between string and byte content during flush
- All content (both string and byte) remains **fully buffered** until the existing flush points (end of layout rendering or explicit `FlushAsync`) — no implicit flushing
- Modify layout/section rendering to handle the new buffer type alongside existing `ViewBuffer`
- Create new `IViewByteBufferScope` for pooling byte pages

**Pros:**
- Clean separation of concerns: string and byte data have independent buffer structures
- Pooled byte pages avoid per-literal allocations
- Could potentially allow the entire response to be byte-oriented in the future
- No modification to existing `ViewBuffer`/`ViewBufferValue` types
- No changes to implicit flushing behavior

**Cons:**
- **Massive complexity** of maintaining two parallel pipelines and an ordering/interleaving mechanism
- The interleaving sequence must be carefully maintained as a view alternates between `WriteLiteral(span)` for HTML and `Write(string)` for dynamic content
- Layout rendering, sections, partial views all need dual-pipeline support and correct sequencing
- Much larger API surface area to maintain
- Higher risk of subtle ordering bugs
- Far more invasive than Approach A

### Approach C: IHtmlContent Wrapper with Pooled Byte Backing Store

**Description:** Store UTF-8 bytes in the `ViewBuffer` as `IHtmlContent` wrapper entries, leveraging the existing `ViewBuffer` infrastructure. Instead of allocating a `byte[]` per literal, UTF-8 data is copied into pooled `byte[]` pages managed by `IViewBufferScope`. The wrapper object holds a reference to the byte page, offset, and length. During the existing flush path, the wrapper's `WriteTo` method handles writing the bytes. No structural changes to `ViewBufferValue`, `ViewBufferPage`, or `ViewBuffer` page management are needed.

**Changes required:**
- Create `Utf8LiteralContent : IHtmlContent` class holding a reference to a pooled `byte[]` page, offset, and length
- Add byte page pooling to `IViewBufferScope` / `MemoryPoolViewBufferScope` (using `ArrayPool<byte>`)
- Add `WriteLiteral(ReadOnlySpan<byte>)` to `RazorPageBase`:
  1. Copies bytes into the current pooled byte page via `IViewBufferScope`
  2. Creates `Utf8LiteralContent` with page reference + offset + length
  3. Calls `ViewBuffer.AppendHtml(content)` — stored as `IHtmlContent` in existing structure
- Modify `ViewBuffer.WriteToAsync` to detect `Utf8LiteralContent` (or a new `IUtf8HtmlContent` interface):
  - When UTF-8 encoding is active: flush the intermediate `TextWriter`'s char buffer, then write the bytes directly to the underlying `Stream`, then resume `TextWriter` writes. This happens **only during the existing flush points** (end of layout rendering, explicit `FlushAsync`).
  - When non-UTF-8 encoding: call the fallback `WriteTo(TextWriter, HtmlEncoder)` which decodes UTF-8 → string → writes through `TextWriter`
- `Utf8LiteralContent.WriteTo(TextWriter, HtmlEncoder)` fallback: `writer.Write(Encoding.UTF8.GetString(page, offset, length))`
- Thread the `Stream`/`PipeWriter` into the `WriteToAsync` path (pass as additional parameter or via a context object)

**Pros:**
- **No structural changes** to `ViewBuffer`, `ViewBufferValue`, `ViewBufferPage` — bytes stored as `IHtmlContent`
- Layout buffering, sections, `CopyTo`/`MoveTo` all work without changes
- Pooled byte pages avoid per-literal `byte[]` allocations
- Graceful fallback for non-UTF-8 responses
- New types are internal, minimal API surface impact
- **No changes to implicit flushing** — all data remains buffered until existing flush points

**Cons:**
- Each `WriteLiteral(ReadOnlySpan<byte>)` call allocates a small `Utf8LiteralContent` wrapper object (class instance on heap). For views with many small HTML literals, this could add GC pressure. Could be mitigated by pooling the wrapper objects or using a struct-based approach.
- The `WriteTo(TextWriter, HtmlEncoder)` on `IHtmlContent` only exposes `TextWriter` — writing raw bytes requires additional information about the underlying stream. `ViewBuffer.WriteToAsync` must be the orchestrator that knows about both `TextWriter` and `Stream`, selectively calling the byte-optimized path instead of `WriteTo` for `IUtf8HtmlContent` entries.
- The `PagedBufferedTextWriter` layer between `ViewBuffer.WriteToAsync` and `HttpResponseStreamWriter` must be flushed before raw byte writes and resumed after, which adds complexity to the flush orchestration
- Type checking (`is IUtf8HtmlContent` or `is Utf8LiteralContent`) for each buffer entry during flush adds minor overhead

### Approach D: ViewBuffer with Pooled Byte Pages + Discriminated ViewBufferValue

**Description:** A variant of Approach A that redesigns `ViewBufferValue` as a proper discriminated union to avoid the boxing overhead of storing byte-range metadata in the `object Value` field. The struct would have explicit fields for the type tag, string reference, `IHtmlContent` reference, and byte-range data (page reference, offset, length). Pooled byte pages (from `ArrayPool<byte>`) store the actual byte data. No per-literal `byte[]` allocations, no wrapper object allocations, no boxing.

**Changes required:**
- Redesign `ViewBufferValue` as a discriminated union struct:
  ```csharp
  internal readonly struct ViewBufferValue
  {
      private readonly object _objectValue; // string or IHtmlContent
      private readonly byte[] _bytePage;     // pooled byte page (null for non-byte entries)
      private readonly int _byteOffset;
      private readonly int _byteLength;
      
      // Constructor overloads for string, IHtmlContent, and byte range
  }
  ```
- Add pooled `byte[]` pages to `IViewBufferScope` / `MemoryPoolViewBufferScope`
- Add `AppendUtf8(ReadOnlySpan<byte>)` to `ViewBuffer` and `ViewBufferTextWriter`
- Add `WriteLiteral(ReadOnlySpan<byte>)` to `RazorPageBase`
- Modify `ViewBuffer.WriteTo`/`WriteToAsync` to handle byte entries during existing flush points
- Modify `CopyTo`/`MoveTo` for byte entry handling
- Thread `Stream` access through flush path

**Pros:**
- **Zero per-request allocations** for the UTF-8 path (no wrapper objects, no boxing, no per-literal `byte[]`)
- Maximum performance: bytes go directly from pooled pages to response stream
- Pooled byte pages via `ArrayPool<byte>` minimize allocations
- Clean discriminated union avoids type-checking overhead during flush (tag check instead)
- No changes to implicit flushing behavior

**Cons:**
- `ViewBufferValue` struct size increases (additional fields for byte page, offset, length) — from ~8 bytes (single `object` reference) to ~24+ bytes. Since `ViewBufferValue[]` pages can hold 256 entries, this increases per-page memory from ~2KB to ~6KB
- **Breaking change to public `ViewBufferValue` type** — it's a `public` struct with a `public object Value` property. Changing its layout requires careful API review
- More complex than Approach C: requires changes to `ViewBuffer`, `ViewBufferValue`, and all buffer consumers
- Still requires `Stream` threading through flush path
- `CopyTo`/`MoveTo` must handle byte pages carefully (copy byte data or transfer page ownership)

### Approach E: Approach C with Wrapper Object Pooling

**Description:** A refinement of Approach C that addresses the per-literal wrapper allocation concern. Instead of allocating a new `Utf8LiteralContent` object per `WriteLiteral(ReadOnlySpan<byte>)` call, use an object pool (or embed the byte-range tracking directly in the pooled byte page structure). This combines the simplicity of Approach C (no `ViewBuffer` structural changes) with the allocation efficiency of Approach D.

**Changes required:**
- Same as Approach C, but with pooled `Utf8LiteralContent` instances from an `ObjectPool<Utf8LiteralContent>` managed by `IViewBufferScope`
- `Utf8LiteralContent` becomes a mutable class that is initialized/reset for each use and returned to the pool when the `ViewBuffer` is cleared
- Alternatively: use a single `Utf8ByteRegionTracker` per `ViewBuffer` that acts as the `IHtmlContent` and uses an index to look up its byte range at write time

**Pros:**
- No structural changes to `ViewBuffer`, `ViewBufferValue`, `ViewBufferPage`
- No per-request allocations: wrapper objects are pooled alongside byte pages
- Layout buffering, `CopyTo`/`MoveTo` work with existing `IHtmlContent` handling
- No changes to implicit flushing behavior
- Simpler than Approach D (no `ViewBufferValue` redesign)

**Cons:**
- Object pooling adds lifecycle management complexity (must return objects to pool at correct time)
- Mutable wrapper objects must be carefully managed during `MoveTo` between buffers (e.g., during layout rendering)
- Still needs `Stream` threading through flush path
- Type checking during flush still needed (though this is a minor cost)

## Non-UTF-8 Encoding Considerations

If the response encoding is not UTF-8 (e.g., `text/html; charset=iso-8859-1`), the pre-encoded UTF-8 byte literals cannot be written directly to the response. The fallback must:

1. Decode the UTF-8 bytes back to a `string` (`Encoding.UTF8.GetString(bytes)`)
2. Write the `string` through the normal `TextWriter` path, which will encode it to the target encoding

This means the non-UTF-8 path would actually be **slower** than the current string-based path (extra UTF-8 decode step). This is acceptable because:
- Non-UTF-8 encoding is extremely rare in modern web applications
- The `@utf8HtmlLiterals` directive is opt-in; users who opt in are presumably targeting UTF-8
- The feature could emit a runtime warning if used with non-UTF-8 encoding
- The fallback still produces correct output, just with reduced performance

## ReadOnlySpan\<byte\> Storage and Allocation Strategy

`ReadOnlySpan<byte>` is a stack-only type and cannot be stored in heap-allocated structures like `ViewBuffer`. The data must be copied somewhere for buffering. The key design constraint is **no per-literal `byte[]` allocations per request**. The approaches fall into two categories:

### Pooled Byte Pages (Recommended)

Analogous to how `ViewBufferValue[]` pages are pooled via `ArrayPool<ViewBufferValue>`, byte data should be stored in pooled `byte[]` pages from `ArrayPool<byte>`. Each `WriteLiteral(ReadOnlySpan<byte>)` call copies the span data into the current byte page. When the page fills up, a new page is rented from the pool. This gives:
- **Zero allocations per request** (pages are rented/returned to pool)
- **Low overhead per literal**: just a `Span.CopyTo` into the pooled page
- **Amortized cost**: many small literals share a single pooled page
- **Simple lifecycle**: pages returned to pool when `ViewBuffer` is cleared (same as existing `ViewBufferValue[]` pages)

The `IViewBufferScope` / `MemoryPoolViewBufferScope` would be extended to manage `byte[]` pages alongside the existing `ViewBufferValue[]` pages.

### Compiler-Generated Static Fields (Complementary Optimization)

As a separate optimization (requiring compiler changes), the Razor compiler could emit static `byte[]` fields instead of inline `u8` literals:
```csharp
private static readonly byte[] __htmlLiteral_0 = "..."u8.ToArray();
// ...
WriteLiteral(__htmlLiteral_0.AsSpan());
```

This would allow `WriteLiteral` to receive a span backed by a static `byte[]`, and the implementation could potentially detect this case and store just a reference to the static array (avoiding the copy into pooled pages entirely for static data). However, this is a secondary optimization that can be pursued independently.

## Flush Path and Exception Handling Implications

### Current Behavior (Must Be Preserved)

Currently, all view output is buffered in the `ViewBuffer` during page execution. No bytes are written to the HTTP response until:
1. The user explicitly calls `FlushAsync()` from within the view
2. `RazorView.RenderLayoutAsync` completes and calls `ViewBuffer.WriteToAsync` at the very end

This means if an exception occurs during view rendering, the response has not yet started, and the framework can:
- Set a different HTTP status code (e.g., 500)
- Clear any headers
- Render an error page instead
- Return a completely different response

### Constraint for UTF-8 Byte Support

**Any approach for UTF-8 byte support must NOT introduce new implicit flush points.** Specifically:
- `WriteLiteral(ReadOnlySpan<byte>)` must only buffer data, never write to the response
- The transition from "string buffer entry" to "byte buffer entry" must not trigger a flush
- Byte data must remain fully buffered until the same flush points that exist today
- During the final flush (`WriteToAsync`), interleaving between string entries (via `TextWriter`) and byte entries (via direct `Stream` write) requires flushing the intermediate `TextWriter`'s internal char buffer — but this is **not** flushing to the HTTP response; it's flushing from the `PagedBufferedTextWriter`'s char pages to `HttpResponseStreamWriter`, which then encodes and writes to the response stream. The distinction is important: the `HttpResponseStreamWriter` itself has a byte buffer that may or may not have been flushed to `response.Body`. Interleaving raw bytes with `TextWriter` output requires ensuring the `TextWriter`'s internal byte buffer is flushed to the stream before writing raw bytes, and this creates a series of small writes to the response stream rather than a single large write. This is an implementation detail of the flush path, not a behavior change visible to the view developer.

## Approach Comparison Matrix

| Criterion | A: Hybrid Buffer | B: Parallel Pipeline | C: IHtmlContent Wrapper | D: Discriminated Union | E: C + Object Pooling |
|-----------|:-:|:-:|:-:|:-:|:-:|
| Per-literal allocation | None (pooled pages) | None (pooled pages) | 1 wrapper object | None | None (pooled wrappers) |
| Per-literal byte copy | Yes (into pool) | Yes (into pool) | Yes (into pool) | Yes (into pool) | Yes (into pool) |
| ViewBuffer changes | Major | None | None | Major | None |
| ViewBufferValue changes | Major | None | None | Major (public type!) | None |
| Flush behavior change | None | None | None | None | None |
| Layout/section support | Needs updates | Needs new pipeline | Works as-is | Needs updates | Works as-is |
| Non-UTF-8 fallback | Decode + re-encode | Decode + re-encode | Decode + re-encode | Decode + re-encode | Decode + re-encode |
| Implementation complexity | High | Very High | Medium | High | Medium-High |
| Runtime perf (UTF-8) | Best | Best | Good | Best | Good |
| Type-check during flush | Tag check | N/A | `is` check per entry | Tag check | `is` check per entry |

## Recommended Approaches for Further Investigation

### Primary Recommendation: Approach C or E (IHtmlContent Wrapper with Pooled Byte Pages)

**Approach C** (or its object-pooling refinement, **Approach E**) offers the best trade-off of implementation pragmatism vs. performance benefit:

- **No structural changes** to `ViewBuffer`, `ViewBufferValue`, `ViewBufferPage` — the existing buffering infrastructure works as-is
- Layout pages, sections, partial views, `CopyTo`/`MoveTo` all continue to work without modification
- UTF-8 byte data is stored in pooled `byte[]` pages (via `ArrayPool<byte>`, managed by `IViewBufferScope`), avoiding per-literal allocations
- **No changes to implicit flushing behavior** — all data remains buffered until existing flush points
- Graceful fallback for non-UTF-8 responses via `WriteTo(TextWriter, HtmlEncoder)` decoding
- Internal implementation: the only public API change is `WriteLiteral(ReadOnlySpan<byte>)` on `RazorPageBase`

The main cost is one `Utf8LiteralContent` wrapper object allocation per HTML literal per request (for Approach C), which Approach E mitigates via object pooling.

### Secondary Recommendation: Approach A or D (Dual-Type ViewBuffer)

If benchmarking shows that the wrapper object overhead in Approach C/E is significant, **Approach A** (or its discriminated-union refinement, **Approach D**) would eliminate that overhead entirely. However:
- **Approach D modifies the public `ViewBufferValue` struct** — this is a breaking API change that would need careful review
- **Approach A** could potentially use `byte[]` as the `object Value` to avoid boxing, since `byte[]` is a reference type, but this loses the ability to distinguish byte-range metadata from other `IHtmlContent` values without additional tracking

The dual-type approaches should be considered only if the simpler wrapper-based approach proves insufficient in benchmarks.

## Key Questions for Further Investigation

1. **Flush path interleaving**: During `ViewBuffer.WriteToAsync`, switching between `TextWriter` writes (for string entries) and direct `Stream` writes (for byte entries) requires flushing the intermediate `PagedBufferedTextWriter` → `HttpResponseStreamWriter` chain between each transition. How many transitions occur in a typical view, and what is the overhead of these intermediate flushes? (These are internal buffer flushes, not response flushes — no bytes leave `HttpResponseStreamWriter`'s internal buffer.)

2. **`ViewBufferValue` as `public` type**: `ViewBufferValue` is a public struct. Approach D would change its layout. Is this type used by third-party libraries? Could we obsolete the current struct and introduce a new internal one?

3. **Benchmarking**: What is the actual performance benefit? Need to measure:
   - Current cost of UTF-16 string → UTF-8 encoding in `HttpResponseStreamWriter` for typical view HTML literals
   - Cost of proposed approach: copy span → pooled byte page → direct stream write during flush
   - Overhead of wrapper object allocation/pooling (Approach C/E) vs. zero-allocation discriminated union (Approach A/D)
   - Impact on layout page scenarios (buffering + re-writing)
   - Impact of intermediate buffer flushes during string/byte interleaving

4. **Tag Helpers, View Components, Partial Views**: These all participate in the `ViewBuffer` system. `WriteLiteral(ReadOnlySpan<byte>)` would only be called from Razor-compiled code, not from these components (which generate string-based HTML). No changes should be needed, but this should be verified.

5. **Compiler code generation**: Should the Razor compiler continue to emit `WriteLiteral("..."u8)` (inline `ReadOnlySpan<byte>`) or should it emit static `byte[]` fields to avoid the copy into pooled pages? The latter is a separate optimization that could be explored independently.

6. **API surface**: Should `IUtf8HtmlContent` (or similar) be a public interface to allow third-party components to also produce UTF-8 byte content, or should it remain internal?

## Files of Interest

| File | Path | Notes |
|------|------|-------|
| RazorPageBase | `src/Mvc/Mvc.Razor/src/RazorPageBase.cs` | Add `WriteLiteral(ReadOnlySpan<byte>)` |
| ViewBuffer | `src/Mvc/Mvc.ViewFeatures/src/Buffers/ViewBuffer.cs` | Core buffering, may need byte support |
| ViewBufferValue | `src/Mvc/Mvc.ViewFeatures/src/Buffers/ViewBufferValue.cs` | Union type, currently string \| IHtmlContent |
| ViewBufferTextWriter | `src/Mvc/Mvc.ViewFeatures/src/Buffers/ViewBufferTextWriter.cs` | May need byte write method |
| ViewBufferPage | `src/Mvc/Mvc.ViewFeatures/src/Buffers/ViewBufferPage.cs` | Page storage |
| IViewBufferScope | `src/Mvc/Mvc.ViewFeatures/src/Buffers/IViewBufferScope.cs` | Pooling |
| MemoryPoolViewBufferScope | `src/Mvc/Mvc.ViewFeatures/src/Buffers/MemoryPoolViewBufferScope.cs` | Pooling impl |
| PagedBufferedTextWriter | `src/Mvc/Mvc.ViewFeatures/src/Buffers/PagedBufferedTextWriter.cs` | Intermediate buffer |
| RazorView | `src/Mvc/Mvc.Razor/src/RazorView.cs` | View execution, flush orchestration |
| ViewExecutor | `src/Mvc/Mvc.ViewFeatures/src/ViewExecutor.cs` | Creates response writer |
| HttpResponseStreamWriter | `src/Http/WebUtilities/src/HttpResponseStreamWriter.cs` | Final char→byte encoding |
| IHtmlContent | `src/Html.Abstractions/src/IHtmlContent.cs` | Core interface |
| HtmlString | `src/Html.Abstractions/src/HtmlString.cs` | Pre-encoded HTML string |


Type	Role
`RazorPageBase`	View base class with `WriteLiteral(string)` and `Write(string)` methods
`ViewBuffer`	`IHtmlContentBuilder` backed by pooled `ViewBufferValue[]` pages
`ViewBufferValue`	Union struct holding either `string` (pre-encoded HTML) or `IHtmlContent`
`ViewBufferTextWriter`	`TextWriter` that buffers into `ViewBuffer`, flushes to inner writer
`ViewBufferPage`	A single page of `ViewBufferValue[]` with count/capacity tracking
`IViewBufferScope`	Manages pooling of `ViewBufferValue[]` arrays
`PagedBufferedTextWriter`	Char-buffering layer between `ViewBuffer` and `HttpResponseStreamWriter`
`HttpResponseStreamWriter`	Final `TextWriter` that encodes chars→bytes via `Encoding.GetEncoder()`
`ViewExecutor`	Creates the `HttpResponseStreamWriter` with resolved encoding (default UTF-8)

Criterion	A: Hybrid Buffer	B: Parallel Pipeline	C: IHtmlContent Wrapper	D: Discriminated Union	E: C + Object Pooling
Per-literal allocation	None (pooled pages)	None (pooled pages)	1 wrapper object	None	None (pooled wrappers)
Per-literal byte copy	Yes (into pool)	Yes (into pool)	Yes (into pool)	Yes (into pool)	Yes (into pool)
ViewBuffer changes	Major	None	None	Major	None
ViewBufferValue changes	Major	None	None	Major (public type!)	None
Flush behavior change	None	None	None	None	None
Layout/section support	Needs updates	Needs new pipeline	Works as-is	Needs updates	Works as-is
Non-UTF-8 fallback	Decode + re-encode	Decode + re-encode	Decode + re-encode	Decode + re-encode	Decode + re-encode
Implementation complexity	High	Very High	Medium	High	Medium-High
Runtime perf (UTF-8)	Best	Best	Good	Best	Good
Type-check during flush	Tag check	N/A	`is` check per entry	Tag check	`is` check per entry

File	Path	Notes
RazorPageBase	`src/Mvc/Mvc.Razor/src/RazorPageBase.cs`	Add `WriteLiteral(ReadOnlySpan<byte>)`
ViewBuffer	`src/Mvc/Mvc.ViewFeatures/src/Buffers/ViewBuffer.cs`	Core buffering, may need byte support
ViewBufferValue	`src/Mvc/Mvc.ViewFeatures/src/Buffers/ViewBufferValue.cs`	Union type, currently string \| IHtmlContent
ViewBufferTextWriter	`src/Mvc/Mvc.ViewFeatures/src/Buffers/ViewBufferTextWriter.cs`	May need byte write method
ViewBufferPage	`src/Mvc/Mvc.ViewFeatures/src/Buffers/ViewBufferPage.cs`	Page storage
IViewBufferScope	`src/Mvc/Mvc.ViewFeatures/src/Buffers/IViewBufferScope.cs`	Pooling
MemoryPoolViewBufferScope	`src/Mvc/Mvc.ViewFeatures/src/Buffers/MemoryPoolViewBufferScope.cs`	Pooling impl
PagedBufferedTextWriter	`src/Mvc/Mvc.ViewFeatures/src/Buffers/PagedBufferedTextWriter.cs`	Intermediate buffer
RazorView	`src/Mvc/Mvc.Razor/src/RazorView.cs`	View execution, flush orchestration
ViewExecutor	`src/Mvc/Mvc.ViewFeatures/src/ViewExecutor.cs`	Creates response writer
HttpResponseStreamWriter	`src/Http/WebUtilities/src/HttpResponseStreamWriter.cs`	Final char→byte encoding
IHtmlContent	`src/Html.Abstractions/src/IHtmlContent.cs`	Core interface
HtmlString	`src/Html.Abstractions/src/HtmlString.cs`	Pre-encoded HTML string

Uh oh!

Investigate MVC view buffering subsystem changes to support UTF-8 HTML literal bytes end-to-end #65605

Description

Investigation: UTF8 HTML Literal Support for MVC Razor Views

Problem Statement

Current Architecture Analysis

The Write Pipeline (summary)

Key Types

Critical Observations

Critical Constraints

No Implicit Flushing Behavior Changes

Pooled Byte Buffers (No Per-Literal Allocations)

Approaches

Approach A: Dual-Type ViewBuffer (Hybrid String + Byte Pages)

Approach B: Parallel Byte-Based Pipeline

Approach C: IHtmlContent Wrapper with Pooled Byte Backing Store

Approach D: ViewBuffer with Pooled Byte Pages + Discriminated ViewBufferValue

Approach E: Approach C with Wrapper Object Pooling

Non-UTF-8 Encoding Considerations

ReadOnlySpan<byte> Storage and Allocation Strategy

Pooled Byte Pages (Recommended)

Compiler-Generated Static Fields (Complementary Optimization)

Flush Path and Exception Handling Implications

Current Behavior (Must Be Preserved)

Constraint for UTF-8 Byte Support

Approach Comparison Matrix

Recommended Approaches for Further Investigation

Primary Recommendation: Approach C or E (IHtmlContent Wrapper with Pooled Byte Pages)

Secondary Recommendation: Approach A or D (Dual-Type ViewBuffer)

Key Questions for Further Investigation

Files of Interest

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions