[wip] emulated SMMU#3458
Draft
jstarks wants to merge 28 commits into
Draft
Conversation
Replace the GICv2m MSI controller with KVM's in-kernel GICv3 ITS for aarch64 PCIe MSI/MSI-X delivery. GICv2m maps MSI writes to a fixed pool of 64 SPIs, which doesn't scale (a single NVMe device with 128 queues exhausts it) and is incompatible with the ITS-based device ID model needed for future SMMU support. The ITS routes MSIs via LPIs using (DeviceID, EventID) lookup, supporting thousands of interrupt vectors across all devices. KVM provides a complete in-kernel ITS (KVM_DEV_TYPE_ARM_VGIC_ITS) that handles all guest MMIO and command queue processing. The VMM creates the device, sets its base address, and initializes it. For emulated devices, MSIs are injected via KVM_SIGNAL_MSI with KVM_MSI_VALID_DEVID. For irqfd (VFIO passthrough), the kvm_irq_routing_msi entry carries the devid so the kernel signals the ITS directly. The main design challenge is that PCIe devices don't know their own requester ID (bus/device/function), since bus numbers are assigned dynamically by guest firmware. This is solved with a per-device AssignedBusRange that the PCIe port updates atomically when the guest programs secondary/subordinate bus numbers. ITS wrappers (ItsSignalMsi, ItsIrqFd) compose the full 32-bit device ID as (segment << 16 | BDF) at interrupt delivery time, transparent to the devices themselves. The SignalMsi trait changes from `rid: u32` (always passed as 0) to `devid: Option<u32>`, and IrqFdRoute::enable gains a matching parameter. This is a mechanical change across all backends (KVM, WHP, MSHV, HVF). Also adds ACPI IORT (IO Remapping Table) generation for aarch64, with ITS Group and PCI Root Complex nodes with ID mappings. The MADT gains a GIC ITS entry. DeviceTree generation emits an ITS child node under the GIC when ITS is configured, with msi-parent on PCIe host bridges pointing to the ITS phandle instead of v2m. ITS support is probed at KVM init time via KVM_CREATE_DEVICE_TEST, falling back to GICv2m on kernels or hardware without ITS. A --gic-msi CLI option (auto/its/v2m) allows overriding the default selection. GICv2m remains available for GICv2-only configurations.
Add the smmu crate with spec-derived type definitions for the Arm SMMUv3 architecture (IHI 0070). This is the foundation for the SMMU emulator — all subsequent sub-phases import these types. The spec module contains: - registers: MMIO register offsets and bitfield types (IDR0-5, CR0/CR0ACK, STRTAB_BASE, CMDQ/EVTQ base/prod/cons, IRQ_CTRL, GERROR, MSI config) - commands: command queue entry format and per-command bitfield types (CFGI_STE, CFGI_CD, TLBI_*, CMD_SYNC with MSI completion) - events: event queue entry with typed header/flags/address fields and convenience constructors for fault events - ste: stream table entry with typed quadword fields (SteDw0, SteDw1) and SteConfig/S1Fmt/Strw enums - cd: context descriptor with typed quadword fields (CdDw0, CdDw1) and Tg0/Ips enums with helper methods - pt: AArch64 VMSAv8 stage 1 page table descriptor with AP/shareability enums and output address extraction for all granule sizes All types use zerocopy derives for safe guest memory access and bitfield_struct for field-level access. 65 unit tests verify bitfield round-trips, spec constant values, and address encoding.
Unknown IPS or TG0 values from the guest are malformed CD entries. Returning a silent default would cause incorrect translation. Return Option instead so callers can generate C_BAD_CD fault events.
Implement SmmuDevice as a ChipsetDevice with MmioIntercept for the SMMUv3 register file. The device emulates a 128KB MMIO region (page 0 + page 1) with all registers needed for the Linux SMMUv3 driver's hw_probe and enable sequence. IDR registers report: S1P, AArch64 TTF, COHACC, ASID16, MSI, LE endian, linear stream table, 16-bit SIDSIZE, 40-bit OAS, 4K granule, SMMUv3.3. CR0/CR0ACK echo protocol with immediate acknowledge. IRQ_CTRL/IRQ_CTRLACK echo. GBPA with UPDATE bit auto-clear. GERROR/GERRORN toggle protocol. Stream table base, command queue base/prod/cons, event queue base/prod/cons, and per-queue MSI config registers all readable and writable. 15 unit tests covering IDR readback, CR0 enable sequence, STRTAB_BASE round-trip, IRQ_CTRL ACK, GBPA update bit, page 1 access, read-only register write rejection, MSI config registers, and invalid access sizes.
Implement CMDQ consumption: when the guest writes CMDQ_PROD, the emulator processes all pending commands up to PROD, advancing CONS. Supported commands: PREFETCH_CFG, CFGI_STE, CFGI_STE_RANGE, CFGI_CD, CFGI_CD_ALL, TLBI_NH_ALL, TLBI_NH_ASID, TLBI_NH_VA, TLBI_NH_VAA, TLBI_S12_VMALL, TLBI_NSNH_ALL (all no-ops for now), and CMD_SYNC. CMD_SYNC with CS=SIG_IRQ writes MSI data to the MSI address in guest memory, which is how Linux detects command completion. Unknown opcodes trigger CERROR_ILL in CMDQ_CONS and toggle GERROR.CMDQ_ERR, halting further processing until the guest acknowledges the error. 7 new unit tests (87 total) covering basic consumption, MSI sync writes, queue wrapping, unknown opcode error handling, the Linux reset sequence, error-stops-processing semantics, and disabled-CMDQ behavior.
Implement EVTQ write logic: write_event() appends a 32-byte event record to the guest's event queue, advances EVTQ_PROD, and fires the EVTQ MSI interrupt if enabled via IRQ_CTRL.EVENTQ_IRQEN. When the queue is full (PROD and CONS differ only in the wrap bit), toggles GERROR.EVTQ_ABT_ERR and drops the event. EVTQ_CONS on page 1 is updated by the guest to signal consumed events, freeing queue space. 4 new unit tests (91 total) covering event write and read-back, MSI signaling, queue full behavior, and CONS freeing space.
Implement sub-phases 1E and 1F of the SMMU emulator plan: 1E: Stream table and context descriptor parsing - lookup_ste(): reads STE from guest memory, validates V bit, checks SID range - ste_config_action(): dispatches on STE.Config (abort/bypass/S1 translate) - lookup_cd(): reads CD from guest memory, validates V and AA64 bits - translation_context(): extracts page table parameters (TTB0, T0SZ, TG0, IPS) 1F: AArch64 VMSAv8 stage 1 page table walker - walk_s1(): walks 4K-granule page tables from TTB0 through up to 4 levels - Supports block descriptors (1GB at L1, 2MB at L2) and page descriptors (L3) - Permission checking (AP bits for write access) - Access flag checking (AF=0 produces F_ACCESS fault) - Output address size checking against IPS/OAS - Input IOVA range checking against T0SZ 14 new unit tests (120 total).
Add FromBytes, IntoBytes, Immutable, KnownLayout derives to PtDesc so it can be used directly with GuestMemory::read_plain() instead of reading a raw u64 and converting. Also fix a manual_div_ceil clippy lint.
Store IDR0, IDR1, IDR5 as their proper bitfield types instead of raw u32, and add Inspect derives to those types. Replace the bespoke inspect_u32_hex/inspect_u64_hex helpers with #[inspect(hex)] for the remaining raw integer fields.
Add SmmuSharedState, SmmuTranslatingMemory, and SmmuSignalMsi for per-device IOVA translation through the SMMUv3 emulator. SmmuSharedState holds the SMMU configuration (stream table base, enable state) behind an RwLock, allowing concurrent translations (read path) while register writes are exclusive. SmmuDevice creates and owns the shared state, syncing CR0.SMMUEN and STRTAB_BASE/CFG changes to it. SmmuTranslatingMemory implements GuestMemoryAccess with mapping()=None, routing all reads/writes through the SMMU page table walker. It derives the stream ID at translation time from AssignedBusRange (shared with the ITS wrappers and PCIe port). Page-crossing accesses are split at 4K boundaries with independent translations per page. SmmuSignalMsi translates the MSI address (which may be an IOVA when Linux maps doorbells via iommu_dma_prepare_msi) before forwarding to the inner SignalMsi target. Device identity (devid) is passed through unchanged. The factory method create_device_context() produces paired (GuestMemory, SmmuSignalMsi) wrappers for each PCI device behind the SMMU. 14 new tests covering translated read/write, page crossing, bypass, abort, unmapped fault, unassigned bus, disabled SMMU, MSI translation, MSI bypass, MSI fault, and devid passthrough. 134 total tests pass.
Add IortSmmuV3 struct to acpi_spec for the IORT SMMUv3 node (type 0x04, revision 4). Extend the IORT builder to optionally insert an SMMUv3 node between PCI root complexes and the ITS group when smmu_base is configured on AcpiTablesBuilder. When SMMU is present: RC → SMMUv3 → ITS Group. The SMMU node has COHACC set, generic model, zero GSIVs (MSI mode). Each RC's ID mapping targets the SMMUv3 node with (segment << 16) output_base. The SMMUv3 node's ID mapping targets the ITS group with a full identity map. When SMMU is not configured, the existing RC → ITS Group path is unchanged. Add DEFAULT_SMMU_BASE (0xEFFA_0000) address constant below the ITS region. Four new IORT tests: smmu+its topology, multi-RC with SMMU, no-SMMU regression, and SMMUv3 node field verification.
Move smmu_base from AcpiTablesBuilder (arch-neutral) into AcpiArchConfig::Aarch64 as smmu_bases: Vec<u64>. This correctly scopes SMMU configuration to aarch64 only, and supports multiple SMMU instances (each with its own MMIO base). The IORT builder creates one SMMUv3 node per entry. Currently all root complexes map to the first SMMU. Per-RC SMMU assignment can be added when needed. x86 construction sites no longer need to specify smmu_base: None.
Replace smmu_bases: Vec<u64> with smmu_base: Option<u64> in AcpiArchConfig::Aarch64. Multiple SMMUs can be added when per-RC SMMU assignment is actually needed.
Add --smmu CLI flag and wire SmmuDevice into the aarch64 chipset. When enabled, each PCIe device gets SmmuTranslatingMemory for DMA and SmmuSignalMsi for MSI address translation. The IORT table includes the SMMUv3 node in the RC→SMMUv3→ITS chain. SmmuDevice gets ChangeDeviceState and SaveRestore (not-supported) impls required by VmmChipsetDevice.
…ase 1J.1) Add a comprehensive integration-style unit test that exercises the complete SMMU stack through the MMIO interface, mimicking the Linux SMMUv3 driver initialization sequence: 1. Probe: read IDR registers, verify feature bits 2. Reset: disable SMMU, program CR1, stream table, CMDQ/EVTQ, enable each subsystem in sequence (CMDQEN → EVTQEN → SMMUEN) 3. Command queue: issue CFGI_ALL, TLBI_NSNH_ALL, CFGI_STE, CFGI_CD, each followed by CMD_SYNC with MSI completion signaling 4. Attach: configure STE (S1_TRANS) and CD in guest memory, build a 3-level AArch64 4K page table hierarchy 5. DMA: read/write through SmmuTranslatingMemory at translated IOVAs 6. MSI: fire MSI through SmmuSignalMsi with IOVA-mapped doorbell page, verify address translation with intra-page offset 7. Fault: access unmapped IOVA, verify translation fault event in EVTQ with correct event type, stream ID, and faulting address
A guest can program CMDQ_BASE.LOG2SIZE and EVTQ_BASE.LOG2SIZE with values larger than what IDR1.CMDQS/EVENTQS advertise. Without bounds checking, this allows a malicious guest to force the SMMU to iterate over an excessively large command queue on each CMDQ_PROD write. Clamp the effective log2size to the IDR1-advertised maximum. This matches the CONSTRAINED UNPREDICTABLE behavior real hardware uses for out-of-range queue sizes.
Hot-plugged PCIe devices were bypassing the SMMU, receiving raw GuestMemory and only ITS-wrapped SignalMsi. Since the IORT advertises the root complex behind the SMMU, the guest programs IOVA mappings for hot-plugged devices too, and DMA with untranslated IOVAs would fail. Store the Arc<SmmuSharedState> in LoadedVmInner (previously it was a local variable used only during boot) and use it in the AddPcieDevice handler to wrap the device's GuestMemory and SignalMsi with SmmuTranslatingMemory and SmmuSignalMsi, matching the static device path.
Switch the SMMU from MSI-based interrupt delivery (IDR0.MSI=1) to wired SPI interrupts (IDR0.MSI=0), matching QEMU's approach. The previous MSI implementation used guest_memory.write_at() which silently fails for MMIO addresses like ITS doorbells, so EVTQ and GERROR interrupts were never delivered. With wired SPIs, the SMMU device gets LineInterrupt objects for EVTQ (SPI 35) and GERROR (SPI 36) from the chipset builder, and pulses them directly. CMD_SYNC completion continues to use the guest RAM polling path (MSIWrite), which is the mechanism Linux uses when IDR0.MSI=0. The IORT SMMUv3 node now carries populated GSIVs for the event and gerror interrupts. The device_id_mapping with DEVICEID_VALID is retained for ITS configurations because Linux's IORT MSI domain resolution requires it for the RC-to-SMMUv3-to-ITS node traversal. Also add SMMUv3 to the aarch64 device tree (FDT) for non-ACPI boot. The DT node uses the arm,smmu-v3 compatible string with eventq/gerror interrupt entries, and PCIe host bridge nodes get iommu-map properties linking RIDs to SMMU stream IDs.
When a device behind the SMMU programs its MSI-X table, the MSI address is an IOVA (the guest's IOMMU driver maps the doorbell page into the device's IOVA space). The irqfd path programs this address directly into the kernel's MSI routing table, bypassing SMMU translation entirely. This means the kernel route would be configured with an untranslated IOVA instead of the physical GIC/ITS address. Add SmmuIrqFd and SmmuIrqFdRoute wrappers that translate the MSI address through the SMMU page tables on IrqFdRoute::enable(), before forwarding to the inner irqfd route (which may itself be an ITS wrapper). The composition order is SmmuIrqFd(ItsIrqFd(partition.irqfd())), matching the userspace SmmuSignalMsi(ItsSignalMsi(...)) chain.
…pping Phase 2 foundation: decouple SMMU stream IDs from PCI segment numbers and make translation stage support configurable. - Add SmmuFeatures struct with s1_supported/s2_supported flags that control IDR0.S1P and IDR0.S2P advertisement to the guest. This allows creating S2-only SMMUs for VFIO scenarios. - Replace segment: u16 with stream_id_base: u32 in all per-device SMMU wrappers (SmmuTranslatingMemory, SmmuSignalMsi, SmmuIrqFd, SmmuIrqFdRoute). The SMMU-local stream ID is now computed as stream_id_base + bdf, where stream_id_base comes from the IORT ID_MAPPING output_base for the root complex. This decouples SMMU stream table indexing from PCI segment numbers, enabling multiple root complexes to map into different regions of a single SMMU's stream table. - Add AssignedBusRange::compose_stream_id() method for SMMU-specific stream ID composition (parallel to compose_device_id for ITS). - Remove the segment==0 restriction in dispatch.rs — all segments behind the SMMU now get translation wrappers, not just segment 0. - Add tests for non-zero stream_id_base translation, IDR0 feature bit configuration (S1-only, S1+S2, S2-only), and compose_stream_id.
Replace the single --smmu bool with a repeatable --smmu <rc-name> CLI arg. Each invocation creates an SMMU instance covering the named PCIe root complex. The internal config uses SmmuInstanceConfig with rc_name instead of segment, and the dispatch wiring builds per-port SMMU lookup maps from the root complex topology. IORT generation emits one SMMUv3 node per instance. Each root complex's ID mapping points to its SMMU (if configured) or directly to the ITS. Device tree generation similarly emits per-SMMU nodes with per-RC iommu-map entries. The hotplug path uses pcie_rc_names (parallel to pcie_host_bridges) to look up the SMMU shared state for dynamically added devices. SPI allocation for SMMU interrupts: instance N uses vectors (3+N*2) and (4+N*2), giving each SMMU its own event queue and global error SPIs.
VFIO assigned devices bypass the emulated SMMU's stage 1 page tables because the host IOMMU uses its own tables (VFIO type1 identity mapping). Until iommufd nested translation is available, block VFIO devices on root complexes covered by an S1-capable SMMU at configuration time with a clear diagnostic. A set of port names behind S1-capable SMMUs is built during chipset construction. Before resolving each PCIe device, the resource ID is checked — if it is "vfio" and the port is in the S1 set, an error is returned advising the user to move the device, use S2-only mode, or enable iommufd.
Contributor
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR adds support for an emulated SMMUv3 on aarch64 and updates PCIe MSI routing to support GICv3 ITS (device-id based routing) in addition to GICv2m.
Changes:
- Introduces SMMUv3 emulation (spec types + translation logic) and plumbs per-device bus-range identity to support ITS/SMMU requester/device ID composition.
- Adds ACPI IORT generation (and DT
iommu-map) for PCIe interrupt/DMA remapping; adds MADT ITS entry and backend ITS capability detection (KVM). - Updates MSI/irqfd plumbing to carry an optional device identity (
devid) end-to-end.
Reviewed changes
Copilot reviewed 70 out of 71 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| vmm_core/vmotherboard/src/lib.rs | Re-exports PCIe bus-range identity type for consumers. |
| vmm_core/vmotherboard/src/chipset/builder/mod.rs | Threads optional PCIe device identity through builder registrations. |
| vmm_core/vmotherboard/src/chipset/backing/arc_mutex/services.rs | Extends PCIe registration API to accept optional device identity. |
| vmm_core/vmotherboard/src/chipset/backing/arc_mutex/pci.rs | Stores and forwards optional device identity during PCIe bus resolution. |
| vmm_core/vmotherboard/src/chipset/backing/arc_mutex/device.rs | Adds builder hook to attach PCIe bus-range identity to devices. |
| vmm_core/vmotherboard/src/base_chipset.rs | Forwards optional device identity into PCIe enumerator device attach. |
| vmm_core/virt_whp/src/synic.rs | Adapts SignalMsi signature to optional device identity. |
| vmm_core/virt_whp/src/lib.rs | Switches to topology-provided MSI controller config; advertises ITS support=false. |
| vmm_core/virt_whp/src/device.rs | Adapts SignalMsi signature to optional device identity. |
| vmm_core/virt_mshv/src/x86_64/mod.rs | Adapts SignalMsi signature to optional device identity. |
| vmm_core/virt_mshv/src/irqfd.rs | Adapts IrqFdRoute::enable signature to accept optional device identity. |
| vmm_core/virt_mshv/src/aarch64/mod.rs | Adapts MSI signaling for new SignalMsi API; advertises ITS support=false. |
| vmm_core/virt_kvm/src/lib.rs | Stores MSI controller config and ITS device FD; prepares KVM backend for ITS. |
| vmm_core/virt_kvm/src/gsi.rs | Plumbs optional devid into KVM irq routing builder path. |
| vmm_core/virt_kvm/src/arch/x86_64/mod.rs | Sets devid=None for x86 MSI routes; adapts SignalMsi signature. |
| vmm_core/virt_kvm/src/arch/aarch64/mod.rs | Probes ITS support, creates in-kernel ITS, adds ITS irqfd/MSI routing support. |
| vmm_core/virt_hvf/src/lib.rs | Advertises ITS support=false. |
| vmm_core/virt/src/x86/apic_software_device.rs | Adapts MSI forwarding to new SignalMsi API. |
| vmm_core/virt/src/generic.rs | Extends PlatformInfo with ITS capability and adapts SignalMsi signature. |
| vmm_core/virt/src/aarch64/gic_v2m.rs | Adapts SignalMsi signature to optional device identity. |
| vmm_core/virt/src/aarch64/gic_software_device.rs | Adapts SignalMsi signature to optional device identity. |
| vmm_core/src/device_builder.rs | Accepts per-device bus-range identity and passes into PCIe device builder. |
| vmm_core/src/acpi_builder.rs | Adds IORT construction + SMMU config, MADT ITS entries, and extensive tests. |
| vm/vmcore/vm_topology/src/processor/aarch64.rs | Replaces gic_v2m with gic_msi controller enum (None/V2m/Its). |
| vm/vmcore/src/irqfd.rs | Extends irqfd route enable API with optional device identity. |
| vm/kvm/src/lib.rs | Adds MSI route devid support and propagates flags into KVM irq routing. |
| vm/devices/virtio/virtio/src/transport/core.rs | Forces access_platform feature bit for virtio devices behind an IOMMU. |
| vm/devices/user_driver_emulated_mock/src/lib.rs | Updates MSI controller mock to ignore device identity. |
| vm/devices/storage/nvme_test/src/tests/test_helpers.rs | Updates MSI test helper to new SignalMsi signature. |
| vm/devices/storage/nvme/src/tests/test_helpers.rs | Updates MSI test helper to new SignalMsi signature. |
| vm/devices/pci/vpci/src/test_helpers/mod.rs | Updates MSI test helper to new SignalMsi signature. |
| vm/devices/pci/pcie/src/switch.rs | Uses port-side-effecting cfg write path; plumbs optional bus-range identity. |
| vm/devices/pci/pcie/src/root.rs | Plumbs optional bus-range identity, ensures port tracks bus-range on cfg writes, adds tests. |
| vm/devices/pci/pcie/src/port.rs | Adds shared assigned-bus-range tracking and cfg-write side effects. |
| vm/devices/pci/pcie/src/lib.rs | Exposes new bus_range + its modules. |
| vm/devices/pci/pcie/src/its.rs | Adds ITS wrappers for SignalMsi and IrqFd that inject device IDs. |
| vm/devices/pci/pcie/src/bus_range.rs | Adds shared atomic bus-range tracking and device/stream ID composition helpers. |
| vm/devices/pci/pcie/fuzz/fuzz_pcie.rs | Updates fuzz harness for new PCIe add-device signature. |
| vm/devices/pci/pcie/Cargo.toml | Adds pal_event dependency for irqfd route wrapper event access. |
| vm/devices/pci/pci_core/src/test_helpers/mod.rs | Updates MSI test helper to new SignalMsi signature. |
| vm/devices/pci/pci_core/src/msi.rs | Updates SignalMsi API; adds route/target helpers to pass optional device identity. |
| vm/devices/pci/pci_core/src/capabilities/msix.rs | Updates MSI-X delivery to new MsiTarget API. |
| vm/devices/iommu/smmu/src/translate.rs | Adds SMMUv3 STE/CD lookup and stage-1 page table walker + tests. |
| vm/devices/iommu/smmu/src/spec/ste.rs | Adds SMMUv3 STE layout/types + tests. |
| vm/devices/iommu/smmu/src/spec/registers.rs | Adds SMMUv3 register offsets/bitfields + tests. |
| vm/devices/iommu/smmu/src/spec/pt.rs | Adds AArch64 stage-1 page table descriptor helpers + tests. |
| vm/devices/iommu/smmu/src/spec/mod.rs | Exposes SMMU spec modules. |
| vm/devices/iommu/smmu/src/spec/events.rs | Adds SMMU event queue entry types + constructors + tests. |
| vm/devices/iommu/smmu/src/spec/commands.rs | Adds SMMU command queue entry types + helpers + tests. |
| vm/devices/iommu/smmu/src/spec/cd.rs | Adds SMMU context descriptor layout/types + tests. |
| vm/devices/iommu/smmu/src/lib.rs | Introduces new smmu crate module surface. |
| vm/devices/iommu/smmu/Cargo.toml | Adds new smmu crate definition + dependencies. |
| vm/acpi_spec/src/madt.rs | Adds MADT GIC ITS structure support. |
| vm/acpi_spec/src/lib.rs | Exposes new ACPI IORT module. |
| vm/acpi_spec/src/iort.rs | Adds IORT node/mapping structures used by ACPI builder. |
| tmk/tmk_vmm/src/run.rs | Updates aarch64 platform config to use gic_msi. |
| openvmm/openvmm_entry/src/lib.rs | Adds CLI/config wiring for GIC MSI controller selection and SMMU instances. |
| openvmm/openvmm_entry/src/cli_args.rs | Adds --gic-msi and --smmu CLI flags for aarch64. |
| openvmm/openvmm_defs/src/config.rs | Adds defaults for ITS/SMMU MMIO layout and SMMU/GIC MSI config structs. |
| openvmm/openvmm_core/src/worker/vm_loaders/linux.rs | Builds DT with ITS and SMMU nodes + iommu-map; passes SMMU configs. |
| openvmm/openvmm_core/src/worker/dispatch.rs | Selects ITS vs v2m, instantiates SMMU devices, wraps per-device MSI/irqfd/memory. |
| openvmm/openvmm_core/Cargo.toml | Adds smmu dependency to OpenVMM core. |
| openhcl/virt_mshv_vtl/src/lib.rs | Updates SignalMsi implementation signature. |
| openhcl/underhill_core/src/loader/mod.rs | Extends loader config to include (placeholder) SMMU base field. |
| openhcl/bootloader_fdt_parser/src/lib.rs | Updates parsed platform config to use gic_msi. |
| Guide/src/reference/emulated/pcie/overview.md | Documents aarch64 MSI routing via ITS vs v2m and the new CLI flag. |
| Guide/src/reference/devices/firmware/linux_direct.md | Updates docs to mention ITS/IORT in ACPI mode for PCIe routing. |
| Cargo.toml | Adds new workspace crate smmu. |
Comments suppressed due to low confidence (4)
vmm_core/src/acpi_builder.rs:1
- The IORT RC mapping logic uses a global
rc_mapping_countand defaults an unmapped RC toits_group_offseteven when there is no ITS. Ifhas_smmu == trueandhas_its == false(and not every RC is covered by an SMMU), RCs without an SMMU will incorrectly map to offsetIORT_NODE_OFFSET(which will be the first SMMU node), effectively claiming they are behind the wrong SMMU. Fix by computing the mapping count and target per root complex: emit an RC ID mapping only if that RC has an SMMU offset, or if an ITS is actually present; otherwise set that RC node’s mapping_count to 0 and append noIortIdMappingentry.
vmm_core/src/acpi_builder.rs:1 - The IORT RC mapping logic uses a global
rc_mapping_countand defaults an unmapped RC toits_group_offseteven when there is no ITS. Ifhas_smmu == trueandhas_its == false(and not every RC is covered by an SMMU), RCs without an SMMU will incorrectly map to offsetIORT_NODE_OFFSET(which will be the first SMMU node), effectively claiming they are behind the wrong SMMU. Fix by computing the mapping count and target per root complex: emit an RC ID mapping only if that RC has an SMMU offset, or if an ITS is actually present; otherwise set that RC node’s mapping_count to 0 and append noIortIdMappingentry.
vmm_core/src/acpi_builder.rs:1 - The test suite exercises IORT generation with ITS and with SMMU+ITS, but doesn’t cover the important configuration where
has_smmu == trueandhas_its == false(including the case where only a subset of RCs are covered by SMMUs). Adding tests for “SMMU without ITS” and “partial RC coverage” would catch incorrect RC mapping counts/targets (and would have exposed the current incorrectunwrap_or(its_group_offset)fallback when no ITS exists).
vm/devices/pci/pci_core/src/capabilities/msix.rs:217 - With the new optional
devidplumbing intended for ITS routing, this MSI-X delivery path always signals withdevid=None, which prevents identifying the correct PCI function for multi-function devices (where ITS device ID must include the function number). If multi-function endpoints are in scope for ITS mode, consider extending the MSI-X interrupt target state to carry the function’s BDF (or RID) and signaling withsignal_msi_with_rid(...)(or passingSome(bdf)down to the ITS wrapper) so the composed ITS device ID is accurate.
fn deliver(&self) {
let mut state = self.0.lock();
if state.enabled {
state.target.signal_msi(state.address, state.data);
} else {
state.pending = true;
}
}
Comment on lines
+502
to
+503
| // through its SMMU instance. | ||
| node = node.add_u32_array(p_iommu_map, &[0, *phandle, 0, 0x10000])?; |
Comment on lines
+206
to
+225
| fn compute_start_level(tg0: Tg0, t0sz: u8) -> Option<(u8, u8)> { | ||
| let va_bits = 64u8.checked_sub(t0sz)?; | ||
| let bits_per_level = tg0.bits_per_level()?; | ||
| let page_shift = tg0.page_shift()?; | ||
|
|
||
| // Number of address bits resolved by the page table walk (excluding page | ||
| // offset). For 4K/9 bits per level: va_bits - 12 bits are resolved by | ||
| // the walk. | ||
| let resolve_bits = va_bits.checked_sub(page_shift)?; | ||
|
|
||
| // Number of full levels needed = ceil(resolve_bits / bits_per_level). | ||
| // Start level = 4 - num_levels (levels are numbered 0..3). | ||
| let num_levels = resolve_bits.div_ceil(bits_per_level); | ||
| if num_levels > 4 { | ||
| return None; | ||
| } | ||
| let start_level = 4 - num_levels; | ||
|
|
||
| Some((start_level, va_bits)) | ||
| } |
Comment on lines
176
to
179
| if state.pending { | ||
| state.target.signal_msi(0, address, data); | ||
| state.target.signal_msi(address, data); | ||
| state.pending = false; | ||
| } |
|
This PR modifies files containing For more on why we check whole files, instead of just diffs, check out the Rustonomicon |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.