Skip to content

pcie: add GICv3 ITS support for aarch64 MSI delivery#3441

Open
jstarks wants to merge 8 commits into
microsoft:mainfrom
jstarks:its
Open

pcie: add GICv3 ITS support for aarch64 MSI delivery#3441
jstarks wants to merge 8 commits into
microsoft:mainfrom
jstarks:its

Conversation

@jstarks
Copy link
Copy Markdown
Member

@jstarks jstarks commented May 8, 2026

Replace the GICv2m MSI controller with KVM's in-kernel GICv3 ITS for aarch64 PCIe MSI/MSI-X delivery. GICv2m maps MSI writes to a fixed pool of 64 SPIs, which doesn't scale (a single NVMe device with 128 queues exhausts it) and is incompatible with the ITS-based device ID model needed for future SMMU support. The ITS routes MSIs via LPIs using (DeviceID, EventID) lookup, supporting thousands of interrupt vectors across all devices.

KVM provides a complete in-kernel ITS (KVM_DEV_TYPE_ARM_VGIC_ITS) that handles all guest MMIO and command queue processing. The VMM creates the device, sets its base address, and initializes it. For emulated devices, MSIs are injected via KVM_SIGNAL_MSI with KVM_MSI_VALID_DEVID. For irqfd (VFIO passthrough), the kvm_irq_routing_msi entry carries the devid so the kernel signals the ITS directly.

The main design challenge is that PCIe devices don't know their own requester ID (bus/device/function), since bus numbers are assigned dynamically by guest firmware. This is solved with a per-device AssignedBusRange that the PCIe port updates atomically when the guest programs secondary/subordinate bus numbers. ITS wrappers (ItsSignalMsi, ItsIrqFd) compose the full 32-bit device ID as (segment << 16 | BDF) at interrupt delivery time, transparent to the devices themselves.

The SignalMsi trait changes from rid: u32 (always passed as 0) to devid: Option<u32>, and IrqFdRoute::enable gains a matching parameter. This is a mechanical change across all backends (KVM, WHP, MSHV, HVF).

Also adds ACPI IORT (IO Remapping Table) generation for aarch64, with ITS Group and PCI Root Complex nodes with ID mappings. The MADT gains a GIC ITS entry. DeviceTree generation emits an ITS child node under the GIC when ITS is configured, with msi-parent on PCIe host bridges pointing to the ITS phandle instead of v2m.

ITS support is probed at KVM init time via KVM_CREATE_DEVICE_TEST, falling back to GICv2m on kernels or hardware without ITS. A --gic-msi CLI option (auto/its/v2m) allows overriding the default selection. GICv2m remains available for GICv2-only configurations.

Replace the GICv2m MSI controller with KVM's in-kernel GICv3 ITS for
aarch64 PCIe MSI/MSI-X delivery. GICv2m maps MSI writes to a fixed pool
of 64 SPIs, which doesn't scale (a single NVMe device with 128 queues
exhausts it) and is incompatible with the ITS-based device ID model
needed for future SMMU support. The ITS routes MSIs via LPIs using
(DeviceID, EventID) lookup, supporting thousands of interrupt vectors
across all devices.

KVM provides a complete in-kernel ITS (KVM_DEV_TYPE_ARM_VGIC_ITS) that
handles all guest MMIO and command queue processing. The VMM creates the
device, sets its base address, and initializes it. For emulated devices,
MSIs are injected via KVM_SIGNAL_MSI with KVM_MSI_VALID_DEVID. For irqfd
(VFIO passthrough), the kvm_irq_routing_msi entry carries the devid so
the kernel signals the ITS directly.

The main design challenge is that PCIe devices don't know their own
requester ID (bus/device/function), since bus numbers are assigned
dynamically by guest firmware. This is solved with a per-device
AssignedBusRange that the PCIe port updates atomically when the guest
programs secondary/subordinate bus numbers. ITS wrappers (ItsSignalMsi,
ItsIrqFd) compose the full 32-bit device ID as (segment << 16 | BDF) at
interrupt delivery time, transparent to the devices themselves.

The SignalMsi trait changes from `rid: u32` (always passed as 0) to
`devid: Option<u32>`, and IrqFdRoute::enable gains a matching parameter.
This is a mechanical change across all backends (KVM, WHP, MSHV, HVF).

Also adds ACPI IORT (IO Remapping Table) generation for aarch64, with
ITS Group and PCI Root Complex nodes with ID mappings. The MADT gains a
GIC ITS entry. DeviceTree generation emits an ITS child node under the
GIC when ITS is configured, with msi-parent on PCIe host bridges
pointing to the ITS phandle instead of v2m.

ITS support is probed at KVM init time via KVM_CREATE_DEVICE_TEST,
falling back to GICv2m on kernels or hardware without ITS. A --gic-msi
CLI option (auto/its/v2m) allows overriding the default selection.
GICv2m remains available for GICv2-only configurations.
Copilot AI review requested due to automatic review settings May 8, 2026 16:45
@github-actions github-actions Bot added Guide unsafe Related to unsafe code labels May 8, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

⚠️ Unsafe Code Detected

This PR modifies files containing unsafe Rust code. Extra scrutiny is required during review.

For more on why we check whole files, instead of just diffs, check out the Rustonomicon

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR upgrades aarch64 PCIe MSI/MSI-X delivery by adding support for KVM’s in-kernel GICv3 ITS (LPI-based MSI routing) while keeping GICv2m as a fallback, and threads the required device-identity plumbing through the PCIe and hypervisor layers.

Changes:

  • Add an aarch64 MSI-controller selection model (GicMsiController / --gic-msi) and probe KVM ITS support, falling back to GICv2m when needed.
  • Introduce PCIe bus-range tracking (AssignedBusRange) and ITS wrappers that compose (segment << 16) | BDF for MSI signaling and irqfd routing.
  • Generate the required firmware descriptions for ITS routing (ACPI MADT ITS entry + IORT; device tree ITS child + msi-parent updates).

Reviewed changes

Copilot reviewed 54 out of 55 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
vmm_core/vmotherboard/src/lib.rs Re-export AssignedBusRange for PCIe identity tracking integration.
vmm_core/vmotherboard/src/chipset/builder/mod.rs Thread optional AssignedBusRange through PCIe device registration.
vmm_core/vmotherboard/src/chipset/backing/arc_mutex/services.rs Extend static PCIe registration to carry optional bus-range identity.
vmm_core/vmotherboard/src/chipset/backing/arc_mutex/pci.rs Store and pass per-device optional AssignedBusRange during enumeration.
vmm_core/vmotherboard/src/chipset/backing/arc_mutex/device.rs Add builder API to attach an AssignedBusRange to PCIe devices.
vmm_core/vmotherboard/src/base_chipset.rs Propagate optional AssignedBusRange into PCIe device addition.
vmm_core/virt/src/x86/apic_software_device.rs Adapt SignalMsi signature to Option<u32> device identity.
vmm_core/virt/src/generic.rs Add supports_its to PlatformInfo; adapt SignalMsi signature.
vmm_core/virt/src/aarch64/gic_v2m.rs Update SignalMsi signature for v2m signaler.
vmm_core/virt/src/aarch64/gic_software_device.rs Update SignalMsi signature for software GIC implementation.
vmm_core/virt_whp/src/synic.rs Update SignalMsi signature for WHP MSI injection path.
vmm_core/virt_whp/src/lib.rs Replace gic_v2m with gic_msi controller selection; set supports_its=false.
vmm_core/virt_whp/src/device.rs Update SignalMsi signature for WHP device interrupt injection.
vmm_core/virt_mshv/src/x86_64/mod.rs Update SignalMsi signature for MSHV MSI injection.
vmm_core/virt_mshv/src/irqfd.rs Extend irqfd route enable to accept optional device identity.
vmm_core/virt_mshv/src/aarch64/mod.rs Add supports_its=false; adapt MSI signaling signature/behavior.
vmm_core/virt_kvm/src/lib.rs Track ITS device lifetime; store gic_msi controller selection.
vmm_core/virt_kvm/src/gsi.rs Extend KVM irqfd enable path to pass through optional device identity.
vmm_core/virt_kvm/src/arch/x86_64/mod.rs Plumb devid through KVM routing entries (unused on x86).
vmm_core/virt_kvm/src/arch/aarch64/mod.rs Probe/create/init ITS device; add ITS MSI injection and irqfd routing support; expose supports_its.
vmm_core/virt_hvf/src/lib.rs Add supports_its=false for HVF platform info.
vmm_core/src/device_builder.rs Pass optional AssignedBusRange into PCIe device build path.
vmm_core/src/acpi_builder.rs Add IORT generation for aarch64 + MADT ITS entry; add unit tests for IORT.
vm/vmcore/vm_topology/src/processor/aarch64.rs Replace gic_v2m option with GicMsiController (None/V2m/Its) in topology.
vm/vmcore/src/irqfd.rs Extend IrqFdRoute::enable with optional devid for ITS routing.
vm/kvm/src/lib.rs Extend KVM IRQ routing MSI entry to include optional devid + flags.
vm/devices/user_driver_emulated_mock/src/lib.rs Adapt mock MSI controller to new SignalMsi signature.
vm/devices/storage/nvme/src/tests/test_helpers.rs Update NVMe test MSI controller to new signature.
vm/devices/storage/nvme_test/src/tests/test_helpers.rs Update NVMe_test MSI controller to new signature.
vm/devices/pci/vpci/src/test_helpers/mod.rs Update VPCI test MSI signaling to new signature.
vm/devices/pci/pcie/src/switch.rs Add optional AssignedBusRange propagation into downstream port setup; route cfg writes via write_cfg.
vm/devices/pci/pcie/src/root.rs Add optional AssignedBusRange propagation to ports/hotplug; route cfg writes via write_cfg.
vm/devices/pci/pcie/src/port.rs Implement port-side bus-range tracking and cfg-write side effects for identity updates.
vm/devices/pci/pcie/src/lib.rs Export new bus_range and its modules.
vm/devices/pci/pcie/src/its.rs Add ITS wrappers (ItsSignalMsi, ItsIrqFd) composing segment+BDF device IDs.
vm/devices/pci/pcie/src/bus_range.rs Add shared atomic bus-range container and ITS devid composition helper.
vm/devices/pci/pcie/fuzz/fuzz_pcie.rs Update fuzz harness to match new PCIe root API signature.
vm/devices/pci/pcie/Cargo.toml Add pal_event dependency required by irqfd wrapper interface.
vm/devices/pci/pci_core/src/test_helpers/mod.rs Update test MSI controller signature.
vm/devices/pci/pci_core/src/msi.rs Change SignalMsi to Option<u32> devid; add route/target helpers for rid-aware signaling.
vm/devices/pci/pci_core/src/capabilities/msix.rs Update MSI-X delivery path to new MsiTarget API and irqfd enable signature.
vm/acpi_spec/src/madt.rs Add MADT GIC ITS entry type/struct.
vm/acpi_spec/src/lib.rs Export new iort module.
vm/acpi_spec/src/iort.rs Introduce IORT table/node/type definitions for aarch64 PCIe + ITS mapping.
tmk/tmk_vmm/src/run.rs Update TMK aarch64 topology config to gic_msi=None.
openvmm/openvmm_entry/src/lib.rs Wire --gic-msi into config (GicMsiConfig).
openvmm/openvmm_entry/src/cli_args.rs Add --gic-msi CLI option and GicMsiCli enum.
openvmm/openvmm_defs/src/config.rs Add ITS base/size constants and GicMsiConfig config enum.
openvmm/openvmm_core/src/worker/vm_loaders/linux.rs Emit device tree ITS child node + msi-parent selection (ITS vs v2m).
openvmm/openvmm_core/src/worker/dispatch.rs Select ITS vs v2m from platform/config; wrap PCIe MSI/irqfd with ITS devid injection; propagate AssignedBusRange.
openhcl/virt_mshv_vtl/src/lib.rs Update OpenHCL MSI signaling signature to Option<u32>.
openhcl/bootloader_fdt_parser/src/lib.rs Update parsed topology to gic_msi=None default.
Guide/src/reference/emulated/pcie/overview.md Document aarch64 MSI routing via ITS vs v2m and the --gic-msi override.
Guide/src/reference/devices/firmware/linux_direct.md Update ACPI table list to include ITS/IORT behavior for aarch64.
Cargo.lock Lockfile update for added pal_event dependency in pcie crate.

Comment thread openvmm/openvmm_entry/src/cli_args.rs
Comment thread openvmm/openvmm_core/src/worker/dispatch.rs Outdated
Comment thread openvmm/openvmm_core/src/worker/dispatch.rs Outdated
Comment thread vm/devices/pci/pcie/src/port.rs Outdated
@jstarks jstarks marked this pull request as ready for review May 8, 2026 17:14
@jstarks jstarks requested a review from a team as a code owner May 8, 2026 17:14
Copilot AI review requested due to automatic review settings May 8, 2026 17:14
@jstarks jstarks requested a review from a team as a code owner May 8, 2026 17:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 54 out of 55 changed files in this pull request and generated 4 comments.

Comment thread openvmm/openvmm_core/src/worker/dispatch.rs Outdated
Comment thread openvmm/openvmm_core/src/worker/dispatch.rs Outdated
Comment thread vm/devices/pci/pci_core/src/msi.rs
Comment thread vmm_core/virt_kvm/src/arch/aarch64/mod.rs
Comment thread openvmm/openvmm_core/src/worker/dispatch.rs Outdated
Comment thread vm/devices/pci/pcie/src/port.rs Outdated
Comment thread vm/devices/pci/pcie/src/root.rs Outdated
Comment thread vm/devices/pci/pcie/src/bus_range.rs Outdated
///
/// Clone is cheap (just an `Arc` bump).
#[derive(Clone, Debug)]
pub struct AssignedBusRange(Arc<AtomicU16>);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also include a segment number (ie. SegmentBusRange)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so--this is meant to be segment local, and the interpretation/use of the segment (which is fixed and doesn't need atomic) is contextual.

Comment thread openvmm/openvmm_core/src/worker/dispatch.rs Outdated
Comment thread vm/devices/pci/pcie/src/port.rs Outdated
partition.irqfd(),
signal_msi,
irqfd,
Some(bus_range),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little confused about the ownership and updating here. Each downstream port (root port or switch port) has a bus range that it updates on config space writes, but then we also have a separate AssignedBusRange given to the endpoint devices? And since the endpoint's bus range is handed over to ItsSignalMsi / ItsIrqFd, how does it get any information about bus number configuration?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AssignedBusRange represents a ref count to a common object. A bit confusing at first.

Copilot AI review requested due to automatic review settings May 11, 2026 23:50
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 54 out of 55 changed files in this pull request and generated 3 comments.

Comment thread vm/devices/pci/pci_core/src/lib.rs
Comment thread openvmm/openvmm_core/src/worker/dispatch.rs
Comment on lines +1892 to +1896

// Query the switch's actual downstream port names instead of
// reconstructing them from the naming convention.
for p in switch_device.lock().downstream_ports() {
port_info.insert(
Copilot AI review requested due to automatic review settings May 12, 2026 17:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 52 out of 53 changed files in this pull request and generated 4 comments.

Comment on lines +1966 to +1987
let signal_msi = partition.as_signal_msi(Vtl::Vtl0).map(|s| {
if use_its {
Arc::new(pcie::its::ItsSignalMsi::new(
s,
pi.bus_range.clone(),
pi.segment,
)) as Arc<dyn pci_core::msi::SignalMsi>
} else {
s
}
});
let irqfd = partition.irqfd().map(|fd| {
if use_its {
Arc::new(pcie::its::ItsIrqFd::new(
fd,
pi.bus_range.clone(),
pi.segment,
)) as Arc<dyn vmcore::irqfd::IrqFd>
} else {
fd
}
});
Comment on lines +1055 to +1069
impl pci_core::msi::SignalMsi for GicItsSignalMsi {
fn signal_msi(&self, devid: Option<u32>, address: u64, data: u32) {
if address != self.translater_addr {
tracelimit::warn_ratelimited!(
address,
data,
expected = self.translater_addr,
"unexpected MSI address (expected ITS GITS_TRANSLATER)"
);
return;
}
let (flags, raw_devid) = match devid {
Some(id) => (kvm::KVM_MSI_VALID_DEVID, id),
None => (0, 0),
};
Comment thread vm/devices/pci/pci_core/src/bus_range.rs
Comment thread openvmm/openvmm_entry/src/cli_args.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Guide unsafe Related to unsafe code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants