Skip to content

nvme_driver: disable keepalive for a specific class of devices#3460

Open
gurasinghMS wants to merge 2 commits into
microsoft:release/1.7.2511from
gurasinghMS:cherrypick/release/1.7.2511/pr-3438
Open

nvme_driver: disable keepalive for a specific class of devices#3460
gurasinghMS wants to merge 2 commits into
microsoft:release/1.7.2511from
gurasinghMS:cherrypick/release/1.7.2511/pr-3438

Conversation

@gurasinghMS
Copy link
Copy Markdown
Contributor

@gurasinghMS gurasinghMS commented May 12, 2026

THIS IS NOT A CLEAN CHERRY PICK. Changes to openhcl_servicing were omitted. Cherry picked changes from #3438

This change is intended to disable keepalive entirely for any devices with VendorId = 0x1414 and DeviceId = 0xb111. Disablement is done in the nvme_manager when the device is first loaded by reading the VendorID and DeviceID from the config space and the manager stores a flag the determines keepalive compatibility for the device (so that we are not repeatedly reading in the config space).
It also makes sure that even when a device is being restored, it is not automatically assumed to be keepalive compatible.

(cherry picked from commit 47303dc)

…soft#3438)

This change is intended to disable keepalive entirely for any devices
with VendorId = 0x1414 and DeviceId = 0xb111. Disablement is done in the
nvme_manager when the device is first loaded by reading the VendorID and
DeviceID from the config space and the manager stores a flag the
determines keepalive compatibility for the device (so that we are not
repeatedly reading in the config space).
It also makes sure that even when a device is being restored, it is not
automatically assumed to be keepalive compatible.

(cherry picked from commit 47303dc)
Copilot AI review requested due to automatic review settings May 12, 2026 03:00
@gurasinghMS gurasinghMS requested review from a team as code owners May 12, 2026 03:00
@github-actions github-actions Bot added the release_1.7.2511 Targets the release/1.7.2511 branch. label May 12, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the OpenHCL NVMe manager to disable NVMe keepalive for devices with PCI Vendor ID 0x1414 and Device ID 0xb111, by detecting these IDs via sysfs at device load/restore time and caching a per-device “keepalive compatible” flag.

Changes:

  • Add a hardware-config fault that can override PCI Vendor/Device IDs reported by the NVMe fault-injection device (useful for testing).
  • Cache and enforce a per-device keepalive-compatibility flag in the NVMe manager (affecting shutdown behavior and save/restore eligibility).
  • Ensure restore does not assume keepalive compatibility solely from saved state.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
vmm_tests/vmm_tests/tests/tests/multiarch/openhcl_servicing.rs Adds an import for the new hardware-config fault type (currently unused).
vm/devices/storage/nvme_test/src/pci.rs Applies optional hardware-config overrides when building PCI config space IDs.
vm/devices/storage/nvme_resources/src/fault.rs Introduces HardwareConfigFaultConfig and adds it to FaultConfiguration.
openhcl/underhill_core/src/nvme_manager/mod.rs Adds sysfs vendor/device ID read helper and keepalive-compatibility predicate.
openhcl/underhill_core/src/nvme_manager/manager.rs Uses keepalive-compatibility to gate keepalive-related shutdown + save/restore paths.
openhcl/underhill_core/src/nvme_manager/device.rs Stores keepalive_compatible on NvmeDriverManager and exposes an accessor.

use nvme_resources::fault::AdminQueueFaultBehavior;
use nvme_resources::fault::AdminQueueFaultConfig;
use nvme_resources::fault::FaultConfiguration;
use nvme_resources::fault::HardwareConfigFaultConfig;
Comment on lines 497 to 501
self.context.vp_count,
true, // save_restore_supported is always `true` when restoring.
keepalive_compatible, // save_restore support is no longer guaranteed
keepalive_compatible,
Some(nvme_driver),
self.context.nvme_driver_spawner.clone(),
@github-actions
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release_1.7.2511 Targets the release/1.7.2511 branch.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants