nvme_driver: disable keepalive for a specific class of devices#3460
Open
gurasinghMS wants to merge 2 commits into
Open
nvme_driver: disable keepalive for a specific class of devices#3460gurasinghMS wants to merge 2 commits into
gurasinghMS wants to merge 2 commits into
Conversation
…soft#3438) This change is intended to disable keepalive entirely for any devices with VendorId = 0x1414 and DeviceId = 0xb111. Disablement is done in the nvme_manager when the device is first loaded by reading the VendorID and DeviceID from the config space and the manager stores a flag the determines keepalive compatibility for the device (so that we are not repeatedly reading in the config space). It also makes sure that even when a device is being restored, it is not automatically assumed to be keepalive compatible. (cherry picked from commit 47303dc)
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates the OpenHCL NVMe manager to disable NVMe keepalive for devices with PCI Vendor ID 0x1414 and Device ID 0xb111, by detecting these IDs via sysfs at device load/restore time and caching a per-device “keepalive compatible” flag.
Changes:
- Add a hardware-config fault that can override PCI Vendor/Device IDs reported by the NVMe fault-injection device (useful for testing).
- Cache and enforce a per-device keepalive-compatibility flag in the NVMe manager (affecting shutdown behavior and save/restore eligibility).
- Ensure restore does not assume keepalive compatibility solely from saved state.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| vmm_tests/vmm_tests/tests/tests/multiarch/openhcl_servicing.rs | Adds an import for the new hardware-config fault type (currently unused). |
| vm/devices/storage/nvme_test/src/pci.rs | Applies optional hardware-config overrides when building PCI config space IDs. |
| vm/devices/storage/nvme_resources/src/fault.rs | Introduces HardwareConfigFaultConfig and adds it to FaultConfiguration. |
| openhcl/underhill_core/src/nvme_manager/mod.rs | Adds sysfs vendor/device ID read helper and keepalive-compatibility predicate. |
| openhcl/underhill_core/src/nvme_manager/manager.rs | Uses keepalive-compatibility to gate keepalive-related shutdown + save/restore paths. |
| openhcl/underhill_core/src/nvme_manager/device.rs | Stores keepalive_compatible on NvmeDriverManager and exposes an accessor. |
| use nvme_resources::fault::AdminQueueFaultBehavior; | ||
| use nvme_resources::fault::AdminQueueFaultConfig; | ||
| use nvme_resources::fault::FaultConfiguration; | ||
| use nvme_resources::fault::HardwareConfigFaultConfig; |
Comment on lines
497
to
501
| self.context.vp_count, | ||
| true, // save_restore_supported is always `true` when restoring. | ||
| keepalive_compatible, // save_restore support is no longer guaranteed | ||
| keepalive_compatible, | ||
| Some(nvme_driver), | ||
| self.context.nvme_driver_spawner.clone(), |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
THIS IS NOT A CLEAN CHERRY PICK. Changes to openhcl_servicing were omitted. Cherry picked changes from #3438
This change is intended to disable keepalive entirely for any devices with VendorId = 0x1414 and DeviceId = 0xb111. Disablement is done in the nvme_manager when the device is first loaded by reading the VendorID and DeviceID from the config space and the manager stores a flag the determines keepalive compatibility for the device (so that we are not repeatedly reading in the config space).
It also makes sure that even when a device is being restored, it is not automatically assumed to be keepalive compatible.
(cherry picked from commit 47303dc)