nvme_driver: don't lazily restore interrupts (#3220) by mattkur · Pull Request #3235 · microsoft/openvmm

mattkur · 2026-04-09T18:40:02Z

Clean cherry pick of PR #3220

Works around a device-side issue: some devices cannot handle the case where interrupts are reprogrammed out of IV order.

Works around a device-side issue: some devices cannot handle the case where interrupts are reprogrammed out of IV order. (cherry picked from commit 958b8a4)

Copilot

Pull request overview

Adjusts NVMe driver servicing-restore behavior to avoid device bugs triggered when MSI-X interrupts are reprogrammed out of interrupt-vector (IV) order, by making the default restore path eagerly restore all I/O queues in IV-sorted order.

Changes:

Default to an eager restore strategy that restores all I/O queues, sorted by interrupt vector, to preserve ascending IV programming order.
Add allow_lazy_restore: Option<bool> to saved state to optionally fall back to the prior eager+proto (lazy) split restore behavior.
Update Underhill NVMe manager test helpers to populate the new saved-state field.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
vm/devices/storage/disk_nvme/nvme_driver/src/driver.rs	Implements IV-sorted eager restoration of all I/O queues; adds saved-state flag to optionally enable the old lazy behavior.
openhcl/underhill_core/src/nvme_manager/save_restore_helpers.rs	Updates test-state construction to include the new `allow_lazy_restore` field.

Copilot · 2026-04-09T18:45:58Z

+            // Eager restore path: restore ALL queues sorted by interrupt
+            // vector for ordered VPci allocation (MSI-X ordering workaround).
+            //
+            // Devnote: Safety of inline new_self_drained(): This loop is fully
+            // synchronous (no .await). Although IoQueue::restore() spawns
+            // queue handler tasks, they don't poll until the async runtime
+            // yields — which happens only after .collect() completes. So all
+            // new_self_drained() and new_draining() calls finish before any
+            // handler can fire the drain-complete signal. If this loop is ever
+            // refactored to be async, the waiters for empty queues must be
+            // pre-created (as done in the lazy path above).
+            let mut sorted_io: Vec<_> = saved_state.worker_data.io.iter().collect();
+            sorted_io.sort_by_key(|q| q.iv);
+
+            worker.io = sorted_io
+                .into_iter()
+                .flat_map(|q| -> Result<IoQueue<D>, anyhow::Error> {
+                    let qid = q.queue_data.qid;
+                    let cpu = q.cpu;
+                    tracing::info!(qid, cpu, iv = q.iv, ?pci_id, "restoring queue");
+                    max_seen_qid = max_seen_qid.max(qid);
+                    let interrupt =
+                        worker.device.map_interrupt(q.iv, q.cpu).with_context(|| {
+                            format!(
+                                "failed to map interrupt for {}, cpu {}, iv {}",
+                                pci_id, q.cpu, q.iv
+                            )
+                        })?;
+                    tracing::info!(qid, cpu, ?pci_id, "restoring queue: search for mem block");
+                    let mem_block = restored_memory
+                        .iter()
+                        .find(|mem| {
+                            mem.len() == q.queue_data.mem_len
+                                && q.queue_data.base_pfn == mem.pfns()[0]
+                        })
+                        .expect("unable to find restored mem block")
+                        .to_owned();
+                    tracing::info!(qid, cpu, ?pci_id, "restoring queue: restore IoQueue");
+                    let q = IoQueue::restore(
+                        driver.clone(),
+                        interrupt,
+                        registers.clone(),
+                        mem_block,
+                        &pci_id,
+                        q,
+                        bounce_buffer,
+                        if q.queue_data.handler_data.pending_cmds.commands.is_empty() {
+                            drain_after_restore_template.new_self_drained()
+                        } else {
+                            drain_after_restore_template.new_draining()
+                        },


In the eager-restore path, empty queues get DrainAfterRestoreBuilder::new_self_drained() created inline while IoQueue::restore() is spawning queue-handler tasks. The devnote assumes those tasks won’t run until the async runtime yields, but VmTaskDriver explicitly allows spawned tasks to run on other executors/threads immediately. That reintroduces the missed-signal race you previously avoided by pre-creating waiters: a non-empty queue could drain and signal before later empty queues register their waiters, leaving them stuck in SelfDrained forever. Consider pre-creating the DrainAfterRestore handles for all queues before restoring/spawning any of them, or use the builder’s new_for_new_queue() (double-check) logic for empty queues so late registration can’t miss a completed drain.

nvme_driver: don't lazily restore interrupts (microsoft#3220)

cfaa89f

Works around a device-side issue: some devices cannot handle the case where interrupts are reprogrammed out of IV order. (cherry picked from commit 958b8a4)

mattkur requested review from a team as code owners April 9, 2026 18:40

Copilot AI review requested due to automatic review settings April 9, 2026 18:40

github-actions bot added the release_1.7.2511 Targets the release/1.7.2511 branch. label Apr 9, 2026

Copilot started reviewing on behalf of mattkur April 9, 2026 18:41 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

alandau approved these changes Apr 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nvme_driver: don't lazily restore interrupts (#3220)#3235

nvme_driver: don't lazily restore interrupts (#3220)#3235
mattkur wants to merge 1 commit intomicrosoft:release/1.7.2511from
mattkur:cherrypick/release/1.7.2511/pr-3220

mattkur commented Apr 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mattkur commented Apr 9, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants