aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorSagi Grimberg <sagi@grimberg.me>2022-02-01 14:54:19 +0200
committerChristoph Hellwig <hch@lst.de>2022-02-02 09:19:05 +0100
commit0fa0f99fc84e41057cbdd2efbfe91c6b2f47dd9d (patch)
tree6b318c83b0aa3008d0895649b79bd79f3f4255f7
parentb879f915bc48a18d4f4462729192435bb0f17052 (diff)
downloadlinux-dm-0fa0f99fc84e41057cbdd2efbfe91c6b2f47dd9d.tar.gz
nvme: fix a possible use-after-free in controller reset during load
Unlike .queue_rq, in .submit_async_event drivers may not check the ctrl readiness for AER submission. This may lead to a use-after-free condition that was observed with nvme-tcp. The race condition may happen in the following scenario: 1. driver executes its reset_ctrl_work 2. -> nvme_stop_ctrl - flushes ctrl async_event_work 3. ctrl sends AEN which is received by the host, which in turn schedules AEN handling 4. teardown admin queue (which releases the queue socket) 5. AEN processed, submits another AER, calling the driver to submit 6. driver attempts to send the cmd ==> use-after-free In order to fix that, add ctrl state check to validate the ctrl is actually able to accept the AER submission. This addresses the above race in controller resets because the driver during teardown should: 1. change ctrl state to RESETTING 2. flush async_event_work (as well as other async work elements) So after 1,2, any other AER command will find the ctrl state to be RESETTING and bail out without submitting the AER. Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
-rw-r--r--drivers/nvme/host/core.c9
1 files changed, 8 insertions, 1 deletions
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 5e0bfda04bd7b1..961a5f8a44d2aa 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4253,7 +4253,14 @@ static void nvme_async_event_work(struct work_struct *work)
container_of(work, struct nvme_ctrl, async_event_work);
nvme_aen_uevent(ctrl);
- ctrl->ops->submit_async_event(ctrl);
+
+ /*
+ * The transport drivers must guarantee AER submission here is safe by
+ * flushing ctrl async_event_work after changing the controller state
+ * from LIVE and before freeing the admin queue.
+ */
+ if (ctrl->state == NVME_CTRL_LIVE)
+ ctrl->ops->submit_async_event(ctrl);
}
static bool nvme_ctrl_pp_status(struct nvme_ctrl *ctrl)