blk-mq: Defer freeing flush queue to SRCU callback

The freeing of the flush queue/request in blk_mq_exit_hctx() can race with
tag iterators that may still be accessing it. To prevent a potential
use-after-free, the deallocation should be deferred until after a grace
period. With this way, we can replace the big tags->lock in tags iterator
code path with srcu for solving the issue.

This patch introduces an SRCU-based deferred freeing mechanism for the
flush queue.

The changes include:
- Adding a `rcu_head` to `struct blk_flush_queue`.
- Creating a new callback function, `blk_free_flush_queue_callback`,
  to handle the actual freeing.
- Replacing the direct call to `blk_free_flush_queue()` in
  `blk_mq_exit_hctx()` with `call_srcu()`, using the `tags_srcu`
  instance to ensure synchronization with tag iterators.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This commit is contained in:
Ming Lei
2025-08-30 10:18:22 +08:00
committed by Jens Axboe
parent ad0d05dbdd
commit 135b8521f2
2 changed files with 11 additions and 1 deletions

View File

@@ -3912,6 +3912,14 @@ static void blk_mq_clear_flush_rq_mapping(struct blk_mq_tags *tags,
spin_unlock_irqrestore(&tags->lock, flags);
}
static void blk_free_flush_queue_callback(struct rcu_head *head)
{
struct blk_flush_queue *fq =
container_of(head, struct blk_flush_queue, rcu_head);
blk_free_flush_queue(fq);
}
/* hctx->ctxs will be freed in queue's release handler */
static void blk_mq_exit_hctx(struct request_queue *q,
struct blk_mq_tag_set *set,
@@ -3931,7 +3939,8 @@ static void blk_mq_exit_hctx(struct request_queue *q,
if (set->ops->exit_hctx)
set->ops->exit_hctx(hctx, hctx_idx);
blk_free_flush_queue(hctx->fq);
call_srcu(&set->tags_srcu, &hctx->fq->rcu_head,
blk_free_flush_queue_callback);
hctx->fq = NULL;
xa_erase(&q->hctx_table, hctx_idx);