From: Jens Axboe <[email protected]>
To: Keith Busch <[email protected]>
Cc: Andrew Marshall <[email protected]>,
[email protected],
Greg Kroah-Hartman <[email protected]>,
stable <[email protected]>
Subject: Stable backport (was "Re: PROBLEM: io_uring hang causing uninterruptible sleep state on 6.6.59")
Date: Sun, 3 Nov 2024 19:38:30 -0700 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
[-- Attachment #1: Type: text/plain, Size: 1802 bytes --]
On 11/3/24 5:06 PM, Jens Axboe wrote:
> On 11/3/24 5:01 PM, Keith Busch wrote:
>> On Sun, Nov 03, 2024 at 04:53:27PM -0700, Jens Axboe wrote:
>>> On 11/3/24 4:47 PM, Andrew Marshall wrote:
>>>> I identified f4ce3b5d26ce149e77e6b8e8f2058aa80e5b034e as the likely
>>>> problematic commit simply by browsing git log. As indicated above;
>>>> reverting that atop 6.6.59 results in success. Since it is passing on
>>>> 6.11.6, I suspect there is some missing backport to 6.6.x, or some
>>>> other semantic merge conflict. Unfortunately I do not have a compact,
>>>> minimal reproducer, but can provide my large one (it is testing a
>>>> larger build process in a VM) if needed?there are some additional
>>>> details in the above-linked downstream bug report, though. I hope that
>>>> having identified the problematic commit is enough for someone with
>>>> more context to go off of. Happy to provide more information if
>>>> needed.
>>>
>>> Don't worry about not having a reproducer, having the backport commit
>>> pin pointed will do just fine. I'll take a look at this.
>>
>> I think stable is missing:
>>
>> 6b231248e97fc3 ("io_uring: consolidate overflow flushing")
>
> I think you need to go back further than that, this one already
> unconditionally holds ->uring_lock around overflow flushing...
Took a look, it's this one:
commit 8d09a88ef9d3cb7d21d45c39b7b7c31298d23998
Author: Pavel Begunkov <[email protected]>
Date: Wed Apr 10 02:26:54 2024 +0100
io_uring: always lock __io_cqring_overflow_flush
Greg/stable, can you pick this one for 6.6-stable? It picks
cleanly.
For 6.1, which is the other stable of that age that has the backport,
the attached patch will do the trick.
With that, I believe it should be sorted. Hopefully that can make
6.6.60 and 6.1.116.
--
Jens Axboe
[-- Attachment #2: 0001-io_uring-always-lock-__io_cqring_overflow_flush.patch --]
[-- Type: text/x-patch, Size: 1966 bytes --]
From 3f1c33f03386c481caf2044a836f3ca611094098 Mon Sep 17 00:00:00 2001
From: Pavel Begunkov <[email protected]>
Date: Wed, 10 Apr 2024 02:26:54 +0100
Subject: [PATCH] io_uring: always lock __io_cqring_overflow_flush
Commit 8d09a88ef9d3cb7d21d45c39b7b7c31298d23998 upstream.
Conditional locking is never great, in case of
__io_cqring_overflow_flush(), which is a slow path, it's not justified.
Don't handle IOPOLL separately, always grab uring_lock for overflow
flushing.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/162947df299aa12693ac4b305dacedab32ec7976.1712708261.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
---
io_uring/io_uring.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index f902b161f02c..92c1aa8f3501 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -593,6 +593,8 @@ static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force)
bool all_flushed;
size_t cqe_size = sizeof(struct io_uring_cqe);
+ lockdep_assert_held(&ctx->uring_lock);
+
if (!force && __io_cqring_events(ctx) == ctx->cq_entries)
return false;
@@ -647,12 +649,9 @@ static bool io_cqring_overflow_flush(struct io_ring_ctx *ctx)
bool ret = true;
if (test_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq)) {
- /* iopoll syncs against uring_lock, not completion_lock */
- if (ctx->flags & IORING_SETUP_IOPOLL)
- mutex_lock(&ctx->uring_lock);
+ mutex_lock(&ctx->uring_lock);
ret = __io_cqring_overflow_flush(ctx, false);
- if (ctx->flags & IORING_SETUP_IOPOLL)
- mutex_unlock(&ctx->uring_lock);
+ mutex_unlock(&ctx->uring_lock);
}
return ret;
@@ -1405,6 +1404,8 @@ static int io_iopoll_check(struct io_ring_ctx *ctx, long min)
int ret = 0;
unsigned long check_cq;
+ lockdep_assert_held(&ctx->uring_lock);
+
if (!io_allowed_run_tw(ctx))
return -EEXIST;
--
2.45.2
next prev parent reply other threads:[~2024-11-04 2:38 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-03 23:47 PROBLEM: io_uring hang causing uninterruptible sleep state on 6.6.59 Andrew Marshall
2024-11-03 23:53 ` Jens Axboe
2024-11-03 23:58 ` Jens Axboe
2024-11-04 0:01 ` Keith Busch
2024-11-04 0:06 ` Jens Axboe
2024-11-04 2:38 ` Jens Axboe [this message]
2024-11-04 4:25 ` Stable backport (was "Re: PROBLEM: io_uring hang causing uninterruptible sleep state on 6.6.59") Andrew Marshall
2024-11-04 13:17 ` Andrew Marshall
2024-11-04 15:58 ` Jens Axboe
2024-11-06 6:05 ` Greg Kroah-Hartman
2024-11-06 14:11 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox