From: Pavel Begunkov <[email protected]>
To: Jens Axboe <[email protected]>, [email protected]
Cc: Dennis Zhou <[email protected]>, Tejun Heo <[email protected]>,
Christoph Lameter <[email protected]>, Joakim Hassila <[email protected]>
Subject: Re: [PATCH 0/2] fix hangs with shared sqpoll
Date: Fri, 16 Apr 2021 14:12:34 +0100 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 16/04/2021 14:04, Jens Axboe wrote:
> On 4/15/21 6:26 PM, Pavel Begunkov wrote:
>> On 16/04/2021 01:22, Pavel Begunkov wrote:
>>> Late catched 5.12 bug with nasty hangs. Thanks Jens for a reproducer.
>>
>> 1/2 is basically a rip off of one of old Jens' patches, but can't
>> find it anywhere. If you still have it, especially if it was
>> reviewed/etc., may make sense to go with it instead
>
> I wonder if we can do something like the below instead - we don't
> care about a particularly stable count in terms of wakeup
> reliance, and it'd save a nasty sync atomic switch.
But we care about it being monotonous. There are nuances with it.
I think, non sync'ed summing may put it to eternal sleep.
Are you looking to save on switching? It's almost always is already
dying with prior ref_kill
>
> Totally untested...
>
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 6c182a3a221b..9edbcf01ea49 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -8928,7 +8928,7 @@ static void io_uring_cancel_sqpoll(struct io_ring_ctx *ctx)
> atomic_inc(&tctx->in_idle);
> do {
> /* read completions before cancelations */
> - inflight = tctx_inflight(tctx, false);
> + inflight = percpu_ref_sum(&ctx->refs);
> if (!inflight)
> break;
> io_uring_try_cancel_requests(ctx, current, NULL);
> @@ -8939,7 +8939,7 @@ static void io_uring_cancel_sqpoll(struct io_ring_ctx *ctx)
> * avoids a race where a completion comes in before we did
> * prepare_to_wait().
> */
> - if (inflight == tctx_inflight(tctx, false))
> + if (inflight == percpu_ref_sum(&ctx->refs))
> schedule();
> finish_wait(&tctx->wait, &wait);
> } while (1);
> diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
> index 16c35a728b4c..2f29f34bc993 100644
> --- a/include/linux/percpu-refcount.h
> +++ b/include/linux/percpu-refcount.h
> @@ -131,6 +131,7 @@ void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
> void percpu_ref_resurrect(struct percpu_ref *ref);
> void percpu_ref_reinit(struct percpu_ref *ref);
> bool percpu_ref_is_zero(struct percpu_ref *ref);
> +long percpu_ref_sum(struct percpu_ref *ref);
>
> /**
> * percpu_ref_kill - drop the initial ref
> diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
> index a1071cdefb5a..b09ed9fdd32d 100644
> --- a/lib/percpu-refcount.c
> +++ b/lib/percpu-refcount.c
> @@ -475,3 +475,31 @@ void percpu_ref_resurrect(struct percpu_ref *ref)
> spin_unlock_irqrestore(&percpu_ref_switch_lock, flags);
> }
> EXPORT_SYMBOL_GPL(percpu_ref_resurrect);
> +
> +/**
> + * percpu_ref_sum - return approximate ref counts
> + * @ref: perpcu_ref to sum
> + *
> + * Note that this should only really be used to compare refs, as by the
> + * very nature of percpu references, the value may be stale even before it
> + * has been returned.
> + */
> +long percpu_ref_sum(struct percpu_ref *ref)
> +{
> + unsigned long __percpu *percpu_count;
> + long ret;
> +
> + rcu_read_lock();
> + if (__ref_is_percpu(ref, &percpu_count)) {
> + ret = atomic_long_read(&ref->data->count);
> + } else {
> + int cpu;
> +
> + ret = 0;
> + for_each_possible_cpu(cpu)
> + ret += *per_cpu_ptr(percpu_count, cpu);
> + }
> + rcu_read_unlock();
> +
> + return ret;
> +}
>
--
Pavel Begunkov
next prev parent reply other threads:[~2021-04-16 13:16 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-16 0:22 [PATCH 0/2] fix hangs with shared sqpoll Pavel Begunkov
2021-04-16 0:22 ` [PATCH 1/2] percpu_ref: add percpu_ref_atomic_count() Pavel Begunkov
2021-04-16 4:45 ` Dennis Zhou
2021-04-16 13:16 ` Pavel Begunkov
2021-04-16 14:10 ` Ming Lei
2021-04-16 14:37 ` Dennis Zhou
2021-04-19 2:03 ` Ming Lei
2021-04-16 15:31 ` Bart Van Assche
2021-04-16 15:34 ` Jens Axboe
2021-04-16 0:22 ` [PATCH 2/2] io_uring: fix shared sqpoll cancellation hangs Pavel Begunkov
2021-04-16 0:26 ` [PATCH 0/2] fix hangs with shared sqpoll Pavel Begunkov
2021-04-16 13:04 ` Jens Axboe
2021-04-16 13:12 ` Pavel Begunkov [this message]
2021-04-16 13:58 ` Jens Axboe
2021-04-16 14:09 ` Pavel Begunkov
2021-04-16 14:42 ` Pavel Begunkov
[not found] ` <[email protected]>
2021-04-18 13:56 ` Pavel Begunkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox