public inbox for [email protected]
 help / color / mirror / Atom feed
From: Pavel Begunkov <[email protected]>
To: Jens Axboe <[email protected]>, [email protected]
Subject: Re: [PATCH v2] io-wq: forcefully cancel on io-wq destroy
Date: Thu, 1 Apr 2021 11:25:19 +0100	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 01/04/2021 02:17, Jens Axboe wrote:
> On 3/31/21 5:18 PM, Pavel Begunkov wrote:
>> [  491.222908] INFO: task thread-exit:2490 blocked for more than 122 seconds.
>> [  491.222957] Call Trace:
>> [  491.222967]  __schedule+0x36b/0x950
>> [  491.222985]  schedule+0x68/0xe0
>> [  491.222994]  schedule_timeout+0x209/0x2a0
>> [  491.223003]  ? tlb_flush_mmu+0x28/0x140
>> [  491.223013]  wait_for_completion+0x8b/0xf0
>> [  491.223023]  io_wq_destroy_manager+0x24/0x60
>> [  491.223037]  io_wq_put_and_exit+0x18/0x30
>> [  491.223045]  io_uring_clean_tctx+0x76/0xa0
>> [  491.223061]  __io_uring_files_cancel+0x1b9/0x2e0
>> [  491.223068]  ? blk_finish_plug+0x26/0x40
>> [  491.223085]  do_exit+0xc0/0xb40
>> [  491.223099]  ? syscall_trace_enter.isra.0+0x1a1/0x1e0
>> [  491.223109]  __x64_sys_exit+0x1b/0x20
>> [  491.223117]  do_syscall_64+0x38/0x50
>> [  491.223131]  entry_SYSCALL_64_after_hwframe+0x44/0xae
>> [  491.223177] INFO: task iou-mgr-2490:2491 blocked for more than 122 seconds.
>> [  491.223194] Call Trace:
>> [  491.223198]  __schedule+0x36b/0x950
>> [  491.223206]  ? pick_next_task_fair+0xcf/0x3e0
>> [  491.223218]  schedule+0x68/0xe0
>> [  491.223225]  schedule_timeout+0x209/0x2a0
>> [  491.223236]  wait_for_completion+0x8b/0xf0
>> [  491.223246]  io_wq_manager+0xf1/0x1d0
>> [  491.223255]  ? recalc_sigpending+0x1c/0x60
>> [  491.223265]  ? io_wq_cpu_online+0x40/0x40
>> [  491.223272]  ret_from_fork+0x22/0x30
>>
>> When io-wq worker exits and sees IO_WQ_BIT_EXIT it tries not cancel all
>> left requests but to execute them, hence we may wait for the exiting
>> task for long until someone pushes it, e.g. with SIGKILL. Actively
>> cancel pending work items on io-wq destruction.
>>
>> note: io_run_cancel() moved up without any changes.
> 
> Just to pull some of the discussion in here - I don't think this is a
> good idea as-is. At the very least, this should be gated on UNBOUND,
> and just waiting for bounded requests while canceling unbounded ones.

Right, and this may be unexpected for userspace as well, e.g.
sockets/pipes. Another approach would be go executing for some time, but
if doesn't help go and kill them all. Or mixture of both. This at least
would give a chance for socket ops to get it done if it's dynamic and
doesn't stuck waiting.

Though, as the original problem it locks do_exit() for some time,
that's not nice, so maybe it would need deferring this final io-wq
execution to async and letting do_exit() to proceed.

-- 
Pavel Begunkov

      reply	other threads:[~2021-04-01 10:30 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-31 23:18 [PATCH v2] io-wq: forcefully cancel on io-wq destroy Pavel Begunkov
2021-04-01  1:17 ` Jens Axboe
2021-04-01 10:25   ` Pavel Begunkov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox