* [PATCH] io-wq: set task TASK_INTERRUPTIBLE state before schedule_timeout @ 2020-10-27 3:09 qiang.zhang 2020-10-27 13:35 ` Jens Axboe 0 siblings, 1 reply; 4+ messages in thread From: qiang.zhang @ 2020-10-27 3:09 UTC (permalink / raw) To: axboe; +Cc: io-uring, linux-kernel From: Zqiang <[email protected]> In 'io_wqe_worker' thread, if the work which in 'wqe->work_list' be finished, the 'wqe->work_list' is empty, and after that the '__io_worker_idle' func return false, the task state is TASK_RUNNING, need to be set TASK_INTERRUPTIBLE before call schedule_timeout func. Signed-off-by: Zqiang <[email protected]> --- fs/io-wq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/io-wq.c b/fs/io-wq.c index 02894df7656d..5f0626935b64 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -618,7 +618,7 @@ static int io_wqe_worker(void *data) raw_spin_unlock_irq(&wqe->lock); if (signal_pending(current)) flush_signals(current); - if (schedule_timeout(WORKER_IDLE_TIMEOUT)) + if (schedule_timeout_interruptible(WORKER_IDLE_TIMEOUT)) continue; /* timed out, exit unless we're the fixed worker */ if (test_bit(IO_WQ_BIT_EXIT, &wq->state) || -- 2.17.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] io-wq: set task TASK_INTERRUPTIBLE state before schedule_timeout 2020-10-27 3:09 [PATCH] io-wq: set task TASK_INTERRUPTIBLE state before schedule_timeout qiang.zhang @ 2020-10-27 13:35 ` Jens Axboe 2020-10-28 2:47 ` 回复: " Zhang, Qiang 0 siblings, 1 reply; 4+ messages in thread From: Jens Axboe @ 2020-10-27 13:35 UTC (permalink / raw) To: qiang.zhang; +Cc: io-uring, linux-kernel On 10/26/20 9:09 PM, [email protected] wrote: > From: Zqiang <[email protected]> > > In 'io_wqe_worker' thread, if the work which in 'wqe->work_list' be > finished, the 'wqe->work_list' is empty, and after that the > '__io_worker_idle' func return false, the task state is TASK_RUNNING, > need to be set TASK_INTERRUPTIBLE before call schedule_timeout func. I don't think that's safe - what if someone added work right before you call schedule_timeout_interruptible? Something ala: io_wq_enqueue() set_current_state(TASK_INTERRUPTIBLE(); schedule_timeout(WORKER_IDLE_TIMEOUT); then we'll have work added and the task state set to running, but the worker itself just sets us to non-running and will hence wait WORKER_IDLE_TIMEOUT before the work is processed. The current situation will do one extra loop for this case, as the schedule_timeout() just ends up being a nop and we go around again checking for work. Since we already unused the mm, the next iteration will go to sleep properly unless new work came in. -- Jens Axboe ^ permalink raw reply [flat|nested] 4+ messages in thread
* 回复: [PATCH] io-wq: set task TASK_INTERRUPTIBLE state before schedule_timeout 2020-10-27 13:35 ` Jens Axboe @ 2020-10-28 2:47 ` Zhang, Qiang 2020-10-28 13:36 ` Jens Axboe 0 siblings, 1 reply; 4+ messages in thread From: Zhang, Qiang @ 2020-10-28 2:47 UTC (permalink / raw) To: Jens Axboe; +Cc: [email protected], [email protected] ________________________________________ 发件人: Jens Axboe <[email protected]> 发送时间: 2020年10月27日 21:35 收件人: Zhang, Qiang 抄送: [email protected]; [email protected] 主题: Re: [PATCH] io-wq: set task TASK_INTERRUPTIBLE state before schedule_timeout On 10/26/20 9:09 PM, [email protected] wrote: > From: Zqiang <[email protected]> > > In 'io_wqe_worker' thread, if the work which in 'wqe->work_list' be > finished, the 'wqe->work_list' is empty, and after that the > '__io_worker_idle' func return false, the task state is TASK_RUNNING, > need to be set TASK_INTERRUPTIBLE before call schedule_timeout func. > >I don't think that's safe - what if someone added work right before you >call schedule_timeout_interruptible? Something ala: > > >io_wq_enqueue() > set_current_state(TASK_INTERRUPTIBLE(); > schedule_timeout(WORKER_IDLE_TIMEOUT); > >then we'll have work added and the task state set to running, but the >worker itself just sets us to non-running and will hence wait >WORKER_IDLE_TIMEOUT before the work is processed. > >The current situation will do one extra loop for this case, as the >schedule_timeout() just ends up being a nop and we go around again although the worker task state is running, due to the call schedule_timeout, the current worker still possible to be switched out. if set current worker task is no-running, the current worker be switched out, but the schedule will call io_wq_worker_sleeping func to wake up free worker task, if wqe->free_list is not empty. >checking for work. Since we already unused the mm, the next iteration >will go to sleep properly unless new work came in. > >-- >Jens Axboe ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: 回复: [PATCH] io-wq: set task TASK_INTERRUPTIBLE state before schedule_timeout 2020-10-28 2:47 ` 回复: " Zhang, Qiang @ 2020-10-28 13:36 ` Jens Axboe 0 siblings, 0 replies; 4+ messages in thread From: Jens Axboe @ 2020-10-28 13:36 UTC (permalink / raw) To: Zhang, Qiang; +Cc: [email protected], [email protected] On 10/27/20 8:47 PM, Zhang, Qiang wrote: > > > ________________________________________ > 发件人: Jens Axboe <[email protected]> > 发送时间: 2020年10月27日 21:35 > 收件人: Zhang, Qiang > 抄送: [email protected]; [email protected] > 主题: Re: [PATCH] io-wq: set task TASK_INTERRUPTIBLE state before schedule_timeout > > On 10/26/20 9:09 PM, [email protected] wrote: >> From: Zqiang <[email protected]> >> >> In 'io_wqe_worker' thread, if the work which in 'wqe->work_list' be >> finished, the 'wqe->work_list' is empty, and after that the >> '__io_worker_idle' func return false, the task state is TASK_RUNNING, >> need to be set TASK_INTERRUPTIBLE before call schedule_timeout func. >> >> I don't think that's safe - what if someone added work right before you >> call schedule_timeout_interruptible? Something ala: >> >> >> io_wq_enqueue() >> set_current_state(TASK_INTERRUPTIBLE(); >> schedule_timeout(WORKER_IDLE_TIMEOUT); >> >> then we'll have work added and the task state set to running, but the >> worker itself just sets us to non-running and will hence wait >> WORKER_IDLE_TIMEOUT before the work is processed. >> >> The current situation will do one extra loop for this case, as the >> schedule_timeout() just ends up being a nop and we go around again > > although the worker task state is running, due to the call > schedule_timeout, the current worker still possible to be switched > out. if set current worker task is no-running, the current worker be > switched out, but the schedule will call io_wq_worker_sleeping func > to wake up free worker task, if wqe->free_list is not empty. It'll only be swapped out for TASK_RUNNING if we should be running other work, which would happen on next need-resched event anyway. And the miss you're describing is an expensive one, as it entails creating a new thread and switching to that. That's not a great way to handle a race. So I'm a bit puzzled here - yes we'll do an extra loop and check for the dropping of mm, but that's really minor. The solution is a _lot_ more expensive for hitting the race of needing a new worker, but missing it because you unconditionally set the task to non-running. On top of that, it's also not the idiomatic way to wait for events, which is typically: is event true, break if so set_current_state(TASK_INTERRUPTIBLE); event comes in, task set runnable check again, schedule doesn't schedule, since we were set runnable or variants thereof, using waitqueues. So while I'm of course not opposed to fixing the io-wq loop so that we don't do that last loop when going idle, a) it basically doesn't matter, and b) the proposed solution is much worse. If there was a more elegant solution without worse side effects, then we can discuss that. -- Jens Axboe ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-10-29 2:00 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-10-27 3:09 [PATCH] io-wq: set task TASK_INTERRUPTIBLE state before schedule_timeout qiang.zhang 2020-10-27 13:35 ` Jens Axboe 2020-10-28 2:47 ` 回复: " Zhang, Qiang 2020-10-28 13:36 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox