* Re: [PATCH] io_uring: handle signals before letting io-worker exit
[not found] <[email protected]>
@ 2021-05-27 13:46 ` Jens Axboe
2021-05-27 15:21 ` Olivier Langlois
0 siblings, 1 reply; 4+ messages in thread
From: Jens Axboe @ 2021-05-27 13:46 UTC (permalink / raw)
To: Olivier Langlois, Pavel Begunkov, io-uring, linux-kernel
On 5/26/21 12:21 PM, Olivier Langlois wrote:
> This is required for proper core dump generation.
>
> Because signals are normally serviced before resuming userspace and an
> io_worker thread will never resume userspace, it needs to explicitly
> call the signal servicing functions.
>
> Also, notice that it is possible to exit from the io_wqe_worker()
> function main loop while having a pending signal such as when
> the IO_WQ_BIT_EXIT bit is set.
>
> It is crucial to service any pending signal before calling do_exit()
> Proper coredump generation is relying on PF_SIGNALED to be set.
>
> More specifically, exit_mm() is using this flag to wait for the
> core dump completion before releasing its memory descriptor.
>
> Signed-off-by: Olivier Langlois <[email protected]>
> ---
> fs/io-wq.c | 22 ++++++++++++++++++++--
> 1 file changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/fs/io-wq.c b/fs/io-wq.c
> index 5361a9b4b47b..b76c61e9aff2 100644
> --- a/fs/io-wq.c
> +++ b/fs/io-wq.c
> @@ -9,8 +9,6 @@
> #include <linux/init.h>
> #include <linux/errno.h>
> #include <linux/sched/signal.h>
> -#include <linux/mm.h>
> -#include <linux/sched/mm.h>
> #include <linux/percpu.h>
> #include <linux/slab.h>
> #include <linux/rculist_nulls.h>
> @@ -193,6 +191,26 @@ static void io_worker_exit(struct io_worker *worker)
>
> kfree_rcu(worker, rcu);
> io_worker_ref_put(wqe->wq);
> + /*
> + * Because signals are normally serviced before resuming userspace and an
> + * io_worker thread will never resume userspace, it needs to explicitly
> + * call the signal servicing functions.
> + *
> + * Also notice that it is possible to exit from the io_wqe_worker()
> + * function main loop while having a pending signal such as when
> + * the IO_WQ_BIT_EXIT bit is set.
> + *
> + * It is crucial to service any pending signal before calling do_exit()
> + * Proper coredump generation is relying on PF_SIGNALED to be set.
> + *
> + * More specifically, exit_mm() is using this flag to wait for the
> + * core dump completion before releasing its memory descriptor.
> + */
> + if (signal_pending(current)) {
> + struct ksignal ksig;
> +
> + get_signal(&ksig);
> + }
> do_exit(0);
> }
Do we need the same thing in fs/io_uring.c:io_sq_thread()?
--
Jens Axboe
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] io_uring: handle signals before letting io-worker exit
2021-05-27 13:46 ` [PATCH] io_uring: handle signals before letting io-worker exit Jens Axboe
@ 2021-05-27 15:21 ` Olivier Langlois
2021-05-27 15:30 ` Jens Axboe
0 siblings, 1 reply; 4+ messages in thread
From: Olivier Langlois @ 2021-05-27 15:21 UTC (permalink / raw)
To: Jens Axboe, Pavel Begunkov, io-uring, linux-kernel
On Thu, 2021-05-27 at 07:46 -0600, Jens Axboe wrote:
> On 5/26/21 12:21 PM, Olivier Langlois wrote:
> > This is required for proper core dump generation.
> >
> > Because signals are normally serviced before resuming userspace and
> > an
> > io_worker thread will never resume userspace, it needs to
> > explicitly
> > call the signal servicing functions.
> >
> > Also, notice that it is possible to exit from the io_wqe_worker()
> > function main loop while having a pending signal such as when
> > the IO_WQ_BIT_EXIT bit is set.
> >
> > It is crucial to service any pending signal before calling
> > do_exit()
> > Proper coredump generation is relying on PF_SIGNALED to be set.
> >
> > More specifically, exit_mm() is using this flag to wait for the
> > core dump completion before releasing its memory descriptor.
> >
> > Signed-off-by: Olivier Langlois <[email protected]>
> > ---
> > fs/io-wq.c | 22 ++++++++++++++++++++--
> > 1 file changed, 20 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/io-wq.c b/fs/io-wq.c
> > index 5361a9b4b47b..b76c61e9aff2 100644
> > --- a/fs/io-wq.c
> > +++ b/fs/io-wq.c
> > @@ -9,8 +9,6 @@
> > #include <linux/init.h>
> > #include <linux/errno.h>
> > #include <linux/sched/signal.h>
> > -#include <linux/mm.h>
> > -#include <linux/sched/mm.h>
> > #include <linux/percpu.h>
> > #include <linux/slab.h>
> > #include <linux/rculist_nulls.h>
> > @@ -193,6 +191,26 @@ static void io_worker_exit(struct io_worker
> > *worker)
> >
> > kfree_rcu(worker, rcu);
> > io_worker_ref_put(wqe->wq);
> > + /*
> > + * Because signals are normally serviced before resuming
> > userspace and an
> > + * io_worker thread will never resume userspace, it needs
> > to explicitly
> > + * call the signal servicing functions.
> > + *
> > + * Also notice that it is possible to exit from the
> > io_wqe_worker()
> > + * function main loop while having a pending signal such as
> > when
> > + * the IO_WQ_BIT_EXIT bit is set.
> > + *
> > + * It is crucial to service any pending signal before
> > calling do_exit()
> > + * Proper coredump generation is relying on PF_SIGNALED to
> > be set.
> > + *
> > + * More specifically, exit_mm() is using this flag to wait
> > for the
> > + * core dump completion before releasing its memory
> > descriptor.
> > + */
> > + if (signal_pending(current)) {
> > + struct ksignal ksig;
> > +
> > + get_signal(&ksig);
> > + }
> > do_exit(0);
> > }
>
> Do we need the same thing in fs/io_uring.c:io_sq_thread()?
>
Jens,
You are 100% correct. In fact, this is the same problem for ALL
currently existing and future io threads. Therefore, I start to think
that the right place for the fix might be straight into do_exit()...
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] io_uring: handle signals before letting io-worker exit
2021-05-27 15:21 ` Olivier Langlois
@ 2021-05-27 15:30 ` Jens Axboe
2021-05-27 16:09 ` Olivier Langlois
0 siblings, 1 reply; 4+ messages in thread
From: Jens Axboe @ 2021-05-27 15:30 UTC (permalink / raw)
To: Olivier Langlois, Pavel Begunkov, io-uring, linux-kernel
On 5/27/21 9:21 AM, Olivier Langlois wrote:
> On Thu, 2021-05-27 at 07:46 -0600, Jens Axboe wrote:
>> On 5/26/21 12:21 PM, Olivier Langlois wrote:
>>> This is required for proper core dump generation.
>>>
>>> Because signals are normally serviced before resuming userspace and
>>> an
>>> io_worker thread will never resume userspace, it needs to
>>> explicitly
>>> call the signal servicing functions.
>>>
>>> Also, notice that it is possible to exit from the io_wqe_worker()
>>> function main loop while having a pending signal such as when
>>> the IO_WQ_BIT_EXIT bit is set.
>>>
>>> It is crucial to service any pending signal before calling
>>> do_exit()
>>> Proper coredump generation is relying on PF_SIGNALED to be set.
>>>
>>> More specifically, exit_mm() is using this flag to wait for the
>>> core dump completion before releasing its memory descriptor.
>>>
>>> Signed-off-by: Olivier Langlois <[email protected]>
>>> ---
>>> fs/io-wq.c | 22 ++++++++++++++++++++--
>>> 1 file changed, 20 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/io-wq.c b/fs/io-wq.c
>>> index 5361a9b4b47b..b76c61e9aff2 100644
>>> --- a/fs/io-wq.c
>>> +++ b/fs/io-wq.c
>>> @@ -9,8 +9,6 @@
>>> #include <linux/init.h>
>>> #include <linux/errno.h>
>>> #include <linux/sched/signal.h>
>>> -#include <linux/mm.h>
>>> -#include <linux/sched/mm.h>
>>> #include <linux/percpu.h>
>>> #include <linux/slab.h>
>>> #include <linux/rculist_nulls.h>
>>> @@ -193,6 +191,26 @@ static void io_worker_exit(struct io_worker
>>> *worker)
>>>
>>> kfree_rcu(worker, rcu);
>>> io_worker_ref_put(wqe->wq);
>>> + /*
>>> + * Because signals are normally serviced before resuming
>>> userspace and an
>>> + * io_worker thread will never resume userspace, it needs
>>> to explicitly
>>> + * call the signal servicing functions.
>>> + *
>>> + * Also notice that it is possible to exit from the
>>> io_wqe_worker()
>>> + * function main loop while having a pending signal such as
>>> when
>>> + * the IO_WQ_BIT_EXIT bit is set.
>>> + *
>>> + * It is crucial to service any pending signal before
>>> calling do_exit()
>>> + * Proper coredump generation is relying on PF_SIGNALED to
>>> be set.
>>> + *
>>> + * More specifically, exit_mm() is using this flag to wait
>>> for the
>>> + * core dump completion before releasing its memory
>>> descriptor.
>>> + */
>>> + if (signal_pending(current)) {
>>> + struct ksignal ksig;
>>> +
>>> + get_signal(&ksig);
>>> + }
>>> do_exit(0);
>>> }
>>
>> Do we need the same thing in fs/io_uring.c:io_sq_thread()?
>>
> Jens,
>
> You are 100% correct. In fact, this is the same problem for ALL
> currently existing and future io threads. Therefore, I start to think
> that the right place for the fix might be straight into do_exit()...
That is what I was getting at. To avoid poluting do_exit() with it, I
think it'd be best to add an io_thread_exit() that simply does:
void io_thread_exit(void)
{
if (signal_pending(current)) {
struct ksignal ksig;
get_signal(&ksig);
}
do_exit(0);
}
and convert the do_exit() calls in io_uring/io-wq to io_thread_exit()
instead.
--
Jens Axboe
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] io_uring: handle signals before letting io-worker exit
2021-05-27 15:30 ` Jens Axboe
@ 2021-05-27 16:09 ` Olivier Langlois
0 siblings, 0 replies; 4+ messages in thread
From: Olivier Langlois @ 2021-05-27 16:09 UTC (permalink / raw)
To: Jens Axboe, Pavel Begunkov, io-uring, linux-kernel
On Thu, 2021-05-27 at 09:30 -0600, Jens Axboe wrote:
> > >
> > Jens,
> >
> > You are 100% correct. In fact, this is the same problem for ALL
> > currently existing and future io threads. Therefore, I start to
> > think
> > that the right place for the fix might be straight into
> > do_exit()...
>
> That is what I was getting at. To avoid poluting do_exit() with it, I
> think it'd be best to add an io_thread_exit() that simply does:
>
> void io_thread_exit(void)
> {
> if (signal_pending(current)) {
> struct ksignal ksig;
> get_signal(&ksig);
> }
> do_exit(0);
> }
>
> and convert the do_exit() calls in io_uring/io-wq to io_thread_exit()
> instead.
>
IMHO, that would be an acceptable compromise because it does fix my
problem. However, I am of the opinion that it wouldn't be poluting
do_exit() and would in fact be the right place to do it considering
that create_io_thread() is in kernel and theoritically, anyone can call
it to create an io_thread and would be susceptible to get bitten by the
exact same problem and would have to come up with a similar solution if
it is not addressed directly by the kernel.
Also, since I have submitted the patch, I have made the following
realization:
I got bitten by the problem because of a race condition between the io-
mgr thread and its io-wrks threads for processing their pending SIGKILL
and the proposed patch does correct my problem.
The issue would have most likely been buried by 5.13 io-mgr removal...
BUT, even the proposed patch isn't 100% perfect. AFAIK, it is still
possible, but very unlikely, to get a signal between calling
signal_pending() and do_exit().
It might be possible to implement the solution and be 100% correct all
the time by doing it inside do_exit()... I am currently eyeing
exit_signals() as a potential good site for the patch...
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-05-27 16:09 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <[email protected]>
2021-05-27 13:46 ` [PATCH] io_uring: handle signals before letting io-worker exit Jens Axboe
2021-05-27 15:21 ` Olivier Langlois
2021-05-27 15:30 ` Jens Axboe
2021-05-27 16:09 ` Olivier Langlois
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox