* [RFC][BUG] io_uring: fix work corruption for poll_add @ 2020-07-23 18:12 Pavel Begunkov 2020-07-23 18:15 ` Pavel Begunkov 2020-07-23 22:16 ` Jens Axboe 0 siblings, 2 replies; 10+ messages in thread From: Pavel Begunkov @ 2020-07-23 18:12 UTC (permalink / raw) To: Jens Axboe, io-uring poll_add can have req->work initialised, which will be overwritten in __io_arm_poll_handler() because of the union. Luckily, hash_node is zeroed in the end, so the damage is limited to lost put for work.creds, and probably corrupted work.list. That's the easiest and really dirty fix, which rearranges members in the union, arm_poll*() modifies and zeroes only work.files and work.mm, which are never taken for poll add. note: io_kiocb is exactly 4 cachelines now. Signed-off-by: Pavel Begunkov <[email protected]> --- fs/io_uring.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 32b0064f806e..58e6f7d938b6 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -669,12 +669,12 @@ struct io_kiocb { * restore the work, if needed. */ struct { - struct callback_head task_work; - struct hlist_node hash_node; struct async_poll *apoll; + struct hlist_node hash_node; }; struct io_wq_work work; }; + struct callback_head task_work; }; #define IO_PLUG_THRESHOLD 2 -- 2.24.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [RFC][BUG] io_uring: fix work corruption for poll_add 2020-07-23 18:12 [RFC][BUG] io_uring: fix work corruption for poll_add Pavel Begunkov @ 2020-07-23 18:15 ` Pavel Begunkov 2020-07-23 18:19 ` Jens Axboe 2020-07-23 22:16 ` Jens Axboe 1 sibling, 1 reply; 10+ messages in thread From: Pavel Begunkov @ 2020-07-23 18:15 UTC (permalink / raw) To: Jens Axboe, io-uring On 23/07/2020 21:12, Pavel Begunkov wrote: > poll_add can have req->work initialised, which will be overwritten in > __io_arm_poll_handler() because of the union. Luckily, hash_node is > zeroed in the end, so the damage is limited to lost put for work.creds, > and probably corrupted work.list. > > That's the easiest and really dirty fix, which rearranges members in the > union, arm_poll*() modifies and zeroes only work.files and work.mm, > which are never taken for poll add. > note: io_kiocb is exactly 4 cachelines now. Please, tell me if anybody has a good lean solution, because I'm a bit too tired at the moment to fix it properly. BTW, that's for 5.8, for-5.9 it should be done differently because of io_kiocb compaction. > > Signed-off-by: Pavel Begunkov <[email protected]> > --- > fs/io_uring.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 32b0064f806e..58e6f7d938b6 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -669,12 +669,12 @@ struct io_kiocb { > * restore the work, if needed. > */ > struct { > - struct callback_head task_work; > - struct hlist_node hash_node; > struct async_poll *apoll; > + struct hlist_node hash_node; > }; > struct io_wq_work work; > }; > + struct callback_head task_work; > }; > > #define IO_PLUG_THRESHOLD 2 > -- Pavel Begunkov ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC][BUG] io_uring: fix work corruption for poll_add 2020-07-23 18:15 ` Pavel Begunkov @ 2020-07-23 18:19 ` Jens Axboe 2020-07-23 19:10 ` Pavel Begunkov 0 siblings, 1 reply; 10+ messages in thread From: Jens Axboe @ 2020-07-23 18:19 UTC (permalink / raw) To: Pavel Begunkov, io-uring On 7/23/20 12:15 PM, Pavel Begunkov wrote: > On 23/07/2020 21:12, Pavel Begunkov wrote: >> poll_add can have req->work initialised, which will be overwritten in >> __io_arm_poll_handler() because of the union. Luckily, hash_node is >> zeroed in the end, so the damage is limited to lost put for work.creds, >> and probably corrupted work.list. >> >> That's the easiest and really dirty fix, which rearranges members in the >> union, arm_poll*() modifies and zeroes only work.files and work.mm, >> which are never taken for poll add. >> note: io_kiocb is exactly 4 cachelines now. > > Please, tell me if anybody has a good lean solution, because I'm a bit > too tired at the moment to fix it properly. > BTW, that's for 5.8, for-5.9 it should be done differently because of > io_kiocb compaction. Do you have a test case that leaks the reference? -- Jens Axboe ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC][BUG] io_uring: fix work corruption for poll_add 2020-07-23 18:19 ` Jens Axboe @ 2020-07-23 19:10 ` Pavel Begunkov 0 siblings, 0 replies; 10+ messages in thread From: Pavel Begunkov @ 2020-07-23 19:10 UTC (permalink / raw) To: Jens Axboe, io-uring On 23/07/2020 21:19, Jens Axboe wrote: > On 7/23/20 12:15 PM, Pavel Begunkov wrote: >> On 23/07/2020 21:12, Pavel Begunkov wrote: >>> poll_add can have req->work initialised, which will be overwritten in >>> __io_arm_poll_handler() because of the union. Luckily, hash_node is >>> zeroed in the end, so the damage is limited to lost put for work.creds, >>> and probably corrupted work.list. >>> >>> That's the easiest and really dirty fix, which rearranges members in the >>> union, arm_poll*() modifies and zeroes only work.files and work.mm, >>> which are never taken for poll add. >>> note: io_kiocb is exactly 4 cachelines now. >> >> Please, tell me if anybody has a good lean solution, because I'm a bit >> too tired at the moment to fix it properly. >> BTW, that's for 5.8, for-5.9 it should be done differently because of >> io_kiocb compaction. > > Do you have a test case that leaks the reference? link-timeout.c::test_timeout_link_chain2() - add IOSQE_ASYNC after poll_add_prep() (probably, not even needed) - close() pipes fds at the end. - while(1) test_timeout_link_chain2() That's what I did to test it. Confirmed with printk + it killed the system in 10-30 minutes. I can get something faster sometime later. -- Pavel Begunkov ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC][BUG] io_uring: fix work corruption for poll_add 2020-07-23 18:12 [RFC][BUG] io_uring: fix work corruption for poll_add Pavel Begunkov 2020-07-23 18:15 ` Pavel Begunkov @ 2020-07-23 22:16 ` Jens Axboe 2020-07-23 22:24 ` Jens Axboe 1 sibling, 1 reply; 10+ messages in thread From: Jens Axboe @ 2020-07-23 22:16 UTC (permalink / raw) To: Pavel Begunkov, io-uring On 7/23/20 12:12 PM, Pavel Begunkov wrote: > poll_add can have req->work initialised, which will be overwritten in > __io_arm_poll_handler() because of the union. Luckily, hash_node is > zeroed in the end, so the damage is limited to lost put for work.creds, > and probably corrupted work.list. > > That's the easiest and really dirty fix, which rearranges members in the > union, arm_poll*() modifies and zeroes only work.files and work.mm, > which are never taken for poll add. > note: io_kiocb is exactly 4 cachelines now. I don't think there's a way around moving task_work out, just like it was done on 5.9. The problem is that we could put the environment bits before doing task_work_add(), but we might need them if the subsequent queue ends up having to go async. So there's really no know when we can put them, outside of when the request finishes. Hence, we are kind of SOL here. -- Jens Axboe ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC][BUG] io_uring: fix work corruption for poll_add 2020-07-23 22:16 ` Jens Axboe @ 2020-07-23 22:24 ` Jens Axboe 2020-07-24 12:46 ` Pavel Begunkov 0 siblings, 1 reply; 10+ messages in thread From: Jens Axboe @ 2020-07-23 22:24 UTC (permalink / raw) To: Pavel Begunkov, io-uring On 7/23/20 4:16 PM, Jens Axboe wrote: > On 7/23/20 12:12 PM, Pavel Begunkov wrote: >> poll_add can have req->work initialised, which will be overwritten in >> __io_arm_poll_handler() because of the union. Luckily, hash_node is >> zeroed in the end, so the damage is limited to lost put for work.creds, >> and probably corrupted work.list. >> >> That's the easiest and really dirty fix, which rearranges members in the >> union, arm_poll*() modifies and zeroes only work.files and work.mm, >> which are never taken for poll add. >> note: io_kiocb is exactly 4 cachelines now. > > I don't think there's a way around moving task_work out, just like it > was done on 5.9. The problem is that we could put the environment bits > before doing task_work_add(), but we might need them if the subsequent > queue ends up having to go async. So there's really no know when we can > put them, outside of when the request finishes. Hence, we are kind of > SOL here. Actually, if we do go async, then we can just grab the environment again. We're in the same task at that point. So maybe it'd be better to work on ensuring that the request is either in the valid work state, or empty work if using task_work. Only potential complication with that is doing io_req_work_drop_env() from the waitqueue handler, at least the ->needs_fs part won't like that too much. -- Jens Axboe ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC][BUG] io_uring: fix work corruption for poll_add 2020-07-23 22:24 ` Jens Axboe @ 2020-07-24 12:46 ` Pavel Begunkov 2020-07-24 12:52 ` Pavel Begunkov 0 siblings, 1 reply; 10+ messages in thread From: Pavel Begunkov @ 2020-07-24 12:46 UTC (permalink / raw) To: Jens Axboe, io-uring On 24/07/2020 01:24, Jens Axboe wrote: > On 7/23/20 4:16 PM, Jens Axboe wrote: >> On 7/23/20 12:12 PM, Pavel Begunkov wrote: >>> poll_add can have req->work initialised, which will be overwritten in >>> __io_arm_poll_handler() because of the union. Luckily, hash_node is >>> zeroed in the end, so the damage is limited to lost put for work.creds, >>> and probably corrupted work.list. >>> >>> That's the easiest and really dirty fix, which rearranges members in the >>> union, arm_poll*() modifies and zeroes only work.files and work.mm, >>> which are never taken for poll add. >>> note: io_kiocb is exactly 4 cachelines now. >> >> I don't think there's a way around moving task_work out, just like it +hash_node. I was thinking to do apoll alloc+memcpy as for rw, but this one is ugly. >> was done on 5.9. The problem is that we could put the environment bits >> before doing task_work_add(), but we might need them if the subsequent >> queue ends up having to go async. So there's really no know when we can >> put them, outside of when the request finishes. Hence, we are kind of >> SOL here. > > Actually, if we do go async, then we can just grab the environment > again. We're in the same task at that point. So maybe it'd be better to > work on ensuring that the request is either in the valid work state, or > empty work if using task_work. > > Only potential complication with that is doing io_req_work_drop_env() > from the waitqueue handler, at least the ->needs_fs part won't like that > too much. Considering that work->list is removed before executing io_wq_work, it should work. And if done only for poll_add, which needs nothing and ends up with creds, there shouldn't be any problems. I'll try this out -- Pavel Begunkov ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC][BUG] io_uring: fix work corruption for poll_add 2020-07-24 12:46 ` Pavel Begunkov @ 2020-07-24 12:52 ` Pavel Begunkov 2020-07-24 14:12 ` Jens Axboe 0 siblings, 1 reply; 10+ messages in thread From: Pavel Begunkov @ 2020-07-24 12:52 UTC (permalink / raw) To: Jens Axboe, io-uring On 24/07/2020 15:46, Pavel Begunkov wrote: > On 24/07/2020 01:24, Jens Axboe wrote: >> On 7/23/20 4:16 PM, Jens Axboe wrote: >>> On 7/23/20 12:12 PM, Pavel Begunkov wrote: >>>> poll_add can have req->work initialised, which will be overwritten in >>>> __io_arm_poll_handler() because of the union. Luckily, hash_node is >>>> zeroed in the end, so the damage is limited to lost put for work.creds, >>>> and probably corrupted work.list. >>>> >>>> That's the easiest and really dirty fix, which rearranges members in the >>>> union, arm_poll*() modifies and zeroes only work.files and work.mm, >>>> which are never taken for poll add. >>>> note: io_kiocb is exactly 4 cachelines now. >>> >>> I don't think there's a way around moving task_work out, just like it > > +hash_node. I was thinking to do apoll alloc+memcpy as for rw, but this > one is ugly. > >>> was done on 5.9. The problem is that we could put the environment bits >>> before doing task_work_add(), but we might need them if the subsequent >>> queue ends up having to go async. So there's really no know when we can >>> put them, outside of when the request finishes. Hence, we are kind of >>> SOL here. >> >> Actually, if we do go async, then we can just grab the environment >> again. We're in the same task at that point. So maybe it'd be better to >> work on ensuring that the request is either in the valid work state, or >> empty work if using task_work. >> >> Only potential complication with that is doing io_req_work_drop_env() >> from the waitqueue handler, at least the ->needs_fs part won't like that >> too much. > > Considering that work->list is removed before executing io_wq_work, it > should work. And if done only for poll_add, which needs nothing and ends up > with creds, there shouldn't be any problems. I'll try this out Except for custom ->creds assigned at the beginning with the personality feature. Does poll ever use it? -- Pavel Begunkov ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC][BUG] io_uring: fix work corruption for poll_add 2020-07-24 12:52 ` Pavel Begunkov @ 2020-07-24 14:12 ` Jens Axboe 2020-07-24 14:23 ` Pavel Begunkov 0 siblings, 1 reply; 10+ messages in thread From: Jens Axboe @ 2020-07-24 14:12 UTC (permalink / raw) To: Pavel Begunkov, io-uring On 7/24/20 6:52 AM, Pavel Begunkov wrote: > On 24/07/2020 15:46, Pavel Begunkov wrote: >> On 24/07/2020 01:24, Jens Axboe wrote: >>> On 7/23/20 4:16 PM, Jens Axboe wrote: >>>> On 7/23/20 12:12 PM, Pavel Begunkov wrote: >>>>> poll_add can have req->work initialised, which will be overwritten in >>>>> __io_arm_poll_handler() because of the union. Luckily, hash_node is >>>>> zeroed in the end, so the damage is limited to lost put for work.creds, >>>>> and probably corrupted work.list. >>>>> >>>>> That's the easiest and really dirty fix, which rearranges members in the >>>>> union, arm_poll*() modifies and zeroes only work.files and work.mm, >>>>> which are never taken for poll add. >>>>> note: io_kiocb is exactly 4 cachelines now. >>>> >>>> I don't think there's a way around moving task_work out, just like it >> >> +hash_node. I was thinking to do apoll alloc+memcpy as for rw, but this >> one is ugly. >> >>>> was done on 5.9. The problem is that we could put the environment bits >>>> before doing task_work_add(), but we might need them if the subsequent >>>> queue ends up having to go async. So there's really no know when we can >>>> put them, outside of when the request finishes. Hence, we are kind of >>>> SOL here. >>> >>> Actually, if we do go async, then we can just grab the environment >>> again. We're in the same task at that point. So maybe it'd be better to >>> work on ensuring that the request is either in the valid work state, or >>> empty work if using task_work. >>> >>> Only potential complication with that is doing io_req_work_drop_env() >>> from the waitqueue handler, at least the ->needs_fs part won't like that >>> too much. >> >> Considering that work->list is removed before executing io_wq_work, it >> should work. And if done only for poll_add, which needs nothing and ends up >> with creds, there shouldn't be any problems. I'll try this out > > Except for custom ->creds assigned at the beginning with the personality > feature. Does poll ever use it? It's kind of annoying how we don't have a def->needs_creds, because lots of things would never use it. For poll, it wouldn't be used at all, which makes this issue doubly annoying. -- Jens Axboe ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC][BUG] io_uring: fix work corruption for poll_add 2020-07-24 14:12 ` Jens Axboe @ 2020-07-24 14:23 ` Pavel Begunkov 0 siblings, 0 replies; 10+ messages in thread From: Pavel Begunkov @ 2020-07-24 14:23 UTC (permalink / raw) To: Jens Axboe, io-uring On 24/07/2020 17:12, Jens Axboe wrote: > On 7/24/20 6:52 AM, Pavel Begunkov wrote: >> On 24/07/2020 15:46, Pavel Begunkov wrote: >>> On 24/07/2020 01:24, Jens Axboe wrote: >>>> On 7/23/20 4:16 PM, Jens Axboe wrote: >>>>> On 7/23/20 12:12 PM, Pavel Begunkov wrote: >>>>>> poll_add can have req->work initialised, which will be overwritten in >>>>>> __io_arm_poll_handler() because of the union. Luckily, hash_node is >>>>>> zeroed in the end, so the damage is limited to lost put for work.creds, >>>>>> and probably corrupted work.list. >>>>>> >>>>>> That's the easiest and really dirty fix, which rearranges members in the >>>>>> union, arm_poll*() modifies and zeroes only work.files and work.mm, >>>>>> which are never taken for poll add. >>>>>> note: io_kiocb is exactly 4 cachelines now. >>>>> >>>>> I don't think there's a way around moving task_work out, just like it >>> >>> +hash_node. I was thinking to do apoll alloc+memcpy as for rw, but this >>> one is ugly. >>> >>>>> was done on 5.9. The problem is that we could put the environment bits >>>>> before doing task_work_add(), but we might need them if the subsequent >>>>> queue ends up having to go async. So there's really no know when we can >>>>> put them, outside of when the request finishes. Hence, we are kind of >>>>> SOL here. >>>> >>>> Actually, if we do go async, then we can just grab the environment >>>> again. We're in the same task at that point. So maybe it'd be better to >>>> work on ensuring that the request is either in the valid work state, or >>>> empty work if using task_work. >>>> >>>> Only potential complication with that is doing io_req_work_drop_env() >>>> from the waitqueue handler, at least the ->needs_fs part won't like that >>>> too much. >>> >>> Considering that work->list is removed before executing io_wq_work, it >>> should work. And if done only for poll_add, which needs nothing and ends up >>> with creds, there shouldn't be any problems. I'll try this out >> >> Except for custom ->creds assigned at the beginning with the personality >> feature. Does poll ever use it? > > It's kind of annoying how we don't have a def->needs_creds, because lots > of things would never use it. For poll, it wouldn't be used at all, > which makes this issue doubly annoying. Then we don't have to care which one it has, and the scheme should work good enough for a quick fix. I still don't like overwriting work.list until it leaves io-wq, but that's to think about for 5.9 -- Pavel Begunkov ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2020-07-24 14:25 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-07-23 18:12 [RFC][BUG] io_uring: fix work corruption for poll_add Pavel Begunkov 2020-07-23 18:15 ` Pavel Begunkov 2020-07-23 18:19 ` Jens Axboe 2020-07-23 19:10 ` Pavel Begunkov 2020-07-23 22:16 ` Jens Axboe 2020-07-23 22:24 ` Jens Axboe 2020-07-24 12:46 ` Pavel Begunkov 2020-07-24 12:52 ` Pavel Begunkov 2020-07-24 14:12 ` Jens Axboe 2020-07-24 14:23 ` Pavel Begunkov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox