* Does need memory barrier to synchronize req->result with req->iopoll_completed @ 2020-06-14 14:10 Xiaoguang Wang 2020-06-14 15:36 ` Jens Axboe 0 siblings, 1 reply; 5+ messages in thread From: Xiaoguang Wang @ 2020-06-14 14:10 UTC (permalink / raw) To: io-uring; +Cc: [email protected], joseph qi hi, I have taken some further thoughts about previous IPOLL race fix patch, if io_complete_rw_iopoll() is called in interrupt context, "req->result = res" and "WRITE_ONCE(req->iopoll_completed, 1);" are independent store operations. So in io_do_iopoll(), if iopoll_completed is ture, can we make sure that req->result has already been perceived by the cpu executing io_do_iopoll()? Regards, Xiaoguang Wang ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Does need memory barrier to synchronize req->result with req->iopoll_completed 2020-06-14 14:10 Does need memory barrier to synchronize req->result with req->iopoll_completed Xiaoguang Wang @ 2020-06-14 15:36 ` Jens Axboe 2020-06-15 2:10 ` Xiaoguang Wang 2020-06-16 17:31 ` Bijan Mottahedeh 0 siblings, 2 replies; 5+ messages in thread From: Jens Axboe @ 2020-06-14 15:36 UTC (permalink / raw) To: Xiaoguang Wang, io-uring; +Cc: joseph qi On 6/14/20 8:10 AM, Xiaoguang Wang wrote: > hi, > > I have taken some further thoughts about previous IPOLL race fix patch, > if io_complete_rw_iopoll() is called in interrupt context, "req->result = res" > and "WRITE_ONCE(req->iopoll_completed, 1);" are independent store operations. > So in io_do_iopoll(), if iopoll_completed is ture, can we make sure that > req->result has already been perceived by the cpu executing io_do_iopoll()? Good point, I think if we do something like the below, we should be totally safe against an IRQ completion. Since we batch the completions, we can get by with just a single smp_rmb() on the completion side. diff --git a/fs/io_uring.c b/fs/io_uring.c index 155f3d830ddb..74c2a4709b63 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1736,6 +1736,9 @@ static void io_iopoll_complete(struct io_ring_ctx *ctx, unsigned int *nr_events, struct req_batch rb; struct io_kiocb *req; + /* order with ->result store in io_complete_rw_iopoll() */ + smp_rmb(); + rb.to_free = rb.need_iter = 0; while (!list_empty(done)) { int cflags = 0; @@ -1976,6 +1979,8 @@ static void io_complete_rw_iopoll(struct kiocb *kiocb, long res, long res2) if (res != req->result) req_set_fail_links(req); req->result = res; + /* order with io_poll_complete() checking ->result */ + smp_wmb(); if (res != -EAGAIN) WRITE_ONCE(req->iopoll_completed, 1); } -- Jens Axboe ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: Does need memory barrier to synchronize req->result with req->iopoll_completed 2020-06-14 15:36 ` Jens Axboe @ 2020-06-15 2:10 ` Xiaoguang Wang 2020-06-16 17:31 ` Bijan Mottahedeh 1 sibling, 0 replies; 5+ messages in thread From: Xiaoguang Wang @ 2020-06-15 2:10 UTC (permalink / raw) To: Jens Axboe, io-uring; +Cc: joseph qi hi, > On 6/14/20 8:10 AM, Xiaoguang Wang wrote: >> hi, >> >> I have taken some further thoughts about previous IPOLL race fix patch, >> if io_complete_rw_iopoll() is called in interrupt context, "req->result = res" >> and "WRITE_ONCE(req->iopoll_completed, 1);" are independent store operations. >> So in io_do_iopoll(), if iopoll_completed is ture, can we make sure that >> req->result has already been perceived by the cpu executing io_do_iopoll()? > > Good point, I think if we do something like the below, we should be > totally safe against an IRQ completion. Since we batch the completions, > we can get by with just a single smp_rmb() on the completion side. Yes, agree. And thanks for confirming this issue, I'll make a formal patch with proper commit log. Regards, Xiaoguang Wang > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 155f3d830ddb..74c2a4709b63 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -1736,6 +1736,9 @@ static void io_iopoll_complete(struct io_ring_ctx *ctx, unsigned int *nr_events, > struct req_batch rb; > struct io_kiocb *req; > > + /* order with ->result store in io_complete_rw_iopoll() */ > + smp_rmb(); > + > rb.to_free = rb.need_iter = 0; > while (!list_empty(done)) { > int cflags = 0; > @@ -1976,6 +1979,8 @@ static void io_complete_rw_iopoll(struct kiocb *kiocb, long res, long res2) > if (res != req->result) > req_set_fail_links(req); > req->result = res; > + /* order with io_poll_complete() checking ->result */ > + smp_wmb(); > if (res != -EAGAIN) > WRITE_ONCE(req->iopoll_completed, 1); > } > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Does need memory barrier to synchronize req->result with req->iopoll_completed 2020-06-14 15:36 ` Jens Axboe 2020-06-15 2:10 ` Xiaoguang Wang @ 2020-06-16 17:31 ` Bijan Mottahedeh 2020-06-16 19:35 ` Jens Axboe 1 sibling, 1 reply; 5+ messages in thread From: Bijan Mottahedeh @ 2020-06-16 17:31 UTC (permalink / raw) To: Jens Axboe; +Cc: io-uring On 6/14/2020 8:36 AM, Jens Axboe wrote: > On 6/14/20 8:10 AM, Xiaoguang Wang wrote: >> hi, >> >> I have taken some further thoughts about previous IPOLL race fix patch, >> if io_complete_rw_iopoll() is called in interrupt context, "req->result = res" >> and "WRITE_ONCE(req->iopoll_completed, 1);" are independent store operations. >> So in io_do_iopoll(), if iopoll_completed is ture, can we make sure that >> req->result has already been perceived by the cpu executing io_do_iopoll()? > Good point, I think if we do something like the below, we should be > totally safe against an IRQ completion. Since we batch the completions, > we can get by with just a single smp_rmb() on the completion side. > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 155f3d830ddb..74c2a4709b63 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -1736,6 +1736,9 @@ static void io_iopoll_complete(struct io_ring_ctx *ctx, unsigned int *nr_events, > struct req_batch rb; > struct io_kiocb *req; > > + /* order with ->result store in io_complete_rw_iopoll() */ > + smp_rmb(); > + > rb.to_free = rb.need_iter = 0; > while (!list_empty(done)) { > int cflags = 0; > @@ -1976,6 +1979,8 @@ static void io_complete_rw_iopoll(struct kiocb *kiocb, long res, long res2) > if (res != req->result) > req_set_fail_links(req); > req->result = res; > + /* order with io_poll_complete() checking ->result */ > + smp_wmb(); > if (res != -EAGAIN) > WRITE_ONCE(req->iopoll_completed, 1); > } > I'm just trying to understand how the above smp_rmb() works. When io_complete_rw_iopoll() is called, all requests on the done list have already had ->iopoll_completed checked, and given the smp_wmb(),we know the two writes were ordered, so what does the smp_rmb() achieve here exactly? What ordering does it perform? Thanks. --bijan ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Does need memory barrier to synchronize req->result with req->iopoll_completed 2020-06-16 17:31 ` Bijan Mottahedeh @ 2020-06-16 19:35 ` Jens Axboe 0 siblings, 0 replies; 5+ messages in thread From: Jens Axboe @ 2020-06-16 19:35 UTC (permalink / raw) To: Bijan Mottahedeh; +Cc: io-uring On 6/16/20 11:31 AM, Bijan Mottahedeh wrote: > On 6/14/2020 8:36 AM, Jens Axboe wrote: >> On 6/14/20 8:10 AM, Xiaoguang Wang wrote: >>> hi, >>> >>> I have taken some further thoughts about previous IPOLL race fix patch, >>> if io_complete_rw_iopoll() is called in interrupt context, "req->result = res" >>> and "WRITE_ONCE(req->iopoll_completed, 1);" are independent store operations. >>> So in io_do_iopoll(), if iopoll_completed is ture, can we make sure that >>> req->result has already been perceived by the cpu executing io_do_iopoll()? >> Good point, I think if we do something like the below, we should be >> totally safe against an IRQ completion. Since we batch the completions, >> we can get by with just a single smp_rmb() on the completion side. >> >> diff --git a/fs/io_uring.c b/fs/io_uring.c >> index 155f3d830ddb..74c2a4709b63 100644 >> --- a/fs/io_uring.c >> +++ b/fs/io_uring.c >> @@ -1736,6 +1736,9 @@ static void io_iopoll_complete(struct io_ring_ctx *ctx, unsigned int *nr_events, >> struct req_batch rb; >> struct io_kiocb *req; >> >> + /* order with ->result store in io_complete_rw_iopoll() */ >> + smp_rmb(); >> + >> rb.to_free = rb.need_iter = 0; >> while (!list_empty(done)) { >> int cflags = 0; >> @@ -1976,6 +1979,8 @@ static void io_complete_rw_iopoll(struct kiocb *kiocb, long res, long res2) >> if (res != req->result) >> req_set_fail_links(req); >> req->result = res; >> + /* order with io_poll_complete() checking ->result */ >> + smp_wmb(); >> if (res != -EAGAIN) >> WRITE_ONCE(req->iopoll_completed, 1); >> } >> > I'm just trying to understand how the above smp_rmb() works. When > io_complete_rw_iopoll() is called, all requests on the done list have > already had ->iopoll_completed checked, and given the smp_wmb(),we know > the two writes were ordered, so what does the smp_rmb() achieve here > exactly? What ordering does it perform? Documentation/memory-barriers.txt actually has a good example of that, skip to line 2219 or so. -- Jens Axboe ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-06-16 19:35 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-06-14 14:10 Does need memory barrier to synchronize req->result with req->iopoll_completed Xiaoguang Wang 2020-06-14 15:36 ` Jens Axboe 2020-06-15 2:10 ` Xiaoguang Wang 2020-06-16 17:31 ` Bijan Mottahedeh 2020-06-16 19:35 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox