* Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression [not found] <20220527092432.GE11731@xsang-OptiPlex-9020> @ 2022-05-27 13:50 ` Jens Axboe 2022-06-08 8:00 ` Oliver Sang ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Jens Axboe @ 2022-05-27 13:50 UTC (permalink / raw) To: kernel test robot Cc: LKML, io-uring, lkp, lkp, ying.huang, feng.tang, zhengjun.xing, fengwei.yin, guobing.chen, ming.a.chen, frank.du, Shuhua.Fan, wangyang.guo, Wenhuan.Huang, jessica.ji, shan.kang, guangli.li, tiejun.li, yu.ma, dapeng1.mi, jiebin.sun, gengxin.xie, fan.zhao On 5/27/22 3:24 AM, kernel test robot wrote: > > > Greeting, > > FYI, we noticed a -10.2% regression of phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s due to commit: > > > commit: 584b0180f0f4d67d7145950fe68c625f06c88b10 ("io_uring: move read/write file prep state into actual opcode handler") > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > in testcase: phoronix-test-suite > on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory > with following parameters: > > test: fio-1.14.1 > option_a: Sequential Write > option_b: IO_uring > option_c: Yes > option_d: Yes > option_e: 1MB > option_f: Default Test Directory > cpufreq_governor: performance > ucode: 0x500320a > > test-description: The Phoronix Test Suite is the most comprehensive testing and benchmarking platform available that provides an extensible framework for which new tests can be easily added. > test-url: http://www.phoronix-test-suite.com/ I'm a bit skeptical on this, but I'd like to try and run the test case. Since it's just a fio test case, why can't I find it somewhere? Seems very convoluted to have to setup lkp-tests just for this. Besides, I tried, but it doesn't work on aarch64... -- Jens Axboe ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression 2022-05-27 13:50 ` [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression Jens Axboe @ 2022-06-08 8:00 ` Oliver Sang 2022-06-14 1:54 ` [LKP] " Yin Fengwei 2022-07-12 8:06 ` Yin Fengwei 2 siblings, 0 replies; 18+ messages in thread From: Oliver Sang @ 2022-06-08 8:00 UTC (permalink / raw) To: Jens Axboe Cc: LKML, io-uring, lkp, lkp, ying.huang, feng.tang, zhengjun.xing, fengwei.yin, guobing.chen, ming.a.chen, frank.du, Shuhua.Fan, wangyang.guo, Wenhuan.Huang, jessica.ji, shan.kang, guangli.li, tiejun.li, yu.ma, dapeng1.mi, jiebin.sun, gengxin.xie, fan.zhao Hi Jens Axboe, On Fri, May 27, 2022 at 07:50:27AM -0600, Jens Axboe wrote: > On 5/27/22 3:24 AM, kernel test robot wrote: > > > > > > Greeting, > > > > FYI, we noticed a -10.2% regression of phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s due to commit: > > > > > > commit: 584b0180f0f4d67d7145950fe68c625f06c88b10 ("io_uring: move read/write file prep state into actual opcode handler") > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > > > in testcase: phoronix-test-suite > > on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory > > with following parameters: > > > > test: fio-1.14.1 > > option_a: Sequential Write > > option_b: IO_uring > > option_c: Yes > > option_d: Yes > > option_e: 1MB > > option_f: Default Test Directory > > cpufreq_governor: performance > > ucode: 0x500320a > > > > test-description: The Phoronix Test Suite is the most comprehensive testing and benchmarking platform available that provides an extensible framework for which new tests can be easily added. > > test-url: http://www.phoronix-test-suite.com/ > > I'm a bit skeptical on this, but I'd like to try and run the test case. > Since it's just a fio test case, why can't I find it somewhere? Seems > very convoluted to have to setup lkp-tests just for this. Besides, I > tried, but it doesn't work on aarch64... we just follow doc on http://www.phoronix-test-suite.com/ to run tests in PTS framework, so you don't need to care about lkp-tests. and for this fio test, the parameters we used just as: test: fio-1.14.1 option_a: Sequential Write option_b: IO_uring option_c: Yes option_d: Yes option_e: 1MB option_f: Default Test Directory and yeah, we most focus on x86_64 and don't support lkp-tests to run on aarch64... if you have some idea that we could run other tests, could you let us know? it will be great pleasure to run more tests to check for us if we can support. Thanks a lot! > > -- > Jens Axboe > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression 2022-05-27 13:50 ` [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression Jens Axboe 2022-06-08 8:00 ` Oliver Sang @ 2022-06-14 1:54 ` Yin Fengwei 2022-07-12 8:06 ` Yin Fengwei 2 siblings, 0 replies; 18+ messages in thread From: Yin Fengwei @ 2022-06-14 1:54 UTC (permalink / raw) To: Jens Axboe, kernel test robot Cc: LKML, io-uring, lkp, lkp, guobing.chen, ming.a.chen, frank.du, Shuhua.Fan, wangyang.guo, Wenhuan.Huang, jessica.ji, shan.kang, guangli.li, tiejun.li, yu.ma, dapeng1.mi, jiebin.sun, gengxin.xie, fan.zhao On 5/27/2022 9:50 PM, Jens Axboe wrote: > On 5/27/22 3:24 AM, kernel test robot wrote: >> >> >> Greeting, >> >> FYI, we noticed a -10.2% regression of phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s due to commit: >> >> >> commit: 584b0180f0f4d67d7145950fe68c625f06c88b10 ("io_uring: move read/write file prep state into actual opcode handler") >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master >> >> in testcase: phoronix-test-suite >> on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory >> with following parameters: >> >> test: fio-1.14.1 >> option_a: Sequential Write >> option_b: IO_uring >> option_c: Yes >> option_d: Yes >> option_e: 1MB >> option_f: Default Test Directory >> cpufreq_governor: performance >> ucode: 0x500320a >> >> test-description: The Phoronix Test Suite is the most comprehensive testing and benchmarking platform available that provides an extensible framework for which new tests can be easily added. >> test-url: http://www.phoronix-test-suite.com/ > > I'm a bit skeptical on this, but I'd like to try and run the test case. > Since it's just a fio test case, why can't I find it somewhere? Seems > very convoluted to have to setup lkp-tests just for this. Besides, I > tried, but it doesn't work on aarch64... > We re-run the test and still could get exactly same test result. We noticed following info from perf profile: 14.40 ± 21% +71.3 85.71 ± 2% perf-profile.calltrace.cycles-pp.io_wqe_worker.ret_from_fork 14.25 ± 21% +71.4 85.64 ± 2% perf-profile.calltrace.cycles-pp.io_worker_handle_work.io_wqe_worker.ret_from_fork 14.23 ± 21% +71.4 85.63 ± 2% perf-profile.calltrace.cycles-pp.io_issue_sqe.io_wq_submit_work.io_worker_handle_work.io_wqe_worker.ret_from_fork 14.23 ± 21% +71.4 85.64 ± 2% perf-profile.calltrace.cycles-pp.io_wq_submit_work.io_worker_handle_work.io_wqe_worker.ret_from_fork 14.22 ± 21% +71.4 85.63 ± 2% perf-profile.calltrace.cycles-pp.io_write.io_issue_sqe.io_wq_submit_work.io_worker_handle_work.io_wqe_worker 14.10 ± 21% +71.5 85.62 ± 2% perf-profile.calltrace.cycles-pp.ext4_buffered_write_iter.io_write.io_issue_sqe.io_wq_submit_work.io_worker_handle_work 0.00 +80.9 80.92 ± 2% perf-profile.calltrace.cycles-pp.osq_lock.rwsem_optimistic_spin.rwsem_down_write_slowpath.ext4_buffered_write_iter.io_write 0.00 +83.0 82.99 ± 2% perf-profile.calltrace.cycles-pp.rwsem_optimistic_spin.rwsem_down_write_slowpath.ext4_buffered_write_iter.io_write.io_issue_sqe 0.00 +83.6 83.63 ± 2% perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.ext4_buffered_write_iter.io_write.io_issue_sqe.io_wq_submit_work The above operations takes more time with the patch applied. It looks like the inode lock contention raised a lot with the patch. Frankly, we can't connect this behavior with the patch. Just list here for your information. Thanks. Regards Yin, Fengwei ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression 2022-05-27 13:50 ` [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression Jens Axboe 2022-06-08 8:00 ` Oliver Sang 2022-06-14 1:54 ` [LKP] " Yin Fengwei @ 2022-07-12 8:06 ` Yin Fengwei 2022-07-15 15:58 ` Jens Axboe 2 siblings, 1 reply; 18+ messages in thread From: Yin Fengwei @ 2022-07-12 8:06 UTC (permalink / raw) To: Jens Axboe, kernel test robot; +Cc: LKML, io-uring, lkp, lkp Hi Jens, On 5/27/2022 9:50 PM, Jens Axboe wrote: > I'm a bit skeptical on this, but I'd like to try and run the test case. > Since it's just a fio test case, why can't I find it somewhere? Seems > very convoluted to have to setup lkp-tests just for this. Besides, I > tried, but it doesn't work on aarch64... Recheck this regression report. The regression could be reproduced if the following config file is used with fio (tag: fio-3.25) : [global] rw=write ioengine=io_uring iodepth=64 size=1g direct=1 buffered=1 startdelay=5 force_async=4 ramp_time=5 runtime=20 time_based clat_percentiles=0 disable_lat=1 disable_clat=1 disable_slat=1 filename=test_fiofile [test] name=test bs=1M stonewall Just FYI, a small change to commit: 584b0180f0f4d67d7145950fe68c625f06c88b10: diff --git a/fs/io_uring.c b/fs/io_uring.c index 969f65de9972..616d857f8fc6 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -3181,8 +3181,13 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct kiocb *kiocb = &req->rw.kiocb; unsigned ioprio; + struct file *file = req->file; int ret; + if (likely(file && (file->f_mode & FMODE_WRITE))) + if (!io_req_ffs_set(req)) + req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT; + kiocb->ki_pos = READ_ONCE(sqe->off); ioprio = READ_ONCE(sqe->ioprio); could make regression gone. No idea how req->flags impact the write performance. Thanks. Regards Yin, Fengwei ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression 2022-07-12 8:06 ` Yin Fengwei @ 2022-07-15 15:58 ` Jens Axboe 2022-07-18 0:58 ` Yin Fengwei 2022-07-18 3:30 ` Yin Fengwei 0 siblings, 2 replies; 18+ messages in thread From: Jens Axboe @ 2022-07-15 15:58 UTC (permalink / raw) To: Yin Fengwei, kernel test robot; +Cc: LKML, io-uring, lkp, lkp On 7/12/22 2:06 AM, Yin Fengwei wrote: > Hi Jens, > > On 5/27/2022 9:50 PM, Jens Axboe wrote: >> I'm a bit skeptical on this, but I'd like to try and run the test case. >> Since it's just a fio test case, why can't I find it somewhere? Seems >> very convoluted to have to setup lkp-tests just for this. Besides, I >> tried, but it doesn't work on aarch64... > Recheck this regression report. The regression could be reproduced if > the following config file is used with fio (tag: fio-3.25) : > > [global] > rw=write > ioengine=io_uring > iodepth=64 > size=1g > direct=1 > buffered=1 > startdelay=5 > force_async=4 > ramp_time=5 > runtime=20 > time_based > clat_percentiles=0 > disable_lat=1 > disable_clat=1 > disable_slat=1 > filename=test_fiofile > [test] > name=test > bs=1M > stonewall > > Just FYI, a small change to commit: 584b0180f0f4d67d7145950fe68c625f06c88b10: > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 969f65de9972..616d857f8fc6 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -3181,8 +3181,13 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe) > { > struct kiocb *kiocb = &req->rw.kiocb; > unsigned ioprio; > + struct file *file = req->file; > int ret; > > + if (likely(file && (file->f_mode & FMODE_WRITE))) > + if (!io_req_ffs_set(req)) > + req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT; > + > kiocb->ki_pos = READ_ONCE(sqe->off); > > ioprio = READ_ONCE(sqe->ioprio); > > could make regression gone. No idea how req->flags impact the write > performance. Thanks. I can't really explain that either, at least not immediately. I tried running with and without that patch, and don't see any difference here. In terms of making this more obvious, does the below also fix it for you? And what filesystem is this being run on? diff --git a/fs/io_uring.c b/fs/io_uring.c index a01ea49f3017..797fad99780d 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -4269,9 +4269,6 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode) if (unlikely(!file || !(file->f_mode & mode))) return -EBADF; - if (!io_req_ffs_set(req)) - req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT; - kiocb->ki_flags = iocb_flags(file); ret = kiocb_set_rw_flags(kiocb, req->rw.flags); if (unlikely(ret)) @@ -8309,7 +8306,13 @@ static bool io_assign_file(struct io_kiocb *req, unsigned int issue_flags) else req->file = io_file_get_normal(req, req->cqe.fd); - return !!req->file; + if (unlikely(!req->file)) + return false; + + if (!io_req_ffs_set(req)) + req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT; + + return true; } static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags) -- Jens Axboe ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression 2022-07-15 15:58 ` Jens Axboe @ 2022-07-18 0:58 ` Yin Fengwei 2022-07-18 1:14 ` Jens Axboe 2022-07-18 3:30 ` Yin Fengwei 1 sibling, 1 reply; 18+ messages in thread From: Yin Fengwei @ 2022-07-18 0:58 UTC (permalink / raw) To: Jens Axboe, kernel test robot; +Cc: LKML, io-uring, lkp, lkp On 7/15/2022 11:58 PM, Jens Axboe wrote: > I can't really explain that either, at least not immediately. I tried > running with and without that patch, and don't see any difference here. > In terms of making this more obvious, does the below also fix it for > you? I will try the fix and let you know the result. > > And what filesystem is this being run on? I am using ext4 and LKP are also using ext4. Thanks. Regards Yin, Fengwei > > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index a01ea49f3017..797fad99780d 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -4269,9 +4269,6 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode) > if (unlikely(!file || !(file->f_mode & mode))) > return -EBADF; > > - if (!io_req_ffs_set(req)) > - req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT; > - > kiocb->ki_flags = iocb_flags(file); > ret = kiocb_set_rw_flags(kiocb, req->rw.flags); > if (unlikely(ret)) > @@ -8309,7 +8306,13 @@ static bool io_assign_file(struct io_kiocb *req, unsigned int issue_flags) > else > req->file = io_file_get_normal(req, req->cqe.fd); > > - return !!req->file; > + if (unlikely(!req->file)) > + return false; > + > + if (!io_req_ffs_set(req)) > + req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT; > + > + return true; > } > > static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags) ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression 2022-07-18 0:58 ` Yin Fengwei @ 2022-07-18 1:14 ` Jens Axboe 0 siblings, 0 replies; 18+ messages in thread From: Jens Axboe @ 2022-07-18 1:14 UTC (permalink / raw) To: Yin Fengwei, kernel test robot; +Cc: LKML, io-uring, lkp, lkp On 7/17/22 6:58 PM, Yin Fengwei wrote: > > > On 7/15/2022 11:58 PM, Jens Axboe wrote: >> I can't really explain that either, at least not immediately. I tried >> running with and without that patch, and don't see any difference here. >> In terms of making this more obvious, does the below also fix it for >> you? > I will try the fix and let you know the result. > >> >> And what filesystem is this being run on? > I am using ext4 and LKP are also using ext4. Thanks. Thanks, I'll try ext4 as well (was on XFS). -- Jens Axboe ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression 2022-07-15 15:58 ` Jens Axboe 2022-07-18 0:58 ` Yin Fengwei @ 2022-07-18 3:30 ` Yin Fengwei 2022-07-18 16:27 ` Jens Axboe 1 sibling, 1 reply; 18+ messages in thread From: Yin Fengwei @ 2022-07-18 3:30 UTC (permalink / raw) To: Jens Axboe, kernel test robot; +Cc: LKML, io-uring, lkp, lkp Hi Jens, On 7/15/2022 11:58 PM, Jens Axboe wrote: > In terms of making this more obvious, does the below also fix it for > you? The regression is still there after applied the change you posted. Your change can't be applied to v5.18 (the latest commit on master branch). I changed it a little bit to be applied: diff --git a/fs/io_uring.c b/fs/io_uring.c index e0823f58f7959..0bf7f3d18d46e 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -3762,9 +3762,6 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode) if (unlikely(!file || !(file->f_mode & mode))) return -EBADF; - if (!io_req_ffs_set(req)) - req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT; - kiocb->ki_flags = iocb_flags(file); ret = kiocb_set_rw_flags(kiocb, req->rw.flags); if (unlikely(ret)) @@ -7114,8 +7111,13 @@ static bool io_assign_file(struct io_kiocb *req, unsigned int issue_flags) req->file = io_file_get_fixed(req, req->fd, issue_flags); else req->file = io_file_get_normal(req, req->fd); - if (req->file) + if (req->file) { + if (!io_req_ffs_set(req)) + req->flags |= io_file_get_flags(req->file) << + REQ_F_SUPPORT_NOWAIT_BIT; + return true; + } req_set_fail(req); req->result = -EBADF; Regards Yin, Fengwei ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression 2022-07-18 3:30 ` Yin Fengwei @ 2022-07-18 16:27 ` Jens Axboe 2022-07-19 0:27 ` Yin Fengwei 2022-07-19 2:16 ` Yin Fengwei 0 siblings, 2 replies; 18+ messages in thread From: Jens Axboe @ 2022-07-18 16:27 UTC (permalink / raw) To: Yin Fengwei, kernel test robot; +Cc: LKML, io-uring, lkp, lkp On 7/17/22 9:30 PM, Yin Fengwei wrote: > Hi Jens, > > On 7/15/2022 11:58 PM, Jens Axboe wrote: >> In terms of making this more obvious, does the below also fix it for >> you? > > The regression is still there after applied the change you posted. Still don't see the regression here, using ext4. I get about 1020-1045 IOPS with or without the patch you sent. This is running it in a vm, and the storage device is nvme. What is hosting your ext4 fs? -- Jens Axboe ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression 2022-07-18 16:27 ` Jens Axboe @ 2022-07-19 0:27 ` Yin Fengwei 2022-07-19 2:16 ` Yin Fengwei 1 sibling, 0 replies; 18+ messages in thread From: Yin Fengwei @ 2022-07-19 0:27 UTC (permalink / raw) To: Jens Axboe, kernel test robot; +Cc: LKML, io-uring, lkp, lkp On 7/19/2022 12:27 AM, Jens Axboe wrote: > On 7/17/22 9:30 PM, Yin Fengwei wrote: >> Hi Jens, >> >> On 7/15/2022 11:58 PM, Jens Axboe wrote: >>> In terms of making this more obvious, does the below also fix it for >>> you? >> >> The regression is still there after applied the change you posted. > > Still don't see the regression here, using ext4. I get about 1020-1045 > IOPS with or without the patch you sent. > > This is running it in a vm, and the storage device is nvme. What is > hosting your ext4 fs? > My local testing system is also a vm with SATA disk. LKP test platform is a native one with SATA disk. I could reproduce the regression on both environment. I will try to use nvme to host my local vm disk and check whether we could see something different. Regards Yin, Fengwei ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression 2022-07-18 16:27 ` Jens Axboe 2022-07-19 0:27 ` Yin Fengwei @ 2022-07-19 2:16 ` Yin Fengwei 2022-07-19 2:29 ` Jens Axboe 1 sibling, 1 reply; 18+ messages in thread From: Yin Fengwei @ 2022-07-19 2:16 UTC (permalink / raw) To: Jens Axboe, kernel test robot; +Cc: LKML, io-uring, lkp, lkp Hi Jens, On 7/19/2022 12:27 AM, Jens Axboe wrote: > On 7/17/22 9:30 PM, Yin Fengwei wrote: >> Hi Jens, >> >> On 7/15/2022 11:58 PM, Jens Axboe wrote: >>> In terms of making this more obvious, does the below also fix it for >>> you? >> >> The regression is still there after applied the change you posted. > > Still don't see the regression here, using ext4. I get about 1020-1045 > IOPS with or without the patch you sent. > > This is running it in a vm, and the storage device is nvme. What is > hosting your ext4 fs? Just did more test with vm. The regression can't be reproduced with latest code (I tried the tag v5.19-rc7) whatever the underneath storage is SATA or NVME. But the regression and the debugging patch from me could be reproduced on both SATA and NVME if use commit 584b0180f0f4d6 as base commit (584b0180f0f4d6 vs 584b0180f0f4d6 with my debugging patch). Here is the test result I got: NVME as host storage: 5.19.0-rc7: write: IOPS=933, BW=937MiB/s (982MB/s)(18.3GiB/20020msec); 0 zone resets write: IOPS=993, BW=996MiB/s (1045MB/s)(19.5GiB/20020msec); 0 zone resets write: IOPS=1005, BW=1009MiB/s (1058MB/s)(19.7GiB/20020msec); 0 zone resets write: IOPS=985, BW=989MiB/s (1037MB/s)(19.3GiB/20020msec); 0 zone resets write: IOPS=1020, BW=1024MiB/s (1073MB/s)(20.0GiB/20020msec); 0 zone resets 5.19.0-rc7 with my debugging patch: write: IOPS=988, BW=992MiB/s (1040MB/s)(19.7GiB/20384msec); 0 zone resets write: IOPS=995, BW=998MiB/s (1047MB/s)(20.1GiB/20574msec); 0 zone resets write: IOPS=996, BW=1000MiB/s (1048MB/s)(19.5GiB/20020msec); 0 zone resets write: IOPS=995, BW=998MiB/s (1047MB/s)(19.5GiB/20020msec); 0 zone resets write: IOPS=1006, BW=1009MiB/s (1058MB/s)(19.7GiB/20019msec); 0 zone resets 584b0180f0: write: IOPS=1004, BW=1008MiB/s (1057MB/s)(19.7GiB/20020msec); 0 zone resets write: IOPS=968, BW=971MiB/s (1018MB/s)(19.4GiB/20468msec); 0 zone resets write: IOPS=982, BW=986MiB/s (1033MB/s)(19.3GiB/20020msec); 0 zone resets write: IOPS=1000, BW=1004MiB/s (1053MB/s)(20.1GiB/20461msec); 0 zone resets write: IOPS=903, BW=906MiB/s (950MB/s)(18.1GiB/20419msec); 0 zone resets 584b0180f0 with my debugging the patch: write: IOPS=1073, BW=1076MiB/s (1129MB/s)(21.1GiB/20036msec); 0 zone resets write: IOPS=1131, BW=1135MiB/s (1190MB/s)(22.2GiB/20022msec); 0 zone resets write: IOPS=1122, BW=1126MiB/s (1180MB/s)(22.1GiB/20071msec); 0 zone resets write: IOPS=1071, BW=1075MiB/s (1127MB/s)(21.1GiB/20071msec); 0 zone resets write: IOPS=1049, BW=1053MiB/s (1104MB/s)(21.1GiB/20482msec); 0 zone resets SATA disk as host storage: 5.19.0-rc7: write: IOPS=624, BW=627MiB/s (658MB/s)(12.3GiB/20023msec); 0 zone resets write: IOPS=655, BW=658MiB/s (690MB/s)(12.9GiB/20021msec); 0 zone resets write: IOPS=596, BW=600MiB/s (629MB/s)(12.1GiB/20586msec); 0 zone resets write: IOPS=647, BW=650MiB/s (682MB/s)(12.7GiB/20020msec); 0 zone resets write: IOPS=591, BW=594MiB/s (623MB/s)(12.1GiB/20787msec); 0 zone resets 5.19.0-rc7 with my debugging patch: write: IOPS=633, BW=637MiB/s (668MB/s)(12.6GiB/20201msec); 0 zone resets write: IOPS=614, BW=617MiB/s (647MB/s)(13.1GiB/21667msec); 0 zone resets write: IOPS=653, BW=657MiB/s (689MB/s)(12.8GiB/20020msec); 0 zone resets write: IOPS=618, BW=622MiB/s (652MB/s)(12.2GiB/20033msec); 0 zone resets write: IOPS=604, BW=608MiB/s (638MB/s)(12.1GiB/20314msec); 0 zone resets 584b0180f0: write: IOPS=635, BW=638MiB/s (669MB/s)(12.5GiB/20020msec); 0 zone resets write: IOPS=649, BW=652MiB/s (684MB/s)(12.8GiB/20066msec); 0 zone resets write: IOPS=639, BW=642MiB/s (674MB/s)(13.1GiB/20818msec); 0 zone resets 584b0180f0 with my debugging patch: write: IOPS=850, BW=853MiB/s (895MB/s)(17.1GiB/20474msec); 0 zone resets write: IOPS=738, BW=742MiB/s (778MB/s)(15.1GiB/20787msec); 0 zone resets write: IOPS=751, BW=755MiB/s (792MB/s)(15.1GiB/20432msec); 0 zone resets Regards Yin, Fengwei ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression 2022-07-19 2:16 ` Yin Fengwei @ 2022-07-19 2:29 ` Jens Axboe 2022-07-19 8:58 ` Yin Fengwei 0 siblings, 1 reply; 18+ messages in thread From: Jens Axboe @ 2022-07-19 2:29 UTC (permalink / raw) To: Yin Fengwei, kernel test robot; +Cc: LKML, io-uring, lkp, lkp On 7/18/22 8:16 PM, Yin Fengwei wrote: > Hi Jens, > > On 7/19/2022 12:27 AM, Jens Axboe wrote: >> On 7/17/22 9:30 PM, Yin Fengwei wrote: >>> Hi Jens, >>> >>> On 7/15/2022 11:58 PM, Jens Axboe wrote: >>>> In terms of making this more obvious, does the below also fix it for >>>> you? >>> >>> The regression is still there after applied the change you posted. >> >> Still don't see the regression here, using ext4. I get about 1020-1045 >> IOPS with or without the patch you sent. >> >> This is running it in a vm, and the storage device is nvme. What is >> hosting your ext4 fs? > Just did more test with vm. The regression can't be reproduced with latest > code (I tried the tag v5.19-rc7) whatever the underneath storage is SATA > or NVME. > > But the regression and the debugging patch from me could be reproduced > on both SATA and NVME if use commit 584b0180f0f4d6 as base commit > (584b0180f0f4d6 vs 584b0180f0f4d6 with my debugging patch). > > > Here is the test result I got: > NVME as host storage: > 5.19.0-rc7: > write: IOPS=933, BW=937MiB/s (982MB/s)(18.3GiB/20020msec); 0 zone resets > write: IOPS=993, BW=996MiB/s (1045MB/s)(19.5GiB/20020msec); 0 zone resets > write: IOPS=1005, BW=1009MiB/s (1058MB/s)(19.7GiB/20020msec); 0 zone resets > write: IOPS=985, BW=989MiB/s (1037MB/s)(19.3GiB/20020msec); 0 zone resets > write: IOPS=1020, BW=1024MiB/s (1073MB/s)(20.0GiB/20020msec); 0 zone resets > > 5.19.0-rc7 with my debugging patch: > write: IOPS=988, BW=992MiB/s (1040MB/s)(19.7GiB/20384msec); 0 zone resets > write: IOPS=995, BW=998MiB/s (1047MB/s)(20.1GiB/20574msec); 0 zone resets > write: IOPS=996, BW=1000MiB/s (1048MB/s)(19.5GiB/20020msec); 0 zone resets > write: IOPS=995, BW=998MiB/s (1047MB/s)(19.5GiB/20020msec); 0 zone resets > write: IOPS=1006, BW=1009MiB/s (1058MB/s)(19.7GiB/20019msec); 0 zone resets These two basically look identical, which may be why I get the same with and without your patch. I don't think it makes a difference for this. Curious how it came about? > 584b0180f0: > write: IOPS=1004, BW=1008MiB/s (1057MB/s)(19.7GiB/20020msec); 0 zone resets > write: IOPS=968, BW=971MiB/s (1018MB/s)(19.4GiB/20468msec); 0 zone resets > write: IOPS=982, BW=986MiB/s (1033MB/s)(19.3GiB/20020msec); 0 zone resets > write: IOPS=1000, BW=1004MiB/s (1053MB/s)(20.1GiB/20461msec); 0 zone resets > write: IOPS=903, BW=906MiB/s (950MB/s)(18.1GiB/20419msec); 0 zone resets > > 584b0180f0 with my debugging the patch: > write: IOPS=1073, BW=1076MiB/s (1129MB/s)(21.1GiB/20036msec); 0 zone resets > write: IOPS=1131, BW=1135MiB/s (1190MB/s)(22.2GiB/20022msec); 0 zone resets > write: IOPS=1122, BW=1126MiB/s (1180MB/s)(22.1GiB/20071msec); 0 zone resets > write: IOPS=1071, BW=1075MiB/s (1127MB/s)(21.1GiB/20071msec); 0 zone resets > write: IOPS=1049, BW=1053MiB/s (1104MB/s)(21.1GiB/20482msec); 0 zone resets Last one looks like it may be faster indeed. I do wonder if this is something else, though. There's no reason why -rc7 with that same patch applied should be any different than 584b0180f0 with it. these resu > > > SATA disk as host storage: > 5.19.0-rc7: > write: IOPS=624, BW=627MiB/s (658MB/s)(12.3GiB/20023msec); 0 zone resets > write: IOPS=655, BW=658MiB/s (690MB/s)(12.9GiB/20021msec); 0 zone resets > write: IOPS=596, BW=600MiB/s (629MB/s)(12.1GiB/20586msec); 0 zone resets > write: IOPS=647, BW=650MiB/s (682MB/s)(12.7GiB/20020msec); 0 zone resets > write: IOPS=591, BW=594MiB/s (623MB/s)(12.1GiB/20787msec); 0 zone resets > > 5.19.0-rc7 with my debugging patch: > write: IOPS=633, BW=637MiB/s (668MB/s)(12.6GiB/20201msec); 0 zone resets > write: IOPS=614, BW=617MiB/s (647MB/s)(13.1GiB/21667msec); 0 zone resets > write: IOPS=653, BW=657MiB/s (689MB/s)(12.8GiB/20020msec); 0 zone resets > write: IOPS=618, BW=622MiB/s (652MB/s)(12.2GiB/20033msec); 0 zone resets > write: IOPS=604, BW=608MiB/s (638MB/s)(12.1GiB/20314msec); 0 zone resets These again are probably the same, within variance. > 584b0180f0: > write: IOPS=635, BW=638MiB/s (669MB/s)(12.5GiB/20020msec); 0 zone resets > write: IOPS=649, BW=652MiB/s (684MB/s)(12.8GiB/20066msec); 0 zone resets > write: IOPS=639, BW=642MiB/s (674MB/s)(13.1GiB/20818msec); 0 zone resets > > 584b0180f0 with my debugging patch: > write: IOPS=850, BW=853MiB/s (895MB/s)(17.1GiB/20474msec); 0 zone resets > write: IOPS=738, BW=742MiB/s (778MB/s)(15.1GiB/20787msec); 0 zone resets > write: IOPS=751, BW=755MiB/s (792MB/s)(15.1GiB/20432msec); 0 zone resets But this one looks like a clear difference. I'll poke at this tomorrow. -- Jens Axboe ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression 2022-07-19 2:29 ` Jens Axboe @ 2022-07-19 8:58 ` Yin Fengwei 2022-07-20 17:24 ` Jens Axboe 0 siblings, 1 reply; 18+ messages in thread From: Yin Fengwei @ 2022-07-19 8:58 UTC (permalink / raw) To: Jens Axboe, kernel test robot; +Cc: LKML, io-uring, lkp, lkp Hi Jens, On 7/19/2022 10:29 AM, Jens Axboe wrote: > I'll poke at this tomorrow. Just FYI. Another finding (test is based on commit 584b0180f0): If the code block is put to different function, the fio performance result is different: Patch1: diff --git a/fs/io_uring.c b/fs/io_uring.c index 616d857f8fc6..b0578a3d063a 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -3184,10 +3184,6 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe) struct file *file = req->file; int ret; - if (likely(file && (file->f_mode & FMODE_WRITE))) - if (!io_req_ffs_set(req)) - req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT; - kiocb->ki_pos = READ_ONCE(sqe->off); ioprio = READ_ONCE(sqe->ioprio); @@ -7852,6 +7848,10 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req, return 0; } + if (likely(req->file)) + if (!io_req_ffs_set(req)) + req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT; + io_queue_sqe(req); return 0; Patch2: diff --git a/fs/io_uring.c b/fs/io_uring.c index b0578a3d063a..af705e7ba8d3 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -7639,6 +7639,11 @@ static void io_queue_sqe_fallback(struct io_kiocb *req) static inline void io_queue_sqe(struct io_kiocb *req) __must_hold(&req->ctx->uring_lock) { + + if (likely(req->file)) + if (!io_req_ffs_set(req)) + req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT; + if (likely(!(req->flags & (REQ_F_FORCE_ASYNC | REQ_F_FAIL)))) __io_queue_sqe(req); else @@ -7848,10 +7853,6 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req, return 0; } - if (likely(req->file)) - if (!io_req_ffs_set(req)) - req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT; - io_queue_sqe(req); return 0; } Patch3: diff --git a/fs/io_uring.c b/fs/io_uring.c index af705e7ba8d3..5771d6d0ad8a 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -7598,6 +7598,10 @@ static inline void __io_queue_sqe(struct io_kiocb *req) struct io_kiocb *linked_timeout; int ret; + if (likely(req->file)) + if (!io_req_ffs_set(req)) + req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT; + ret = io_issue_sqe(req, IO_URING_F_NONBLOCK|IO_URING_F_COMPLETE_DEFER); if (req->flags & REQ_F_COMPLETE_INLINE) { @@ -7640,10 +7644,6 @@ static inline void io_queue_sqe(struct io_kiocb *req) __must_hold(&req->ctx->uring_lock) { - if (likely(req->file)) - if (!io_req_ffs_set(req)) - req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT; - if (likely(!(req->flags & (REQ_F_FORCE_ASYNC | REQ_F_FAIL)))) __io_queue_sqe(req); else The test result (confirmed on my own test env and LKP): patch1 and patch2 have no regression. patch3 has regression. Regards Yin, Fengwei ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression 2022-07-19 8:58 ` Yin Fengwei @ 2022-07-20 17:24 ` Jens Axboe 2022-07-20 18:13 ` Jens Axboe 2022-07-21 3:08 ` Yin Fengwei 0 siblings, 2 replies; 18+ messages in thread From: Jens Axboe @ 2022-07-20 17:24 UTC (permalink / raw) To: Yin Fengwei, kernel test robot; +Cc: LKML, io-uring, lkp, lkp On 7/19/22 2:58 AM, Yin Fengwei wrote: > Hi Jens, > > On 7/19/2022 10:29 AM, Jens Axboe wrote: >> I'll poke at this tomorrow. > > Just FYI. Another finding (test is based on commit 584b0180f0): > If the code block is put to different function, the fio performance result is > different: I think this turned out to be a little bit of a goose chase. What's happening here is that later kernels defer the file assignment, which means it isn't set if a request is queued with IOSQE_ASYNC. That in turn, for writes, means that we don't hash it on io-wq insertion, and then it doesn't get serialized with other writes to that file. I'll come up with a patch for this that you can test. -- Jens Axboe ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression 2022-07-20 17:24 ` Jens Axboe @ 2022-07-20 18:13 ` Jens Axboe 2022-07-20 23:25 ` Yin Fengwei 2022-07-21 2:59 ` Yin Fengwei 2022-07-21 3:08 ` Yin Fengwei 1 sibling, 2 replies; 18+ messages in thread From: Jens Axboe @ 2022-07-20 18:13 UTC (permalink / raw) To: Yin Fengwei, kernel test robot; +Cc: LKML, io-uring, lkp, lkp On 7/20/22 11:24 AM, Jens Axboe wrote: > On 7/19/22 2:58 AM, Yin Fengwei wrote: >> Hi Jens, >> >> On 7/19/2022 10:29 AM, Jens Axboe wrote: >>> I'll poke at this tomorrow. >> >> Just FYI. Another finding (test is based on commit 584b0180f0): >> If the code block is put to different function, the fio performance result is >> different: > > I think this turned out to be a little bit of a goose chase. What's > happening here is that later kernels defer the file assignment, which > means it isn't set if a request is queued with IOSQE_ASYNC. That in > turn, for writes, means that we don't hash it on io-wq insertion, and > then it doesn't get serialized with other writes to that file. > > I'll come up with a patch for this that you can test. Can you try this? It's against 5.19-rc7. diff --git a/fs/io_uring.c b/fs/io_uring.c index a01ea49f3017..34758e95990a 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2015,6 +2015,64 @@ static inline void io_arm_ltimeout(struct io_kiocb *req) __io_arm_ltimeout(req); } +static bool io_bdev_nowait(struct block_device *bdev) +{ + return !bdev || blk_queue_nowait(bdev_get_queue(bdev)); +} + +/* + * If we tracked the file through the SCM inflight mechanism, we could support + * any file. For now, just ensure that anything potentially problematic is done + * inline. + */ +static bool __io_file_supports_nowait(struct file *file, umode_t mode) +{ + if (S_ISBLK(mode)) { + if (IS_ENABLED(CONFIG_BLOCK) && + io_bdev_nowait(I_BDEV(file->f_mapping->host))) + return true; + return false; + } + if (S_ISSOCK(mode)) + return true; + if (S_ISREG(mode)) { + if (IS_ENABLED(CONFIG_BLOCK) && + io_bdev_nowait(file->f_inode->i_sb->s_bdev) && + file->f_op != &io_uring_fops) + return true; + return false; + } + + /* any ->read/write should understand O_NONBLOCK */ + if (file->f_flags & O_NONBLOCK) + return true; + return file->f_mode & FMODE_NOWAIT; +} + +static inline bool io_file_supports_nowait(struct io_kiocb *req) +{ + return req->flags & REQ_F_SUPPORT_NOWAIT; +} + +/* + * If we tracked the file through the SCM inflight mechanism, we could support + * any file. For now, just ensure that anything potentially problematic is done + * inline. + */ +static unsigned int io_file_get_flags(struct file *file) +{ + umode_t mode = file_inode(file)->i_mode; + unsigned int res = 0; + + if (S_ISREG(mode)) + res |= FFS_ISREG; + if (__io_file_supports_nowait(file, mode)) + res |= FFS_NOWAIT; + if (io_file_need_scm(file)) + res |= FFS_SCM; + return res; +} + static void io_prep_async_work(struct io_kiocb *req) { const struct io_op_def *def = &io_op_defs[req->opcode]; @@ -2031,6 +2089,9 @@ static void io_prep_async_work(struct io_kiocb *req) if (req->flags & REQ_F_FORCE_ASYNC) req->work.flags |= IO_WQ_WORK_CONCURRENT; + if (req->file && !io_req_ffs_set(req)) + req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT; + if (req->flags & REQ_F_ISREG) { if (def->hash_reg_file || (ctx->flags & IORING_SETUP_IOPOLL)) io_wq_hash_work(&req->work, file_inode(req->file)); @@ -3556,64 +3617,6 @@ static void io_iopoll_req_issued(struct io_kiocb *req, unsigned int issue_flags) } } -static bool io_bdev_nowait(struct block_device *bdev) -{ - return !bdev || blk_queue_nowait(bdev_get_queue(bdev)); -} - -/* - * If we tracked the file through the SCM inflight mechanism, we could support - * any file. For now, just ensure that anything potentially problematic is done - * inline. - */ -static bool __io_file_supports_nowait(struct file *file, umode_t mode) -{ - if (S_ISBLK(mode)) { - if (IS_ENABLED(CONFIG_BLOCK) && - io_bdev_nowait(I_BDEV(file->f_mapping->host))) - return true; - return false; - } - if (S_ISSOCK(mode)) - return true; - if (S_ISREG(mode)) { - if (IS_ENABLED(CONFIG_BLOCK) && - io_bdev_nowait(file->f_inode->i_sb->s_bdev) && - file->f_op != &io_uring_fops) - return true; - return false; - } - - /* any ->read/write should understand O_NONBLOCK */ - if (file->f_flags & O_NONBLOCK) - return true; - return file->f_mode & FMODE_NOWAIT; -} - -/* - * If we tracked the file through the SCM inflight mechanism, we could support - * any file. For now, just ensure that anything potentially problematic is done - * inline. - */ -static unsigned int io_file_get_flags(struct file *file) -{ - umode_t mode = file_inode(file)->i_mode; - unsigned int res = 0; - - if (S_ISREG(mode)) - res |= FFS_ISREG; - if (__io_file_supports_nowait(file, mode)) - res |= FFS_NOWAIT; - if (io_file_need_scm(file)) - res |= FFS_SCM; - return res; -} - -static inline bool io_file_supports_nowait(struct io_kiocb *req) -{ - return req->flags & REQ_F_SUPPORT_NOWAIT; -} - static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct kiocb *kiocb = &req->rw.kiocb; -- Jens Axboe ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression 2022-07-20 18:13 ` Jens Axboe @ 2022-07-20 23:25 ` Yin Fengwei 2022-07-21 2:59 ` Yin Fengwei 1 sibling, 0 replies; 18+ messages in thread From: Yin Fengwei @ 2022-07-20 23:25 UTC (permalink / raw) To: Jens Axboe, kernel test robot; +Cc: LKML, io-uring, lkp, lkp On 7/21/2022 2:13 AM, Jens Axboe wrote: > Can you try this? It's against 5.19-rc7. Sure. I will try it and share the test result. Thanks. Regards Yin, Fengwei ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression 2022-07-20 18:13 ` Jens Axboe 2022-07-20 23:25 ` Yin Fengwei @ 2022-07-21 2:59 ` Yin Fengwei 1 sibling, 0 replies; 18+ messages in thread From: Yin Fengwei @ 2022-07-21 2:59 UTC (permalink / raw) To: Jens Axboe, kernel test robot; +Cc: LKML, io-uring, lkp, lkp Hi Jens, On 7/21/2022 2:13 AM, Jens Axboe wrote: > Can you try this? It's against 5.19-rc7. > > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index a01ea49f3017..34758e95990a 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -2015,6 +2015,64 @@ static inline void io_arm_ltimeout(struct io_kiocb *req) > __io_arm_ltimeout(req); > } > > +static bool io_bdev_nowait(struct block_device *bdev) > +{ > + return !bdev || blk_queue_nowait(bdev_get_queue(bdev)); > +} > + > +/* > + * If we tracked the file through the SCM inflight mechanism, we could support > + * any file. For now, just ensure that anything potentially problematic is done > + * inline. > + */ > +static bool __io_file_supports_nowait(struct file *file, umode_t mode) > +{ > + if (S_ISBLK(mode)) { > + if (IS_ENABLED(CONFIG_BLOCK) && > + io_bdev_nowait(I_BDEV(file->f_mapping->host))) > + return true; > + return false; > + } > + if (S_ISSOCK(mode)) > + return true; > + if (S_ISREG(mode)) { > + if (IS_ENABLED(CONFIG_BLOCK) && > + io_bdev_nowait(file->f_inode->i_sb->s_bdev) && > + file->f_op != &io_uring_fops) > + return true; > + return false; > + } > + > + /* any ->read/write should understand O_NONBLOCK */ > + if (file->f_flags & O_NONBLOCK) > + return true; > + return file->f_mode & FMODE_NOWAIT; > +} > + > +static inline bool io_file_supports_nowait(struct io_kiocb *req) > +{ > + return req->flags & REQ_F_SUPPORT_NOWAIT; > +} > + > +/* > + * If we tracked the file through the SCM inflight mechanism, we could support > + * any file. For now, just ensure that anything potentially problematic is done > + * inline. > + */ > +static unsigned int io_file_get_flags(struct file *file) > +{ > + umode_t mode = file_inode(file)->i_mode; > + unsigned int res = 0; > + > + if (S_ISREG(mode)) > + res |= FFS_ISREG; > + if (__io_file_supports_nowait(file, mode)) > + res |= FFS_NOWAIT; > + if (io_file_need_scm(file)) > + res |= FFS_SCM; > + return res; > +} > + > static void io_prep_async_work(struct io_kiocb *req) > { > const struct io_op_def *def = &io_op_defs[req->opcode]; > @@ -2031,6 +2089,9 @@ static void io_prep_async_work(struct io_kiocb *req) > if (req->flags & REQ_F_FORCE_ASYNC) > req->work.flags |= IO_WQ_WORK_CONCURRENT; > > + if (req->file && !io_req_ffs_set(req)) > + req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT; > + > if (req->flags & REQ_F_ISREG) { > if (def->hash_reg_file || (ctx->flags & IORING_SETUP_IOPOLL)) > io_wq_hash_work(&req->work, file_inode(req->file)); > @@ -3556,64 +3617,6 @@ static void io_iopoll_req_issued(struct io_kiocb *req, unsigned int issue_flags) > } > } > > -static bool io_bdev_nowait(struct block_device *bdev) > -{ > - return !bdev || blk_queue_nowait(bdev_get_queue(bdev)); > -} > - > -/* > - * If we tracked the file through the SCM inflight mechanism, we could support > - * any file. For now, just ensure that anything potentially problematic is done > - * inline. > - */ > -static bool __io_file_supports_nowait(struct file *file, umode_t mode) > -{ > - if (S_ISBLK(mode)) { > - if (IS_ENABLED(CONFIG_BLOCK) && > - io_bdev_nowait(I_BDEV(file->f_mapping->host))) > - return true; > - return false; > - } > - if (S_ISSOCK(mode)) > - return true; > - if (S_ISREG(mode)) { > - if (IS_ENABLED(CONFIG_BLOCK) && > - io_bdev_nowait(file->f_inode->i_sb->s_bdev) && > - file->f_op != &io_uring_fops) > - return true; > - return false; > - } > - > - /* any ->read/write should understand O_NONBLOCK */ > - if (file->f_flags & O_NONBLOCK) > - return true; > - return file->f_mode & FMODE_NOWAIT; > -} > - > -/* > - * If we tracked the file through the SCM inflight mechanism, we could support > - * any file. For now, just ensure that anything potentially problematic is done > - * inline. > - */ > -static unsigned int io_file_get_flags(struct file *file) > -{ > - umode_t mode = file_inode(file)->i_mode; > - unsigned int res = 0; > - > - if (S_ISREG(mode)) > - res |= FFS_ISREG; > - if (__io_file_supports_nowait(file, mode)) > - res |= FFS_NOWAIT; > - if (io_file_need_scm(file)) > - res |= FFS_SCM; > - return res; > -} > - > -static inline bool io_file_supports_nowait(struct io_kiocb *req) > -{ > - return req->flags & REQ_F_SUPPORT_NOWAIT; > -} > - > static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe) > { > struct kiocb *kiocb = &req->rw.kiocb; > > -- Jens Axboe This change could make regression gone. The test result is as following: 28d3a5662d44077aa6eb42bfcfa is your patch 584b0180f0f4d67d v5.19-rc7 28d3a5662d44077aa6eb42bfcfa ---------------- --------------------------- --------------------------- fail:runs %reproduction fail:runs %reproduction fail:runs | | | | | 503:3 9297% 782:3 178% 509:3 dmesg.timestamp:last 3:3 0% 3:3 0% 3:3 pmeter.pmeter.fail :3 100% 3:3 100% 3:3 kmsg.I/O_error,dev_loop#,sector#op#:(READ)flags#phys_seg#prio_class :3 3755% 112:3 4016% 120:3 kmsg.timestamp:I/O_error,dev_loop#,sector#op#:(READ)flags#phys_seg#prio_class 465:3 9221% 742:3 235% 473:3 kmsg.timestamp:last %stddev %change %stddev %change %stddev \ | \ | \ 972.00 -0.3% 968.67 +11.4% 1082 phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.iops 975.00 -0.3% 972.33 +11.5% 1086 phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s Comparing to v5.19-rc7 and 584b0180f0f4d67d, it could bring 11% regression back. Thanks. Regards Yin, Fengwei ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression 2022-07-20 17:24 ` Jens Axboe 2022-07-20 18:13 ` Jens Axboe @ 2022-07-21 3:08 ` Yin Fengwei 1 sibling, 0 replies; 18+ messages in thread From: Yin Fengwei @ 2022-07-21 3:08 UTC (permalink / raw) To: Jens Axboe, kernel test robot; +Cc: LKML, io-uring, lkp, lkp Hi Jens, On 7/21/2022 1:24 AM, Jens Axboe wrote: > I think this turned out to be a little bit of a goose chase. What's > happening here is that later kernels defer the file assignment, which > means it isn't set if a request is queued with IOSQE_ASYNC. That in > turn, for writes, means that we don't hash it on io-wq insertion, and > then it doesn't get serialized with other writes to that file. Thanks a lot for the detail behavior explanation. Regards Yin, Fengwei ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2022-07-21 3:08 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <20220527092432.GE11731@xsang-OptiPlex-9020> 2022-05-27 13:50 ` [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression Jens Axboe 2022-06-08 8:00 ` Oliver Sang 2022-06-14 1:54 ` [LKP] " Yin Fengwei 2022-07-12 8:06 ` Yin Fengwei 2022-07-15 15:58 ` Jens Axboe 2022-07-18 0:58 ` Yin Fengwei 2022-07-18 1:14 ` Jens Axboe 2022-07-18 3:30 ` Yin Fengwei 2022-07-18 16:27 ` Jens Axboe 2022-07-19 0:27 ` Yin Fengwei 2022-07-19 2:16 ` Yin Fengwei 2022-07-19 2:29 ` Jens Axboe 2022-07-19 8:58 ` Yin Fengwei 2022-07-20 17:24 ` Jens Axboe 2022-07-20 18:13 ` Jens Axboe 2022-07-20 23:25 ` Yin Fengwei 2022-07-21 2:59 ` Yin Fengwei 2022-07-21 3:08 ` Yin Fengwei
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox