public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jens Axboe <[email protected]>
To: Bijan Mottahedeh <[email protected]>
Cc: [email protected]
Subject: Re: io_uring performance with block sizes > 128k
Date: Mon, 2 Mar 2020 22:01:54 -0700	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 3/2/20 4:57 PM, Jens Axboe wrote:
> On 3/2/20 4:55 PM, Bijan Mottahedeh wrote:
>> I'm seeing a sizeable drop in perf with polled fio tests for block sizes 
>>  > 128k:
>>
>> filename=/dev/nvme0n1
>> rw=randread
>> direct=1
>> time_based=1
>> randrepeat=1
>> gtod_reduce=1
>>
>> fio --readonly --ioengine=io_uring --iodepth 1024 --fixedbufs --hipri 
>> --numjobs=16
>> fio --readonly --ioengine=pvsync2 --iodepth 1024 --hipri --numjobs=16
>>
>>
>> Compared with the pvsync2 engine, the only major difference I could see 
>> was the dio path, __blkdev_direct_IO() for io_uring vs. 
>> __blkdev_direct_IO_simple() for pvsync2 because of the is_sync_kiocb() 
>> check.
>>
>>
>> static ssize_t
>> blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>> {
>>          ...
>>          if (is_sync_kiocb(iocb) && nr_pages <= BIO_MAX_PAGES)
>>                  return __blkdev_direct_IO_simple(iocb, iter, nr_pages);
>>
>>          return __blkdev_direct_IO(iocb, iter, min(nr_pages, 
>> BIO_MAX_PAGES));
>> }
>>
>> Just for an experiment, I hacked io_uring code to force it through the 
>> _simple() path and I get better numbers though the variance is fairly 
>> high, but the drop at bs > 128k seems consistent:
>>
>>
>> # baseline
>> READ: bw=3167MiB/s (3321MB/s), 186MiB/s-208MiB/s (196MB/s-219MB/s)   #128k
>> READ: bw=898MiB/s (941MB/s), 51.2MiB/s-66.1MiB/s (53.7MB/s-69.3MB/s) #144k
>> READ: bw=1576MiB/s (1652MB/s), 81.8MiB/s-109MiB/s (85.8MB/s-114MB/s) #256k
>>
>> # hack
>> READ: bw=2705MiB/s (2836MB/s), 157MiB/s-174MiB/s (165MB/s-183MB/s) #128k
>> READ: bw=2901MiB/s (3042MB/s), 174MiB/s-194MiB/s (183MB/s-204MB/s) #144k
>> READ: bw=4194MiB/s (4398MB/s), 252MiB/s-271MiB/s (265MB/s-284MB/s) #256k
> 
> A quick guess would be that the IO is being split above 128K, and hence
> the polling only catches one of the parts?

Can you try and see if this makes a difference?


diff --git a/fs/io_uring.c b/fs/io_uring.c
index 571b510ef0e7..cf7599a2c503 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1725,8 +1725,10 @@ static int io_do_iopoll(struct io_ring_ctx *ctx, unsigned int *nr_events,
 		if (ret < 0)
 			break;
 
+#if 0
 		if (ret && spin)
 			spin = false;
+#endif
 		ret = 0;
 	}
 

-- 
Jens Axboe


  reply	other threads:[~2020-03-03  5:01 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-02 23:55 io_uring performance with block sizes > 128k Bijan Mottahedeh
2020-03-02 23:57 ` Jens Axboe
2020-03-03  5:01   ` Jens Axboe [this message]
2020-03-03 20:23     ` Bijan Mottahedeh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox