io_uring performance with block sizes > 128k

public inbox for [email protected]
 help / color / mirror / Atom feed

From: Bijan Mottahedeh <[email protected]>
To: Jens Axboe <[email protected]>
Cc: [email protected]
Subject: io_uring performance with block sizes > 128k
Date: Mon, 2 Mar 2020 15:55:46 -0800	[thread overview]
Message-ID: <[email protected]> (raw)

I'm seeing a sizeable drop in perf with polled fio tests for block sizes 
 > 128k:

filename=/dev/nvme0n1
rw=randread
direct=1
time_based=1
randrepeat=1
gtod_reduce=1

fio --readonly --ioengine=io_uring --iodepth 1024 --fixedbufs --hipri 
--numjobs=16
fio --readonly --ioengine=pvsync2 --iodepth 1024 --hipri --numjobs=16


Compared with the pvsync2 engine, the only major difference I could see 
was the dio path, __blkdev_direct_IO() for io_uring vs. 
__blkdev_direct_IO_simple() for pvsync2 because of the is_sync_kiocb() 
check.


static ssize_t
blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
{
         ...
         if (is_sync_kiocb(iocb) && nr_pages <= BIO_MAX_PAGES)
                 return __blkdev_direct_IO_simple(iocb, iter, nr_pages);

         return __blkdev_direct_IO(iocb, iter, min(nr_pages, 
BIO_MAX_PAGES));
}

Just for an experiment, I hacked io_uring code to force it through the 
_simple() path and I get better numbers though the variance is fairly 
high, but the drop at bs > 128k seems consistent:


# baseline
READ: bw=3167MiB/s (3321MB/s), 186MiB/s-208MiB/s (196MB/s-219MB/s)   #128k
READ: bw=898MiB/s (941MB/s), 51.2MiB/s-66.1MiB/s (53.7MB/s-69.3MB/s) #144k
READ: bw=1576MiB/s (1652MB/s), 81.8MiB/s-109MiB/s (85.8MB/s-114MB/s) #256k

# hack
READ: bw=2705MiB/s (2836MB/s), 157MiB/s-174MiB/s (165MB/s-183MB/s) #128k
READ: bw=2901MiB/s (3042MB/s), 174MiB/s-194MiB/s (183MB/s-204MB/s) #144k
READ: bw=4194MiB/s (4398MB/s), 252MiB/s-271MiB/s (265MB/s-284MB/s) #256k


--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1972,12 +1972,12 @@ static int io_prep_rw(struct io_kiocb *req, 
const struct
                         return -EOPNOTSUPP;

                 kiocb->ki_flags |= IOCB_HIPRI;
-               kiocb->ki_complete = io_complete_rw_iopoll;
+               kiocb->ki_complete = NULL;
                 req->result = 0;
         } else {
                 if (kiocb->ki_flags & IOCB_HIPRI)
                         return -EINVAL;
-               kiocb->ki_complete = io_complete_rw;
+               kiocb->ki_complete = NULL;
         }

         req->rw.addr = READ_ONCE(sqe->addr);
@@ -2005,7 +2005,12 @@ static inline void io_rw_done(struct kiocb 
*kiocb, ssize_
                 ret = -EINTR;
                 /* fall through */
         default:
-               kiocb->ki_complete(kiocb, ret, 0);
+               if (kiocb->ki_complete)
+                       kiocb->ki_complete(kiocb, ret, 0);
+               else if (kiocb->ki_flags & IOCB_HIPRI)
+                       io_complete_rw_iopoll(kiocb, ret, 0);
+               else
+                       io_complete_rw(kiocb, ret, 0);
         }
  }


With the baseline version, perf top shows a significant amount of time 
for lock contention.  I *think* it is nvmeq->sq_lock.

Does that make sense?  I do realize the hack defeats the io_uring 
purpose but I though it might provide some clues as to what is going 
on.  Let me know if there is something else I can try.

Thanks.

--bijan

next             reply	other threads:[~2020-03-02 23:55 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-02 23:55 Bijan Mottahedeh [this message]
2020-03-02 23:57 ` io_uring performance with block sizes > 128k Jens Axboe
2020-03-03  5:01   ` Jens Axboe
2020-03-03 20:23     ` Bijan Mottahedeh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox