From: Jens Axboe <[email protected]>
To: Xiaoguang Wang <[email protected]>,
[email protected]
Cc: [email protected]
Subject: Re: [PATCH] __io_uring_get_cqe: eliminate unnecessary io_uring_enter() syscalls
Date: Mon, 2 Mar 2020 07:05:45 -0700 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 3/1/20 9:18 PM, Xiaoguang Wang wrote:
> When user applis programming mode, like sumbit one sqe and wait its
> completion event, __io_uring_get_cqe() will result in many unnecessary
> syscalls, see below test program:
>
> int main(int argc, char *argv[])
> {
> struct io_uring ring;
> int fd, ret;
> struct io_uring_sqe *sqe;
> struct io_uring_cqe *cqe;
> struct iovec iov;
> off_t offset, filesize = 0;
> void *buf;
>
> if (argc < 2) {
> printf("%s: file\n", argv[0]);
> return 1;
> }
>
> ret = io_uring_queue_init(4, &ring, 0);
> if (ret < 0) {
> fprintf(stderr, "queue_init: %s\n", strerror(-ret));
> return 1;
> }
>
> fd = open(argv[1], O_RDONLY | O_DIRECT);
> if (fd < 0) {
> perror("open");
> return 1;
> }
>
> if (posix_memalign(&buf, 4096, 4096))
> return 1;
> iov.iov_base = buf;
> iov.iov_len = 4096;
>
> offset = 0;
> do {
> sqe = io_uring_get_sqe(&ring);
> if (!sqe) {
> printf("here\n");
> break;
> }
> io_uring_prep_readv(sqe, fd, &iov, 1, offset);
>
> ret = io_uring_submit(&ring);
> if (ret < 0) {
> fprintf(stderr, "io_uring_submit: %s\n", strerror(-ret));
> return 1;
> }
>
> ret = io_uring_wait_cqe(&ring, &cqe);
> if (ret < 0) {
> fprintf(stderr, "io_uring_wait_cqe: %s\n", strerror(-ret));
> return 1;
> }
>
> if (cqe->res <= 0) {
> if (cqe->res < 0) {
> fprintf(stderr, "got eror: %d\n", cqe->res);
> ret = 1;
> }
> io_uring_cqe_seen(&ring, cqe);
> break;
> }
> offset += cqe->res;
> filesize += cqe->res;
> io_uring_cqe_seen(&ring, cqe);
> } while (1);
>
> printf("filesize: %ld\n", filesize);
> close(fd);
> io_uring_queue_exit(&ring);
> return 0;
> }
>
> dd if=/dev/zero of=testfile bs=4096 count=16
> ./test testfile
> and use bpftrace to trace io_uring_enter syscalls, in original codes,
> [lege@localhost ~]$ sudo bpftrace -e "tracepoint:syscalls:sys_enter_io_uring_enter {@c[tid] = count();}"
> Attaching 1 probe...
> @c[11184]: 49
> Above test issues 49 syscalls, it's counterintuitive. After looking
> into the codes, it's because __io_uring_get_cqe issue one more syscall,
> indded when __io_uring_get_cqe issues the first syscall, one cqe should
> already be ready, we don't need to wait again.
>
> To fix this issue, after the first syscall, set wait_nr to be zero, with
> tihs patch, bpftrace shows the number of io_uring_enter syscall is 33.
Thanks, that's a nice fix, we definitely don't want to be doing
50% more system calls than we have to...
--
Jens Axboe
next prev parent reply other threads:[~2020-03-02 14:05 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-02 4:18 [PATCH] __io_uring_get_cqe: eliminate unnecessary io_uring_enter() syscalls Xiaoguang Wang
2020-03-02 14:05 ` Jens Axboe [this message]
2020-03-02 15:24 ` Jens Axboe
2020-03-02 15:37 ` Jens Axboe
2020-03-03 13:11 ` Xiaoguang Wang
2020-03-03 14:35 ` Jens Axboe
2020-03-04 13:27 ` Xiaoguang Wang
2020-03-04 13:57 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox