public inbox for [email protected]
 help / color / mirror / Atom feed
From: Ben Noordhuis <[email protected]>
To: Pavel Begunkov <[email protected]>
Cc: [email protected]
Subject: Re: Chaining accept+read
Date: Wed, 28 Sep 2022 12:55:08 +0200	[thread overview]
Message-ID: <CAHQurc9e=BU3gXbc=brb1b+vLb7nmeyeVaGwqkgRoqnSyHT2AQ@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>

On Wed, Sep 28, 2022 at 12:02 PM Pavel Begunkov <[email protected]> wrote:
>
> On 9/28/22 10:50, Ben Noordhuis wrote:
> > I'm trying to chain accept+read but it's not working.
> >
> > My code looks like this:
> >
> >      *sqe1 = (struct io_uring_sqe){
> >        .opcode     = IORING_OP_ACCEPT,
> >        .flags      = IOSQE_IO_LINK,
> >        .fd         = listenfd,
> >        .file_index = 42, // or 42+1
> >      };
> >      *sqe2 = (struct io_uring_sqe){
> >        .opcode     = IORING_OP_READ,
> >        .flags      = IOSQE_FIXED_FILE,
> >        .addr       = (u64) buf,
> >        .len        = len,
> >        .fd         = 42,
> >      };
> >      submit();
> >
> > Both ops fail immediately; accept with -ECANCELED, read with -EBADF,
> > presumably because fixed fd 42 doesn't exist at the time of submission.
> >
> > Would it be possible to support this pattern in io_uring or are there
> > reasons for why things are the way they are?
>
> It should already be supported. And errors look a bit odd, I'd rather
> expect -EBADF or some other for accept and -ECANCELED for the read.
> Do you have a test program / reporoducer? Hopefully in C.

Of course, please see below. Error handling elided for brevity. Hope
I'm not doing anything stupid.

For me it immediately prints this:

0 res=-125
1 res=-9

Some observations:

- it's not included in the test case but I can tell from the user_data
field the -EBADF is the read op
- replacing IORING_OP_READ with e.g. IORING_OP_NOP makes it work
(accepts a connection)
- once the fd has been installed, I can successfully chain
IOSQE_FIXED_FILE read&write ops

I'm primarily testing against a 5.15 kernel. Is this something that's
been fixed since? I went through the commit history but I didn't find
anything relevant.

---

#include <linux/io_uring.h>
#include <netinet/in.h>
#include <stdatomic.h>
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#include <unistd.h>
int main(void) {
  struct sockaddr_in sin = {
    .sin_family = AF_INET,
    .sin_addr   = (struct in_addr){ htonl(INADDR_ANY) },
    .sin_port   = htons(9000),
  };
        int listenfd = socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, 0);
        (void) bind(listenfd, (struct sockaddr *) &sin, sizeof(sin));
  (void) listen(listenfd, 128);
  struct io_uring_params p = {};
  int ringfd = syscall(__NR_io_uring_setup, 32, &p);
  int files[64]; memset(files, -1, sizeof(files));
        syscall(__NR_io_uring_register,
          ringfd, IORING_REGISTER_FILES, files, 64);
  __u8 *sq = mmap(0, p.sq_off.array + p.sq_entries * sizeof(__u32),
      PROT_READ|PROT_WRITE, MAP_SHARED|MAP_POPULATE,
      ringfd, IORING_OFF_SQ_RING);
  __u8 *cq = mmap(
      0, p.cq_off.cqes + p.cq_entries * sizeof(struct io_uring_cqe),
      PROT_READ|PROT_WRITE, MAP_SHARED|MAP_POPULATE,
      ringfd, IORING_OFF_CQ_RING);
  struct io_uring_sqe *sqe = mmap(0, p.sq_entries * sizeof(*sqe),
      PROT_READ|PROT_WRITE, MAP_SHARED|MAP_POPULATE,
      ringfd, IORING_OFF_SQES);
        __u32 *sqtail  = (__u32 *) (sq + p.sq_off.tail);
        __u32 *sqarray = (__u32 *) (sq + p.sq_off.array);
        struct io_uring_cqe *cqe =
      (struct io_uring_cqe *) (cq + p.cq_off.cqes);
  sqe[0] = (struct io_uring_sqe){
    .opcode     = IORING_OP_ACCEPT,
    .flags      = IOSQE_ASYNC|IOSQE_IO_LINK,
    .fd         = listenfd,
    .file_index = 42,
  };
  char buf[256];
  sqe[1] = (struct io_uring_sqe){
    .opcode     = IORING_OP_READ,
    .flags      = IOSQE_FIXED_FILE,
    .fd         = 42,
    .len        = sizeof(buf),
    .addr       = (__u64) buf,
  };
  sqarray[0] = 0; sqarray[1] = 1;
  atomic_store((atomic_uint *) sqtail, 2);
  int n = syscall(__NR_io_uring_enter,
                  ringfd, 2, 1, IORING_ENTER_GETEVENTS, 0, 0);
  for (int i = 0; i < n; i++) printf("%d res=%d\n", i, cqe[i].res);
  return 0;
}

  reply	other threads:[~2022-09-28 10:56 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-28  9:50 Chaining accept+read Ben Noordhuis
2022-09-28 10:00 ` Pavel Begunkov
2022-09-28 10:55   ` Ben Noordhuis [this message]
2022-09-28 11:59     ` Pavel Begunkov
2022-09-28 13:49       ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHQurc9e=BU3gXbc=brb1b+vLb7nmeyeVaGwqkgRoqnSyHT2AQ@mail.gmail.com' \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox