* Zero-copy irq-driven data @ 2020-12-03 15:26 Ricardo Ribalda 2020-12-04 16:06 ` Pavel Begunkov 0 siblings, 1 reply; 4+ messages in thread From: Ricardo Ribalda @ 2020-12-03 15:26 UTC (permalink / raw) To: io-uring Hello I have just started using io_uring so please bear with me. I have a device that produces data at random time and I want to read it with the lowest latency possible and hopefully zero copy. In userspace: I have a sqe with a bunch of io_uring_prep_read_fixed and when they are ready I process them and push them again to the sqe, so it always has operations. In kernelspace: I have implemented the read() file operation in my driver. The data handling follows this loop: loop(): 1) read() gets called by io_uring 2) save the userpointer and the length into a structure 3) go to sleep 4) get an IRQ from the device, with new data 5) dma/copy the data to the user 6) wake up read() and return I guess at this point you see my problem.... What happens if I get an IRQ between 6 and 1? Even if there are plenty of read_operations waiting in the sqe, that data will be lost. :( So I guess what I am asking is: A) Am I doing something stupid? B) Is there a way for a driver to call a callback when it receives data and push it to a read operation on the cqe? C) Can I ask the io_uring to call read() more than once if there are more read_operations in the sqe? D) Can the driver inspect what is in the sqe, to make an educated decision of delaying the irq handling for some cycles if there are more reads pending? Thanks! -- Ricardo Ribalda ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Zero-copy irq-driven data 2020-12-03 15:26 Zero-copy irq-driven data Ricardo Ribalda @ 2020-12-04 16:06 ` Pavel Begunkov 2020-12-07 15:07 ` Ricardo Ribalda 0 siblings, 1 reply; 4+ messages in thread From: Pavel Begunkov @ 2020-12-04 16:06 UTC (permalink / raw) To: Ricardo Ribalda; +Cc: io-uring On 03/12/2020 15:26, Ricardo Ribalda wrote: > Hello > > I have just started using io_uring so please bear with me. > > I have a device that produces data at random time and I want to read > it with the lowest latency possible and hopefully zero copy. > > In userspace: > > I have a sqe with a bunch of io_uring_prep_read_fixed and when they > are ready I process them and push them again to the sqe, so it always > has operations. SQ - submission queue, SQE - SQ entry. To clarify misunderstanding I guess you wanted to say that you have an SQ filled with fixed read requests (i.e. SQEs prep'ed with io_uring_prep_read_fixed()), and so on. > > In kernelspace: > > I have implemented the read() file operation in my driver. The data I'd advise you to implement read_iter() instead, otherwise io_uring won't be able to get all performance out of it, especially for fixed reqs. > handling follows this loop: > > loop(): > 1) read() gets called by io_uring > 2) save the userpointer and the length into a structure > 3) go to sleep > 4) get an IRQ from the device, with new data > 5) dma/copy the data to the user > 6) wake up read() and return > > I guess at this point you see my problem.... What happens if I get an > IRQ between 6 and 1? > Even if there are plenty of read_operations waiting in the sqe, that > data will be lost. :( Frankly, that's not related to io_uring and more rather a device driver writing question. That's not the right list to ask these questions. Though I don't know which would suit your case... > So I guess what I am asking is: > > A) Am I doing something stupid? In essence, since you're writing up your own driver from scratch (not on top of some framework), all that stuff is to you to handle. E.g. you may create a list and adding a short entry with an address to dma on each IRQ. And then dma and serve them only when you've got a request. Or any other design. But for sure there will be enough of pitfalls on your way. Also, I'd recommend first to make it work with old good read(2) first. > > B) Is there a way for a driver to call a callback when it receives > data and push it to a read operation on the cqe? In short: No After you fill an SQE (which is also just a chunk of memory), io_uring gets it and creates a request, which in your case will call ->read*(). So you'd get a driver-visible read request (not necessarily issued by io_uring) > > C) Can I ask the io_uring to call read() more than once if there are > more read_operations in the sqe? "read_operations in the sqe" what it means? > > D) Can the driver inspect what is in the sqe, to make an educated No, and shouldn't be needed. > decision of delaying the irq handling for some cycles if there are > more reads pending? -- Pavel Begunkov ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Zero-copy irq-driven data 2020-12-04 16:06 ` Pavel Begunkov @ 2020-12-07 15:07 ` Ricardo Ribalda 2020-12-07 23:55 ` Jens Axboe 0 siblings, 1 reply; 4+ messages in thread From: Ricardo Ribalda @ 2020-12-07 15:07 UTC (permalink / raw) To: Pavel Begunkov; +Cc: io-uring Hi Pavel Thanks for your response On Fri, Dec 4, 2020 at 5:09 PM Pavel Begunkov <[email protected]> wrote: > > On 03/12/2020 15:26, Ricardo Ribalda wrote: > > Hello > > > > I have just started using io_uring so please bear with me. > > > > I have a device that produces data at random time and I want to read > > it with the lowest latency possible and hopefully zero copy. > > > > In userspace: > > > > I have a sqe with a bunch of io_uring_prep_read_fixed and when they > > are ready I process them and push them again to the sqe, so it always > > has operations. > > SQ - submission queue, SQE - SQ entry. > To clarify misunderstanding I guess you wanted to say that you have > an SQ filled with fixed read requests (i.e. SQEs prep'ed with > io_uring_prep_read_fixed()), and so on. Sorry, I am a mess with acronyms. > > > > > > In kernelspace: > > > > I have implemented the read() file operation in my driver. The data > > I'd advise you to implement read_iter() instead, otherwise io_uring > won't be able to get all performance out of it, especially for fixed > reqs. > > > handling follows this loop: > > > > loop(): > > 1) read() gets called by io_uring > > 2) save the userpointer and the length into a structure > > 3) go to sleep > > 4) get an IRQ from the device, with new data > > 5) dma/copy the data to the user > > 6) wake up read() and return > > > > I guess at this point you see my problem.... What happens if I get an > > IRQ between 6 and 1? > > Even if there are plenty of read_operations waiting in the sqe, that > > data will be lost. :( > > Frankly, that's not related to io_uring and more rather a device driver > writing question. That's not the right list to ask these questions. > Though I don't know which would suit your case... > > > So I guess what I am asking is: > > > > A) Am I doing something stupid? > > In essence, since you're writing up your own driver from scratch > (not on top of some framework), all that stuff is to you to handle. > E.g. you may create a list and adding a short entry with an address > to dma on each IRQ. And then dma and serve them only when you've got > a request. Or any other design. But for sure there will be enough > of pitfalls on your way. > > Also, I'd recommend first to make it work with old good read(2) first. > > > > > B) Is there a way for a driver to call a callback when it receives > > data and push it to a read operation on the cqe? > > In short: No > > After you fill an SQE (which is also just a chunk of memory), io_uring > gets it and creates a request, which in your case will call ->read*(). > So you'd get a driver-visible read request (not necessarily issued by > io_uring) > > > > > C) Can I ask the io_uring to call read() more than once if there are > > more read_operations in the sqe? > > "read_operations in the sqe" what it means? Lets say I have 3 read_operations in the sq. A standard trace from the driver will look like read() return read() return read () return If I could get read() read() read() return return return Then I would not lose any data during " read() reloading" > > > > > D) Can the driver inspect what is in the sqe, to make an educated > > No, and shouldn't be needed. > > > decision of delaying the irq handling for some cycles if there are > > more reads pending? > > -- > Pavel Begunkov -- Ricardo Ribalda ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Zero-copy irq-driven data 2020-12-07 15:07 ` Ricardo Ribalda @ 2020-12-07 23:55 ` Jens Axboe 0 siblings, 0 replies; 4+ messages in thread From: Jens Axboe @ 2020-12-07 23:55 UTC (permalink / raw) To: Ricardo Ribalda, Pavel Begunkov; +Cc: io-uring On 12/7/20 8:07 AM, Ricardo Ribalda wrote: > Hi Pavel > > Thanks for your response > > On Fri, Dec 4, 2020 at 5:09 PM Pavel Begunkov <[email protected]> wrote: >> >> On 03/12/2020 15:26, Ricardo Ribalda wrote: >>> Hello >>> >>> I have just started using io_uring so please bear with me. >>> >>> I have a device that produces data at random time and I want to read >>> it with the lowest latency possible and hopefully zero copy. >>> >>> In userspace: >>> >>> I have a sqe with a bunch of io_uring_prep_read_fixed and when they >>> are ready I process them and push them again to the sqe, so it always >>> has operations. >> >> SQ - submission queue, SQE - SQ entry. >> To clarify misunderstanding I guess you wanted to say that you have >> an SQ filled with fixed read requests (i.e. SQEs prep'ed with >> io_uring_prep_read_fixed()), and so on. > > > Sorry, I am a mess with acronyms. > >> >> >>> >>> In kernelspace: >>> >>> I have implemented the read() file operation in my driver. The data >> >> I'd advise you to implement read_iter() instead, otherwise io_uring >> won't be able to get all performance out of it, especially for fixed >> reqs. >> >>> handling follows this loop: >>> >>> loop(): >>> 1) read() gets called by io_uring >>> 2) save the userpointer and the length into a structure >>> 3) go to sleep >>> 4) get an IRQ from the device, with new data >>> 5) dma/copy the data to the user >>> 6) wake up read() and return >>> >>> I guess at this point you see my problem.... What happens if I get an >>> IRQ between 6 and 1? >>> Even if there are plenty of read_operations waiting in the sqe, that >>> data will be lost. :( >> >> Frankly, that's not related to io_uring and more rather a device driver >> writing question. That's not the right list to ask these questions. >> Though I don't know which would suit your case... >> >>> So I guess what I am asking is: >>> >>> A) Am I doing something stupid? >> >> In essence, since you're writing up your own driver from scratch >> (not on top of some framework), all that stuff is to you to handle. >> E.g. you may create a list and adding a short entry with an address >> to dma on each IRQ. And then dma and serve them only when you've got >> a request. Or any other design. But for sure there will be enough >> of pitfalls on your way. >> >> Also, I'd recommend first to make it work with old good read(2) first. >> >>> >>> B) Is there a way for a driver to call a callback when it receives >>> data and push it to a read operation on the cqe? >> >> In short: No >> >> After you fill an SQE (which is also just a chunk of memory), io_uring >> gets it and creates a request, which in your case will call ->read*(). >> So you'd get a driver-visible read request (not necessarily issued by >> io_uring) >> >>> >>> C) Can I ask the io_uring to call read() more than once if there are >>> more read_operations in the sqe? >> >> "read_operations in the sqe" what it means? > > Lets say I have 3 read_operations in the sq. A standard trace from the > driver will look like > > read() > return > read() > return > read () > return > > If I could get > > read() > read() > read() > return > return > return This is outside the hands of the driver, as Pavel said. If the application is smart and knows it has 3 reads, then with io_uring it'll submit all 3 at once. What happens after this is down to what kind of IO it is, plugging, IO scheduling (if any), etc. The driver has no business interfering with that, the responsibility of the driver is to do the IO it is told to do. -- Jens Axboe ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-12-07 23:56 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-12-03 15:26 Zero-copy irq-driven data Ricardo Ribalda 2020-12-04 16:06 ` Pavel Begunkov 2020-12-07 15:07 ` Ricardo Ribalda 2020-12-07 23:55 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox