* [PATCH 1/4] iov: add import_ubuf()
2022-11-07 17:56 [PATCH 0/4] io_uring: use ITER_UBUF Keith Busch
@ 2022-11-07 17:56 ` Keith Busch
2022-11-08 6:55 ` Christoph Hellwig
2022-11-07 17:56 ` [PATCH 2/4] io_uring: switch network send/recv to ITER_UBUF Keith Busch
` (3 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: Keith Busch @ 2022-11-07 17:56 UTC (permalink / raw)
To: viro, axboe, io-uring; +Cc: asml.silence, linux-fsdevel
From: Jens Axboe <[email protected]>
Like import_single_range(), but for ITER_UBUF.
Signed-off-by: Jens Axboe <[email protected]>
---
include/linux/uio.h | 1 +
lib/iov_iter.c | 11 +++++++++++
2 files changed, 12 insertions(+)
diff --git a/include/linux/uio.h b/include/linux/uio.h
index 2e3134b14ffd..27575495c006 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -337,6 +337,7 @@ ssize_t __import_iovec(int type, const struct iovec __user *uvec,
struct iov_iter *i, bool compat);
int import_single_range(int type, void __user *buf, size_t len,
struct iovec *iov, struct iov_iter *i);
+int import_ubuf(int type, void __user *buf, size_t len, struct iov_iter *i);
static inline void iov_iter_ubuf(struct iov_iter *i, unsigned int direction,
void __user *buf, size_t count)
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index c3ca28ca68a6..07adf18e5e40 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1855,6 +1855,17 @@ int import_single_range(int rw, void __user *buf, size_t len,
}
EXPORT_SYMBOL(import_single_range);
+int import_ubuf(int rw, void __user *buf, size_t len, struct iov_iter *i)
+{
+ if (len > MAX_RW_COUNT)
+ len = MAX_RW_COUNT;
+ if (unlikely(!access_ok(buf, len)))
+ return -EFAULT;
+
+ iov_iter_ubuf(i, rw, buf, len);
+ return 0;
+}
+
/**
* iov_iter_restore() - Restore a &struct iov_iter to the same state as when
* iov_iter_save_state() was called.
--
2.30.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/4] iov: add import_ubuf()
2022-11-07 17:56 ` [PATCH 1/4] iov: add import_ubuf() Keith Busch
@ 2022-11-08 6:55 ` Christoph Hellwig
2022-11-08 16:05 ` Keith Busch
0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2022-11-08 6:55 UTC (permalink / raw)
To: Keith Busch; +Cc: viro, axboe, io-uring, asml.silence, linux-fsdevel
On Mon, Nov 07, 2022 at 09:56:07AM -0800, Keith Busch wrote:
> From: Jens Axboe <[email protected]>
>
> Like import_single_range(), but for ITER_UBUF.
So what is the argument for not simplify switching
import_single_range to always do a ITER_UBUF? Maybe there is a reason
against that, but it should be clearly stated here.
>
> Signed-off-by: Jens Axboe <[email protected]>
Now that this went through your hands it also needs your signoff.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/4] iov: add import_ubuf()
2022-11-08 6:55 ` Christoph Hellwig
@ 2022-11-08 16:05 ` Keith Busch
0 siblings, 0 replies; 9+ messages in thread
From: Keith Busch @ 2022-11-08 16:05 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Keith Busch, viro, axboe, io-uring, asml.silence, linux-fsdevel
On Mon, Nov 07, 2022 at 10:55:49PM -0800, Christoph Hellwig wrote:
> On Mon, Nov 07, 2022 at 09:56:07AM -0800, Keith Busch wrote:
> > From: Jens Axboe <[email protected]>
> >
> > Like import_single_range(), but for ITER_UBUF.
>
> So what is the argument for not simplify switching
> import_single_range to always do a ITER_UBUF? Maybe there is a reason
> against that, but it should be clearly stated here.
That may be a good idea if everyone uses the more efficient iter, but I
thought it'd be safer to keep them separate. There are just a few
import_single_range() users that expect the result be ITER_IOVEC. It
will take some extra time on my side to make sure that kind of change
won't break anything.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/4] io_uring: switch network send/recv to ITER_UBUF
2022-11-07 17:56 [PATCH 0/4] io_uring: use ITER_UBUF Keith Busch
2022-11-07 17:56 ` [PATCH 1/4] iov: add import_ubuf() Keith Busch
@ 2022-11-07 17:56 ` Keith Busch
2022-11-07 17:56 ` [PATCH 3/4] io_uring: use ubuf for single range imports for read/write Keith Busch
` (2 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Keith Busch @ 2022-11-07 17:56 UTC (permalink / raw)
To: viro, axboe, io-uring; +Cc: asml.silence, linux-fsdevel, Keith Busch
From: Jens Axboe <[email protected]>
This is more efficient than ITER_IOVEC.
Signed-off-by: Jens Axboe <[email protected]>
[merged to 6.1]
Signed-off-by: Keith Busch <[email protected]>
---
io_uring/net.c | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)
diff --git a/io_uring/net.c b/io_uring/net.c
index 9a07e79cc0e6..12c68b5ec62d 100644
--- a/io_uring/net.c
+++ b/io_uring/net.c
@@ -170,7 +170,7 @@ static int io_setup_async_msg(struct io_kiocb *req,
if (async_msg->msg.msg_name)
async_msg->msg.msg_name = &async_msg->addr;
/* if were using fast_iov, set it to the new one */
- if (!kmsg->free_iov) {
+ if (iter_is_iovec(&kmsg->msg.msg_iter) && !kmsg->free_iov) {
size_t fast_idx = kmsg->msg.msg_iter.iov - kmsg->fast_iov;
async_msg->msg.msg_iter.iov = &async_msg->fast_iov[fast_idx];
}
@@ -333,7 +333,6 @@ int io_send(struct io_kiocb *req, unsigned int issue_flags)
struct sockaddr_storage __address;
struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg);
struct msghdr msg;
- struct iovec iov;
struct socket *sock;
unsigned flags;
int min_ret = 0;
@@ -367,7 +366,7 @@ int io_send(struct io_kiocb *req, unsigned int issue_flags)
if (unlikely(!sock))
return -ENOTSOCK;
- ret = import_single_range(WRITE, sr->buf, sr->len, &iov, &msg.msg_iter);
+ ret = import_ubuf(WRITE, sr->buf, sr->len, &msg.msg_iter);
if (unlikely(ret))
return ret;
@@ -752,10 +751,7 @@ int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags)
}
}
- kmsg->fast_iov[0].iov_base = buf;
- kmsg->fast_iov[0].iov_len = len;
- iov_iter_init(&kmsg->msg.msg_iter, READ, kmsg->fast_iov, 1,
- len);
+ iov_iter_ubuf(&kmsg->msg.msg_iter, READ, buf, len);
}
flags = sr->msg_flags;
@@ -824,7 +820,6 @@ int io_recv(struct io_kiocb *req, unsigned int issue_flags)
struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg);
struct msghdr msg;
struct socket *sock;
- struct iovec iov;
unsigned int cflags;
unsigned flags;
int ret, min_ret = 0;
@@ -849,7 +844,7 @@ int io_recv(struct io_kiocb *req, unsigned int issue_flags)
sr->buf = buf;
}
- ret = import_single_range(READ, sr->buf, len, &iov, &msg.msg_iter);
+ ret = import_ubuf(READ, sr->buf, len, &msg.msg_iter);
if (unlikely(ret))
goto out_free;
--
2.30.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/4] io_uring: use ubuf for single range imports for read/write
2022-11-07 17:56 [PATCH 0/4] io_uring: use ITER_UBUF Keith Busch
2022-11-07 17:56 ` [PATCH 1/4] iov: add import_ubuf() Keith Busch
2022-11-07 17:56 ` [PATCH 2/4] io_uring: switch network send/recv to ITER_UBUF Keith Busch
@ 2022-11-07 17:56 ` Keith Busch
2022-11-07 17:56 ` [PATCH 4/4] iov_iter: move iter_ubuf check inside restore WARN Keith Busch
2022-11-08 6:54 ` [PATCH 0/4] io_uring: use ITER_UBUF Christoph Hellwig
4 siblings, 0 replies; 9+ messages in thread
From: Keith Busch @ 2022-11-07 17:56 UTC (permalink / raw)
To: viro, axboe, io-uring; +Cc: asml.silence, linux-fsdevel, Keith Busch
From: Jens Axboe <[email protected]>
This is more efficient than ITER_IOVEC.
Signed-off-by: Jens Axboe <[email protected]>
[merge to 6.1, random fixes]
Signed-off-by: Keith Busch <[email protected]>
---
io_uring/rw.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/io_uring/rw.c b/io_uring/rw.c
index 1ce065709724..19551b5e8088 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -391,7 +391,7 @@ static struct iovec *__io_import_iovec(int ddir, struct io_kiocb *req,
rw->len = sqe_len;
}
- ret = import_single_range(ddir, buf, sqe_len, s->fast_iov, iter);
+ ret = import_ubuf(ddir, buf, sqe_len, iter);
if (ret)
return ERR_PTR(ret);
return NULL;
@@ -450,7 +450,10 @@ static ssize_t loop_rw_iter(int ddir, struct io_rw *rw, struct iov_iter *iter)
struct iovec iovec;
ssize_t nr;
- if (!iov_iter_is_bvec(iter)) {
+ if (iter_is_ubuf(iter)) {
+ iovec.iov_base = iter->ubuf + iter->iov_offset;
+ iovec.iov_len = iov_iter_count(iter);
+ } else if (!iov_iter_is_bvec(iter)) {
iovec = iov_iter_iovec(iter);
} else {
iovec.iov_base = u64_to_user_ptr(rw->addr);
@@ -495,7 +498,7 @@ static void io_req_map_rw(struct io_kiocb *req, const struct iovec *iovec,
io->free_iovec = iovec;
io->bytes_done = 0;
/* can only be fixed buffers, no need to do anything */
- if (iov_iter_is_bvec(iter))
+ if (iov_iter_is_bvec(iter) || iter_is_ubuf(iter))
return;
if (!iovec) {
unsigned iov_off = 0;
--
2.30.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4/4] iov_iter: move iter_ubuf check inside restore WARN
2022-11-07 17:56 [PATCH 0/4] io_uring: use ITER_UBUF Keith Busch
` (2 preceding siblings ...)
2022-11-07 17:56 ` [PATCH 3/4] io_uring: use ubuf for single range imports for read/write Keith Busch
@ 2022-11-07 17:56 ` Keith Busch
2022-11-08 6:54 ` [PATCH 0/4] io_uring: use ITER_UBUF Christoph Hellwig
4 siblings, 0 replies; 9+ messages in thread
From: Keith Busch @ 2022-11-07 17:56 UTC (permalink / raw)
To: viro, axboe, io-uring; +Cc: asml.silence, linux-fsdevel, Keith Busch
From: Keith Busch <[email protected]>
io_uring is using iter_ubuf types for single vector requests. We expect
state restore may happen for this type now, and it is already handled
correctly, so move the check inside the warning to suppress it.
Signed-off-by: Keith Busch <[email protected]>
---
lib/iov_iter.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 07adf18e5e40..aa192a386bd7 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1880,8 +1880,8 @@ int import_ubuf(int rw, void __user *buf, size_t len, struct iov_iter *i)
*/
void iov_iter_restore(struct iov_iter *i, struct iov_iter_state *state)
{
- if (WARN_ON_ONCE(!iov_iter_is_bvec(i) && !iter_is_iovec(i)) &&
- !iov_iter_is_kvec(i) && !iter_is_ubuf(i))
+ if (WARN_ON_ONCE(!iov_iter_is_bvec(i) && !iter_is_iovec(i) &&
+ !iter_is_ubuf(i)) && !iov_iter_is_kvec(i))
return;
i->iov_offset = state->iov_offset;
i->count = state->count;
--
2.30.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 0/4] io_uring: use ITER_UBUF
2022-11-07 17:56 [PATCH 0/4] io_uring: use ITER_UBUF Keith Busch
` (3 preceding siblings ...)
2022-11-07 17:56 ` [PATCH 4/4] iov_iter: move iter_ubuf check inside restore WARN Keith Busch
@ 2022-11-08 6:54 ` Christoph Hellwig
2022-11-08 20:25 ` Keith Busch
4 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2022-11-08 6:54 UTC (permalink / raw)
To: Keith Busch
Cc: viro, axboe, io-uring, asml.silence, linux-fsdevel, Keith Busch
On Mon, Nov 07, 2022 at 09:56:06AM -0800, Keith Busch wrote:
> 1. io_uring will always prefer using the _iter versions of read/write
> callbacks if file_operations implement both, where as the generic
> syscalls will use .read/.write (if implemented) for non-vectored IO.
There are very few file operations that have both, and for those
the difference matters, e.g. the strange vectors semantics for the
sound code. I would strongly suggest to mirror what the normal
read/write path does here.
> 2. io_uring will use the ITER_UBUF representation for single vector
> readv/writev, but the generic syscalls currently uses ITER_IOVEC for
> these.
Same here. It might be woth to use ITER_UBUF for single vector
readv/writev, but this should be the same for all interfaces. I'd
suggest to drop this for now and do a separate series with careful
review from Al for this.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/4] io_uring: use ITER_UBUF
2022-11-08 6:54 ` [PATCH 0/4] io_uring: use ITER_UBUF Christoph Hellwig
@ 2022-11-08 20:25 ` Keith Busch
0 siblings, 0 replies; 9+ messages in thread
From: Keith Busch @ 2022-11-08 20:25 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Keith Busch, viro, axboe, io-uring, asml.silence, linux-fsdevel
On Mon, Nov 07, 2022 at 10:54:06PM -0800, Christoph Hellwig wrote:
> On Mon, Nov 07, 2022 at 09:56:06AM -0800, Keith Busch wrote:
> > 1. io_uring will always prefer using the _iter versions of read/write
> > callbacks if file_operations implement both, where as the generic
> > syscalls will use .read/.write (if implemented) for non-vectored IO.
>
> There are very few file operations that have both, and for those
> the difference matters, e.g. the strange vectors semantics for the
> sound code.
Yes, thankfully there are not many. Other than the two mentioned
file_operations, the only other fops I find implementing both are
'null_ops' and 'zero_ops'; those are fine. And one other implements
just .write/.write_iter: trace_events_user.c, which is also fine.
> I would strongly suggest to mirror what the normal
> read/write path does here.
I don't think we can change that now. io_uring has always used the
.{read,write}_iter callbacks if available ever since it introduced
non-vectored read/write (3a6820f2bb8a0). Altering the io_uring op's ABI
to align with the read/write syscalls seems risky.
But I don't think there are any real use cases affected by this series
anyway.
> > 2. io_uring will use the ITER_UBUF representation for single vector
> > readv/writev, but the generic syscalls currently uses ITER_IOVEC for
> > these.
>
> Same here. It might be woth to use ITER_UBUF for single vector
> readv/writev, but this should be the same for all interfaces. I'd
> suggest to drop this for now and do a separate series with careful
> review from Al for this.
I feel like that's a worthy longer term goal, but I'll start looking
into it now.
^ permalink raw reply [flat|nested] 9+ messages in thread