public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH v2 0/4] nxt propagation
@ 2020-03-02 20:45 Pavel Begunkov
  2020-03-02 20:45 ` [PATCH v2 1/4] io_uring: clean up io_close Pavel Begunkov
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Pavel Begunkov @ 2020-03-02 20:45 UTC (permalink / raw)
  To: Jens Axboe, io-uring, linux-kernel

*splitted the patchset*

The next propagation bits are done similarly as it was before, but
- nxt stealing is now at top-level, but not hidden in handlers
- ensure there is no with REQ_F_DONT_STEAL_NEXT

v2:
- fix race cond in io_put_req_submission()
- don't REQ_F_DONT_STEAL_NEXT for sync poll_add

Pavel Begunkov (4):
  io_uring: clean up io_close
  io_uring: make submission ref putting consistent
  io_uring: remove @nxt from handlers
  io_uring: get next req on subm ref drop

 fs/io_uring.c | 366 +++++++++++++++++++++++---------------------------
 1 file changed, 167 insertions(+), 199 deletions(-)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/4] io_uring: clean up io_close
  2020-03-02 20:45 [PATCH v2 0/4] nxt propagation Pavel Begunkov
@ 2020-03-02 20:45 ` Pavel Begunkov
  2020-03-02 20:45 ` [PATCH v2 2/4] io_uring: make submission ref putting consistent Pavel Begunkov
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Pavel Begunkov @ 2020-03-02 20:45 UTC (permalink / raw)
  To: Jens Axboe, io-uring, linux-kernel

Don't abuse labels for plain and straightworward code.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index fb8fe0bd5e18..ff6cc05b86c7 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3030,8 +3030,16 @@ static int io_close(struct io_kiocb *req, struct io_kiocb **nxt,
 		return ret;
 
 	/* if the file has a flush method, be safe and punt to async */
-	if (req->close.put_file->f_op->flush && !io_wq_current_is_worker())
-		goto eagain;
+	if (req->close.put_file->f_op->flush && force_nonblock) {
+		req->work.func = io_close_finish;
+		/*
+		 * Do manual async queue here to avoid grabbing files - we don't
+		 * need the files, and it'll cause io_close_finish() to close
+		 * the file again and cause a double CQE entry for this request
+		 */
+		io_queue_async_work(req);
+		return 0;
+	}
 
 	/*
 	 * No ->flush(), safely close from here and just punt the
@@ -3039,15 +3047,6 @@ static int io_close(struct io_kiocb *req, struct io_kiocb **nxt,
 	 */
 	__io_close_finish(req, nxt);
 	return 0;
-eagain:
-	req->work.func = io_close_finish;
-	/*
-	 * Do manual async queue here to avoid grabbing files - we don't
-	 * need the files, and it'll cause io_close_finish() to close
-	 * the file again and cause a double CQE entry for this request
-	 */
-	io_queue_async_work(req);
-	return 0;
 }
 
 static int io_prep_sfr(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/4] io_uring: make submission ref putting consistent
  2020-03-02 20:45 [PATCH v2 0/4] nxt propagation Pavel Begunkov
  2020-03-02 20:45 ` [PATCH v2 1/4] io_uring: clean up io_close Pavel Begunkov
@ 2020-03-02 20:45 ` Pavel Begunkov
  2020-03-02 20:45 ` [PATCH v2 3/4] io_uring: remove @nxt from handlers Pavel Begunkov
  2020-03-02 20:45 ` [PATCH v2 4/4] io_uring: get next req on subm ref drop Pavel Begunkov
  3 siblings, 0 replies; 11+ messages in thread
From: Pavel Begunkov @ 2020-03-02 20:45 UTC (permalink / raw)
  To: Jens Axboe, io-uring, linux-kernel

The rule is simple, any async handler gets a submission ref and should
put it at the end. Make them all follow it, and so more consistent.

This is a preparation patch, and as io_wq_assign_next() currently won't
ever work, this doesn't care to use io_put_req_find_next() instead of
io_put_req().

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 26 ++++++++++++--------------
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index ff6cc05b86c7..ad8046a9bc0f 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2550,7 +2550,7 @@ static bool io_req_cancelled(struct io_kiocb *req)
 	if (req->work.flags & IO_WQ_WORK_CANCEL) {
 		req_set_fail_links(req);
 		io_cqring_add_event(req, -ECANCELED);
-		io_put_req(req);
+		io_double_put_req(req);
 		return true;
 	}
 
@@ -2600,6 +2600,7 @@ static void io_fsync_finish(struct io_wq_work **workptr)
 	if (io_req_cancelled(req))
 		return;
 	__io_fsync(req, &nxt);
+	io_put_req(req); /* drop submission reference */
 	if (nxt)
 		io_wq_assign_next(workptr, nxt);
 }
@@ -2609,7 +2610,6 @@ static int io_fsync(struct io_kiocb *req, struct io_kiocb **nxt,
 {
 	/* fsync always requires a blocking context */
 	if (force_nonblock) {
-		io_put_req(req);
 		req->work.func = io_fsync_finish;
 		return -EAGAIN;
 	}
@@ -2621,9 +2621,6 @@ static void __io_fallocate(struct io_kiocb *req, struct io_kiocb **nxt)
 {
 	int ret;
 
-	if (io_req_cancelled(req))
-		return;
-
 	ret = vfs_fallocate(req->file, req->sync.mode, req->sync.off,
 				req->sync.len);
 	if (ret < 0)
@@ -2637,7 +2634,10 @@ static void io_fallocate_finish(struct io_wq_work **workptr)
 	struct io_kiocb *req = container_of(*workptr, struct io_kiocb, work);
 	struct io_kiocb *nxt = NULL;
 
+	if (io_req_cancelled(req))
+		return;
 	__io_fallocate(req, &nxt);
+	io_put_req(req); /* drop submission reference */
 	if (nxt)
 		io_wq_assign_next(workptr, nxt);
 }
@@ -2659,7 +2659,6 @@ static int io_fallocate(struct io_kiocb *req, struct io_kiocb **nxt,
 {
 	/* fallocate always requiring blocking context */
 	if (force_nonblock) {
-		io_put_req(req);
 		req->work.func = io_fallocate_finish;
 		return -EAGAIN;
 	}
@@ -3015,6 +3014,7 @@ static void io_close_finish(struct io_wq_work **workptr)
 
 	/* not cancellable, don't do io_req_cancelled() */
 	__io_close_finish(req, &nxt);
+	io_put_req(req); /* drop submission reference */
 	if (nxt)
 		io_wq_assign_next(workptr, nxt);
 }
@@ -3038,6 +3038,8 @@ static int io_close(struct io_kiocb *req, struct io_kiocb **nxt,
 		 * the file again and cause a double CQE entry for this request
 		 */
 		io_queue_async_work(req);
+		/* submission ref will be dropped, take it for async */
+		refcount_inc_not_zero(&req->refs);
 		return 0;
 	}
 
@@ -3088,6 +3090,7 @@ static void io_sync_file_range_finish(struct io_wq_work **workptr)
 	if (io_req_cancelled(req))
 		return;
 	__io_sync_file_range(req, &nxt);
+	io_put_req(req); /* put submission ref */
 	if (nxt)
 		io_wq_assign_next(workptr, nxt);
 }
@@ -3097,7 +3100,6 @@ static int io_sync_file_range(struct io_kiocb *req, struct io_kiocb **nxt,
 {
 	/* sync_file_range always requires a blocking context */
 	if (force_nonblock) {
-		io_put_req(req);
 		req->work.func = io_sync_file_range_finish;
 		return -EAGAIN;
 	}
@@ -3464,11 +3466,10 @@ static void io_accept_finish(struct io_wq_work **workptr)
 	struct io_kiocb *req = container_of(*workptr, struct io_kiocb, work);
 	struct io_kiocb *nxt = NULL;
 
-	io_put_req(req);
-
 	if (io_req_cancelled(req))
 		return;
 	__io_accept(req, &nxt, false);
+	io_put_req(req); /* drop submission reference */
 	if (nxt)
 		io_wq_assign_next(workptr, nxt);
 }
@@ -4733,17 +4734,14 @@ static void io_wq_submit_work(struct io_wq_work **workptr)
 		} while (1);
 	}
 
-	/* drop submission reference */
-	io_put_req(req);
-
 	if (ret) {
 		req_set_fail_links(req);
 		io_cqring_add_event(req, ret);
 		io_put_req(req);
 	}
 
-	/* if a dependent link is ready, pass it back */
-	if (!ret && nxt)
+	io_put_req(req); /* drop submission reference */
+	if (nxt)
 		io_wq_assign_next(workptr, nxt);
 }
 
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 3/4] io_uring: remove @nxt from handlers
  2020-03-02 20:45 [PATCH v2 0/4] nxt propagation Pavel Begunkov
  2020-03-02 20:45 ` [PATCH v2 1/4] io_uring: clean up io_close Pavel Begunkov
  2020-03-02 20:45 ` [PATCH v2 2/4] io_uring: make submission ref putting consistent Pavel Begunkov
@ 2020-03-02 20:45 ` Pavel Begunkov
  2020-03-02 20:45 ` [PATCH v2 4/4] io_uring: get next req on subm ref drop Pavel Begunkov
  3 siblings, 0 replies; 11+ messages in thread
From: Pavel Begunkov @ 2020-03-02 20:45 UTC (permalink / raw)
  To: Jens Axboe, io-uring, linux-kernel

There will be no use for @nxt in the handlers, and it's doesn't work
anyway, so purge it

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 204 +++++++++++++++++++++-----------------------------
 1 file changed, 86 insertions(+), 118 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index ad8046a9bc0f..daf7c2095523 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1804,17 +1804,6 @@ static void io_complete_rw(struct kiocb *kiocb, long res, long res2)
 	io_put_req(req);
 }
 
-static struct io_kiocb *__io_complete_rw(struct kiocb *kiocb, long res)
-{
-	struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw.kiocb);
-	struct io_kiocb *nxt = NULL;
-
-	io_complete_rw_common(kiocb, res);
-	io_put_req_find_next(req, &nxt);
-
-	return nxt;
-}
-
 static void io_complete_rw_iopoll(struct kiocb *kiocb, long res, long res2)
 {
 	struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw.kiocb);
@@ -2009,14 +1998,14 @@ static inline void io_rw_done(struct kiocb *kiocb, ssize_t ret)
 	}
 }
 
-static void kiocb_done(struct kiocb *kiocb, ssize_t ret, struct io_kiocb **nxt)
+static void kiocb_done(struct kiocb *kiocb, ssize_t ret)
 {
 	struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw.kiocb);
 
 	if (req->flags & REQ_F_CUR_POS)
 		req->file->f_pos = kiocb->ki_pos;
 	if (ret >= 0 && kiocb->ki_complete == io_complete_rw)
-		*nxt = __io_complete_rw(kiocb, ret);
+		io_complete_rw(kiocb, ret, 0);
 	else
 		io_rw_done(kiocb, ret);
 }
@@ -2265,8 +2254,7 @@ static int io_read_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 	return 0;
 }
 
-static int io_read(struct io_kiocb *req, struct io_kiocb **nxt,
-		   bool force_nonblock)
+static int io_read(struct io_kiocb *req, bool force_nonblock)
 {
 	struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs;
 	struct kiocb *kiocb = &req->rw.kiocb;
@@ -2306,7 +2294,7 @@ static int io_read(struct io_kiocb *req, struct io_kiocb **nxt,
 
 		/* Catch -EAGAIN return for forced non-blocking submission */
 		if (!force_nonblock || ret2 != -EAGAIN) {
-			kiocb_done(kiocb, ret2, nxt);
+			kiocb_done(kiocb, ret2);
 		} else {
 copy_iov:
 			ret = io_setup_async_rw(req, io_size, iovec,
@@ -2355,8 +2343,7 @@ static int io_write_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 	return 0;
 }
 
-static int io_write(struct io_kiocb *req, struct io_kiocb **nxt,
-		    bool force_nonblock)
+static int io_write(struct io_kiocb *req, bool force_nonblock)
 {
 	struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs;
 	struct kiocb *kiocb = &req->rw.kiocb;
@@ -2420,7 +2407,7 @@ static int io_write(struct io_kiocb *req, struct io_kiocb **nxt,
 		if (ret2 == -EOPNOTSUPP && (kiocb->ki_flags & IOCB_NOWAIT))
 			ret2 = -EAGAIN;
 		if (!force_nonblock || ret2 != -EAGAIN) {
-			kiocb_done(kiocb, ret2, nxt);
+			kiocb_done(kiocb, ret2);
 		} else {
 copy_iov:
 			ret = io_setup_async_rw(req, io_size, iovec,
@@ -2477,8 +2464,7 @@ static bool io_splice_punt(struct file *file)
 	return !(file->f_mode & O_NONBLOCK);
 }
 
-static int io_splice(struct io_kiocb *req, struct io_kiocb **nxt,
-		     bool force_nonblock)
+static int io_splice(struct io_kiocb *req, bool force_nonblock)
 {
 	struct io_splice *sp = &req->splice;
 	struct file *in = sp->file_in;
@@ -2505,7 +2491,7 @@ static int io_splice(struct io_kiocb *req, struct io_kiocb **nxt,
 	io_cqring_add_event(req, ret);
 	if (ret != sp->len)
 		req_set_fail_links(req);
-	io_put_req_find_next(req, nxt);
+	io_put_req(req);
 	return 0;
 }
 
@@ -2578,7 +2564,7 @@ static void io_wq_assign_next(struct io_wq_work **workptr, struct io_kiocb *nxt)
 	}
 }
 
-static void __io_fsync(struct io_kiocb *req, struct io_kiocb **nxt)
+static void __io_fsync(struct io_kiocb *req)
 {
 	loff_t end = req->sync.off + req->sync.len;
 	int ret;
@@ -2589,7 +2575,7 @@ static void __io_fsync(struct io_kiocb *req, struct io_kiocb **nxt)
 	if (ret < 0)
 		req_set_fail_links(req);
 	io_cqring_add_event(req, ret);
-	io_put_req_find_next(req, nxt);
+	io_put_req(req);
 }
 
 static void io_fsync_finish(struct io_wq_work **workptr)
@@ -2599,25 +2585,24 @@ static void io_fsync_finish(struct io_wq_work **workptr)
 
 	if (io_req_cancelled(req))
 		return;
-	__io_fsync(req, &nxt);
+	__io_fsync(req);
 	io_put_req(req); /* drop submission reference */
 	if (nxt)
 		io_wq_assign_next(workptr, nxt);
 }
 
-static int io_fsync(struct io_kiocb *req, struct io_kiocb **nxt,
-		    bool force_nonblock)
+static int io_fsync(struct io_kiocb *req, bool force_nonblock)
 {
 	/* fsync always requires a blocking context */
 	if (force_nonblock) {
 		req->work.func = io_fsync_finish;
 		return -EAGAIN;
 	}
-	__io_fsync(req, nxt);
+	__io_fsync(req);
 	return 0;
 }
 
-static void __io_fallocate(struct io_kiocb *req, struct io_kiocb **nxt)
+static void __io_fallocate(struct io_kiocb *req)
 {
 	int ret;
 
@@ -2626,7 +2611,7 @@ static void __io_fallocate(struct io_kiocb *req, struct io_kiocb **nxt)
 	if (ret < 0)
 		req_set_fail_links(req);
 	io_cqring_add_event(req, ret);
-	io_put_req_find_next(req, nxt);
+	io_put_req(req);
 }
 
 static void io_fallocate_finish(struct io_wq_work **workptr)
@@ -2636,7 +2621,7 @@ static void io_fallocate_finish(struct io_wq_work **workptr)
 
 	if (io_req_cancelled(req))
 		return;
-	__io_fallocate(req, &nxt);
+	__io_fallocate(req);
 	io_put_req(req); /* drop submission reference */
 	if (nxt)
 		io_wq_assign_next(workptr, nxt);
@@ -2654,8 +2639,7 @@ static int io_fallocate_prep(struct io_kiocb *req,
 	return 0;
 }
 
-static int io_fallocate(struct io_kiocb *req, struct io_kiocb **nxt,
-			bool force_nonblock)
+static int io_fallocate(struct io_kiocb *req, bool force_nonblock)
 {
 	/* fallocate always requiring blocking context */
 	if (force_nonblock) {
@@ -2663,7 +2647,7 @@ static int io_fallocate(struct io_kiocb *req, struct io_kiocb **nxt,
 		return -EAGAIN;
 	}
 
-	__io_fallocate(req, nxt);
+	__io_fallocate(req);
 	return 0;
 }
 
@@ -2736,8 +2720,7 @@ static int io_openat2_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 	return 0;
 }
 
-static int io_openat2(struct io_kiocb *req, struct io_kiocb **nxt,
-		      bool force_nonblock)
+static int io_openat2(struct io_kiocb *req, bool force_nonblock)
 {
 	struct open_flags op;
 	struct file *file;
@@ -2768,15 +2751,14 @@ static int io_openat2(struct io_kiocb *req, struct io_kiocb **nxt,
 	if (ret < 0)
 		req_set_fail_links(req);
 	io_cqring_add_event(req, ret);
-	io_put_req_find_next(req, nxt);
+	io_put_req(req);
 	return 0;
 }
 
-static int io_openat(struct io_kiocb *req, struct io_kiocb **nxt,
-		     bool force_nonblock)
+static int io_openat(struct io_kiocb *req, bool force_nonblock)
 {
 	req->open.how = build_open_how(req->open.how.flags, req->open.how.mode);
-	return io_openat2(req, nxt, force_nonblock);
+	return io_openat2(req, force_nonblock);
 }
 
 static int io_epoll_ctl_prep(struct io_kiocb *req,
@@ -2804,8 +2786,7 @@ static int io_epoll_ctl_prep(struct io_kiocb *req,
 #endif
 }
 
-static int io_epoll_ctl(struct io_kiocb *req, struct io_kiocb **nxt,
-			bool force_nonblock)
+static int io_epoll_ctl(struct io_kiocb *req, bool force_nonblock)
 {
 #if defined(CONFIG_EPOLL)
 	struct io_epoll *ie = &req->epoll;
@@ -2818,7 +2799,7 @@ static int io_epoll_ctl(struct io_kiocb *req, struct io_kiocb **nxt,
 	if (ret < 0)
 		req_set_fail_links(req);
 	io_cqring_add_event(req, ret);
-	io_put_req_find_next(req, nxt);
+	io_put_req(req);
 	return 0;
 #else
 	return -EOPNOTSUPP;
@@ -2840,8 +2821,7 @@ static int io_madvise_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 #endif
 }
 
-static int io_madvise(struct io_kiocb *req, struct io_kiocb **nxt,
-		      bool force_nonblock)
+static int io_madvise(struct io_kiocb *req, bool force_nonblock)
 {
 #if defined(CONFIG_ADVISE_SYSCALLS) && defined(CONFIG_MMU)
 	struct io_madvise *ma = &req->madvise;
@@ -2854,7 +2834,7 @@ static int io_madvise(struct io_kiocb *req, struct io_kiocb **nxt,
 	if (ret < 0)
 		req_set_fail_links(req);
 	io_cqring_add_event(req, ret);
-	io_put_req_find_next(req, nxt);
+	io_put_req(req);
 	return 0;
 #else
 	return -EOPNOTSUPP;
@@ -2872,8 +2852,7 @@ static int io_fadvise_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 	return 0;
 }
 
-static int io_fadvise(struct io_kiocb *req, struct io_kiocb **nxt,
-		      bool force_nonblock)
+static int io_fadvise(struct io_kiocb *req, bool force_nonblock)
 {
 	struct io_fadvise *fa = &req->fadvise;
 	int ret;
@@ -2893,7 +2872,7 @@ static int io_fadvise(struct io_kiocb *req, struct io_kiocb **nxt,
 	if (ret < 0)
 		req_set_fail_links(req);
 	io_cqring_add_event(req, ret);
-	io_put_req_find_next(req, nxt);
+	io_put_req(req);
 	return 0;
 }
 
@@ -2930,8 +2909,7 @@ static int io_statx_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 	return 0;
 }
 
-static int io_statx(struct io_kiocb *req, struct io_kiocb **nxt,
-		    bool force_nonblock)
+static int io_statx(struct io_kiocb *req, bool force_nonblock)
 {
 	struct io_open *ctx = &req->open;
 	unsigned lookup_flags;
@@ -2968,7 +2946,7 @@ static int io_statx(struct io_kiocb *req, struct io_kiocb **nxt,
 	if (ret < 0)
 		req_set_fail_links(req);
 	io_cqring_add_event(req, ret);
-	io_put_req_find_next(req, nxt);
+	io_put_req(req);
 	return 0;
 }
 
@@ -2995,7 +2973,7 @@ static int io_close_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 }
 
 /* only called when __close_fd_get_file() is done */
-static void __io_close_finish(struct io_kiocb *req, struct io_kiocb **nxt)
+static void __io_close_finish(struct io_kiocb *req)
 {
 	int ret;
 
@@ -3004,7 +2982,7 @@ static void __io_close_finish(struct io_kiocb *req, struct io_kiocb **nxt)
 		req_set_fail_links(req);
 	io_cqring_add_event(req, ret);
 	fput(req->close.put_file);
-	io_put_req_find_next(req, nxt);
+	io_put_req(req);
 }
 
 static void io_close_finish(struct io_wq_work **workptr)
@@ -3013,14 +2991,13 @@ static void io_close_finish(struct io_wq_work **workptr)
 	struct io_kiocb *nxt = NULL;
 
 	/* not cancellable, don't do io_req_cancelled() */
-	__io_close_finish(req, &nxt);
+	__io_close_finish(req);
 	io_put_req(req); /* drop submission reference */
 	if (nxt)
 		io_wq_assign_next(workptr, nxt);
 }
 
-static int io_close(struct io_kiocb *req, struct io_kiocb **nxt,
-		    bool force_nonblock)
+static int io_close(struct io_kiocb *req, bool force_nonblock)
 {
 	int ret;
 
@@ -3047,7 +3024,7 @@ static int io_close(struct io_kiocb *req, struct io_kiocb **nxt,
 	 * No ->flush(), safely close from here and just punt the
 	 * fput() to async context.
 	 */
-	__io_close_finish(req, nxt);
+	__io_close_finish(req);
 	return 0;
 }
 
@@ -3069,7 +3046,7 @@ static int io_prep_sfr(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 	return 0;
 }
 
-static void __io_sync_file_range(struct io_kiocb *req, struct io_kiocb **nxt)
+static void __io_sync_file_range(struct io_kiocb *req)
 {
 	int ret;
 
@@ -3078,7 +3055,7 @@ static void __io_sync_file_range(struct io_kiocb *req, struct io_kiocb **nxt)
 	if (ret < 0)
 		req_set_fail_links(req);
 	io_cqring_add_event(req, ret);
-	io_put_req_find_next(req, nxt);
+	io_put_req(req);
 }
 
 
@@ -3089,14 +3066,13 @@ static void io_sync_file_range_finish(struct io_wq_work **workptr)
 
 	if (io_req_cancelled(req))
 		return;
-	__io_sync_file_range(req, &nxt);
+	__io_sync_file_range(req);
 	io_put_req(req); /* put submission ref */
 	if (nxt)
 		io_wq_assign_next(workptr, nxt);
 }
 
-static int io_sync_file_range(struct io_kiocb *req, struct io_kiocb **nxt,
-			      bool force_nonblock)
+static int io_sync_file_range(struct io_kiocb *req, bool force_nonblock)
 {
 	/* sync_file_range always requires a blocking context */
 	if (force_nonblock) {
@@ -3104,7 +3080,7 @@ static int io_sync_file_range(struct io_kiocb *req, struct io_kiocb **nxt,
 		return -EAGAIN;
 	}
 
-	__io_sync_file_range(req, nxt);
+	__io_sync_file_range(req);
 	return 0;
 }
 
@@ -3156,8 +3132,7 @@ static int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 #endif
 }
 
-static int io_sendmsg(struct io_kiocb *req, struct io_kiocb **nxt,
-		      bool force_nonblock)
+static int io_sendmsg(struct io_kiocb *req, bool force_nonblock)
 {
 #if defined(CONFIG_NET)
 	struct io_async_msghdr *kmsg = NULL;
@@ -3211,15 +3186,14 @@ static int io_sendmsg(struct io_kiocb *req, struct io_kiocb **nxt,
 	io_cqring_add_event(req, ret);
 	if (ret < 0)
 		req_set_fail_links(req);
-	io_put_req_find_next(req, nxt);
+	io_put_req(req);
 	return 0;
 #else
 	return -EOPNOTSUPP;
 #endif
 }
 
-static int io_send(struct io_kiocb *req, struct io_kiocb **nxt,
-		   bool force_nonblock)
+static int io_send(struct io_kiocb *req, bool force_nonblock)
 {
 #if defined(CONFIG_NET)
 	struct socket *sock;
@@ -3262,7 +3236,7 @@ static int io_send(struct io_kiocb *req, struct io_kiocb **nxt,
 	io_cqring_add_event(req, ret);
 	if (ret < 0)
 		req_set_fail_links(req);
-	io_put_req_find_next(req, nxt);
+	io_put_req(req);
 	return 0;
 #else
 	return -EOPNOTSUPP;
@@ -3303,8 +3277,7 @@ static int io_recvmsg_prep(struct io_kiocb *req,
 #endif
 }
 
-static int io_recvmsg(struct io_kiocb *req, struct io_kiocb **nxt,
-		      bool force_nonblock)
+static int io_recvmsg(struct io_kiocb *req, bool force_nonblock)
 {
 #if defined(CONFIG_NET)
 	struct io_async_msghdr *kmsg = NULL;
@@ -3360,15 +3333,14 @@ static int io_recvmsg(struct io_kiocb *req, struct io_kiocb **nxt,
 	io_cqring_add_event(req, ret);
 	if (ret < 0)
 		req_set_fail_links(req);
-	io_put_req_find_next(req, nxt);
+	io_put_req(req);
 	return 0;
 #else
 	return -EOPNOTSUPP;
 #endif
 }
 
-static int io_recv(struct io_kiocb *req, struct io_kiocb **nxt,
-		   bool force_nonblock)
+static int io_recv(struct io_kiocb *req, bool force_nonblock)
 {
 #if defined(CONFIG_NET)
 	struct socket *sock;
@@ -3412,7 +3384,7 @@ static int io_recv(struct io_kiocb *req, struct io_kiocb **nxt,
 	io_cqring_add_event(req, ret);
 	if (ret < 0)
 		req_set_fail_links(req);
-	io_put_req_find_next(req, nxt);
+	io_put_req(req);
 	return 0;
 #else
 	return -EOPNOTSUPP;
@@ -3440,8 +3412,7 @@ static int io_accept_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 }
 
 #if defined(CONFIG_NET)
-static int __io_accept(struct io_kiocb *req, struct io_kiocb **nxt,
-		       bool force_nonblock)
+static int __io_accept(struct io_kiocb *req, bool force_nonblock)
 {
 	struct io_accept *accept = &req->accept;
 	unsigned file_flags;
@@ -3457,7 +3428,7 @@ static int __io_accept(struct io_kiocb *req, struct io_kiocb **nxt,
 	if (ret < 0)
 		req_set_fail_links(req);
 	io_cqring_add_event(req, ret);
-	io_put_req_find_next(req, nxt);
+	io_put_req(req);
 	return 0;
 }
 
@@ -3468,20 +3439,19 @@ static void io_accept_finish(struct io_wq_work **workptr)
 
 	if (io_req_cancelled(req))
 		return;
-	__io_accept(req, &nxt, false);
+	__io_accept(req, false);
 	io_put_req(req); /* drop submission reference */
 	if (nxt)
 		io_wq_assign_next(workptr, nxt);
 }
 #endif
 
-static int io_accept(struct io_kiocb *req, struct io_kiocb **nxt,
-		     bool force_nonblock)
+static int io_accept(struct io_kiocb *req, bool force_nonblock)
 {
 #if defined(CONFIG_NET)
 	int ret;
 
-	ret = __io_accept(req, nxt, force_nonblock);
+	ret = __io_accept(req, force_nonblock);
 	if (ret == -EAGAIN && force_nonblock) {
 		req->work.func = io_accept_finish;
 		return -EAGAIN;
@@ -3516,8 +3486,7 @@ static int io_connect_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 #endif
 }
 
-static int io_connect(struct io_kiocb *req, struct io_kiocb **nxt,
-		      bool force_nonblock)
+static int io_connect(struct io_kiocb *req, bool force_nonblock)
 {
 #if defined(CONFIG_NET)
 	struct io_async_ctx __io, *io;
@@ -3555,7 +3524,7 @@ static int io_connect(struct io_kiocb *req, struct io_kiocb **nxt,
 	if (ret < 0)
 		req_set_fail_links(req);
 	io_cqring_add_event(req, ret);
-	io_put_req_find_next(req, nxt);
+	io_put_req(req);
 	return 0;
 #else
 	return -EOPNOTSUPP;
@@ -3951,7 +3920,7 @@ static int io_poll_add_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe
 	return 0;
 }
 
-static int io_poll_add(struct io_kiocb *req, struct io_kiocb **nxt)
+static int io_poll_add(struct io_kiocb *req)
 {
 	struct io_poll_iocb *poll = &req->poll;
 	struct io_ring_ctx *ctx = req->ctx;
@@ -3973,7 +3942,7 @@ static int io_poll_add(struct io_kiocb *req, struct io_kiocb **nxt)
 
 	if (mask) {
 		io_cqring_ev_posted(ctx);
-		io_put_req_find_next(req, nxt);
+		io_put_req(req);
 	}
 	return ipt.error;
 }
@@ -4222,7 +4191,7 @@ static int io_async_cancel_one(struct io_ring_ctx *ctx, void *sqe_addr)
 
 static void io_async_find_and_cancel(struct io_ring_ctx *ctx,
 				     struct io_kiocb *req, __u64 sqe_addr,
-				     struct io_kiocb **nxt, int success_ret)
+				     int success_ret)
 {
 	unsigned long flags;
 	int ret;
@@ -4248,7 +4217,7 @@ static void io_async_find_and_cancel(struct io_ring_ctx *ctx,
 
 	if (ret < 0)
 		req_set_fail_links(req);
-	io_put_req_find_next(req, nxt);
+	io_put_req(req);
 }
 
 static int io_async_cancel_prep(struct io_kiocb *req,
@@ -4264,11 +4233,11 @@ static int io_async_cancel_prep(struct io_kiocb *req,
 	return 0;
 }
 
-static int io_async_cancel(struct io_kiocb *req, struct io_kiocb **nxt)
+static int io_async_cancel(struct io_kiocb *req)
 {
 	struct io_ring_ctx *ctx = req->ctx;
 
-	io_async_find_and_cancel(ctx, req, req->cancel.addr, nxt, 0);
+	io_async_find_and_cancel(ctx, req, req->cancel.addr, 0);
 	return 0;
 }
 
@@ -4475,7 +4444,7 @@ static void io_cleanup_req(struct io_kiocb *req)
 }
 
 static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
-			struct io_kiocb **nxt, bool force_nonblock)
+			bool force_nonblock)
 {
 	struct io_ring_ctx *ctx = req->ctx;
 	int ret;
@@ -4492,7 +4461,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			if (ret < 0)
 				break;
 		}
-		ret = io_read(req, nxt, force_nonblock);
+		ret = io_read(req, force_nonblock);
 		break;
 	case IORING_OP_WRITEV:
 	case IORING_OP_WRITE_FIXED:
@@ -4502,7 +4471,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			if (ret < 0)
 				break;
 		}
-		ret = io_write(req, nxt, force_nonblock);
+		ret = io_write(req, force_nonblock);
 		break;
 	case IORING_OP_FSYNC:
 		if (sqe) {
@@ -4510,7 +4479,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			if (ret < 0)
 				break;
 		}
-		ret = io_fsync(req, nxt, force_nonblock);
+		ret = io_fsync(req, force_nonblock);
 		break;
 	case IORING_OP_POLL_ADD:
 		if (sqe) {
@@ -4518,7 +4487,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			if (ret)
 				break;
 		}
-		ret = io_poll_add(req, nxt);
+		ret = io_poll_add(req);
 		break;
 	case IORING_OP_POLL_REMOVE:
 		if (sqe) {
@@ -4534,7 +4503,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			if (ret < 0)
 				break;
 		}
-		ret = io_sync_file_range(req, nxt, force_nonblock);
+		ret = io_sync_file_range(req, force_nonblock);
 		break;
 	case IORING_OP_SENDMSG:
 	case IORING_OP_SEND:
@@ -4544,9 +4513,9 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 				break;
 		}
 		if (req->opcode == IORING_OP_SENDMSG)
-			ret = io_sendmsg(req, nxt, force_nonblock);
+			ret = io_sendmsg(req, force_nonblock);
 		else
-			ret = io_send(req, nxt, force_nonblock);
+			ret = io_send(req, force_nonblock);
 		break;
 	case IORING_OP_RECVMSG:
 	case IORING_OP_RECV:
@@ -4556,9 +4525,9 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 				break;
 		}
 		if (req->opcode == IORING_OP_RECVMSG)
-			ret = io_recvmsg(req, nxt, force_nonblock);
+			ret = io_recvmsg(req, force_nonblock);
 		else
-			ret = io_recv(req, nxt, force_nonblock);
+			ret = io_recv(req, force_nonblock);
 		break;
 	case IORING_OP_TIMEOUT:
 		if (sqe) {
@@ -4582,7 +4551,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			if (ret)
 				break;
 		}
-		ret = io_accept(req, nxt, force_nonblock);
+		ret = io_accept(req, force_nonblock);
 		break;
 	case IORING_OP_CONNECT:
 		if (sqe) {
@@ -4590,7 +4559,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			if (ret)
 				break;
 		}
-		ret = io_connect(req, nxt, force_nonblock);
+		ret = io_connect(req, force_nonblock);
 		break;
 	case IORING_OP_ASYNC_CANCEL:
 		if (sqe) {
@@ -4598,7 +4567,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			if (ret)
 				break;
 		}
-		ret = io_async_cancel(req, nxt);
+		ret = io_async_cancel(req);
 		break;
 	case IORING_OP_FALLOCATE:
 		if (sqe) {
@@ -4606,7 +4575,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			if (ret)
 				break;
 		}
-		ret = io_fallocate(req, nxt, force_nonblock);
+		ret = io_fallocate(req, force_nonblock);
 		break;
 	case IORING_OP_OPENAT:
 		if (sqe) {
@@ -4614,7 +4583,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			if (ret)
 				break;
 		}
-		ret = io_openat(req, nxt, force_nonblock);
+		ret = io_openat(req, force_nonblock);
 		break;
 	case IORING_OP_CLOSE:
 		if (sqe) {
@@ -4622,7 +4591,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			if (ret)
 				break;
 		}
-		ret = io_close(req, nxt, force_nonblock);
+		ret = io_close(req, force_nonblock);
 		break;
 	case IORING_OP_FILES_UPDATE:
 		if (sqe) {
@@ -4638,7 +4607,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			if (ret)
 				break;
 		}
-		ret = io_statx(req, nxt, force_nonblock);
+		ret = io_statx(req, force_nonblock);
 		break;
 	case IORING_OP_FADVISE:
 		if (sqe) {
@@ -4646,7 +4615,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			if (ret)
 				break;
 		}
-		ret = io_fadvise(req, nxt, force_nonblock);
+		ret = io_fadvise(req, force_nonblock);
 		break;
 	case IORING_OP_MADVISE:
 		if (sqe) {
@@ -4654,7 +4623,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			if (ret)
 				break;
 		}
-		ret = io_madvise(req, nxt, force_nonblock);
+		ret = io_madvise(req, force_nonblock);
 		break;
 	case IORING_OP_OPENAT2:
 		if (sqe) {
@@ -4662,7 +4631,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			if (ret)
 				break;
 		}
-		ret = io_openat2(req, nxt, force_nonblock);
+		ret = io_openat2(req, force_nonblock);
 		break;
 	case IORING_OP_EPOLL_CTL:
 		if (sqe) {
@@ -4670,7 +4639,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			if (ret)
 				break;
 		}
-		ret = io_epoll_ctl(req, nxt, force_nonblock);
+		ret = io_epoll_ctl(req, force_nonblock);
 		break;
 	case IORING_OP_SPLICE:
 		if (sqe) {
@@ -4678,7 +4647,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			if (ret < 0)
 				break;
 		}
-		ret = io_splice(req, nxt, force_nonblock);
+		ret = io_splice(req, force_nonblock);
 		break;
 	default:
 		ret = -EINVAL;
@@ -4722,7 +4691,7 @@ static void io_wq_submit_work(struct io_wq_work **workptr)
 
 	if (!ret) {
 		do {
-			ret = io_issue_sqe(req, NULL, &nxt, false);
+			ret = io_issue_sqe(req, NULL, false);
 			/*
 			 * We can get EAGAIN for polled IO even though we're
 			 * forcing a sync submission from here, since we can't
@@ -4868,8 +4837,7 @@ static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer)
 
 	if (prev) {
 		req_set_fail_links(prev);
-		io_async_find_and_cancel(ctx, req, prev->user_data, NULL,
-						-ETIME);
+		io_async_find_and_cancel(ctx, req, prev->user_data, -ETIME);
 		io_put_req(prev);
 	} else {
 		io_cqring_add_event(req, -ETIME);
@@ -4938,7 +4906,7 @@ static void __io_queue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 			old_creds = override_creds(req->work.creds);
 	}
 
-	ret = io_issue_sqe(req, sqe, &nxt, true);
+	ret = io_issue_sqe(req, sqe, true);
 
 	/*
 	 * We async punt it if the file wasn't marked NOWAIT, or if the file
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 4/4] io_uring: get next req on subm ref drop
  2020-03-02 20:45 [PATCH v2 0/4] nxt propagation Pavel Begunkov
                   ` (2 preceding siblings ...)
  2020-03-02 20:45 ` [PATCH v2 3/4] io_uring: remove @nxt from handlers Pavel Begunkov
@ 2020-03-02 20:45 ` Pavel Begunkov
  2020-03-03  4:26   ` Jens Axboe
  3 siblings, 1 reply; 11+ messages in thread
From: Pavel Begunkov @ 2020-03-02 20:45 UTC (permalink / raw)
  To: Jens Axboe, io-uring, linux-kernel

Get next request when dropping the submission reference. However, if
there is an asynchronous counterpart (i.e. read/write, timeout, etc),
that would be dangerous to do, so ignore them using new
REQ_F_DONT_STEAL_NEXT flag.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 129 ++++++++++++++++++++++++++------------------------
 1 file changed, 66 insertions(+), 63 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index daf7c2095523..e05710ca63d5 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -488,6 +488,7 @@ enum {
 	REQ_F_NEED_CLEANUP_BIT,
 	REQ_F_OVERFLOW_BIT,
 	REQ_F_POLLED_BIT,
+	REQ_F_DONT_STEAL_NEXT_BIT,
 };
 
 enum {
@@ -532,6 +533,8 @@ enum {
 	REQ_F_OVERFLOW		= BIT(REQ_F_OVERFLOW_BIT),
 	/* already went through poll handler */
 	REQ_F_POLLED		= BIT(REQ_F_POLLED_BIT),
+	/* don't try to get next req, it's async and racy */
+	REQ_F_DONT_STEAL_NEXT	= BIT(REQ_F_DONT_STEAL_NEXT_BIT),
 };
 
 struct async_poll {
@@ -1218,6 +1221,27 @@ static void io_cqring_add_event(struct io_kiocb *req, long res)
 	io_cqring_ev_posted(ctx);
 }
 
+static void io_link_work_cb(struct io_wq_work **workptr)
+{
+	struct io_wq_work *work = *workptr;
+	struct io_kiocb *link = work->data;
+
+	io_queue_linked_timeout(link);
+	io_wq_submit_work(workptr);
+}
+
+static void io_wq_assign_next(struct io_wq_work **workptr, struct io_kiocb *nxt)
+{
+	struct io_kiocb *link;
+
+	*workptr = &nxt->work;
+	link = io_prep_linked_timeout(nxt);
+	if (link) {
+		nxt->work.func = io_link_work_cb;
+		nxt->work.data = link;
+	}
+}
+
 static inline bool io_is_fallback_req(struct io_kiocb *req)
 {
 	return req == (struct io_kiocb *)
@@ -1518,25 +1542,35 @@ static void io_free_req(struct io_kiocb *req)
 		io_queue_async_work(nxt);
 }
 
-/*
- * Drop reference to request, return next in chain (if there is one) if this
- * was the last reference to this request.
- */
-__attribute__((nonnull))
-static void io_put_req_find_next(struct io_kiocb *req, struct io_kiocb **nxtptr)
-{
-	if (refcount_dec_and_test(&req->refs)) {
-		io_req_find_next(req, nxtptr);
-		__io_free_req(req);
-	}
-}
-
 static void io_put_req(struct io_kiocb *req)
 {
 	if (refcount_dec_and_test(&req->refs))
 		io_free_req(req);
 }
 
+__attribute__((warn_unused_result))
+static struct io_kiocb *io_put_req_submission(struct io_kiocb *req)
+{
+	struct io_kiocb *nxt = NULL;
+
+	/* do it first, while holding a ref */
+	if (!(req->flags & REQ_F_DONT_STEAL_NEXT))
+		io_req_find_next(req, &nxt);
+
+	io_put_req(req);
+	return nxt;
+}
+
+static void io_put_req_async_submission(struct io_kiocb *req,
+					struct io_wq_work **workptr)
+{
+	static struct io_kiocb *nxt;
+
+	nxt = io_put_req_submission(req);
+	if (nxt)
+		io_wq_assign_next(workptr, nxt);
+}
+
 /*
  * Must only be used if we don't need to care about links, usually from
  * within the completion handling itself.
@@ -1979,8 +2013,11 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 
 static inline void io_rw_done(struct kiocb *kiocb, ssize_t ret)
 {
+	struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw.kiocb);
+
 	switch (ret) {
 	case -EIOCBQUEUED:
+		req->flags |= REQ_F_DONT_STEAL_NEXT;
 		break;
 	case -ERESTARTSYS:
 	case -ERESTARTNOINTR:
@@ -2526,6 +2563,7 @@ static int io_prep_fsync(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 	if (unlikely(req->sync.flags & ~IORING_FSYNC_DATASYNC))
 		return -EINVAL;
 
+	req->flags |= REQ_F_DONT_STEAL_NEXT;
 	req->sync.off = READ_ONCE(sqe->off);
 	req->sync.len = READ_ONCE(sqe->len);
 	return 0;
@@ -2543,27 +2581,6 @@ static bool io_req_cancelled(struct io_kiocb *req)
 	return false;
 }
 
-static void io_link_work_cb(struct io_wq_work **workptr)
-{
-	struct io_wq_work *work = *workptr;
-	struct io_kiocb *link = work->data;
-
-	io_queue_linked_timeout(link);
-	io_wq_submit_work(workptr);
-}
-
-static void io_wq_assign_next(struct io_wq_work **workptr, struct io_kiocb *nxt)
-{
-	struct io_kiocb *link;
-
-	*workptr = &nxt->work;
-	link = io_prep_linked_timeout(nxt);
-	if (link) {
-		nxt->work.func = io_link_work_cb;
-		nxt->work.data = link;
-	}
-}
-
 static void __io_fsync(struct io_kiocb *req)
 {
 	loff_t end = req->sync.off + req->sync.len;
@@ -2581,14 +2598,11 @@ static void __io_fsync(struct io_kiocb *req)
 static void io_fsync_finish(struct io_wq_work **workptr)
 {
 	struct io_kiocb *req = container_of(*workptr, struct io_kiocb, work);
-	struct io_kiocb *nxt = NULL;
 
 	if (io_req_cancelled(req))
 		return;
 	__io_fsync(req);
-	io_put_req(req); /* drop submission reference */
-	if (nxt)
-		io_wq_assign_next(workptr, nxt);
+	io_put_req_async_submission(req, workptr);
 }
 
 static int io_fsync(struct io_kiocb *req, bool force_nonblock)
@@ -2617,14 +2631,11 @@ static void __io_fallocate(struct io_kiocb *req)
 static void io_fallocate_finish(struct io_wq_work **workptr)
 {
 	struct io_kiocb *req = container_of(*workptr, struct io_kiocb, work);
-	struct io_kiocb *nxt = NULL;
 
 	if (io_req_cancelled(req))
 		return;
 	__io_fallocate(req);
-	io_put_req(req); /* drop submission reference */
-	if (nxt)
-		io_wq_assign_next(workptr, nxt);
+	io_put_req_async_submission(req, workptr);
 }
 
 static int io_fallocate_prep(struct io_kiocb *req,
@@ -2988,13 +2999,10 @@ static void __io_close_finish(struct io_kiocb *req)
 static void io_close_finish(struct io_wq_work **workptr)
 {
 	struct io_kiocb *req = container_of(*workptr, struct io_kiocb, work);
-	struct io_kiocb *nxt = NULL;
 
 	/* not cancellable, don't do io_req_cancelled() */
 	__io_close_finish(req);
-	io_put_req(req); /* drop submission reference */
-	if (nxt)
-		io_wq_assign_next(workptr, nxt);
+	io_put_req_async_submission(req, workptr);
 }
 
 static int io_close(struct io_kiocb *req, bool force_nonblock)
@@ -3016,6 +3024,7 @@ static int io_close(struct io_kiocb *req, bool force_nonblock)
 		 */
 		io_queue_async_work(req);
 		/* submission ref will be dropped, take it for async */
+		req->flags |= REQ_F_DONT_STEAL_NEXT;
 		refcount_inc_not_zero(&req->refs);
 		return 0;
 	}
@@ -3062,14 +3071,11 @@ static void __io_sync_file_range(struct io_kiocb *req)
 static void io_sync_file_range_finish(struct io_wq_work **workptr)
 {
 	struct io_kiocb *req = container_of(*workptr, struct io_kiocb, work);
-	struct io_kiocb *nxt = NULL;
 
 	if (io_req_cancelled(req))
 		return;
 	__io_sync_file_range(req);
-	io_put_req(req); /* put submission ref */
-	if (nxt)
-		io_wq_assign_next(workptr, nxt);
+	io_put_req_async_submission(req, workptr);
 }
 
 static int io_sync_file_range(struct io_kiocb *req, bool force_nonblock)
@@ -3435,14 +3441,11 @@ static int __io_accept(struct io_kiocb *req, bool force_nonblock)
 static void io_accept_finish(struct io_wq_work **workptr)
 {
 	struct io_kiocb *req = container_of(*workptr, struct io_kiocb, work);
-	struct io_kiocb *nxt = NULL;
 
 	if (io_req_cancelled(req))
 		return;
 	__io_accept(req, false);
-	io_put_req(req); /* drop submission reference */
-	if (nxt)
-		io_wq_assign_next(workptr, nxt);
+	io_put_req_async_submission(req, workptr);
 }
 #endif
 
@@ -3859,9 +3862,10 @@ static void io_poll_task_handler(struct io_kiocb *req, struct io_kiocb **nxt)
 	hash_del(&req->hash_node);
 	io_poll_complete(req, req->result, 0);
 	req->flags |= REQ_F_COMP_LOCKED;
-	io_put_req_find_next(req, nxt);
 	spin_unlock_irq(&ctx->completion_lock);
 
+	req->flags &= ~REQ_F_DONT_STEAL_NEXT;
+	*nxt = io_put_req_submission(req);
 	io_cqring_ev_posted(ctx);
 }
 
@@ -3943,7 +3947,10 @@ static int io_poll_add(struct io_kiocb *req)
 	if (mask) {
 		io_cqring_ev_posted(ctx);
 		io_put_req(req);
+	} else {
+		req->flags |= REQ_F_DONT_STEAL_NEXT;
 	}
+
 	return ipt.error;
 }
 
@@ -4066,6 +4073,7 @@ static int io_timeout_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 	if (flags & ~IORING_TIMEOUT_ABS)
 		return -EINVAL;
 
+	req->flags |= REQ_F_DONT_STEAL_NEXT;
 	req->timeout.count = READ_ONCE(sqe->off);
 
 	if (!req->io && io_alloc_async_ctx(req))
@@ -4680,7 +4688,6 @@ static void io_wq_submit_work(struct io_wq_work **workptr)
 {
 	struct io_wq_work *work = *workptr;
 	struct io_kiocb *req = container_of(work, struct io_kiocb, work);
-	struct io_kiocb *nxt = NULL;
 	int ret = 0;
 
 	/* if NO_CANCEL is set, we must still run the work */
@@ -4709,9 +4716,7 @@ static void io_wq_submit_work(struct io_wq_work **workptr)
 		io_put_req(req);
 	}
 
-	io_put_req(req); /* drop submission reference */
-	if (nxt)
-		io_wq_assign_next(workptr, nxt);
+	io_put_req_async_submission(req, workptr);
 }
 
 static int io_req_needs_file(struct io_kiocb *req, int fd)
@@ -4935,10 +4940,6 @@ static void __io_queue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 	}
 
 err:
-	nxt = NULL;
-	/* drop submission reference */
-	io_put_req_find_next(req, &nxt);
-
 	if (linked_timeout) {
 		if (!ret)
 			io_queue_linked_timeout(linked_timeout);
@@ -4952,6 +4953,8 @@ static void __io_queue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 		req_set_fail_links(req);
 		io_put_req(req);
 	}
+
+	nxt = io_put_req_submission(req);
 	if (nxt) {
 		req = nxt;
 
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/4] io_uring: get next req on subm ref drop
  2020-03-02 20:45 ` [PATCH v2 4/4] io_uring: get next req on subm ref drop Pavel Begunkov
@ 2020-03-03  4:26   ` Jens Axboe
  2020-03-03  6:54     ` Pavel Begunkov
  0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2020-03-03  4:26 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring, linux-kernel

On 3/2/20 1:45 PM, Pavel Begunkov wrote:
> Get next request when dropping the submission reference. However, if
> there is an asynchronous counterpart (i.e. read/write, timeout, etc),
> that would be dangerous to do, so ignore them using new
> REQ_F_DONT_STEAL_NEXT flag.

Hmm, not so sure I like this one. It's not quite clear to me where we
need REQ_F_DONT_STEAL_NEXT. If we have an async component, then we set
REQ_F_DONT_STEAL_NEXT. So this is generally the case where our
io_put_req() for submit is not the last drop. And for the other case,
the put is generally in the caller anyway. So I don't really see what
this extra flag buys us?

Few more comments below.

> +static void io_put_req_async_submission(struct io_kiocb *req,
> +					struct io_wq_work **workptr)
> +{
> +	static struct io_kiocb *nxt;
> +
> +	nxt = io_put_req_submission(req);
> +	if (nxt)
> +		io_wq_assign_next(workptr, nxt);
> +}

This really should be called io_put_req_async_completion() since it's
called on completion. The naming is confusing.

> @@ -2581,14 +2598,11 @@ static void __io_fsync(struct io_kiocb *req)
>  static void io_fsync_finish(struct io_wq_work **workptr)
>  {
>  	struct io_kiocb *req = container_of(*workptr, struct io_kiocb, work);
> -	struct io_kiocb *nxt = NULL;
>  
>  	if (io_req_cancelled(req))
>  		return;
>  	__io_fsync(req);
> -	io_put_req(req); /* drop submission reference */
> -	if (nxt)
> -		io_wq_assign_next(workptr, nxt);
> +	io_put_req_async_submission(req, workptr);
>  }
>  
>  static int io_fsync(struct io_kiocb *req, bool force_nonblock)
> @@ -2617,14 +2631,11 @@ static void __io_fallocate(struct io_kiocb *req)
>  static void io_fallocate_finish(struct io_wq_work **workptr)
>  {
>  	struct io_kiocb *req = container_of(*workptr, struct io_kiocb, work);
> -	struct io_kiocb *nxt = NULL;
>  
>  	if (io_req_cancelled(req))
>  		return;
>  	__io_fallocate(req);
> -	io_put_req(req); /* drop submission reference */
> -	if (nxt)
> -		io_wq_assign_next(workptr, nxt);
> +	io_put_req_async_submission(req, workptr);
>  }

All of these cleanups are nice (except the naming, as mentioned).

> @@ -3943,7 +3947,10 @@ static int io_poll_add(struct io_kiocb *req)
>  	if (mask) {
>  		io_cqring_ev_posted(ctx);
>  		io_put_req(req);
> +	} else {
> +		req->flags |= REQ_F_DONT_STEAL_NEXT;
>  	}
> +
>  	return ipt.error;
>  }

Is this racy? I guess it doesn't matter since we're still holding the
completion reference.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/4] io_uring: get next req on subm ref drop
  2020-03-03  4:26   ` Jens Axboe
@ 2020-03-03  6:54     ` Pavel Begunkov
  2020-03-03 10:46       ` Pavel Begunkov
  2020-03-03 16:03       ` Jens Axboe
  0 siblings, 2 replies; 11+ messages in thread
From: Pavel Begunkov @ 2020-03-03  6:54 UTC (permalink / raw)
  To: Jens Axboe, io-uring, linux-kernel

On 03/03/2020 07:26, Jens Axboe wrote:
> On 3/2/20 1:45 PM, Pavel Begunkov wrote:
>> Get next request when dropping the submission reference. However, if
>> there is an asynchronous counterpart (i.e. read/write, timeout, etc),
>> that would be dangerous to do, so ignore them using new
>> REQ_F_DONT_STEAL_NEXT flag.
> 
> Hmm, not so sure I like this one. It's not quite clear to me where we
> need REQ_F_DONT_STEAL_NEXT. If we have an async component, then we set
> REQ_F_DONT_STEAL_NEXT. So this is generally the case where our
> io_put_req() for submit is not the last drop. And for the other case,
> the put is generally in the caller anyway. So I don't really see what
> this extra flag buys us?

Because io_put_work() holds a reference, no async handler can achive req->refs
== 0, so it won't return next upon dropping the submission ref (i.e. by
put_find_nxt()). And I want to have next before io_put_work(), to, instead of as
currently:

run_work(work);
assign_cur_work(NULL); // spinlock + unlock worker->lock
new_work = put_work(work);
assign_cur_work(new_work); // the second time

do:

new_work = run_work(work);
assign_cur_work(new_work); // need new_work here
put_work(work);


The other way:

io_wq_submit_work() // for all async handlers
{
	...
	// Drop submission reference.
	// One extra ref will be put in io_put_work() right
	// after return, and it'll be done in the same thread
	if (atomic_dec_and_get(req) == 1)
		steal_next(req);
}

Maybe cleaner, but looks fragile as well. Would you prefer it?

> Few more comments below.
> 
>> +static void io_put_req_async_submission(struct io_kiocb *req,
>> +					struct io_wq_work **workptr)
>> +{
>> +	static struct io_kiocb *nxt;
>> +
>> +	nxt = io_put_req_submission(req);
>> +	if (nxt)
>> +		io_wq_assign_next(workptr, nxt);
>> +}
> 
> This really should be called io_put_req_async_completion() since it's
> called on completion. The naming is confusing.

Ok

>> @@ -2581,14 +2598,11 @@ static void __io_fsync(struct io_kiocb *req)
>>  static void io_fsync_finish(struct io_wq_work **workptr)
>>  {
>>  	struct io_kiocb *req = container_of(*workptr, struct io_kiocb, work);
>> -	struct io_kiocb *nxt = NULL;
>>  
>>  	if (io_req_cancelled(req))
>>  		return;
>>  	__io_fsync(req);
>> -	io_put_req(req); /* drop submission reference */
>> -	if (nxt)
>> -		io_wq_assign_next(workptr, nxt);
>> +	io_put_req_async_submission(req, workptr);
>>  }
>>  
>>  static int io_fsync(struct io_kiocb *req, bool force_nonblock)
>> @@ -2617,14 +2631,11 @@ static void __io_fallocate(struct io_kiocb *req)
>>  static void io_fallocate_finish(struct io_wq_work **workptr)
>>  {
>>  	struct io_kiocb *req = container_of(*workptr, struct io_kiocb, work);
>> -	struct io_kiocb *nxt = NULL;
>>  
>>  	if (io_req_cancelled(req))
>>  		return;
>>  	__io_fallocate(req);
>> -	io_put_req(req); /* drop submission reference */
>> -	if (nxt)
>> -		io_wq_assign_next(workptr, nxt);
>> +	io_put_req_async_submission(req, workptr);
>>  }
> 
> All of these cleanups are nice (except the naming, as mentioned).
> 
>> @@ -3943,7 +3947,10 @@ static int io_poll_add(struct io_kiocb *req)
>>  	if (mask) {
>>  		io_cqring_ev_posted(ctx);
>>  		io_put_req(req);
>> +	} else {
>> +		req->flags |= REQ_F_DONT_STEAL_NEXT;
>>  	}
>> +
>>  	return ipt.error;
>>  }
> 
> Is this racy? I guess it doesn't matter since we're still holding the
> completion reference.

It's done by the same thread, that uses it. There could be a race if the async
counterpart is going to change req->flags, but we tolerate false negative (i.e.
put_req() will handle it).

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/4] io_uring: get next req on subm ref drop
  2020-03-03  6:54     ` Pavel Begunkov
@ 2020-03-03 10:46       ` Pavel Begunkov
  2020-03-03 16:04         ` Jens Axboe
  2020-03-03 16:03       ` Jens Axboe
  1 sibling, 1 reply; 11+ messages in thread
From: Pavel Begunkov @ 2020-03-03 10:46 UTC (permalink / raw)
  To: Jens Axboe, io-uring, linux-kernel

On 3/3/2020 9:54 AM, Pavel Begunkov wrote:
> On 03/03/2020 07:26, Jens Axboe wrote:
>> On 3/2/20 1:45 PM, Pavel Begunkov wrote:
>>> Get next request when dropping the submission reference. However, if
>>> there is an asynchronous counterpart (i.e. read/write, timeout, etc),
>>> that would be dangerous to do, so ignore them using new
>>> REQ_F_DONT_STEAL_NEXT flag.
>>
>> Hmm, not so sure I like this one. It's not quite clear to me where we
>> need REQ_F_DONT_STEAL_NEXT. If we have an async component, then we set
>> REQ_F_DONT_STEAL_NEXT. So this is generally the case where our
>> io_put_req() for submit is not the last drop. And for the other case,
>> the put is generally in the caller anyway. So I don't really see what
>> this extra flag buys us?
> 
> Because io_put_work() holds a reference, no async handler can achive req->refs
> == 0, so it won't return next upon dropping the submission ref (i.e. by
> put_find_nxt()). And I want to have next before io_put_work(), to, instead of as
> currently:
> 
> run_work(work);
> assign_cur_work(NULL); // spinlock + unlock worker->lock
> new_work = put_work(work);
> assign_cur_work(new_work); // the second time
> 
> do:
> 
> new_work = run_work(work);
> assign_cur_work(new_work); // need new_work here
> put_work(work);
> 
> 
> The other way:
> 
> io_wq_submit_work() // for all async handlers
> {
> 	...
> 	// Drop submission reference.
> 	// One extra ref will be put in io_put_work() right
> 	// after return, and it'll be done in the same thread
> 	if (atomic_dec_and_get(req) == 1)
> 		steal_next(req);
> }
> 
> Maybe cleaner, but looks fragile as well. Would you prefer it?

Any chance you've measured your next-work fix? I wonder how much does it
hurt performance, and whether we need a terse patch for 5.6.


>> Few more comments below.
>>
>>> +static void io_put_req_async_submission(struct io_kiocb *req,
>>> +					struct io_wq_work **workptr)
>>> +{
>>> +	static struct io_kiocb *nxt;
>>> +
>>> +	nxt = io_put_req_submission(req);
>>> +	if (nxt)
>>> +		io_wq_assign_next(workptr, nxt);
>>> +}
>>
>> This really should be called io_put_req_async_completion() since it's
>> called on completion. The naming is confusing.
> 
> Ok
> 
>>> @@ -2581,14 +2598,11 @@ static void __io_fsync(struct io_kiocb *req)
>>>  static void io_fsync_finish(struct io_wq_work **workptr)
>>>  {
>>>  	struct io_kiocb *req = container_of(*workptr, struct io_kiocb, work);
>>> -	struct io_kiocb *nxt = NULL;
>>>  
>>>  	if (io_req_cancelled(req))
>>>  		return;
>>>  	__io_fsync(req);
>>> -	io_put_req(req); /* drop submission reference */
>>> -	if (nxt)
>>> -		io_wq_assign_next(workptr, nxt);
>>> +	io_put_req_async_submission(req, workptr);
>>>  }
>>>  
>>>  static int io_fsync(struct io_kiocb *req, bool force_nonblock)
>>> @@ -2617,14 +2631,11 @@ static void __io_fallocate(struct io_kiocb *req)
>>>  static void io_fallocate_finish(struct io_wq_work **workptr)
>>>  {
>>>  	struct io_kiocb *req = container_of(*workptr, struct io_kiocb, work);
>>> -	struct io_kiocb *nxt = NULL;
>>>  
>>>  	if (io_req_cancelled(req))
>>>  		return;
>>>  	__io_fallocate(req);
>>> -	io_put_req(req); /* drop submission reference */
>>> -	if (nxt)
>>> -		io_wq_assign_next(workptr, nxt);
>>> +	io_put_req_async_submission(req, workptr);
>>>  }
>>
>> All of these cleanups are nice (except the naming, as mentioned).
>>
>>> @@ -3943,7 +3947,10 @@ static int io_poll_add(struct io_kiocb *req)
>>>  	if (mask) {
>>>  		io_cqring_ev_posted(ctx);
>>>  		io_put_req(req);
>>> +	} else {
>>> +		req->flags |= REQ_F_DONT_STEAL_NEXT;
>>>  	}
>>> +
>>>  	return ipt.error;
>>>  }
>>
>> Is this racy? I guess it doesn't matter since we're still holding the
>> completion reference.
> 
> It's done by the same thread, that uses it. There could be a race if the async
> counterpart is going to change req->flags, but we tolerate false negative (i.e.
> put_req() will handle it).
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/4] io_uring: get next req on subm ref drop
  2020-03-03  6:54     ` Pavel Begunkov
  2020-03-03 10:46       ` Pavel Begunkov
@ 2020-03-03 16:03       ` Jens Axboe
  1 sibling, 0 replies; 11+ messages in thread
From: Jens Axboe @ 2020-03-03 16:03 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring, linux-kernel

On 3/2/20 11:54 PM, Pavel Begunkov wrote:
> On 03/03/2020 07:26, Jens Axboe wrote:
>> On 3/2/20 1:45 PM, Pavel Begunkov wrote:
>>> Get next request when dropping the submission reference. However, if
>>> there is an asynchronous counterpart (i.e. read/write, timeout, etc),
>>> that would be dangerous to do, so ignore them using new
>>> REQ_F_DONT_STEAL_NEXT flag.
>>
>> Hmm, not so sure I like this one. It's not quite clear to me where we
>> need REQ_F_DONT_STEAL_NEXT. If we have an async component, then we set
>> REQ_F_DONT_STEAL_NEXT. So this is generally the case where our
>> io_put_req() for submit is not the last drop. And for the other case,
>> the put is generally in the caller anyway. So I don't really see what
>> this extra flag buys us?
> 
> Because io_put_work() holds a reference, no async handler can achive req->refs
> == 0, so it won't return next upon dropping the submission ref (i.e. by
> put_find_nxt()). And I want to have next before io_put_work(), to, instead of as
> currently:
> 
> run_work(work);
> assign_cur_work(NULL); // spinlock + unlock worker->lock
> new_work = put_work(work);
> assign_cur_work(new_work); // the second time
> 
> do:
> 
> new_work = run_work(work);
> assign_cur_work(new_work); // need new_work here
> put_work(work);
> 
> 
> The other way:
> 
> io_wq_submit_work() // for all async handlers
> {
> 	...
> 	// Drop submission reference.
> 	// One extra ref will be put in io_put_work() right
> 	// after return, and it'll be done in the same thread
> 	if (atomic_dec_and_get(req) == 1)
> 		steal_next(req);
> }
> 
> Maybe cleaner, but looks fragile as well. Would you prefer it?

I think I prefer that, since it doesn't need random setting of a
no-steal flag throughout. And it should be pretty solid, since we know
that we hold one and that can only be our reference. Just needs a nice
comment explaining that fact as well.

>>> @@ -3943,7 +3947,10 @@ static int io_poll_add(struct io_kiocb *req)
>>>  	if (mask) {
>>>  		io_cqring_ev_posted(ctx);
>>>  		io_put_req(req);
>>> +	} else {
>>> +		req->flags |= REQ_F_DONT_STEAL_NEXT;
>>>  	}
>>> +
>>>  	return ipt.error;
>>>  }
>>
>> Is this racy? I guess it doesn't matter since we're still holding the
>> completion reference.
> 
> It's done by the same thread, that uses it. There could be a race if
> the async counterpart is going to change req->flags, but we tolerate
> false negative (i.e.  put_req() will handle it).

It's relying on the fact that it's the task itself that'll run the task
work, which can't be done by this time. Just caught my eye as something
to look out for.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/4] io_uring: get next req on subm ref drop
  2020-03-03 10:46       ` Pavel Begunkov
@ 2020-03-03 16:04         ` Jens Axboe
  2020-03-03 16:29           ` Jens Axboe
  0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2020-03-03 16:04 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring, linux-kernel

On 3/3/20 3:46 AM, Pavel Begunkov wrote:
> On 3/3/2020 9:54 AM, Pavel Begunkov wrote:
>> On 03/03/2020 07:26, Jens Axboe wrote:
>>> On 3/2/20 1:45 PM, Pavel Begunkov wrote:
>>>> Get next request when dropping the submission reference. However, if
>>>> there is an asynchronous counterpart (i.e. read/write, timeout, etc),
>>>> that would be dangerous to do, so ignore them using new
>>>> REQ_F_DONT_STEAL_NEXT flag.
>>>
>>> Hmm, not so sure I like this one. It's not quite clear to me where we
>>> need REQ_F_DONT_STEAL_NEXT. If we have an async component, then we set
>>> REQ_F_DONT_STEAL_NEXT. So this is generally the case where our
>>> io_put_req() for submit is not the last drop. And for the other case,
>>> the put is generally in the caller anyway. So I don't really see what
>>> this extra flag buys us?
>>
>> Because io_put_work() holds a reference, no async handler can achive req->refs
>> == 0, so it won't return next upon dropping the submission ref (i.e. by
>> put_find_nxt()). And I want to have next before io_put_work(), to, instead of as
>> currently:
>>
>> run_work(work);
>> assign_cur_work(NULL); // spinlock + unlock worker->lock
>> new_work = put_work(work);
>> assign_cur_work(new_work); // the second time
>>
>> do:
>>
>> new_work = run_work(work);
>> assign_cur_work(new_work); // need new_work here
>> put_work(work);
>>
>>
>> The other way:
>>
>> io_wq_submit_work() // for all async handlers
>> {
>> 	...
>> 	// Drop submission reference.
>> 	// One extra ref will be put in io_put_work() right
>> 	// after return, and it'll be done in the same thread
>> 	if (atomic_dec_and_get(req) == 1)
>> 		steal_next(req);
>> }
>>
>> Maybe cleaner, but looks fragile as well. Would you prefer it?
> 
> Any chance you've measured your next-work fix? I wonder how much does it
> hurt performance, and whether we need a terse patch for 5.6.

Unless I'm missing something, the worker will pick up the next work
without sleeping, since the request will have finished. So it really
should not add any extra overhead, except you'll do an extra wqe lock
roundtrip.

But I'll run some testing to be totally sure.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/4] io_uring: get next req on subm ref drop
  2020-03-03 16:04         ` Jens Axboe
@ 2020-03-03 16:29           ` Jens Axboe
  0 siblings, 0 replies; 11+ messages in thread
From: Jens Axboe @ 2020-03-03 16:29 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring, linux-kernel

On 3/3/20 9:04 AM, Jens Axboe wrote:
> On 3/3/20 3:46 AM, Pavel Begunkov wrote:
>> On 3/3/2020 9:54 AM, Pavel Begunkov wrote:
>>> On 03/03/2020 07:26, Jens Axboe wrote:
>>>> On 3/2/20 1:45 PM, Pavel Begunkov wrote:
>>>>> Get next request when dropping the submission reference. However, if
>>>>> there is an asynchronous counterpart (i.e. read/write, timeout, etc),
>>>>> that would be dangerous to do, so ignore them using new
>>>>> REQ_F_DONT_STEAL_NEXT flag.
>>>>
>>>> Hmm, not so sure I like this one. It's not quite clear to me where we
>>>> need REQ_F_DONT_STEAL_NEXT. If we have an async component, then we set
>>>> REQ_F_DONT_STEAL_NEXT. So this is generally the case where our
>>>> io_put_req() for submit is not the last drop. And for the other case,
>>>> the put is generally in the caller anyway. So I don't really see what
>>>> this extra flag buys us?
>>>
>>> Because io_put_work() holds a reference, no async handler can achive req->refs
>>> == 0, so it won't return next upon dropping the submission ref (i.e. by
>>> put_find_nxt()). And I want to have next before io_put_work(), to, instead of as
>>> currently:
>>>
>>> run_work(work);
>>> assign_cur_work(NULL); // spinlock + unlock worker->lock
>>> new_work = put_work(work);
>>> assign_cur_work(new_work); // the second time
>>>
>>> do:
>>>
>>> new_work = run_work(work);
>>> assign_cur_work(new_work); // need new_work here
>>> put_work(work);
>>>
>>>
>>> The other way:
>>>
>>> io_wq_submit_work() // for all async handlers
>>> {
>>> 	...
>>> 	// Drop submission reference.
>>> 	// One extra ref will be put in io_put_work() right
>>> 	// after return, and it'll be done in the same thread
>>> 	if (atomic_dec_and_get(req) == 1)
>>> 		steal_next(req);
>>> }
>>>
>>> Maybe cleaner, but looks fragile as well. Would you prefer it?
>>
>> Any chance you've measured your next-work fix? I wonder how much does it
>> hurt performance, and whether we need a terse patch for 5.6.
> 
> Unless I'm missing something, the worker will pick up the next work
> without sleeping, since the request will have finished. So it really
> should not add any extra overhead, except you'll do an extra wqe lock
> roundtrip.
> 
> But I'll run some testing to be totally sure.

Testing with link-cp, not seeing much if anything of a difference. Not
in wqe load either.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-03-03 16:29 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-03-02 20:45 [PATCH v2 0/4] nxt propagation Pavel Begunkov
2020-03-02 20:45 ` [PATCH v2 1/4] io_uring: clean up io_close Pavel Begunkov
2020-03-02 20:45 ` [PATCH v2 2/4] io_uring: make submission ref putting consistent Pavel Begunkov
2020-03-02 20:45 ` [PATCH v2 3/4] io_uring: remove @nxt from handlers Pavel Begunkov
2020-03-02 20:45 ` [PATCH v2 4/4] io_uring: get next req on subm ref drop Pavel Begunkov
2020-03-03  4:26   ` Jens Axboe
2020-03-03  6:54     ` Pavel Begunkov
2020-03-03 10:46       ` Pavel Begunkov
2020-03-03 16:04         ` Jens Axboe
2020-03-03 16:29           ` Jens Axboe
2020-03-03 16:03       ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox