public inbox for [email protected]
 help / color / mirror / Atom feed
* [GIT PULL] io_uring updates for 5.14-rc1
@ 2021-06-29 20:43 Jens Axboe
  2021-06-30 20:05 ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2021-06-29 20:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: io-uring

Hi Linus,

This pull request contains the io_uring updates for the 5.14-rc1 merge
window. In detail:

- Support for mkdirat, symlinkat, and linkat for io_uring (Dmitry)

- Multi-queue iopoll improvement (Fam)

- Allow configurable io-wq CPU masks (me)

- renameat/linkat tightening (me)

- poll re-arm improvement (Olivier)

- SQPOLL race fix (Olivier)

- Cancelation unification (Pavel)

- SQPOLL cleanups (Pavel)

- Enable file backed buffers for shmem/memfd (Pavel)

- A ton of cleanups and performance improvements (Pavel)

- Followup and misc fixes (Colin, Fam, Hao, Olivier)

Please pull!


The following changes since commit 009c9aa5be652675a06d5211e1640e02bbb1c33d:

  Linux 5.13-rc6 (2021-06-13 14:43:10 -0700)

are available in the Git repository at:

  git://git.kernel.dk/linux-block.git tags/for-5.14/io_uring-2021-06-29

for you to fetch changes up to bdfe4dc5bfddf8f3b4080ac4a11a4c5843cbe928:

  io_uring: code clean for kiocb_done() (2021-06-27 16:22:42 -0600)

----------------------------------------------------------------
for-5.14/io_uring-2021-06-29

----------------------------------------------------------------
Colin Ian King (2):
      io_uring: Fix incorrect sizeof operator for copy_from_user call
      io-wq: remove redundant initialization of variable ret

Dmitry Kadashev (9):
      fs: make do_mkdirat() take struct filename
      io_uring: add support for IORING_OP_MKDIRAT
      fs: make do_mknodat() take struct filename
      fs: make do_symlinkat() take struct filename
      namei: add getname_uflags()
      fs: make do_linkat() take struct filename
      fs: update do_*() helpers to return ints
      io_uring: add support for IORING_OP_SYMLINKAT
      io_uring: add support for IORING_OP_LINKAT

Fam Zheng (1):
      io_uring: Fix comment of io_get_sqe

Hao Xu (2):
      io_uring: spin in iopoll() only when reqs are in a single queue
      io_uring: code clean for kiocb_done()

Jens Axboe (4):
      io-wq: use private CPU mask
      io_uring: allow user configurable IO thread CPU affinity
      io_uring: add IOPOLL and reserved field checks to IORING_OP_RENAMEAT
      io_uring: add IOPOLL and reserved field checks to IORING_OP_UNLINKAT

Olivier Langlois (6):
      io_uring: Add to traces the req pointer when available
      io_uring: minor clean up in trace events definition
      io-wq: remove header files not needed anymore
      io_uring: Fix race condition when sqp thread goes to sleep
      io_uring: Create define to modify a SQPOLL parameter
      io_uring: reduce latency by reissueing the operation

Pavel Begunkov (68):
      io_uring: improve sqpoll event/state handling
      io_uring: improve sq_thread waiting check
      io_uring: remove unused park_task_work
      io_uring: simplify waking sqo_sq_wait
      io_uring: get rid of files in exit cancel
      io_uring: make fail flag not link specific
      io_uring: shuffle rarely used ctx fields
      io_uring: better locality for rsrc fields
      io_uring: remove dependency on ring->sq/cq_entries
      io_uring: deduce cq_mask from cq_entries
      io_uring: kill cached_cq_overflow
      io_uring: rename io_get_cqring
      io_uring: don't bounce submit_state cachelines
      io_uring: enable shmem/memfd memory registration
      io_uring: fix blocking inline submission
      io-wq: embed wqe ptr array into struct io_wq
      io-wq: remove unused io-wq refcounting
      io_uring: refactor io_iopoll_req_issued
      io_uring: rename function *task_file
      io-wq: don't repeat IO_WQ_BIT_EXIT check by worker
      io-wq: simplify worker exiting
      io_uring: hide rsrc tag copy into generic helpers
      io_uring: remove rsrc put work irq save/restore
      io_uring: add helpers for 2 level table alloc
      io_uring: don't vmalloc rsrc tags
      io_uring: cache task struct refs
      io_uring: unify SQPOLL and user task cancellations
      io_uring: inline io_iter_do_read()
      io_uring: keep SQ pointers in a single cacheline
      io_uring: move ctx->flags from SQ cacheline
      io_uring: shuffle more fields into SQ ctx section
      io_uring: refactor io_get_sqe()
      io_uring: don't cache number of dropped SQEs
      io_uring: optimise completion timeout flushing
      io_uring: small io_submit_sqe() optimisation
      io_uring: clean up check_overflow flag
      io_uring: wait heads renaming
      io_uring: move uring_lock location
      io_uring: refactor io_req_defer()
      io_uring: optimise non-drain path
      io_uring: fix min types mismatch in table alloc
      io_uring: switch !DRAIN fast path when possible
      io_uring: shove more drain bits out of hot path
      io_uring: optimise io_commit_cqring()
      io_uring: fix false WARN_ONCE
      io_uring: refactor io_submit_flush_completions()
      io_uring: move creds from io-wq work to io_kiocb
      io_uring: track request creds with a flag
      io_uring: simplify iovec freeing in io_clean_op()
      io_uring: clean all flags in io_clean_op() at once
      io_uring: refactor io_get_sequence()
      io_uring: inline __tctx_task_work()
      io_uring: optimise task_work submit flushing
      io_uring: refactor tctx task_work list splicing
      io_uring: don't resched with empty task_list
      io_uring: improve in tctx_task_work() resubmission
      io_uring: don't change sqpoll creds if not needed
      io_uring: refactor io_sq_thread()
      io_uring: fix code style problems
      io_uring: update sqe layout build checks
      io_uring: simplify struct io_uring_sqe layout
      io_uring: refactor io_openat2()
      io_uring: refactor io_arm_poll_handler()
      io_uring: mainstream sqpoll task_work running
      io_uring: remove not needed PF_EXITING check
      io_uring: optimise hot path restricted checks
      io_uring: refactor io_submit_flush_completions
      io_uring: pre-initialise some of req fields

 fs/exec.c                       |    8 +-
 fs/internal.h                   |    8 +-
 fs/io-wq.c                      |  103 ++-
 fs/io-wq.h                      |    3 +-
 fs/io_uring.c                   | 1524 +++++++++++++++++++++++----------------
 fs/namei.c                      |  137 ++--
 include/linux/fs.h              |    1 +
 include/trace/events/io_uring.h |  106 ++-
 include/uapi/linux/io_uring.h   |   32 +-
 9 files changed, 1177 insertions(+), 745 deletions(-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GIT PULL] io_uring updates for 5.14-rc1
  2021-06-29 20:43 [GIT PULL] io_uring updates for 5.14-rc1 Jens Axboe
@ 2021-06-30 20:05 ` Linus Torvalds
  2021-06-30 20:14   ` Jens Axboe
  2021-07-02 11:32   ` Dmitry Kadashev
  0 siblings, 2 replies; 12+ messages in thread
From: Linus Torvalds @ 2021-06-30 20:05 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Dmitry Kadashev; +Cc: io-uring

On Tue, Jun 29, 2021 at 1:43 PM Jens Axboe <[email protected]> wrote:
>
> - Support for mkdirat, symlinkat, and linkat for io_uring (Dmitry)

I pulled this, and then I unpulled it again.

I hate how makes the rules for when "putname()" is called completely
arbitrary and very confusing. It ends up with multiple cases of
something like

        error = -ENOENT;
        goto out_putnames;

that didn't exist before.

And worse still ends up being that unbelievably ugly hack with

        // On error `new` is freed by __filename_create, prevent extra freeing
        // below
        new = ERR_PTR(error);
        goto out_putpath;

that ends up intentionally undoing one of the putnames because the
name has already been used.

And none of the commits have acks by Al. I realize that he can
sometimes be a bit unresponsive, but this is just *UGLY*. And we've
had too many io_uring issues for me to just say "I'm sure it's fine".

I can see a few ways to at least de-uglify things:

 - Maybe we can make putname() just do nothing for IS_ERR_OR_NULL() names.

   We have that kind of rules for a number of path walking things,
where passing in an error pointer is fine. Things like
link_path_walk() or filename_lookup() act that way very much by
design, exactly to make it easy to handle error conditions.

 - callers of __filename_create() and similar thar eat the name (and
return a dentry or whatever) could then set the name to NULL, not as
part of the error handling, but unconditionally as a "it's been used".

So I think this is fixable.

But I'm *VERY* tired of io_uring being so buggy and doing "exciting"
things, and when it then starts affecting very core functionality and
I don't even see ack's from the one person who understands all of
that, I put my foot down.

No more flaky io_uring pulls. Give me the unambiguously good stuff,
not this kind of unacked ugly stuff.

               Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GIT PULL] io_uring updates for 5.14-rc1
  2021-06-30 20:05 ` Linus Torvalds
@ 2021-06-30 20:14   ` Jens Axboe
  2021-06-30 20:18     ` Linus Torvalds
  2021-07-02 11:32   ` Dmitry Kadashev
  1 sibling, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2021-06-30 20:14 UTC (permalink / raw)
  To: Linus Torvalds, Al Viro, Dmitry Kadashev; +Cc: io-uring

On 6/30/21 2:05 PM, Linus Torvalds wrote:
> On Tue, Jun 29, 2021 at 1:43 PM Jens Axboe <[email protected]> wrote:
>>
>> - Support for mkdirat, symlinkat, and linkat for io_uring (Dmitry)
> 
> I pulled this, and then I unpulled it again.
> 
> I hate how makes the rules for when "putname()" is called completely
> arbitrary and very confusing. It ends up with multiple cases of
> something like
> 
>         error = -ENOENT;
>         goto out_putnames;
> 
> that didn't exist before.
> 
> And worse still ends up being that unbelievably ugly hack with
> 
>         // On error `new` is freed by __filename_create, prevent extra freeing
>         // below
>         new = ERR_PTR(error);
>         goto out_putpath;
> 
> that ends up intentionally undoing one of the putnames because the
> name has already been used.
> 
> And none of the commits have acks by Al. I realize that he can
> sometimes be a bit unresponsive, but this is just *UGLY*. And we've
> had too many io_uring issues for me to just say "I'm sure it's fine".

I think Dmitry has been beyond patient for this series, I think we
beyond 6 months of very little feedback on the core changes. I've
pinged Al multiple times, but we did have Christian Brauner make
suggestions and acking it. That's not to say that it's perfect,
we'll make changes and improve things.

> I can see a few ways to at least de-uglify things:
> 
>  - Maybe we can make putname() just do nothing for IS_ERR_OR_NULL() names.
> 
>    We have that kind of rules for a number of path walking things,
> where passing in an error pointer is fine. Things like
> link_path_walk() or filename_lookup() act that way very much by
> design, exactly to make it easy to handle error conditions.
> 
>  - callers of __filename_create() and similar thar eat the name (and
> return a dentry or whatever) could then set the name to NULL, not as
> part of the error handling, but unconditionally as a "it's been used".
> 
> So I think this is fixable.

I like the first suggestion in particular, that should be straight
forward. I'll drop this series for now and hopefully we can get it
in an acceptable form soon.

> But I'm *VERY* tired of io_uring being so buggy and doing "exciting"
> things, and when it then starts affecting very core functionality and
> I don't even see ack's from the one person who understands all of
> that, I put my foot down.
>
> No more flaky io_uring pulls. Give me the unambiguously good stuff,
> not this kind of unacked ugly stuff.

I think you're being very unfair here, last release was pretty
uneventful, and of course 5.12 was a bit too exciting with the
thread changes, but that was both expected and a necessary evil.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GIT PULL] io_uring updates for 5.14-rc1
  2021-06-30 20:14   ` Jens Axboe
@ 2021-06-30 20:18     ` Linus Torvalds
  2021-06-30 20:21       ` Jens Axboe
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2021-06-30 20:18 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Al Viro, Dmitry Kadashev, io-uring

On Wed, Jun 30, 2021 at 1:14 PM Jens Axboe <[email protected]> wrote:
>
> I think you're being very unfair here, last release was pretty
> uneventful, and of course 5.12 was a bit too exciting with the
> thread changes, but that was both expected and a necessary evil.

The last one may have been uneventful, but I really want to keep it that way.

And the threading changes were an improvement overall, but they were
by no means the first time io_uring was flaky and had issues. It has
caused a _lot_ of churn over the years.

And I refuse to have that churn now affect fs/namei.c

This had better be done right, or not at all.

                 Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GIT PULL] io_uring updates for 5.14-rc1
  2021-06-30 20:18     ` Linus Torvalds
@ 2021-06-30 20:21       ` Jens Axboe
  0 siblings, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2021-06-30 20:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Al Viro, Dmitry Kadashev, io-uring

On 6/30/21 2:18 PM, Linus Torvalds wrote:
> On Wed, Jun 30, 2021 at 1:14 PM Jens Axboe <[email protected]> wrote:
>>
>> I think you're being very unfair here, last release was pretty
>> uneventful, and of course 5.12 was a bit too exciting with the
>> thread changes, but that was both expected and a necessary evil.
> 
> The last one may have been uneventful, but I really want to keep it
> that way.

We certainly share that goal.

> And the threading changes were an improvement overall, but they were
> by no means the first time io_uring was flaky and had issues. It has
> caused a _lot_ of churn over the years.

In some ways that was almost inevitable with treading new waters, but
I'm hopeful that we're way beyond the hump on that.

I totally agree that it should not be disruptive to core code, and
I'm definitely fine rejecting those bits to avoid having that be the
case. We'll get it sorted.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [GIT PULL] io_uring updates for 5.14-rc1
@ 2021-07-01 15:24 Jens Axboe
  2021-07-01 19:20 ` pr-tracker-bot
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2021-07-01 15:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: io-uring

Hi Linus,

This pull request contains the io_uring updates for the 5.14-rc1 merge
window. Identical to the pull request sent the other day, just with the
vfs changes dropped, and hence the io_uring mkdirat, symlinkat, and
linkat removed as well.

- Multi-queue iopoll improvement (Fam)

- Allow configurable io-wq CPU masks (me)

- renameat/linkat tightening (me)

- poll re-arm improvement (Olivier)

- SQPOLL race fix (Olivier)

- Cancelation unification (Pavel)

- SQPOLL cleanups (Pavel)

- Enable file backed buffers for shmem/memfd (Pavel)

- A ton of cleanups and performance improvements (Pavel)

- Followup and misc fixes (Colin, Fam, Hao, Olivier)

Please pull!


The following changes since commit 009c9aa5be652675a06d5211e1640e02bbb1c33d:

  Linux 5.13-rc6 (2021-06-13 14:43:10 -0700)

are available in the Git repository at:

  git://git.kernel.dk/linux-block.git tags/for-5.14/io_uring-2021-06-30

for you to fetch changes up to e149bd742b2db6a63fc078b1ea6843dc9b22678d:

  io_uring: code clean for kiocb_done() (2021-06-30 14:15:40 -0600)

----------------------------------------------------------------
for-5.14/io_uring-2021-06-30

----------------------------------------------------------------
Colin Ian King (2):
      io_uring: Fix incorrect sizeof operator for copy_from_user call
      io-wq: remove redundant initialization of variable ret

Fam Zheng (1):
      io_uring: Fix comment of io_get_sqe

Hao Xu (2):
      io_uring: spin in iopoll() only when reqs are in a single queue
      io_uring: code clean for kiocb_done()

Jens Axboe (4):
      io-wq: use private CPU mask
      io_uring: allow user configurable IO thread CPU affinity
      io_uring: add IOPOLL and reserved field checks to IORING_OP_RENAMEAT
      io_uring: add IOPOLL and reserved field checks to IORING_OP_UNLINKAT

Olivier Langlois (6):
      io_uring: Add to traces the req pointer when available
      io_uring: minor clean up in trace events definition
      io-wq: remove header files not needed anymore
      io_uring: Fix race condition when sqp thread goes to sleep
      io_uring: Create define to modify a SQPOLL parameter
      io_uring: reduce latency by reissueing the operation

Pavel Begunkov (68):
      io_uring: improve sqpoll event/state handling
      io_uring: improve sq_thread waiting check
      io_uring: remove unused park_task_work
      io_uring: simplify waking sqo_sq_wait
      io_uring: get rid of files in exit cancel
      io_uring: make fail flag not link specific
      io_uring: shuffle rarely used ctx fields
      io_uring: better locality for rsrc fields
      io_uring: remove dependency on ring->sq/cq_entries
      io_uring: deduce cq_mask from cq_entries
      io_uring: kill cached_cq_overflow
      io_uring: rename io_get_cqring
      io_uring: don't bounce submit_state cachelines
      io_uring: enable shmem/memfd memory registration
      io_uring: fix blocking inline submission
      io-wq: embed wqe ptr array into struct io_wq
      io-wq: remove unused io-wq refcounting
      io_uring: refactor io_iopoll_req_issued
      io_uring: rename function *task_file
      io-wq: don't repeat IO_WQ_BIT_EXIT check by worker
      io-wq: simplify worker exiting
      io_uring: hide rsrc tag copy into generic helpers
      io_uring: remove rsrc put work irq save/restore
      io_uring: add helpers for 2 level table alloc
      io_uring: don't vmalloc rsrc tags
      io_uring: cache task struct refs
      io_uring: unify SQPOLL and user task cancellations
      io_uring: inline io_iter_do_read()
      io_uring: keep SQ pointers in a single cacheline
      io_uring: move ctx->flags from SQ cacheline
      io_uring: shuffle more fields into SQ ctx section
      io_uring: refactor io_get_sqe()
      io_uring: don't cache number of dropped SQEs
      io_uring: optimise completion timeout flushing
      io_uring: small io_submit_sqe() optimisation
      io_uring: clean up check_overflow flag
      io_uring: wait heads renaming
      io_uring: move uring_lock location
      io_uring: refactor io_req_defer()
      io_uring: optimise non-drain path
      io_uring: fix min types mismatch in table alloc
      io_uring: switch !DRAIN fast path when possible
      io_uring: shove more drain bits out of hot path
      io_uring: optimise io_commit_cqring()
      io_uring: fix false WARN_ONCE
      io_uring: refactor io_submit_flush_completions()
      io_uring: move creds from io-wq work to io_kiocb
      io_uring: track request creds with a flag
      io_uring: simplify iovec freeing in io_clean_op()
      io_uring: clean all flags in io_clean_op() at once
      io_uring: refactor io_get_sequence()
      io_uring: inline __tctx_task_work()
      io_uring: optimise task_work submit flushing
      io_uring: refactor tctx task_work list splicing
      io_uring: don't resched with empty task_list
      io_uring: improve in tctx_task_work() resubmission
      io_uring: don't change sqpoll creds if not needed
      io_uring: refactor io_sq_thread()
      io_uring: fix code style problems
      io_uring: update sqe layout build checks
      io_uring: simplify struct io_uring_sqe layout
      io_uring: refactor io_openat2()
      io_uring: refactor io_arm_poll_handler()
      io_uring: mainstream sqpoll task_work running
      io_uring: remove not needed PF_EXITING check
      io_uring: optimise hot path restricted checks
      io_uring: refactor io_submit_flush_completions
      io_uring: pre-initialise some of req fields

 fs/io-wq.c                      |  103 ++-
 fs/io-wq.h                      |    3 +-
 fs/io_uring.c                   | 1324 +++++++++++++++++++++------------------
 include/trace/events/io_uring.h |  106 ++--
 include/uapi/linux/io_uring.h   |   28 +-
 5 files changed, 874 insertions(+), 690 deletions(-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GIT PULL] io_uring updates for 5.14-rc1
  2021-07-01 15:24 Jens Axboe
@ 2021-07-01 19:20 ` pr-tracker-bot
  0 siblings, 0 replies; 12+ messages in thread
From: pr-tracker-bot @ 2021-07-01 19:20 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linus Torvalds, io-uring

The pull request you sent on Thu, 1 Jul 2021 09:24:44 -0600:

> git://git.kernel.dk/linux-block.git tags/for-5.14/io_uring-2021-06-30

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/c288d9cd710433e5991d58a0764c4d08a933b871

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GIT PULL] io_uring updates for 5.14-rc1
  2021-06-30 20:05 ` Linus Torvalds
  2021-06-30 20:14   ` Jens Axboe
@ 2021-07-02 11:32   ` Dmitry Kadashev
  2021-07-02 18:33     ` Linus Torvalds
  1 sibling, 1 reply; 12+ messages in thread
From: Dmitry Kadashev @ 2021-07-02 11:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jens Axboe, Al Viro, io-uring

On Thu, Jul 1, 2021 at 3:06 AM Linus Torvalds
<[email protected]> wrote:
>
> On Tue, Jun 29, 2021 at 1:43 PM Jens Axboe <[email protected]> wrote:
> >
> > - Support for mkdirat, symlinkat, and linkat for io_uring (Dmitry)
>
> I pulled this, and then I unpulled it again.

First of all, I totally agree that parts of it are ugly. And I'm happy
to work on improving it with some guidance. But it's been sort of hard
getting feedback on the namei part of things (Christian Brauner was very
supportive though), and in cases where I was not sure what's the best
way to do something I just went with the "pick the least ugly way I can
see at the moment and collect the feedback" approach. And yes, in some
cases "least ugly" is still very ugly.

Part of the issue is I'm very new to the kernel code, I've seen the
unlinkat / renameat changes and thought that surely I can do the same
for mkdirat. It turned out to be much larger and not just about mkdirat
in the end.

> I hate how makes the rules for when "putname()" is called completely
> arbitrary and very confusing.

In my opinion, this is because of the "consume the name on error, but
not on success" logic in __filename_create / __filename_lookup. This
behavior was in fact suggested by Al back in January:
https://lore.kernel.org/io-uring/20210201150042.GQ740243@zeniv-ca/

And it works reasonably well when there is just one struct filename
passed in. But when there are two the state potentially becomes very
confusing: we might end up with cases like the second filename was
freed, but the first was not (which is basically where the hacks below
live).

> It ends up with multiple cases of something like
>
>         error = -ENOENT;
>         goto out_putnames;
>
> that didn't exist before.

Before it was just "return -ENOENT". But now there are two filenames
passed in that the function is responsible for freeing. I'm not sure how
this can be avoided.

> And worse still ends up being that unbelievably ugly hack with
>
>         // On error `new` is freed by __filename_create, prevent extra freeing
>         // below
>         new = ERR_PTR(error);
>         goto out_putpath;
>
> that ends up intentionally undoing one of the putnames because the
> name has already been used.

Yes, this is ugly, and teaching putname to deal with NULLs would mean we
could just set it to NULL here, but we can't just set it to NULL when it
was used, see below.

> And none of the commits have acks by Al. I realize that he can
> sometimes be a bit unresponsive, but this is just *UGLY*. And we've
> had too many io_uring issues for me to just say "I'm sure it's fine".
>
> I can see a few ways to at least de-uglify things:
>
>  - Maybe we can make putname() just do nothing for IS_ERR_OR_NULL() names.
>
>    We have that kind of rules for a number of path walking things,
> where passing in an error pointer is fine. Things like
> link_path_walk() or filename_lookup() act that way very much by
> design, exactly to make it easy to handle error conditions.

This sounds great to me, will make some paths much cleaner. But will
help with "new = ERR_PTR(error);" only partially (by using NULL instead
of ERR_PTR(error)).

>  - callers of __filename_create() and similar thar eat the name (and
> return a dentry or whatever) could then set the name to NULL, not as
> part of the error handling, but unconditionally as a "it's been used".

The problem is we have to keep the filenames around for retries on
ESTALE. It's not consumed by __filename_create() on success. So it's not
as simple as setting the name to NULL after calling __filename_create().
If it was not for ESTALE it'd be possible just to use filename_create()
that consumes the name passed to it unconditionally and it'd make the
logic much simpler indeed.

I'll do my best to improve things, but if there are any suggestions
they'd be appreciated.

-- 
Dmitry Kadashev

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GIT PULL] io_uring updates for 5.14-rc1
  2021-07-02 11:32   ` Dmitry Kadashev
@ 2021-07-02 18:33     ` Linus Torvalds
  2021-07-03 15:26       ` Dmitry Kadashev
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2021-07-02 18:33 UTC (permalink / raw)
  To: Dmitry Kadashev; +Cc: Jens Axboe, Al Viro, io-uring

On Fri, Jul 2, 2021 at 4:32 AM Dmitry Kadashev <[email protected]> wrote:
>
> The problem is we have to keep the filenames around for retries on
> ESTALE. It's not consumed by __filename_create() on success. So it's not
> as simple as setting the name to NULL after calling __filename_create().

I wonder if the semantics couldn't be that __filename_create() never
eats the name, and filename_create() keeps the old semantics?

You kind of made it go halfway, with filename_create() eating it only
on success.

That would make the filename_create() wrapper much simpler too, ie
we'd have it be just

    struct dentry *res = __filename_create(dfd, name, path, lookup_flags);
    putname(name);
    return res;

so it would remove a lot of conditionals, and leave it to callers
whether they want to keep the name live or not.

But I didn't check all the other cases, so maybe that causes its own
set of inconveniences..

            Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GIT PULL] io_uring updates for 5.14-rc1
  2021-07-02 18:33     ` Linus Torvalds
@ 2021-07-03 15:26       ` Dmitry Kadashev
  2021-07-05 21:40         ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Dmitry Kadashev @ 2021-07-03 15:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jens Axboe, Al Viro, io-uring

On Sat, Jul 3, 2021 at 1:33 AM Linus Torvalds
<[email protected]> wrote:
>
> I wonder if the semantics couldn't be that __filename_create() never
> eats the name, and filename_create() keeps the old semantics?

This is how I originally thought it is going to be, but Al suggested
that it eats the name on the failure. Now that I spent slightly
more time with it I *suspect* the reason (or a part of it) is making it
keep the name on failure would require A LOT of changes, since a lot of
functions that __filename_create() calls eat the name on failure.

How about we change the functions (__filename_create(),
__filename_lookup()) to do what filename_parentat() does since
5c31b6cedb675 ("namei: saner calling conventions for
filename_parentat()"), namely return struct filename *.

The nice thing here is we end up having the functions working the same
way, so it's much easier to reason about. In particular, all of the
functions seem to be expected to consume struct filename that was passed
to it. And it is going to be the same with these two new ones instead of
introducing a class of functions that do not do that.

The not-so-nice thing is for __filename_create we'd have to move struct
dentry to args. But as far as I can see there are quite a few "out" args
anyway, and struct path is an "out" arg already. So maybe it's not a big
deal. Maybe it is. Comments are welcome.

The code in question then will look like this:

    old = __filename_lookup(olddfd, old, how, &old_path, NULL);
    if (IS_ERR(old))
        goto out_putnames;

    new = __filename_create(newdfd, new, &new_path, &new_dentry,
                    (how & LOOKUP_REVAL));
    if (IS_ERR(new))
        goto out_putpath;

Thoughts?

-- 
Dmitry Kadashev

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GIT PULL] io_uring updates for 5.14-rc1
  2021-07-03 15:26       ` Dmitry Kadashev
@ 2021-07-05 21:40         ` Linus Torvalds
  2021-07-06 12:06           ` Dmitry Kadashev
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2021-07-05 21:40 UTC (permalink / raw)
  To: Dmitry Kadashev; +Cc: Jens Axboe, Al Viro, io-uring

[ back looking at this from doing other merge window stuff ]

On Sat, Jul 3, 2021 at 8:27 AM Dmitry Kadashev <[email protected]> wrote:
>
> This is how I originally thought it is going to be, but Al suggested
> that it eats the name on the failure. Now that I spent slightly
> more time with it I *suspect* the reason (or a part of it) is making it
> keep the name on failure would require A LOT of changes, since a lot of
> functions that __filename_create() calls eat the name on failure.

Ok.

Let's try to keep the changes minimal and simple.

Your original approach with __filename_create() looks reasonable, if
we then just clean up the end result by making putname() ok with an
error pointer.

The odd conditional putname() calls were the main issue that made me
go "this is just very ugly and confusing"

How do the patches end up looking with just that cleanup (and some
clarifying comment about the very odd "out_putpath" case it had?)

               Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [GIT PULL] io_uring updates for 5.14-rc1
  2021-07-05 21:40         ` Linus Torvalds
@ 2021-07-06 12:06           ` Dmitry Kadashev
  0 siblings, 0 replies; 12+ messages in thread
From: Dmitry Kadashev @ 2021-07-06 12:06 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jens Axboe, Al Viro, io-uring

On Tue, Jul 6, 2021 at 4:40 AM Linus Torvalds
<[email protected]> wrote:
>
> [ back looking at this from doing other merge window stuff ]
>
> On Sat, Jul 3, 2021 at 8:27 AM Dmitry Kadashev <[email protected]> wrote:
> >
> > This is how I originally thought it is going to be, but Al suggested
> > that it eats the name on the failure. Now that I spent slightly
> > more time with it I *suspect* the reason (or a part of it) is making it
> > keep the name on failure would require A LOT of changes, since a lot of
> > functions that __filename_create() calls eat the name on failure.
>
> Ok.
>
> Let's try to keep the changes minimal and simple.
>
> Your original approach with __filename_create() looks reasonable, if
> we then just clean up the end result by making putname() ok with an
> error pointer.
>
> The odd conditional putname() calls were the main issue that made me
> go "this is just very ugly and confusing"
>
> How do the patches end up looking with just that cleanup (and some
> clarifying comment about the very odd "out_putpath" case it had?)

OK, I'll send v7 of the series soon and will CC you.

Also, I've looked at the code again, and I think making
__filename_create() and friends not consume the name (even on error) is
not that much of extra work (but some existing code like
filename_parentat() has to be adjusted, so it's still not completely
trivial). I'll try to do that and will send RFC on top of this series,
so if it turns out to look better / be easier to reason about we can
just switch to that (in another release probably).

Thank you for your help, Linus!

--
Dmitry Kadashev

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-07-06 12:34 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-06-29 20:43 [GIT PULL] io_uring updates for 5.14-rc1 Jens Axboe
2021-06-30 20:05 ` Linus Torvalds
2021-06-30 20:14   ` Jens Axboe
2021-06-30 20:18     ` Linus Torvalds
2021-06-30 20:21       ` Jens Axboe
2021-07-02 11:32   ` Dmitry Kadashev
2021-07-02 18:33     ` Linus Torvalds
2021-07-03 15:26       ` Dmitry Kadashev
2021-07-05 21:40         ` Linus Torvalds
2021-07-06 12:06           ` Dmitry Kadashev
  -- strict thread matches above, loose matches on Subject: below --
2021-07-01 15:24 Jens Axboe
2021-07-01 19:20 ` pr-tracker-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox