Re: relative openat dirfd reference on submit

public inbox for [email protected]
 help / color / mirror / Atom feed

From: Pavel Begunkov <[email protected]>
To: Jens Axboe <[email protected]>, Vito Caputo <[email protected]>,
	[email protected]
Subject: Re: relative openat dirfd reference on submit
Date: Tue, 3 Nov 2020 00:17:53 +0000	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 03/11/2020 00:05, Jens Axboe wrote:
> On 11/2/20 1:52 PM, Vito Caputo wrote:
>> Hello list,
>>
>> I've been tinkering a bit with some async continuation passing style
>> IO-oriented code employing liburing.  This exposed a kind of awkward
>> behavior I suspect could be better from an ergonomics perspective.
>>
>> Imagine a bunch of OPENAT SQEs have been prepared, and they're all
>> relative to a common dirfd.  Once io_uring_submit() has consumed all
>> these SQEs across the syscall boundary, logically it seems the dirfd
>> should be safe to close, since these dirfd-dependent operations have
>> all been submitted to the kernel.
>>
>> But when I attempted this, the subsequent OPENAT CQE results were all
>> -EBADFD errors.  It appeared the submit didn't add any references to
>> the dependent dirfd.
>>
>> To work around this, I resorted to stowing the dirfd and maintaining a
>> shared refcount in the closures associated with these SQEs and
>> executed on their CQEs.  This effectively forced replicating the
>> batched relationship implicit in the shared parent dirfd, where I
>> otherwise had zero need to.  Just so I could defer closing the dirfd
>> until once all these closures had run on their respective CQE arrivals
>> and the refcount for the batch had reached zero.
>>
>> It doesn't seem right.  If I ensure sufficient queue depth and
>> explicitly flush all the dependent SQEs beforehand
>> w/io_uring_submit(), it seems like I should be able to immediately
>> close(dirfd) and have the close be automagically deferred until the
>> last dependent CQE removes its reference from the kernel side.
> 
> We pass the 'dfd' straight on, and only the async part acts on it.
> Which is why it needs to be kept open. But I wonder if we can get
> around it by just pinning the fd for the duration. Since you didn't
> include a test case, can you try with this patch applied? Totally
> untested...

afaik this doesn't pin an fd in a file table, so the app closes and
dfd right after submit and then do_filp_open() tries to look up
closed dfd. Doesn't seem to work, and we need to pass that struct
file to do_filp_open().

> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 1f555e3c44cd..b3a647dd206b 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -3769,6 +3769,9 @@ static int __io_openat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe
>  		req->open.how.flags |= O_LARGEFILE;
>  
>  	req->open.dfd = READ_ONCE(sqe->fd);
> +	if (req->open.dfd != -1 && req->open.dfd != AT_FDCWD)
> +		req->file = fget(req->open.dfd);
> +
>  	fname = u64_to_user_ptr(READ_ONCE(sqe->addr));
>  	req->open.filename = getname(fname);
>  	if (IS_ERR(req->open.filename)) {o 
> @@ -3841,6 +3844,8 @@ static int io_openat2(struct io_kiocb *req, bool force_nonblock)
>  	}
>  err:
>  	putname(req->open.filename);
> +	if (req->file)
> +		fput(req->file);
>  	req->flags &= ~REQ_F_NEED_CLEANUP;
>  	if (ret < 0)
>  		req_set_fail_links(req);
> @@ -5876,6 +5881,8 @@ static void __io_clean_op(struct io_kiocb *req)
>  		case IORING_OP_OPENAT2:
>  			if (req->open.filename)
>  				putname(req->open.filename);
> +			if (req->file)
> +				fput(req->file);
>  			break;
>  		}
>  		req->flags &= ~REQ_F_NEED_CLEANUP;
> 

-- 
Pavel Begunkov

next prev parent reply	other threads:[~2020-11-03  0:20 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-02 20:52 relative openat dirfd reference on submit Vito Caputo
2020-11-03  0:05 ` Jens Axboe
2020-11-03  0:17   ` Pavel Begunkov [this message]
2020-11-03  0:34     ` Jens Axboe
2020-11-03  0:41       ` Pavel Begunkov
2020-11-04 23:43         ` Jens Axboe
2020-11-05  8:45           ` Stefan Metzmacher
2020-11-05 14:09             ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox