public inbox for [email protected]
 help / color / mirror / Atom feed
From: Bijan Mottahedeh <[email protected]>
To: Pavel Begunkov <[email protected]>,
	[email protected], [email protected]
Subject: Re: [PATCH v2 13/13] io_uring: support buffer registration sharing
Date: Wed, 6 Jan 2021 16:50:50 -0800	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 12/18/2020 10:06 AM, Bijan Mottahedeh wrote:
> 
>>> @@ -8415,6 +8421,12 @@ static int io_sqe_buffers_unregister(struct 
>>> io_ring_ctx *ctx)
>>>       if (!data)
>>>           return -ENXIO;
>>> +    if (ctx->flags & IORING_SETUP_ATTACH_BUF) {
>>> +        io_detach_buf_data(ctx);
>>> +        ctx->nr_user_bufs = 0;
>>
>> nr_user_bufs is a part of invariant and should stay together with
>> stuff in io_detach_buf_data().
> 
> Moved to io_detach_buf_data.
> 
> 
>>> @@ -8724,9 +8740,17 @@ static int io_sqe_buffers_register(struct 
>>> io_ring_ctx *ctx, void __user *arg,
>>>       struct fixed_rsrc_ref_node *ref_node;
>>>       struct fixed_rsrc_data *buf_data;
>>> +    if (ctx->flags & IORING_SETUP_ATTACH_BUF) {
>>> +        if (!ctx->buf_data)
>>> +            return -EFAULT;
>>> +        ctx->nr_user_bufs = ctx->buf_data->ctx->nr_user_bufs;
>>
>> Why? Once a table is initialised it shouldn't change its size, would
>> be racy otherwise.
> 
> ctx->buf_data is set at ring setup time but the sharing process 
> (SETUP_SHARE) may do the actual buffer registration at an arbitrary time 
> later, so the attaching process must ensure to get the updated value of 
> nr_user_bufs if available.
> 
>>>       buf_data = io_buffers_map_alloc(ctx, nr_args);
>>>       if (IS_ERR(buf_data))
>>>           return PTR_ERR(buf_data);
>>> +    ctx->buf_data = buf_data;
>>
>> Wanted to write that there is missing
>> `if (ctx->user_bufs) return -EBUSY`
>>
>> but apparently it was moved into io_buffers_map_alloc().
>> I'd really prefer to have it here.
> 
> Moved it back.
> 
>>> +static int io_attach_buf_data(struct io_ring_ctx *ctx,
>>> +                  struct io_uring_params *p)
>>> +{
>>> +    struct io_ring_ctx *ctx_attach;
>>> +    struct fd f;
>>> +
>>> +    f = fdget(p->wq_fd);
>>> +    if (!f.file)
>>> +        return -EBADF;
>>> +    if (f.file->f_op != &io_uring_fops) {
>>> +        fdput(f);
>>> +        return -EINVAL;
>>> +    }
>>> +
>>> +    ctx_attach = f.file->private_data;
>>> +    if (!ctx_attach->buf_data) {
>>
>> It looks racy. What prevents it from being deleted while we're
>> working on it, e.g. by io_sqe_buffers_unregister?
> 
> I think the premise here is that buffer sharing happens between trusted 
> and coordinated processes.  If I understand your concern correctly, then 
> if the sharing process unregisters its buffers after having shared them, 
> than that process is acting improperly.  The race could lead to a failed 
> attach but that would be expected and reasonable I would think?  What do 
> you think should happen in this case?
> 
>>
>>> +        fdput(f);
>>> +        return -EINVAL;
>>> +    }
>>> +    ctx->buf_data = ctx_attach->buf_data;
>>
>> Before updates, etc. (e.g. __io_sqe_buffers_update()) were synchronised
>> by uring_lock, now it's modified concurrently, that looks to be really
>> racy.
> 
> Racy from the attaching process perspective you mean?
> 
>>
>>> +
>>> +    percpu_ref_get(&ctx->buf_data->refs);
>>
>> Ok, now the original io_uring instance will wait until the attached
>> once get rid of their references. That's a versatile ground to have
>> in kernel deadlocks.
>>
>> task1: uring1 = create()
>> task2: uring2 = create()
>> task1: uring3 = create(share=uring2);
>> task2: uring4 = create(share=uring1);
>>
>> task1: io_sqe_buffers_unregister(uring1)
>> task2: io_sqe_buffers_unregister(uring2)
>>
>> If I skimmed through the code right, that should hang unkillably.
> 
> So we need a way to enforce that a process can only have one role, 
> sharing or attaching? But I'm not what the best way to do that.  Is this 
> an issue for other resource sharing, work queues or polling thread?
> 

The intended use case for buffer registration is:

- a group of processes attach a shmem segment
- one process registers the buffers in the shmem segment and shares it
- other processes attach that registration

For this case, it seems that there is really no need to wait for the 
attached processes to get rid of the their references since the shmem 
segment (and thus the registered buffers) will persist anyway until the 
last attached process goes away.  So the last unregister could quiesce 
all references and get rid of the shared buf_data.

I'm not sure how useful the non-shmem use case would be anyway.

Would it makes sense to restrict the scope of this feature?



  reply	other threads:[~2021-01-07  0:53 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-07 22:15 [PATCH v2 00/13] io_uring: buffer registration enhancements Bijan Mottahedeh
2020-12-07 22:15 ` [PATCH v2 01/13] io_uring: modularize io_sqe_buffer_register Bijan Mottahedeh
2020-12-07 22:15 ` [PATCH v2 02/13] io_uring: modularize io_sqe_buffers_register Bijan Mottahedeh
2020-12-07 22:15 ` [PATCH v2 03/13] io_uring: generalize fixed file functionality Bijan Mottahedeh
2020-12-07 22:15 ` [PATCH v2 04/13] io_uring: rename fixed_file variables to fixed_rsrc Bijan Mottahedeh
2020-12-07 22:15 ` [PATCH v2 05/13] io_uring: separate ref_list from fixed_rsrc_data Bijan Mottahedeh
2020-12-07 22:15 ` [PATCH v2 06/13] io_uring: generalize fixed_file_ref_node functionality Bijan Mottahedeh
2020-12-16 14:53   ` Pavel Begunkov
2020-12-18 18:06     ` Bijan Mottahedeh
2020-12-07 22:15 ` [PATCH v2 07/13] io_uring: add rsrc_ref locking routines Bijan Mottahedeh
2020-12-07 22:15 ` [PATCH v2 08/13] io_uring: implement fixed buffers registration similar to fixed files Bijan Mottahedeh
2020-12-16 14:58   ` Pavel Begunkov
2020-12-18 18:06     ` Bijan Mottahedeh
2020-12-16 14:59   ` Pavel Begunkov
2020-12-07 22:15 ` [PATCH v2 09/13] io_uring: create common fixed_rsrc_ref_node handling routines Bijan Mottahedeh
2020-12-07 22:15 ` [PATCH v2 10/13] io_uring: generalize files_update functionlity to rsrc_update Bijan Mottahedeh
2020-12-07 22:15 ` [PATCH v2 11/13] io_uring: support buffer registration updates Bijan Mottahedeh
2020-12-07 22:15 ` [PATCH v2 12/13] io_uring: create common fixed_rsrc_data allocation routines Bijan Mottahedeh
2020-12-07 22:15 ` [PATCH v2 13/13] io_uring: support buffer registration sharing Bijan Mottahedeh
2020-12-16 15:29   ` Pavel Begunkov
2020-12-18 18:06     ` Bijan Mottahedeh
2021-01-07  0:50       ` Bijan Mottahedeh [this message]
2021-01-11  5:19         ` Pavel Begunkov
2021-01-12 21:50           ` Bijan Mottahedeh
2020-12-14 19:09 ` [PATCH v2 00/13] io_uring: buffer registration enhancements Bijan Mottahedeh
2020-12-14 19:29   ` Jens Axboe
2020-12-14 19:43     ` Bijan Mottahedeh
2020-12-14 19:47       ` Jens Axboe
2020-12-14 20:59     ` Pavel Begunkov
2020-12-18 18:06       ` Bijan Mottahedeh
2020-12-16 15:34 ` Pavel Begunkov
2020-12-18 18:06   ` Bijan Mottahedeh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox