On Mon, Nov 18, 2024 at 04:59:22PM +0000, Pavel Begunkov wrote: > On 11/18/24 12:50, Christoph Hellwig wrote: > > On Sat, Nov 16, 2024 at 12:32:25AM +0000, Pavel Begunkov wrote: > > > We can also reuse your idea from your previous iterations and > > > use the bitmap to list all attributes. Then preamble and > > > the explicit attr_type field are not needed, type checking > > > in the loop is removed and packing is better. And just > > > by looking at the map we can calculate the size of the > > > array and remove all size checks in the loop. > > > > Can we please stop overdesigning the f**k out of this? Really, > > Please stop it, it doesn't add weight to your argument. The design > requirement has never changed, at least not during this patchset > iterations. > > > either we're fine using the space in the extended SQE, or > > we're fine using a separate opcode, or if we really have to just > > make it uring_cmd. But stop making thing being extensible for > > the sake of being extensible. > > It's asked to be extendible because there is a good chance it'll need to > be extended, and no, I'm not suggesting anyone to implement the entire > thing, only PI bits is fine. > > And no, it doesn't have to be "this or that" while there are other > options suggested for consideration. And the problem with the SQE128 > option is not even about SQE128 but how it's placed inside, i.e. > at a fixed spot. > > Do we have technical arguments against the direction in the last > suggestion? It's extendible and _very_ simple. The entire (generic) > handling for the bitmask approach for this set would be sth like: > > struct sqe { > u64 attr_type_mask; > u64 attr_ptr; > }; > if (sqe->attr_type_mask) { > if (sqe->attr_type_mask != TYPE_PI) > return -EINVAL; > > struct uapi_pi_structure pi; > copy_from_user(&pi, sqe->attr_ptr, sizeof(pi)); > hanlde_pi(&pi); > } > > And the user side: > > struct uapi_pi_structure pi = { ... }; > sqe->attr_ptr = π > sqe->attr_type_mask = TYPE_PI; > How about using this, but also have the ability to keep PI inline. Attributes added down the line can take one of these options: 1. If available space in SQE/SQE128 is sufficient for keeping attribute fields, one can choose to keep them inline and introduce a TYPE_XYZ_INLINE attribute flag. 2. If the available space is not sufficient, pass the attribute information as pointer and introduce a TYPE_XYZ attribute flag. 3. One can choose to support a attribute via both pointer and inline scheme. The pointer scheme can help with scenarios where user wants to avoid SQE128 for whatever reasons (e.g. mixed workload). Something like this: // uapi/.../io_uring.h struct sqe { ... u64 attr_type_mask; u64 attr_ptr; }; // kernel space if (sqe->attr_type_mask) { struct uapi_pi_struct pi, *piptr; if ((sqe->attr_type_mask & TYPE_PI_INLINE) && (sqe->attr_type_mask & TYPE_PI)) return -EINVAL; /* inline PI case */ if (sqe->attr_type_mask & TYPE_PI_INLINE) { if (!(ctx->flags & IORING_SETUP_SQE128)) return -EINVAL; piptr = (sqe + 1); } else if (sqe->attr_type_mask & TYPE_PI) { /* indirect PI case */ copy_from_user(&pi, sqe->attr_ptr, sizeof(pi)); piptr = π } handle_pi(piptr); } And the user side: // send PI using pointer: struct uapi_pi_structure pi = { ... }; sqe->attr_ptr = π sqe->attr_type_mask = TYPE_PI; // send PI inline: /* setup a big-sqe ring */ struct uapi_pi_structure *pi = (sqe+1); prepare_pi(pi); sqe->attr_type_mask = TYPE_PI_INLINE;