public inbox for [email protected]
 help / color / mirror / Atom feed
* Re: [PATCH 11/17] block: factor out helper for bio allocation from cache
       [not found]     ` <[email protected]>
@ 2022-03-10 12:25       ` Kanchan Joshi
  0 siblings, 0 replies; 6+ messages in thread
From: Kanchan Joshi @ 2022-03-10 12:25 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Kanchan Joshi, Jens Axboe, Keith Busch, Pavel Begunkov, io-uring,
	linux-nvme, linux-block, sbates, logang, Pankaj Raghav,
	Javier González, Luis Chamberlain, Adam Manzanares,
	Anuj Gupta

On Thu, Mar 10, 2022 at 2:05 PM Christoph Hellwig <[email protected]> wrote:
>
> On Tue, Mar 08, 2022 at 08:50:59PM +0530, Kanchan Joshi wrote:
> > +struct bio *bio_alloc_kiocb(struct kiocb *kiocb, unsigned short nr_vecs,
> > +                         struct bio_set *bs)
> > +{
> > +     if (!(kiocb->ki_flags & IOCB_ALLOC_CACHE))
> > +             return bio_alloc_bioset(GFP_KERNEL, nr_vecs, bs);
> > +
> > +     return bio_from_cache(nr_vecs, bs);
> > +}
> >  EXPORT_SYMBOL_GPL(bio_alloc_kiocb);
>
> If we go down this route we might want to just kill the bio_alloc_kiocb
> wrapper.

Fine, will kill that in v2.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 10/17] block: wire-up support for plugging
       [not found]     ` <[email protected]>
@ 2022-03-10 12:40       ` Kanchan Joshi
  0 siblings, 0 replies; 6+ messages in thread
From: Kanchan Joshi @ 2022-03-10 12:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Kanchan Joshi, Jens Axboe, Keith Busch, Pavel Begunkov, io-uring,
	linux-nvme, linux-block, sbates, logang, Pankaj Raghav,
	Javier González, Luis Chamberlain, Adam Manzanares,
	Anuj Gupta

On Thu, Mar 10, 2022 at 2:04 PM Christoph Hellwig <[email protected]> wrote:
>
> On Tue, Mar 08, 2022 at 08:50:58PM +0530, Kanchan Joshi wrote:
> > From: Jens Axboe <[email protected]>
> >
> > Add support to use plugging if it is enabled, else use default path.
>
> The subject and this comment don't really explain what is done, and
> also don't mention at all why it is done.

Missed out, will fix up. But plugging gave a very good hike to IOPS.
Especially while comparing this with io-uring's block-io path that
keeps .plug enabled. Patch 9 (that enables .plug for uring-cmd) and
this goes hand in hand.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 17/17] nvme: enable non-inline passthru commands
       [not found]       ` <CA+1E3rLaQstG8LWUyJrbK5Qz+AnNpOnAyoK-7H5foFm67BJeFA@mail.gmail.com>
@ 2022-03-10 14:19         ` Christoph Hellwig
       [not found]           ` <CA+1E3rL3Q2noHW-cD20SZyo9EqbzjF54F6TgZoUMMuZGkhkqnw@mail.gmail.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2022-03-10 14:19 UTC (permalink / raw)
  To: Kanchan Joshi
  Cc: Christoph Hellwig, Kanchan Joshi, Jens Axboe, Keith Busch,
	Pavel Begunkov, io-uring, linux-nvme, linux-block, sbates, logang,
	Pankaj Raghav, Javier González, Luis Chamberlain,
	Adam Manzanares, Anuj Gupta

On Thu, Mar 10, 2022 at 05:20:13PM +0530, Kanchan Joshi wrote:
> In sync ioctl, we always update this result field by doing put_user on
> completion.
> For async ioctl, since command is inside the the sqe, its lifetime is
> only upto submission. SQE may get reused post submission, leaving no
> way to update the "result" field on completion. Had this field been a
> pointer, we could have saved this on submission and updated on
> completion. But that would require redesigning this structure and
> adding newer ioctl in nvme.

Why would it required adding an ioctl to nvme?  The whole io_uring
async_cmd infrastructure is completely independent from ioctls.

> Coming back, even though sync-ioctl alway updates this result to
> user-space, only a few nvme io commands (e.g. zone-append, copy,
> zone-mgmt-send) can return this additional result (spec-wise).
> Therefore in nvme, when we are dealing with inline-sqe commands from
> io_uring, we never attempt to update the result. And since we don't
> update the result, we limit support to only read/write passthru
> commands. And fail any other command during submission itself (Patch
> 2).

Yikes.  That is outright horrible.  passthrough needs to be command
agnostic and future proof to any newly added nvme command.

> > Overly long line.
> 
> Under 100, but sure, can fold it under 80.

You can only use 100 sparingly if it makes the code more readable.  Which
I know is fuzzy, and in practice never does.  Certainly not in nvme and
block code.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 17/17] nvme: enable non-inline passthru commands
       [not found]           ` <CA+1E3rL3Q2noHW-cD20SZyo9EqbzjF54F6TgZoUMMuZGkhkqnw@mail.gmail.com>
@ 2022-03-11  6:27             ` Christoph Hellwig
  0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2022-03-11  6:27 UTC (permalink / raw)
  To: Kanchan Joshi
  Cc: Christoph Hellwig, Kanchan Joshi, Jens Axboe, Keith Busch,
	Pavel Begunkov, io-uring, linux-nvme, linux-block, sbates, logang,
	Pankaj Raghav, Javier González, Luis Chamberlain,
	Adam Manzanares, Anuj Gupta

On Fri, Mar 11, 2022 at 12:13:24AM +0530, Kanchan Joshi wrote:
> Problem is, the inline facility does not go very well with this
> particular nvme-passthru ioctl (NVME_IOCTL_IO64_CMD).

And it doesn't have to, because there is absolutely no need to reuse
the existing structures!  Quite to the contrary, trying to reuse the
structure and opcode makes things confusing as hell.

> And that's because this ioctl requires additional "__u64 result;" to
> be updated within "struct nvme_passthru_cmd64".
> To update that during completion, we need, at the least, the result
> field to be a pointer "__u64 result_ptr" inside the struct
> nvme_passthru_cmd64.
> Do you see that is possible without adding a new passthru ioctl in nvme?

We don't need a new passthrough ioctl in nvme.  We need to decouple the
uring cmd properly.  And properly in this case means not to add a
result pointer, but to drop the result from the _input_ structure
entirely, and instead optionally support a larger CQ entry that contains
it, just like the first patch does for the SQ.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 05/17] nvme: wire-up support for async-passthru on char-device.
       [not found]   ` <[email protected]>
@ 2022-03-11  7:01     ` Christoph Hellwig
  2022-03-22 15:18     ` Clay Mayers
  1 sibling, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2022-03-11  7:01 UTC (permalink / raw)
  To: Kanchan Joshi
  Cc: axboe, hch, kbusch, asml.silence, io-uring, linux-nvme,
	linux-block, sbates, logang, pankydev8, javier, mcgrof,
	a.manzanares, joshiiitr, anuj20.g

On Tue, Mar 08, 2022 at 08:50:53PM +0530, Kanchan Joshi wrote:
> +/*
> + * This overlays struct io_uring_cmd pdu.
> + * Expect build errors if this grows larger than that.
> + */
> +struct nvme_uring_cmd_pdu {
> +	u32 meta_len;
> +	union {
> +		struct bio *bio;
> +		struct request *req;
> +	};
> +	void *meta; /* kernel-resident buffer */
> +	void __user *meta_buffer;
> +} __packed;

Why is this marked __packed?

In general I'd be much more happy if the meta elelements were a
io_uring-level feature handled outside the driver and typesafe in
struct io_uring_cmd, with just a normal private data pointer for the
actual user, which would remove all the crazy casting.

> +static void nvme_end_async_pt(struct request *req, blk_status_t err)
> +{
> +	struct io_uring_cmd *ioucmd = req->end_io_data;
> +	struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd);
> +	/* extract bio before reusing the same field for request */
> +	struct bio *bio = pdu->bio;
> +
> +	pdu->req = req;
> +	req->bio = bio;
> +	/* this takes care of setting up task-work */
> +	io_uring_cmd_complete_in_task(ioucmd, nvme_pt_task_cb);

This is a bit silly.  First we defer the actual request I/O completion
from the block layer to a different CPU or softirq and then we have
another callback here.  I think it would be much more useful if we
could find a way to enhance blk_mq_complete_request so that it could
directly complet in a given task.  That would also be really nice for
say normal synchronous direct I/O.

> +	if (ioucmd) { /* async dispatch */
> +		if (cmd->common.opcode == nvme_cmd_write ||
> +				cmd->common.opcode == nvme_cmd_read) {

No we can't just check for specific commands in the passthrough handler.

> +			nvme_setup_uring_cmd_data(req, ioucmd, meta, meta_buffer,
> +					meta_len);
> +			blk_execute_rq_nowait(req, 0, nvme_end_async_pt);
> +			return 0;
> +		} else {
> +			/* support only read and write for now. */
> +			ret = -EINVAL;
> +			goto out_meta;
> +		}

Pleae always handle error in the first branch and don't bother with an
else after a goto or return.

> +static int nvme_ns_async_ioctl(struct nvme_ns *ns, struct io_uring_cmd *ioucmd)
> +{
> +	int ret;
> +
> +	BUILD_BUG_ON(sizeof(struct nvme_uring_cmd_pdu) > sizeof(ioucmd->pdu));
> +
> +	switch (ioucmd->cmd_op) {
> +	case NVME_IOCTL_IO64_CMD:
> +		ret = nvme_user_cmd64(ns->ctrl, ns, NULL, ioucmd);
> +		break;
> +	default:
> +		ret = -ENOTTY;
> +	}
> +
> +	if (ret >= 0)
> +		ret = -EIOCBQUEUED;

That's a weird way to handle the returns.  Just return -EIOCBQUEUED
directly from the handler (which as said before should be split from
the ioctl handler anyway).


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH 05/17] nvme: wire-up support for async-passthru on char-device.
       [not found]   ` <[email protected]>
  2022-03-11  7:01     ` [PATCH 05/17] nvme: wire-up support for async-passthru on char-device Christoph Hellwig
@ 2022-03-22 15:18     ` Clay Mayers
  1 sibling, 0 replies; 6+ messages in thread
From: Clay Mayers @ 2022-03-22 15:18 UTC (permalink / raw)
  To: Kanchan Joshi, [email protected], [email protected], [email protected],
	[email protected]
  Cc: [email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected], [email protected],
	[email protected], [email protected], [email protected],
	[email protected]

> From: Kanchan Joshi
> Sent: Tuesday, March 8, 2022 7:21 AM
> To: [email protected]; [email protected]; [email protected];
> [email protected]
> Cc: [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected]
> Subject: [PATCH 05/17] nvme: wire-up support for async-passthru on char-
> device.
> 


<snip>
> +static void nvme_pt_task_cb(struct io_uring_cmd *ioucmd)
> +{
> +	struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd);
> +	struct request *req = pdu->req;
> +	int status;
> +	struct bio *bio = req->bio;
> +
> +	if (nvme_req(req)->flags & NVME_REQ_CANCELLED)
> +		status = -EINTR;
> +	else
> +		status = nvme_req(req)->status;
> +
> +	/* we can free request */
> +	blk_mq_free_request(req);
> +	blk_rq_unmap_user(bio);
> +
> +	if (!status && pdu->meta_buffer) {
> +		if (copy_to_user(pdu->meta_buffer, pdu->meta, pdu-
> >meta_len))

This copy is incorrectly called for writes.

> +			status = -EFAULT;
> +	}
> +	kfree(pdu->meta);
> +
> +	io_uring_cmd_done(ioucmd, status);
> +}
</snip>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-03-22 15:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <[email protected]>
     [not found] ` <CGME20220308152716epcas5p3d38d2372c184259f1a10c969f7e4396f@epcas5p3.samsung.com>
     [not found]   ` <[email protected]>
     [not found]     ` <[email protected]>
2022-03-10 12:25       ` [PATCH 11/17] block: factor out helper for bio allocation from cache Kanchan Joshi
     [not found] ` <CGME20220308152714epcas5p4c5a0d16512fd7054c9a713ee28ede492@epcas5p4.samsung.com>
     [not found]   ` <[email protected]>
     [not found]     ` <[email protected]>
2022-03-10 12:40       ` [PATCH 10/17] block: wire-up support for plugging Kanchan Joshi
     [not found] ` <CGME20220308152729epcas5p17e82d59c68076eb46b5ef658619d65e3@epcas5p1.samsung.com>
     [not found]   ` <[email protected]>
     [not found]     ` <[email protected]>
     [not found]       ` <CA+1E3rLaQstG8LWUyJrbK5Qz+AnNpOnAyoK-7H5foFm67BJeFA@mail.gmail.com>
2022-03-10 14:19         ` [PATCH 17/17] nvme: enable non-inline passthru commands Christoph Hellwig
     [not found]           ` <CA+1E3rL3Q2noHW-cD20SZyo9EqbzjF54F6TgZoUMMuZGkhkqnw@mail.gmail.com>
2022-03-11  6:27             ` Christoph Hellwig
     [not found] ` <CGME20220308152702epcas5p1eb1880e024ac8b9531c85a82f31a4e78@epcas5p1.samsung.com>
     [not found]   ` <[email protected]>
2022-03-11  7:01     ` [PATCH 05/17] nvme: wire-up support for async-passthru on char-device Christoph Hellwig
2022-03-22 15:18     ` Clay Mayers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox