* Re: [PATCH 06/13] fuse: Add an interval ring stop worker/monitor [not found] ` <CAJfpeguvCNUEbcy6VQzVJeNOsnNqfDS=LyRaGvSiDTGerB+iuw@mail.gmail.com> @ 2023-03-23 13:26 ` Ming Lei [not found] ` <[email protected]> 1 sibling, 0 replies; 4+ messages in thread From: Ming Lei @ 2023-03-23 13:26 UTC (permalink / raw) To: Miklos Szeredi Cc: Bernd Schubert, [email protected], Dharmendra Singh, Amir Goldstein, [email protected], Aleksandr Mikhalitsyn, io-uring, Jens Axboe On Thu, Mar 23, 2023 at 01:35:24PM +0100, Miklos Szeredi wrote: > On Thu, 23 Mar 2023 at 12:04, Bernd Schubert <[email protected]> wrote: > > > > Thanks for looking at these patches! > > > > I'm adding in Ming Lei, as I had taken several ideas from ublkm I guess > > I also should also explain in the commit messages and code why it is > > done that way. > > > > On 3/23/23 11:27, Miklos Szeredi wrote: > > > On Tue, 21 Mar 2023 at 02:11, Bernd Schubert <[email protected]> wrote: > > >> > > >> This adds a delayed work queue that runs in intervals > > >> to check and to stop the ring if needed. Fuse connection > > >> abort now waits for this worker to complete. > > > > > > This seems like a hack. Can you explain what the problem is? > > > > > > The first thing I notice is that you store a reference to the task > > > that initiated the ring creation. This already looks fishy, as the > > > ring could well survive the task (thread) that created it, no? > > > > You mean the currently ongoing work, where the daemon can be restarted? > > Daemon restart will need some work with ring communication, I will take > > care of that once we have agreed on an approach. [Also added in Alexsandre]. > > > > fuse_uring_stop_mon() checks if the daemon process is exiting and and > > looks at fc->ring.daemon->flags & PF_EXITING - this is what the process > > reference is for. > > Okay, so you are saying that the lifetime of the ring is bound to the > lifetime of the thread that created it? > > Why is that? Cc Jens and io_uring list For ublk: 1) it is MQ device, it is natural to map queue into pthread/uring 2) io_uring context is invisible to driver, we don't know when it is destructed, so bind io_uring context with queue/pthread, because we have to complete all uring commands before io_uring context exits. uring cmd usage for ublk/fuse should be special and unique, and it is like poll request, and sent to device beforehand, and it is completed only if driver has incoming thing which needs userspace to handle, but ublk/fuse may never have anyting which needs userpace to look. If io_uring can provides API for registering exit callback, things could be easier for ublk/fuse. However, we still need to know the exact io_uring context associated with our commands, so either more io_uring implementation details exposed to driver, or proper APIs provided. > > I'ts much more common to bind a lifetime of an object to that of an > open file. io_uring_setup() will do that for example. > > It's much easier to hook into the destruction of an open file, than > into the destruction of a process (as you've observed). And the way > you do it is even more confusing as the ring is destroyed not when the > process is destroyed, but when a specific thread is destroyed, making > this a thread specific behavior that is probably best avoided. > > So the obvious solution would be to destroy the ring(s) in > fuse_dev_release(). Why wouldn't that work? io uring is used for submitting multiple files, so its lifetime can't be bound to file, also io_uring is invisible to driver if here the ring(s) means io_uring. thanks, Ming ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <[email protected]>]
* Re: [PATCH 06/13] fuse: Add an interval ring stop worker/monitor [not found] ` <[email protected]> @ 2023-03-23 20:51 ` Bernd Schubert 2023-03-27 13:22 ` Pavel Begunkov 0 siblings, 1 reply; 4+ messages in thread From: Bernd Schubert @ 2023-03-23 20:51 UTC (permalink / raw) To: Miklos Szeredi Cc: [email protected], Dharmendra Singh, Amir Goldstein, [email protected], Ming Lei, Aleksandr Mikhalitsyn, [email protected], Jens Axboe On 3/23/23 14:18, Bernd Schubert wrote: > On 3/23/23 13:35, Miklos Szeredi wrote: >> On Thu, 23 Mar 2023 at 12:04, Bernd Schubert <[email protected]> wrote: >>> >>> Thanks for looking at these patches! >>> >>> I'm adding in Ming Lei, as I had taken several ideas from ublkm I guess >>> I also should also explain in the commit messages and code why it is >>> done that way. >>> >>> On 3/23/23 11:27, Miklos Szeredi wrote: >>>> On Tue, 21 Mar 2023 at 02:11, Bernd Schubert <[email protected]> wrote: >>>>> >>>>> This adds a delayed work queue that runs in intervals >>>>> to check and to stop the ring if needed. Fuse connection >>>>> abort now waits for this worker to complete. >>>> >>>> This seems like a hack. Can you explain what the problem is? >>>> >>>> The first thing I notice is that you store a reference to the task >>>> that initiated the ring creation. This already looks fishy, as the >>>> ring could well survive the task (thread) that created it, no? >>> >>> You mean the currently ongoing work, where the daemon can be restarted? >>> Daemon restart will need some work with ring communication, I will take >>> care of that once we have agreed on an approach. [Also added in >>> Alexsandre]. >>> >>> fuse_uring_stop_mon() checks if the daemon process is exiting and and >>> looks at fc->ring.daemon->flags & PF_EXITING - this is what the process >>> reference is for. >> >> Okay, so you are saying that the lifetime of the ring is bound to the >> lifetime of the thread that created it? >> >> Why is that? >> >> I'ts much more common to bind a lifetime of an object to that of an >> open file. io_uring_setup() will do that for example. >> >> It's much easier to hook into the destruction of an open file, than >> into the destruction of a process (as you've observed). And the way >> you do it is even more confusing as the ring is destroyed not when the >> process is destroyed, but when a specific thread is destroyed, making >> this a thread specific behavior that is probably best avoided. >> >> So the obvious solution would be to destroy the ring(s) in >> fuse_dev_release(). Why wouldn't that work? >> > > I _think_ I had tried it at the beginning and run into issues and then > switched the ublk approach. Going to try again now. > Found the reason why I complete SQEs when the daemon stops - on daemon side I have ret = io_uring_wait_cqe(&queue->ring, &cqe); and that hangs when you stop user side with SIGTERM/SIGINT. Maybe that could be solved with io_uring_wait_cqe_timeout() / io_uring_wait_cqe_timeout(), but would that really be a good solution? We would now have CPU activity in intervals on the daemon side for now good reason - the more often the faster SIGTERM/SIGINT works. So at best, it should be uring side that stops to wait on a receiving a signal. Thanks, Bernd ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 06/13] fuse: Add an interval ring stop worker/monitor 2023-03-23 20:51 ` Bernd Schubert @ 2023-03-27 13:22 ` Pavel Begunkov 2023-03-27 14:02 ` Bernd Schubert 0 siblings, 1 reply; 4+ messages in thread From: Pavel Begunkov @ 2023-03-27 13:22 UTC (permalink / raw) To: Bernd Schubert, Miklos Szeredi Cc: [email protected], Dharmendra Singh, Amir Goldstein, [email protected], Ming Lei, Aleksandr Mikhalitsyn, [email protected], Jens Axboe On 3/23/23 20:51, Bernd Schubert wrote: > On 3/23/23 14:18, Bernd Schubert wrote: >> On 3/23/23 13:35, Miklos Szeredi wrote: >>> On Thu, 23 Mar 2023 at 12:04, Bernd Schubert <[email protected]> wrote: [...] > Found the reason why I complete SQEs when the daemon stops - on daemon > side I have > > ret = io_uring_wait_cqe(&queue->ring, &cqe); > > and that hangs when you stop user side with SIGTERM/SIGINT. Maybe that > could be solved with io_uring_wait_cqe_timeout() / > io_uring_wait_cqe_timeout(), but would that really be a good solution? It can be some sort of an eventfd triggered from the signal handler and waited upon by an io_uring poll/read request. Or maybe signalfd. > We would now have CPU activity in intervals on the daemon side for now > good reason - the more often the faster SIGTERM/SIGINT works. > So at best, it should be uring side that stops to wait on a receiving a > signal. FWIW, io_uring (i.e. kernel side) will stop waiting if there are pending signals, and we'd need to check liburing to honour it, e.g. not to retry waiting. -- Pavel Begunkov ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 06/13] fuse: Add an interval ring stop worker/monitor 2023-03-27 13:22 ` Pavel Begunkov @ 2023-03-27 14:02 ` Bernd Schubert 0 siblings, 0 replies; 4+ messages in thread From: Bernd Schubert @ 2023-03-27 14:02 UTC (permalink / raw) To: Pavel Begunkov, Miklos Szeredi Cc: [email protected], Dharmendra Singh, Amir Goldstein, [email protected], Ming Lei, Aleksandr Mikhalitsyn, [email protected], Jens Axboe On 3/27/23 15:22, Pavel Begunkov wrote: > On 3/23/23 20:51, Bernd Schubert wrote: >> On 3/23/23 14:18, Bernd Schubert wrote: >>> On 3/23/23 13:35, Miklos Szeredi wrote: >>>> On Thu, 23 Mar 2023 at 12:04, Bernd Schubert <[email protected]> wrote: > [...] >> Found the reason why I complete SQEs when the daemon stops - on daemon >> side I have >> >> ret = io_uring_wait_cqe(&queue->ring, &cqe); >> >> and that hangs when you stop user side with SIGTERM/SIGINT. Maybe that >> could be solved with io_uring_wait_cqe_timeout() / >> io_uring_wait_cqe_timeout(), but would that really be a good solution? > > It can be some sort of an eventfd triggered from the signal handler > and waited upon by an io_uring poll/read request. Or maybe signalfd. > >> We would now have CPU activity in intervals on the daemon side for now >> good reason - the more often the faster SIGTERM/SIGINT works. >> So at best, it should be uring side that stops to wait on a receiving a >> signal. > > FWIW, io_uring (i.e. kernel side) will stop waiting if there are pending > signals, and we'd need to check liburing to honour it, e.g. not to retry > waiting. > I'm going to check where and why it hangs, busy with something else today - by tomorrow I should know what happens. Thanks, Bernd ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-03-27 14:02 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <[email protected]> [not found] ` <[email protected]> [not found] ` <CAJfpegs6z6pvepUx=3zfAYqisumri=2N-_A-nsYHQd62AQRahA@mail.gmail.com> [not found] ` <[email protected]> [not found] ` <CAJfpeguvCNUEbcy6VQzVJeNOsnNqfDS=LyRaGvSiDTGerB+iuw@mail.gmail.com> 2023-03-23 13:26 ` [PATCH 06/13] fuse: Add an interval ring stop worker/monitor Ming Lei [not found] ` <[email protected]> 2023-03-23 20:51 ` Bernd Schubert 2023-03-27 13:22 ` Pavel Begunkov 2023-03-27 14:02 ` Bernd Schubert
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox