From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-qt1-f172.google.com (mail-qt1-f172.google.com [209.85.160.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7A3CB3FF1; Tue, 19 Nov 2024 02:02:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731981758; cv=none; b=aJNSsWArrzp0F8rXSRFi0PqcrvwPPUINT+vNQhJW2LWBIwdcu74Pn/EMo46Qk8ME1x9JhsVdqrY5WriZ0UrsLo6j1pMiYGGDQ2WCD2YrRuf7cuS8FQouwhCqkYMJu7RLao6zKYd7HVYoezVhoFqYxVRCVVo/mQUA8+GL/I/GbSA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731981758; c=relaxed/simple; bh=IRTjbqzZKvhc+aYkEaVA4syPDyMZSzr+N955Ab3ijeI=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=NBLScwujD4hkoRjRn6wXph757EAn/AMwOxdpf5V8rejOkfE6mCQaBXr/rA88WXDwZk3pOFFPGb31J0VK60Nctp/f/hc1xPxlNZEvZJeAoqueZ8nEpkTmtULwI7gQBP5+Ksu7U8q6bAuzM4fFngfes1MCUXHZbWwbW5HOadBnQJk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ARAvz6/U; arc=none smtp.client-ip=209.85.160.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ARAvz6/U" Received: by mail-qt1-f172.google.com with SMTP id d75a77b69052e-460c2418e37so30607071cf.0; Mon, 18 Nov 2024 18:02:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731981755; x=1732586555; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=YTsX9FLKt1Yl9wgbTorRdkM0TWe8BdbNSz0bAqGWV6k=; b=ARAvz6/U/QQKxcxono4pBgHq6wPvE9adpEwutgaDpAKO0hkGN5FEWYJIiKWL+wUjoG uXPrOSsyzmHsm13AQVeSR3VNU0blrwD2C+ryXwXMY/Ah+5VWmZAWmr8zn697xjo/AQus ltROsPUAPXR1IHtX4AgmGczksxTUIV3MyBhK7rM0tXU1rXSjIoskBJjuKN0K0FPTy3z6 M/gYIK45dJsOUdlqPCwlg/i0mRx6Aw/Yxf9eeYcAbOjTq4N+efrgEjNiz0iH7aiu2+Ny MrnB+mskAwrVoY3Rz+9iQ8rEjZMzHjeAkDBYy7Q34Jk4X6RNGIeyqjJaJ5AWrrX4oAHJ PbgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731981755; x=1732586555; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YTsX9FLKt1Yl9wgbTorRdkM0TWe8BdbNSz0bAqGWV6k=; b=D+5zvp+KtSGWbRPJq3XX+Oyv9eqnPE27MGY4f/9FIM6EfeP4u0qT0ztI1ZeTWzylS8 Cgzr1EFsFWKauD7pT70HpiCMQDxNZl6NFdKKiCFjtFZ9XyayFr+7hVg+d2zVhiN/ommD M0+5fkYUsrZMab0TLKprpizWQjEfXGlufMWY4TpI+fW7QFMv96/BENlWKHD284PwhZ7F +K7B2WvQv78G9DutHkpgs+57onBPOWoK9MkbUtn6F6tUibuFswflz+kbMSfvV6tq8QEs Q4Vyn7+bviSjdY5GqqwlkkU2/h9PTsGYivsplQTn7aFQpDmF6NCuPyyyHJKCtMylu01q 2HZA== X-Forwarded-Encrypted: i=1; AJvYcCV5mmy7JOwEmcLedpiat8yRtDr6gIW2g+74DW1/VD0wgQd3v1vcfQHhiLGQKvadwgsh2AYOlNDQDQ==@vger.kernel.org, AJvYcCWQtjTT6dDgPJnbAFGWadXh28XrP0Zn4m5P0RHqSsOBdyvUeSlkxfP77LQ20sdJZdgW9fybFan63AMHzj2/lg==@vger.kernel.org X-Gm-Message-State: AOJu0YwfMg9eevVpKGSCK+0r36Pp5tDH6ItF/A14mkzm1Lw7bJlp1p0s wnQR7rZoUkhQm3ol5hb1I3MnI6iQRxQuN5MrbKhn8TO2nqy5Uy1No/zh4Dosfe63NbpAGWiFQnK 8wOTTjGvI7zSA6nuyaBu1GkLX0ME= X-Google-Smtp-Source: AGHT+IGe/EluwGH0TjXEtZ01gJK+1CVUJQAdbWQZSPXiQUm7SG5oRfWqv/JN8EeMCa8Py0Z21SSeiBYImSB6EEUhAes= X-Received: by 2002:a05:622a:6119:b0:462:ae89:4695 with SMTP id d75a77b69052e-46363e2db8bmr194661971cf.30.1731981755166; Mon, 18 Nov 2024 18:02:35 -0800 (PST) Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20241107-fuse-uring-for-6-10-rfc4-v5-0-e8660a991499@ddn.com> <20241107-fuse-uring-for-6-10-rfc4-v5-15-e8660a991499@ddn.com> <19af894d-d5ac-4fcf-8fa1-b387c354c669@fastmail.fm> In-Reply-To: <19af894d-d5ac-4fcf-8fa1-b387c354c669@fastmail.fm> From: Joanne Koong Date: Mon, 18 Nov 2024 18:02:24 -0800 Message-ID: Subject: Re: [PATCH RFC v5 15/16] fuse: {io-uring} Prevent mount point hang on fuse-server termination To: Bernd Schubert Cc: Bernd Schubert , Miklos Szeredi , Jens Axboe , Pavel Begunkov , linux-fsdevel@vger.kernel.org, io-uring@vger.kernel.org, Josef Bacik , Amir Goldstein , Ming Lei , David Wei , bernd@bsbernd.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Nov 18, 2024 at 3:47=E2=80=AFPM Bernd Schubert wrote: > > On 11/19/24 00:30, Joanne Koong wrote: > > On Thu, Nov 7, 2024 at 9:04=E2=80=AFAM Bernd Schubert wrote: > >> > >> When the fuse-server terminates while the fuse-client or kernel > >> still has queued URING_CMDs, these commands retain references > >> to the struct file used by the fuse connection. This prevents > >> fuse_dev_release() from being invoked, resulting in a hung mount > >> point. > >> > >> This patch addresses the issue by making queued URING_CMDs > >> cancelable, allowing fuse_dev_release() to proceed as expected > >> and preventing the mount point from hanging. > >> > >> Signed-off-by: Bernd Schubert > >> --- > >> fs/fuse/dev_uring.c | 76 ++++++++++++++++++++++++++++++++++++++++++++= ++++----- > >> 1 file changed, 70 insertions(+), 6 deletions(-) > >> > >> diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c > >> index 6af515458695ccb2e32cc8c62c45471e6710c15f..b465da41c42c47eaf69f09= bab1423061bc8fcc68 100644 > >> --- a/fs/fuse/dev_uring.c > >> +++ b/fs/fuse/dev_uring.c > >> @@ -23,6 +23,7 @@ MODULE_PARM_DESC(enable_uring, > >> > >> struct fuse_uring_cmd_pdu { > >> struct fuse_ring_ent *ring_ent; > >> + struct fuse_ring_queue *queue; > >> }; > >> > >> /* > >> @@ -382,6 +383,61 @@ void fuse_uring_stop_queues(struct fuse_ring *rin= g) > >> } > >> } > >> > >> +/* > >> + * Handle IO_URING_F_CANCEL, typically should come on daemon terminat= ion > >> + */ > >> +static void fuse_uring_cancel(struct io_uring_cmd *cmd, > >> + unsigned int issue_flags, struct fuse_co= nn *fc) > >> +{ > >> + struct fuse_uring_cmd_pdu *pdu =3D (struct fuse_uring_cmd_pdu = *)cmd->pdu; > >> + struct fuse_ring_queue *queue =3D pdu->queue; > >> + struct fuse_ring_ent *ent; > >> + bool found =3D false; > >> + bool need_cmd_done =3D false; > >> + > >> + spin_lock(&queue->lock); > >> + > >> + /* XXX: This is cumbersome for large queues. */ > >> + list_for_each_entry(ent, &queue->ent_avail_queue, list) { > >> + if (pdu->ring_ent =3D=3D ent) { > >> + found =3D true; > >> + break; > >> + } > >> + } > > > > Do we have to check that the entry is on the ent_avail_queue, or can > > we assume that if the ent->state is FRRS_WAIT, the only queue it'll be > > on is the ent_avail_queue? I see only one case where this isn't true, > > for teardown in fuse_uring_stop_list_entries() - if we had a > > workaround for this, eg having some teardown state signifying that > > io_uring_cmd_done() needs to be called on the cmd and clearing > > FRRS_WAIT, then we could get rid of iteration through ent_avail_queue > > for every cancelled cmd. > > > I'm scared that we would run into races - I don't want to access memory > pointed to by pdu->ring_ent, before I'm not sure it is on the list. Oh, I was seeing that we mark the cmd as cancellable (eg in fuse_uring_prepare_cancel()) only after the ring_ent is moved to the ent_avail_queue (in fuse_uring_ent_avail()) and that this is done in the scope of the queue->lock, so we would only call into fuse_uring_cancel() when the ring_ent is on the list. Could there still be a race condition here? Alternatively, inspired by your "bool valid;" idea below, maybe another solution would be having a bit in "struct fuse_ring_ent" tracking if io_uring_cmd_done() needs to be called on it? This is fairly unimportant though - this part could always be optimized in a future patchset if you think it needs to be optimized, but was just curious if these would work. Thanks, Joanne > Remember the long discussion Miklos and I had about the tiny 'tag' > variable and finding requests using existing hash lists [0] ? > Personally I would prefer an array of > > struct queue_entries { > struct fuse_ring_ent *ring_ent; > bool valid; > } > > > in struct fuse_ring_queue { > ... > struct queue_entries *entries[] > } > > And that array would only get freed on queue destruction. Besides > avoiding hash lists, it would also allow to use 'valid' to know if > we can access the ring_entry and then check the state. > > Thanks, > Bernd > > > [0] https://lore.kernel.org/linux-fsdevel/CAJfpegu_UQ1BNp0UDHeOZFWwUoXbJ_= LP4W=3Do+UX6MB3DsJbH8g@mail.gmail.com/T/#t