From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B77DC77B76 for ; Thu, 20 Apr 2023 15:31:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231778AbjDTPbs (ORCPT ); Thu, 20 Apr 2023 11:31:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34750 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231305AbjDTPbr (ORCPT ); Thu, 20 Apr 2023 11:31:47 -0400 Received: from mail-ej1-x62a.google.com (mail-ej1-x62a.google.com [IPv6:2a00:1450:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A320B40D2 for ; Thu, 20 Apr 2023 08:31:46 -0700 (PDT) Received: by mail-ej1-x62a.google.com with SMTP id dm2so7260880ejc.8 for ; Thu, 20 Apr 2023 08:31:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1682004705; x=1684596705; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=KK1/RmT8+lpgI/pTT2ZMZldC4R02ExSdLWBsBLwKly4=; b=CyhoggHHu7+IimLsrB++mRd/+2uu7N+5RUl9VRw4YGTIH7wPHbFZZhKgUWc2taI7T/ 2cwt9ABfdVZUOxoSaY7LintrgF5FAPrfALJzeJOL63TvajKOji53Q4UbikSnUD67XAPi 9RKN7Q2cWticJZ5d1Nb2ZuyZt65Xns9YhzEKtIinbb8HRlbBHfdrurz4K3snUTPLm+aw FHklyKIdKPFoaNkpwM9NqO54aVRV8HEc/zCJ8gAaDOmsL89fzUdoqHh8Mo0RhcCi8KAb qsP2oyY7Ln2ve6zrmQi674+GyQyJd4CZI61LFdQqRoDFNdmt/yOs1WWuHQ53wUYDzPhq z1hg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682004705; x=1684596705; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KK1/RmT8+lpgI/pTT2ZMZldC4R02ExSdLWBsBLwKly4=; b=bqnEYGJn0rJalWe+gD9LEs2o0n1oZ2LOtXTgFovxDCjJh30Me7Wy8ptk5ZRy88BWTG 5v2TsVrKs8c2xDVx0lKMD94cIsPngiGZfgiXRMYN649/I+l/kpSUrzz0LrmVb/aa+ubY uVoyEkqnHaRidUwydh7m2oc+b90yzov/yXNEoMp5898+CiJEGV3zDrmHso7DH0NS1aMD crF1iY+YJ2RneSc+puvPgWPMKiBXhAnGIzqSHnZGtcTW1PI+DzUD8GKTqZdt6cN3Hxv8 a+g+S4vlqAIvPqruZiy+RTRs/7vC4WBNPn+ImA9imH/s6Y/fRy2XQmLsgzCPFjkI3M9y AWRA== X-Gm-Message-State: AAQBX9fiZLvp+BZ17JR8y334PtLY1XkV7Jl7y0l4h4JPspsVQggr97Hu ZAyIQV+Lcto5T9OdQ1Aoif6cCSxNhEw= X-Google-Smtp-Source: AKy350YSsU+r8LJNkCtxgOY5ECg94wG4YqL3XwlmjcIv1AuAR44VXuD2kLOXjxNOn4tDHWaYyrDvyg== X-Received: by 2002:a17:906:a057:b0:8b1:7ae8:ba6f with SMTP id bg23-20020a170906a05700b008b17ae8ba6fmr1838997ejb.16.1682004704917; Thu, 20 Apr 2023 08:31:44 -0700 (PDT) Received: from ?IPV6:2620:10d:c096:310::26ef? ([2620:10d:c092:600::2:7db2]) by smtp.gmail.com with ESMTPSA id l3-20020a170906230300b0094e7d196aa4sm853125eja.160.2023.04.20.08.31.44 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 20 Apr 2023 08:31:44 -0700 (PDT) Message-ID: Date: Thu, 20 Apr 2023 16:28:17 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Subject: Re: [PATCHSET 0/6] Enable NO_OFFLOAD support Content-Language: en-US To: Jens Axboe , io-uring@vger.kernel.org Cc: luhongfei@vivo.com References: <20230419162552.576489-1-axboe@kernel.dk> <1f57b637-e0b5-2954-fa34-ff2672f55787@gmail.com> <3a273417-762c-da28-b918-e79eae0dc3f4@kernel.dk> From: Pavel Begunkov In-Reply-To: <3a273417-762c-da28-b918-e79eae0dc3f4@kernel.dk> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org On 4/20/23 16:08, Jens Axboe wrote: > On 4/19/23 6:43?PM, Pavel Begunkov wrote: >> On 4/19/23 17:25, Jens Axboe wrote: >>> Hi, >>> >>> This series enables support for forcing no-offload for requests that >>> otherwise would have been punted to io-wq. In essence, it bypasses >>> the normal non-blocking issue in favor of just letting the issue block. >>> This is only done for requests that would've otherwise hit io-wq in >>> the offload path, anything pollable will still be doing non-blocking >>> issue. See patch 3 for details. >> >> That's shooting ourselves in the leg. >> >> 1) It has never been easier to lock up userspace. They might be able >> to deal with simple cases like read(pipe) + write(pipe), though even >> that in a complex enough framework would cause debugging and associated >> headache. >> >> Now let's assume that the userspace submits nvme passthrough requests, >> it exhausts tags and a request is left waiting there. To progress >> forward one of the previous reqs should complete, but it's only putting >> task in tw, which will never be run with DEFER_TASKRUN. >> >> It's not enough for the userspace to be careful, for DEFER_TASKRUN >> there will always be a chance to get locked . >> >> 2) It's not limited only to requests we're submitting, but also >> already queued async requests. Inline submission holds uring_lock, >> and so if io-wq thread needs to grab a registered file for the >> request, it'll io_ring_submit_lock() and wait until the submission >> ends. Same for provided buffers and some other cases. >> >> Even task exit will actively try to grab the lock. > > One thing I pondered was making the inline submissions similar to io-wq > submissions - eg don't hold uring_lock over them. To make useful, I > suspect we'd want to prep all SQ entries upfront, and then drop for > submission. That would need completion caches (ctx->submit_state) to be changed, either by allowing multiple of them or limiting by some other mean to only 1 inline submitter. Also, that will probably return the request refcounting back, and DEFER_TASKRUN would probably need to retake the lock for execution unless there are magic tricks around it. Not an easy task if we don't want to hurt performance. > We'd also want to make this mutually exclusive with IOPOLL, obviously. > Doesn't make any sense to do anyway for IOPOLL, but it needs to be > explicitly disallowed. -- Pavel Begunkov