From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wr1-f46.google.com (mail-wr1-f46.google.com [209.85.221.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 469F11AC884; Thu, 12 Sep 2024 16:38:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.46 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726159097; cv=none; b=gmSKUlmMIXNsr3IOktNcm70cJJIXmeQvWiKpYgcLq8mlpZVbNHx14d13VaLBhXFPgnat9vxk3kagiocK1sIbWef9lreCrRwSZAxcPLlN/iiwfvH5zI8yzOP870caFe0YR5CckFXFd0S8GST48tC0NeN1hXU+5uOp5bhKVrTvLDM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726159097; c=relaxed/simple; bh=q629/gqs6q30LbTxAw2qgfnGCkzh3mltH+MjSMsSC+A=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=QerJMRmkEsmZHYVGWh+cMoeTHcIRwpu+VW6PtZjbeJiDBw++YaQLiuM20GMo0l0/C2x5fgb/k+NhST/WudAGjAnV1HRrfTEokOCSMaYCGqUOrr7AuUip1rqq5vrtDKhayioTpMNe8Nc70bup0+b/wQNDlqBd+tnIP8RXXcuNNsM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=MUKKtEkh; arc=none smtp.client-ip=209.85.221.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MUKKtEkh" Received: by mail-wr1-f46.google.com with SMTP id ffacd0b85a97d-374c7e64b60so791569f8f.2; Thu, 12 Sep 2024 09:38:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726159094; x=1726763894; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=ke8B8ei/VwftxZT7up985lGEeDI1fMW74nZ+B/Yl1e0=; b=MUKKtEkhkRKHdxPhEaYjoKSLmSA/WBP7ixcoBcpj7GLyqZlFbyqzHsi1uMko/1o+0Q p9Dmn2bFL5hyo/J1uFes8DvaYnDderjVWbXE3Aq3RzqpdqWXQClva/w3NAbSz1HaVzbs TH/8cx7Zz0a44KL77aDEYLChaQXxXIWp/M7Yjf2VFCu1tXs1Cy+dxWOa0zpi/sNWMQc7 Bycqo7mW8Gc0XxV27tMWg1U+cUYGh+me1d2e3AQdHhq9CxKOwS+EQEj1KDNBgMzwa2KR /Be3N4Qx+GQncS4Jv6JyT0OTCx8X1/rFfufZt/GgD6aFKMFHxWZcroJA00PbBSfR5xLd vCrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726159094; x=1726763894; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ke8B8ei/VwftxZT7up985lGEeDI1fMW74nZ+B/Yl1e0=; b=xEBWejKsDPi9z5+r9zH/B+2vN6D6zsdEqW+M9Z3ly52UNqXDdZmV8paNyjIbuczgBo XYCdpdygTbDRSdRX6ZLQLXRCImstAtN3GF+616O8d3KOWl0PF2kZirlft9xX6wa2Vcav MpVvNVDsVsKkcY79iCg7b5GDwgqqWE2VAKQsXUmPR2XiUENC7GZ/rL2pHGi8n0ivitV4 3OIMVLUBWOItqDOqA/kmXY6T3n0KVFWwai707hfdOPY6DOmCGos/i/VSmDiAg7H87anf VL9hzupPZ11tNm8HNe2p0daUvoY3AbXXVO2ohPEI4yjS/ZzHBiTpmvsIuVKEPiW/A8CU 0+dQ== X-Forwarded-Encrypted: i=1; AJvYcCVnGLThlseWpmExLpi9FaZfTlrWHB0TSzky7txlQBV2UZomPQIPw9vtYwfMpsh9lbT5aWFxLnoaPKyQVg==@vger.kernel.org X-Gm-Message-State: AOJu0YxyI6raf/sDn1fN/UhNlluEU4RTzM0GTMDEA6oA7zbCAOombBVD ywQm3YLDCEigiENFgPK6SuPDWXGn/2pKMluybjychNweseeVnMDD X-Google-Smtp-Source: AGHT+IHS4iIBmRPXS44yHXJrmVIJWvFhF6iTvnEiniY4uPDRvmSCo0v65F86PcBLdF4iVV2pB87G5Q== X-Received: by 2002:a05:6000:1561:b0:374:93c4:2f61 with SMTP id ffacd0b85a97d-378c2cfecaamr2880974f8f.5.1726159093692; Thu, 12 Sep 2024 09:38:13 -0700 (PDT) Received: from [192.168.42.65] ([148.252.141.246]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-378956dbf58sm14795352f8f.94.2024.09.12.09.38.11 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 12 Sep 2024 09:38:12 -0700 (PDT) Message-ID: Date: Thu, 12 Sep 2024 17:38:36 +0100 Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 8/8] block: implement async write zero pages command To: Christoph Hellwig Cc: io-uring@vger.kernel.org, Jens Axboe , Conrad Meyer , linux-block@vger.kernel.org, linux-mm@kvack.org References: Content-Language: en-US From: Pavel Begunkov In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 9/12/24 10:26, Christoph Hellwig wrote: > On Tue, Sep 10, 2024 at 09:10:34PM +0100, Pavel Begunkov wrote: >> If we expect any error handling from the user space at all (we do), >> it'll and have to be asynchronous, it's async commands and io_uring. >> Asking the user to reissue a command in some form is normal. > > The point is that pretty much all other errors are fatal, while this > is a not supported for which we have a guaranteed to work kernel Yes, and there will be an error indicating that it's not supported, just like it'll return an error this io_uring commands are not supported by a given kernel. > fallback. Kicking it off reuires a bit of work, but I'd rather have > that in one place rather than applications that work on some hardware > and not others. There is nothing new in features that might be unsupported, because of hardware or otherwise, it's giving control to the userspace. >> That's a shame, I agree, which is why I call it "presumably" faster, >> but that actually gives more reasons why you might want this cmd >> separately from write zeroes, considering the user might know >> its hardware and the kernel doesn't try to choose which approach >> faster. > > But the kernel is the right place to make that decision, even if we > aren't very smart about it right now. Fanning that out to every > single applications is a bad idea. Apart that it will never happen >> Users who know more about hw and e.g. prefer writes with 0 page as >> per above. Users with lots of devices who care about pcie / memory >> bandwidth, there is enough of those, they might want to do >> something different like adjusting algorithms and throttling. >> Better/easier testing, though of lesser importance. >> >> Those I made up just now on the spot, but the reporter did >> specifically ask about some way to differentiate fallbacks. > > Well, an optional nofallback flag would be in line with how we do > that. Do you have the original report to share somewhere? Following with another flag "please do fallback", at which point it doesn't make any sense when that can be done in userspace. -- Pavel Begunkov