From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-il1-f178.google.com (mail-il1-f178.google.com [209.85.166.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C0D64E573 for ; Mon, 15 Sep 2025 18:41:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.178 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757961690; cv=none; b=bpLlsJbzxaBoQMYA+2N9ZEMea/sNJMHnmi3hN5gqIyhI60oPLd/pyMftzMqOEPGeYHxKy6B7D5WwAWhP3OWsusr8r9x9g9mYinRfV96zKd3rWNshzv8sg9Pr2xFCBP5SvtQoONBb1z3UZcTiWE12/iy0L0lKSOceUBj8ccl3/3A= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757961690; c=relaxed/simple; bh=m+gQ3dajKNM4kPuUVbi2LpaZ7KdqYsad9CRgy6wYe28=; h=Message-ID:Date:MIME-Version:Subject:To:References:From: In-Reply-To:Content-Type; b=UUSZfMZ1z27okc/PzpaIGD/8ZIByZKa0iMHN1hagsVfFFR+nqi4iHNgyXWilTu+hX6xD7Gcku3zXayaJhVHz/I/aD71ifUy6ZvbH8x/apevO594H5Wh/qQXIWqzHXr87zx5+8BoYQbZjn/jXIZNvBym3VWxbtbJd25Xnhl7DjKg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=USAGgIRa; arc=none smtp.client-ip=209.85.166.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="USAGgIRa" Received: by mail-il1-f178.google.com with SMTP id e9e14a558f8ab-40809f7ffc7so21906345ab.3 for ; Mon, 15 Sep 2025 11:41:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1757961684; x=1758566484; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id:from :to:cc:subject:date:message-id:reply-to; bh=73BHzxlGQ2oAWsvNFV/kvYu1Qi+79ancLpPmgGlkXOA=; b=USAGgIRaUeXI9Tff4sx0KZLb830Tl/DUsFNhnrocKbhbfeXxZ71FEIS5ZA5/2Rk+FU gHOCW6pnJSZ0qSrSviu25vTi+fMIDyGoL+fGSopLevrtlusII9jVAw1sewjubWhWV3jf v0jEbmMGLUPDuFNts/ucO1UyceG6rb0MTm+HyGfK/JDHv9DelzyFu3YDyLoiEAgvUmLy clSBDht+GXOiV1bhTrmb1g/96E/sf5u88M2FLsiHhXZyjqhHmQU1kIpQERGlzHydrwCy 6w0PMY8LxnAO2xiEsGmYOKuWgf6EybJzbyywkqbCSIGQNW65vN3aP84TfdPXadpp1dKM 6s0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757961684; x=1758566484; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=73BHzxlGQ2oAWsvNFV/kvYu1Qi+79ancLpPmgGlkXOA=; b=r8AvDDZp5+afSNlc1H2ENS6I9eyjzAUzJfayVzC95SBoAH1RCDQ4V9Jm0vmsDok38J bqxMFR0AMWq8O0hwJtsaYeKxVEB8CG0iR1Ldfn9aW2euk/sc8+q/p4Ac3gBz4qJuWFEl xWzGcBJFqpg8b6jGi4oAd2dutf9L1Is4nefqTP6qPiv587ZrzvycXUXDltQse497dg/4 f8k/y+JPtZYsoNKO9jdJE4UkhWvlivu6CReWEYm3J+5g81UzZK5qDUqE3kLQpRkP116x FxHDU5dB0oRUFYYp1q6fVYXauzOGlzsqiDV0Q4phDlwO0rbAKmcNJJXqTgexrwEO1sFx lYJw== X-Forwarded-Encrypted: i=1; AJvYcCUnYTyryTk6ll0BFlMGPyrxFm4I7TZ3EKUVQ4TRlXX9E9yIewZ3f8HfEuENM9qSiJCkUkG3pA8cxA==@vger.kernel.org X-Gm-Message-State: AOJu0YxX+fy3dcvzXeiefSmjaFLiI9eGiP1D+Ii6kWgopBu0HZWKDZ3v IiLpNXKVH22Ft4I4wbHyPy5ld4KNVY3xNLHfB5bflNElYNRE52id5BUT8Lt1kFHFdb3zxO0XfQD EsoQc X-Gm-Gg: ASbGncs7K/GQ/24KALB5Hg06tOGn7ipo/YRXXA4YcGXhDjD2+iTKSnOm2CLVXgHt9xr +/sEOxS09+e+pbmvnCPguG45QYqfpp8oys1miy2zvAT2/oUDtvV2tLfcymSkP1pG9TeTq+ANdgG YnDKwZ7oM2ZtsPXD8P6hpSF4wuZcjWi9f6Y7pmp2aqOGlvECESIhuslp2KJpIvGjDPkPLYTBPpm /tXokswunNpNRD1iSH5pzI2sYMsRJgRRmX/nbH1SYwtV5n+TvPOs2ECrm4tmor4qbueI4UN204w WZ21pT8wCV4WWxX4lVBHCX2BTZHDQyMiHF2LYANHxp785VRlGe/wg3imwT6pPxdAFHXhASEtxQD plrl3DazlaJNDhwkePx4e862R3E7dONmExyeN1zId X-Google-Smtp-Source: AGHT+IEyELtb67xaPJYzhM17YEB6GjuJOfuSmifYaXXyrwqPBJ8BsUyKD1Xkew1fNZP1UXyVXL41HQ== X-Received: by 2002:a05:6e02:1909:b0:415:2b3d:651a with SMTP id e9e14a558f8ab-4209d40fef6mr128107275ab.7.1757961684519; Mon, 15 Sep 2025 11:41:24 -0700 (PDT) Received: from [192.168.1.116] ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id e9e14a558f8ab-423fdd050f3sm27283825ab.38.2025.09.15.11.41.23 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 15 Sep 2025 11:41:24 -0700 (PDT) Message-ID: <3acf3cdc-8ace-42e6-a8a8-974442d98092@kernel.dk> Date: Mon, 15 Sep 2025 12:41:23 -0600 Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH for-next] io_uring/query: check for loops in in_query() To: Pavel Begunkov , io-uring References: <2a4811de-1e35-428d-8784-a162c0e4ea8f@kernel.dk> <6e347e14-9549-4025-bc99-d184f8244446@gmail.com> Content-Language: en-US From: Jens Axboe In-Reply-To: <6e347e14-9549-4025-bc99-d184f8244446@gmail.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 9/11/25 5:40 AM, Pavel Begunkov wrote: > On 9/11/25 10:02, Pavel Begunkov wrote: >> On 9/11/25 01:13, Jens Axboe wrote: >>> io_query() loops over query items that the application or liburing >>> passes in. But it has no checking for a max number of items, or if a >>> loop could be present. If someone were to do: >>> >>> struct io_uring_query_hdr hdr1, hdr2, hdr3; >>> >>> hdr3.next_entry = &hdr1; >>> hdr2.next_entry = &hdr3; >>> hdr1.next_entry = &hdr2; >>> >>> io_uring_register(fd, IORING_REGISTER_QUERY, &hdr1, 0); >>> >>> then it'll happily loop forever and process hdr1 -> hdr2 -> hdr3 and >>> then loop back to hdr1. >>> >>> Add a max cap for these kinds of cases, which is arbitrarily set to >>> 1024 as well. Since there's now a cap, it seems that it would be saner >>> to have this interface return the number of items processed. Eg 0..N >>> for success, and < 0 for an error. Then if someone does need to query >>> more than the supported number of items, they can do so iteratively. >> >> That worsens usability. The user would have to know / count how >> many entries there was in the first place, retry, and do all >> handling. It'll be better to: >> >> if (nr > (1U << 20)) >> return -ERANGE; >> if (fatal_signal_pending()) >> return -EINTR; >> ... >> return 0; >> >> >> 1M should be high enough for future proofing and to protect from >> mildly insane users (and would still be fast enough). I also had >> cond_resched() in some version, but apparently it got lost as >> well. > > Tested the diff below, works well enough. 1M breaks out after a > second even in a very underpowered VM. Honestly I'm not sure which of the two I dislike more, I think both are not great in terms of API. In practice, nobody is going to ask for 1000 entries. In practice, people will do one at the time. At the same time, I do like having the ability to process multiple in one syscall, even if it doesn't _really_ matter. Normally for interfaces like that, returning number of processed is the right approach. Eg when you get a signal or run into an error, you know where that happened. At the same time, it's also a pain in the butt to use for an application if it did to hundreds of then. But let's be real, it will not. It'll do a a handful at most, and then it's pretty clear where to continue. The only real error here would be -EINTR, as anything would be the applications fault because it's dumb or malicious, hence the only thing it'd do is submit the whole thing again. It's not like it's going to say "oh I got 2, which is less than the 5, let me restart at 3". But it now might have to, because it doesn't know what the error is. Anyway, that's a long winded way of saying I kind of hate needing any kind of limit, but at least with returning the number of entries processed, we can make the limit low and meaningful rather than some random high number "which is surely enough for everyone" with the sole idea behind that being that we need to be able to detect loops. And even if the interface is idempotent, it's still kind of silly to need to redo everything in case of a potentially valid error like an -EINVAL. Are we perhaps better off just killing this linking and just doing single items at the time? That nicely avoids any issues related to this, makes the 0/-ERROR interface sane, and makes the app interface simpler too. The only downside is needing a few more syscalls at probe time, which is not something worth optimizing for imho. -- Jens Axboe