From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92B5DC43331 for ; Sat, 9 Nov 2019 12:28:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6493C2077C for ; Sat, 9 Nov 2019 12:28:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ax9K7hDP" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726301AbfKIM2i (ORCPT ); Sat, 9 Nov 2019 07:28:38 -0500 Received: from mail-lj1-f196.google.com ([209.85.208.196]:37079 "EHLO mail-lj1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726281AbfKIM2i (ORCPT ); Sat, 9 Nov 2019 07:28:38 -0500 Received: by mail-lj1-f196.google.com with SMTP id l21so1647675ljg.4 for ; Sat, 09 Nov 2019 04:28:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:references:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=LUzle+8m7N2+/CrboC7PjFvy2kEWHv1Jo7jzU2U5xko=; b=ax9K7hDPzAEvUqOoCLqVhADPjf9Pr2xzA27qXWrKdsT4oUR+Pv0PAq5lsoFv2N8U5x QafQFKvRAhS91vqzYLh9VpdGPpKJu0UilDzRw17QhJ7z15dGD54DZQngaBIsbolNKrqu HdEkOQu7sxbzwpnjnTCy399er0KLz/wkh4sxUnBY/wMXK3PcbpNVtlFDmR24x39OXbYc zhQPKXYMoofkGgGZK7TH5TKhFwpOdFXdNLabYkjyhWjNO7hFKQC6GgVerwgYqpHCnvE0 bWmHJ/anAZnKORXVusdj34FKi1bV+HLlA+G6SLOXe6ry597gm0RYxcrR5sHoLZ6OUFJC uIbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:references:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=LUzle+8m7N2+/CrboC7PjFvy2kEWHv1Jo7jzU2U5xko=; b=pF80nrWdfvy/zCXNWHF/Gt6UVzeLzgYgvCdM9gFs/8cmEq42IZAbUESPxWMxNPqFuz wtWdYFALsjsgRgAbxNcp/HLWq12dRVaUwSR9sirQBQA39Q0k+28iU0jo3+Qh0i+GjXZy Mu3PBrdWluBTSj8AiYov1kZL+v7wbNjmA67KJ7r2J++bujJRgDlT9ST/LkM88gw6sHZH LFxWXSrQ2JFKN/I9GBypbR4CEiu2ODov2Wr5C+5ec+HIagW/EH7vLEIq4VnAAORapC/q 3dQW+AtLXh3lq35xAiScjl/Y27HeFa/H/p3VIg5+yItKj0zWxXHzLmDrgb9YtmYg3hZY ttTA== X-Gm-Message-State: APjAAAVrixOO+ERcGEpXDNPrr68Sjgz2yAQqz/0kf5siiKEIZ0/9owz8 epfh+99ZvmCWehSLXpUNaMRggNh1 X-Google-Smtp-Source: APXvYqyGJrvtSajRlXkfVvrU80bzQV2uIAL8pe2dBgfSDdnnfxWtXNrPsscZX9Jf4Qn4mN1utYKsow== X-Received: by 2002:a2e:9d97:: with SMTP id c23mr9897935ljj.121.1573302514873; Sat, 09 Nov 2019 04:28:34 -0800 (PST) Received: from [172.31.190.83] ([86.57.146.226]) by smtp.gmail.com with ESMTPSA id r19sm3973444lfi.13.2019.11.09.04.28.34 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 09 Nov 2019 04:28:34 -0800 (PST) Subject: Re: [PATCH RFC v2] io_uring: limit inflight IO From: Pavel Begunkov To: Jens Axboe , io-uring@vger.kernel.org References: <6945baa5-7539-1d67-fcb5-5642e81994a8@gmail.com> Message-ID: <614016e6-9e94-0b26-1652-8ce140952d28@gmail.com> Date: Sat, 9 Nov 2019 15:28:33 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <6945baa5-7539-1d67-fcb5-5642e81994a8@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: io-uring-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org On 11/9/2019 2:01 PM, Pavel Begunkov wrote: > On 09/11/2019 00:10, Jens Axboe wrote: >> Here's a modified version for discussion. Instead of sizing this on the >> specific ring, just size it based on the max allowable CQ ring size. I >> think this should be safer, and won't risk breaking existing use cases >> out there. >> >> The reasoning here is that we already limit the ring sizes globally, >> they are under ulimit memlock protection. With this on top, we have some >> sort of safe guard for the system as a whole as well, whereas before we >> had none. Even a small ring size can keep queuing IO. >> >> Let me know what you guys think... >> >> diff --git a/fs/io_uring.c b/fs/io_uring.c >> index 29ea1106132d..0d8c3b1612af 100644 >> --- a/fs/io_uring.c >> +++ b/fs/io_uring.c >> @@ -737,6 +737,25 @@ static struct io_kiocb *io_get_fallback_req(struct io_ring_ctx *ctx) >> return NULL; >> } >> >> +static bool io_req_over_limit(struct io_ring_ctx *ctx) >> +{ >> + unsigned inflight; >> + >> + /* >> + * This doesn't need to be super precise, so only check every once >> + * in a while. >> + */ >> + if (ctx->cached_sq_head & ctx->sq_mask) >> + return false; >> + >> + /* >> + * Use 2x the max CQ ring size >> + */ >> + inflight = ctx->cached_sq_head - >> + (ctx->cached_cq_tail + atomic_read(&ctx->cached_cq_overflow)); >> + return inflight >= 2 * IORING_MAX_CQ_ENTRIES; >> +} > > ctx->cached_cq_tail protected by ctx->completion_lock and can be > changed asynchronously. That's a not synchronised access, so > formally (probably) breaks the memory model. > > False values shouldn't be a problem here, but anyway. > Took a glance, it seems access to cached_cq_tail is already messed up in other places. Do I miss something? > >> + >> static struct io_kiocb *io_get_req(struct io_ring_ctx *ctx, >> struct io_submit_state *state) >> { >> @@ -744,9 +763,11 @@ static struct io_kiocb *io_get_req(struct io_ring_ctx *ctx, >> struct io_kiocb *req; >> >> if (!percpu_ref_tryget(&ctx->refs)) >> - return NULL; >> + return ERR_PTR(-ENXIO); >> >> if (!state) { >> + if (unlikely(io_req_over_limit(ctx))) >> + goto out_limit; >> req = kmem_cache_alloc(req_cachep, gfp); >> if (unlikely(!req)) >> goto fallback; >> @@ -754,6 +775,8 @@ static struct io_kiocb *io_get_req(struct io_ring_ctx *ctx, >> size_t sz; >> int ret; >> >> + if (unlikely(io_req_over_limit(ctx))) >> + goto out_limit; >> sz = min_t(size_t, state->ios_left, ARRAY_SIZE(state->reqs)); >> ret = kmem_cache_alloc_bulk(req_cachep, gfp, sz, state->reqs); >> >> @@ -789,8 +812,9 @@ static struct io_kiocb *io_get_req(struct io_ring_ctx *ctx, >> req = io_get_fallback_req(ctx); >> if (req) >> goto got_it; >> +out_limit: >> percpu_ref_put(&ctx->refs); >> - return NULL; >> + return ERR_PTR(-EBUSY); >> } >> >> static void io_free_req_many(struct io_ring_ctx *ctx, void **reqs, int *nr) >> @@ -3016,9 +3040,9 @@ static int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr, >> unsigned int sqe_flags; >> >> req = io_get_req(ctx, statep); >> - if (unlikely(!req)) { >> + if (unlikely(IS_ERR(req))) { >> if (!submitted) >> - submitted = -EAGAIN; >> + submitted = PTR_ERR(req); >> break; >> } >> if (!io_get_sqring(ctx, &req->submit)) { >> @@ -3039,8 +3063,10 @@ static int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr, >> if (link && (sqe_flags & IOSQE_IO_DRAIN)) { >> if (!shadow_req) { >> shadow_req = io_get_req(ctx, NULL); >> - if (unlikely(!shadow_req)) >> + if (unlikely(IS_ERR(shadow_req))) { >> + shadow_req = NULL; >> goto out; >> + } >> shadow_req->flags |= (REQ_F_IO_DRAIN | REQ_F_SHADOW_DRAIN); >> refcount_dec(&shadow_req->refs); >> } >> >