From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.5 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B626DC433DF for ; Tue, 11 Aug 2020 01:25:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7DAFB206DC for ; Tue, 11 Aug 2020 01:25:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20150623.gappssmtp.com header.i=@kernel-dk.20150623.gappssmtp.com header.b="GRRaih/u" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727941AbgHKBZQ (ORCPT ); Mon, 10 Aug 2020 21:25:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43146 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727094AbgHKBZP (ORCPT ); Mon, 10 Aug 2020 21:25:15 -0400 Received: from mail-pj1-x1041.google.com (mail-pj1-x1041.google.com [IPv6:2607:f8b0:4864:20::1041]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DABDDC06174A for ; Mon, 10 Aug 2020 18:25:14 -0700 (PDT) Received: by mail-pj1-x1041.google.com with SMTP id i92so767392pje.0 for ; Mon, 10 Aug 2020 18:25:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=PCpfFhVUaRaWr6RthgpF9Kzm/HJjZsLLuTqDdDfEeJo=; b=GRRaih/uF2Fj9QRRqaeXSp060VJRmf2HUVA8kO0fyrDYKrIihfJF+m4bcqOqPa7I4c N/BOmFlWoX05vnCiy+fAHnFHcFJmFXIUZSSUp/zW8bF+JFrrVLnKTHNG0d3QKHM+X6lD QlGq1oYnxCJFRPduCya2O0WaIRD+yj0kEELEILn80M+OdjPzah+Nhc4Nvf52aUqhvDt9 RT62beivdIo+oVDq2FL6Hf1ta23uWA+a3+sKDS5CsMD68nBEs2wOOyKClrKOIkvWnXmm fWG20RvZrAwwULAKhsShjrEEYGHCeVxtAItPY7+puTuH1dC0NDY12vtjWY9o3MkkOXcP y0Tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=PCpfFhVUaRaWr6RthgpF9Kzm/HJjZsLLuTqDdDfEeJo=; b=l/l6Vi65xpyQnb4D+d1A9AtdLTPJ4w7gy8CcCfdUPaIm1F5zpJlNcxasNzQ+ec1v1i XYgyf74Gv544SFj5hLFcFnX7RhcGF8i9KQB2ssw4Swk3iGYcdqMXJLeVZzE4Q8ep2Gvr Ic6JwrKnbt9Q+t3SeLP5HAuR4MT5NiAUd3UJVu03ydm1yDY+/YWQ7278sSQVlgr2KFYI Rys6ZHQulzJp4yExJvG6FxYLbDVF7N3ctgj+4yRFQAZ9/g211ioD1UN1QfoWLzeP6Hcg Fk7Eivg/l+VRYzEkWsfL9oMhMIH03hjb4yhehnOdAbGckW8aX2luLSzYISywuVCn2kQt AIfg== X-Gm-Message-State: AOAM530nDXauA1D349W0ZEiOxuZy7RmaaX87pVgq6aC6zw4kIg3Qxtxm Ok6Uo1dTAiEons7JbCqAj4ldhw== X-Google-Smtp-Source: ABdhPJy6ZRX599Qb4c20ndcyS+2w/BW3ofi9VGW/qu9/wIua4zCGR63q5jRXQzMcdH29G9kgmwSXDA== X-Received: by 2002:a17:902:9683:: with SMTP id n3mr26229324plp.65.1597109114159; Mon, 10 Aug 2020 18:25:14 -0700 (PDT) Received: from [192.168.1.182] ([66.219.217.173]) by smtp.gmail.com with ESMTPSA id a26sm19410236pgm.20.2020.08.10.18.25.12 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 10 Aug 2020 18:25:13 -0700 (PDT) Subject: Re: [PATCH 2/2] io_uring: use TWA_SIGNAL for task_work if the task isn't running To: Jann Horn Cc: Peter Zijlstra , io-uring , stable , Josef , Oleg Nesterov References: <20200808183439.342243-1-axboe@kernel.dk> <20200808183439.342243-3-axboe@kernel.dk> <20200810114256.GS2674@hirez.programming.kicks-ass.net> <07df8ab4-16a8-8537-b4fe-5438bd8110cf@kernel.dk> <20200810201213.GB3982@worktop.programming.kicks-ass.net> <4a8fa719-330f-d380-522f-15d79c74ca9a@kernel.dk> <03c0e282-5317-ea45-8760-2c3f56eec0c0@kernel.dk> <20200810211057.GG3982@worktop.programming.kicks-ass.net> <5628f79b-6bfb-b054-742a-282663cb2565@kernel.dk> <1629f8a9-cee0-75f1-810a-af32968c4055@kernel.dk> From: Jens Axboe Message-ID: Date: Mon, 10 Aug 2020 19:25:11 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: io-uring-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org On 8/10/20 4:41 PM, Jann Horn wrote: > On Tue, Aug 11, 2020 at 12:01 AM Jens Axboe wrote: >> On 8/10/20 3:28 PM, Jens Axboe wrote: >>> On 8/10/20 3:26 PM, Jann Horn wrote: >>>> On Mon, Aug 10, 2020 at 11:12 PM Jens Axboe wrote: >>>>> On 8/10/20 3:10 PM, Peter Zijlstra wrote: >>>>>> On Mon, Aug 10, 2020 at 03:06:49PM -0600, Jens Axboe wrote: >>>>>> >>>>>>> should work as far as I can tell, but I don't even know if there's a >>>>>>> reliable way to do task_in_kernel(). >>>>>> >>>>>> Only on NOHZ_FULL, and tracking that is one of the things that makes it >>>>>> so horribly expensive. >>>>> >>>>> Probably no other way than to bite the bullet and just use TWA_SIGNAL >>>>> unconditionally... >>>> >>>> Why are you trying to avoid using TWA_SIGNAL? Is there a specific part >>>> of handling it that's particularly slow? >>> >>> Not particularly slow, but it's definitely heavier than TWA_RESUME. And >>> as we're driving any pollable async IO through this, just trying to >>> ensure it's as light as possible. >>> >>> It's not a functional thing, just efficiency. >> >> Ran some quick testing in a vm, which is worst case for this kind of >> thing as any kind of mucking with interrupts is really slow. And the hit >> is substantial. Though with the below, we're basically at parity again. >> Just for discussion... >> >> >> diff --git a/kernel/task_work.c b/kernel/task_work.c >> index 5c0848ca1287..ea2c683c8563 100644 >> --- a/kernel/task_work.c >> +++ b/kernel/task_work.c >> @@ -42,7 +42,8 @@ task_work_add(struct task_struct *task, struct callback_head *work, int notify) >> set_notify_resume(task); >> break; >> case TWA_SIGNAL: >> - if (lock_task_sighand(task, &flags)) { >> + if (!(task->jobctl & JOBCTL_TASK_WORK) && >> + lock_task_sighand(task, &flags)) { >> task->jobctl |= JOBCTL_TASK_WORK; >> signal_wake_up(task, 0); >> unlock_task_sighand(task, &flags); > > I think that should work in theory, but if you want to be able to do a > proper unlocked read of task->jobctl here, then I think you'd have to > use READ_ONCE() here and make all existing writes to ->jobctl use > WRITE_ONCE(). > > Also, I think that to make this work, stuff like get_signal() will > need to use memory barriers to ensure that reads from ->task_works are > ordered after ->jobctl has been cleared - ideally written such that on > the fastpath, the memory barrier doesn't execute. I wonder if it's possible to just make it safe for the io_uring case, since a bigger change would make this performance regression persistent in this release... Would still require the split add/notification patch, but that one is trivial. -- Jens Axboe