From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63BF5C3F2CD for ; Mon, 2 Mar 2020 14:39:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 362CA2166E for ; Mon, 2 Mar 2020 14:39:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20150623.gappssmtp.com header.i=@kernel-dk.20150623.gappssmtp.com header.b="OGDWm1iE" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727121AbgCBOjK (ORCPT ); Mon, 2 Mar 2020 09:39:10 -0500 Received: from mail-io1-f66.google.com ([209.85.166.66]:44497 "EHLO mail-io1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727131AbgCBOjK (ORCPT ); Mon, 2 Mar 2020 09:39:10 -0500 Received: by mail-io1-f66.google.com with SMTP id u17so6781473iog.11 for ; Mon, 02 Mar 2020 06:39:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=VF4NFZmN5o/HRzGoFEDSL5KAYzABdGMBxEm52R7GbTc=; b=OGDWm1iEUdp2JmA24XXx32+8bhLiI0aeWpsUVLVV08trU/8bxFz8gZA88Y0rJUoxc3 NX7NaMUkAkW0l27/InvwueG6kdzoO60vUELABusPUeae9iRP7pv97GXg6pkdjzMZNS32 kWKe+0JdNkDXnzvv/cZX/G2CTbQwau7FiskmcUWtx7DN+231YDZJ3BUBO/A0d6IMnN5c ry2bRg91AZ7KKysvI3g5iwWlGqQY0/lzA4BTPUQiC/f17hnjwL3Mtmkya8AcCgymKBMd L+yYDP8FsSwyNyMVts79wNcAEWGHM8U46GIKcwxc+IqUjGyFPqKwHwhZuKHfsjhSYCZ9 NREQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=VF4NFZmN5o/HRzGoFEDSL5KAYzABdGMBxEm52R7GbTc=; b=elds11fZnL89csG+MBAULSV5F5NECGvEXu9YNyAGjOkfT8ebolIc8GGDy5aoBP01Fd 40Lie+VuhJWWNg+Odu/uedJ7sG3znYglA8LsFJPyxxG+kHOO9oENkEbbog0f9+YB4S2W xf66Qzt8AJOGvr0Co713HR+nKPDrEVSFffYihjbxTW4LkJmLWXp3y+rk1369u13+oOQh 7GztMT8VEmvVURwWvR+rqlNKEFNj9+ZY/nBUdhw7hiU3tX3CyO8JD4t8bpvILMC+lMls GB7+DmsWz7XBUxDzx+QRDqWHZD9w3b+dWK/zVXQuYyIl30+IhLlWg2xMLtcmDrB9DKnX DF8Q== X-Gm-Message-State: APjAAAVDgXEaAZme8acoSzyu483S1BLAmziQ0Ec5k0+APY4ZFUr7/W4V 49wePQGyFRak6K7yuQhezcd6GSWWQrY= X-Google-Smtp-Source: APXvYqyGxOkQmtUr5WLL2P8ZNd/bicDvtZpWSMvqDHmEvVINVFPmRv9aABrGn9xbzpIs9zlcOxWibA== X-Received: by 2002:a5d:8b11:: with SMTP id k17mr13402720ion.290.1583159949731; Mon, 02 Mar 2020 06:39:09 -0800 (PST) Received: from [192.168.1.159] ([65.144.74.34]) by smtp.gmail.com with ESMTPSA id u15sm3656525iog.15.2020.03.02.06.39.08 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 02 Mar 2020 06:39:09 -0800 (PST) Subject: Re: [PATCH RFC 0/9] nxt propagation + locking optimisation To: Pavel Begunkov , io-uring@vger.kernel.org, linux-kernel@vger.kernel.org References: From: Jens Axboe Message-ID: <2f974d5e-8aa4-5b12-d70e-668384b4a94c@kernel.dk> Date: Mon, 2 Mar 2020 07:39:08 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: io-uring-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org On 3/1/20 1:33 PM, Pavel Begunkov wrote: > On 01/03/2020 22:14, Jens Axboe wrote: >> On 3/1/20 9:18 AM, Pavel Begunkov wrote: >>> There are several independent parts in the patchset, but bundled >>> to make a point. >>> 1-2: random stuff, that implicitly used later. >>> 3-5: restore @nxt propagation >>> 6-8: optimise locking in io_worker_handle_work() >>> 9: optimise io_uring refcounting >>> >>> The next propagation bits are done similarly as it was before, but >>> - nxt stealing is now at top-level, but not hidden in handlers >>> - ensure there is no with REQ_F_DONT_STEAL_NEXT >>> >>> [6-8] is the reason to dismiss the previous @nxt propagation appoach, >>> I didn't found a good way to do the same. Even though it looked >>> clearer and without new flag. >>> >>> Performance tested it with link-of-nops + IOSQE_ASYNC: >>> >>> link size: 100 >>> orig: 501 (ns per nop) >>> 0-8: 446 >>> 0-9: 416 >>> >>> link size: 10 >>> orig: 826 >>> 0-8: 776 >>> 0-9: 756 >> >> This looks nice, I'll take a closer look tomorrow or later today. Seems >> that at least patch 2 should go into 5.6 however, so may make sense to >> order the series like that. > > It's the first one modifying io-wq.c, so should be fine to pick from > the middle as is. Yep, just did. >> BTW, Andres brought up a good point, and that's hashed file write works. >> Generally they complete super fast (just copying into the page cache), >> which means that that worker will be hammering the wq lock a lot. Since >> work N+1 can't make any progress before N completes (since that's how >> hashed work works), we should pull a bigger batch of these work items >> instead of just one at the time. I think that'd potentially make a huge >> difference for the performance of buffered writes. > > Flashed the same thought. It should be easy enough for hashed > requests. Though, general batching would make us to think about > fairness, work stealing, etc. There's only the one list anyway, so the work is doing to be processed in order to begin with. Hence I don't think there's a lot of fairness to be worried about here, we're just going to be processing the existing work in the same order, but more efficiently. We should be getting both better throughput and fairness if we remove all items hashed to the same key for that one worker, only stopping if we encounter a non-hashed work or work hashed to a different key. Because if we do, if any of that hashed work ever needs to sleep, the next independent work can proceed in a different worker. > BTW, what's the point of hashing only heads of a link? Sounds like it > can lead to the write-write collisions, which it tries to avoid. Yeah, the linked items should be hashed as well, not sure why that isn't done. >> Just throwing it out there, since you're working in that space anyway >> and the rewards will be much larger. > > I will take a look, but not sure when, I yet have some hunches myself. Thanks! -- Jens Axboe