From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E72C3C6FA86 for ; Mon, 5 Sep 2022 08:12:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236523AbiIEIMd (ORCPT ); Mon, 5 Sep 2022 04:12:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56226 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235312AbiIEIMc (ORCPT ); Mon, 5 Sep 2022 04:12:32 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BAF9A3ED5D; Mon, 5 Sep 2022 01:12:30 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 6CC095FCCC; Mon, 5 Sep 2022 08:12:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1662365549; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=vLBCtbeiMkaT10it/2QfW3oy9NGXJ0931FNcJJeTs+E=; b=hvJuONTlqXa3IZLVuzAZ3mvfimjG/HG5K1UYXiiMd6Jxc7MnBmd8EvFQgw5oSEYQnG/f0V +51wd/30Jfiv7ka9ky5kghqEmUu9AoehBsL4FA/S3udtvnMmZ/ybc1m3/Htxqdr3KL53Wv TIehxfs6fvCI+DApLUGQDHmpY/iq5bI= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 4046313A66; Mon, 5 Sep 2022 08:12:29 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id vpI8D22vFWNpBAAAMHmgww (envelope-from ); Mon, 05 Sep 2022 08:12:29 +0000 Date: Mon, 5 Sep 2022 10:12:28 +0200 From: Michal Hocko To: Suren Baghdasaryan Cc: Kent Overstreet , Mel Gorman , Peter Zijlstra , Andrew Morton , Vlastimil Babka , Johannes Weiner , Roman Gushchin , Davidlohr Bueso , Matthew Wilcox , "Liam R. Howlett" , David Vernet , Juri Lelli , Laurent Dufour , Peter Xu , David Hildenbrand , Jens Axboe , mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Benjamin Segall , Daniel Bristot de Oliveira , Valentin Schneider , Christopher Lameter , Pekka Enberg , Joonsoo Kim , 42.hyeyoo@gmail.com, Alexander Potapenko , Marco Elver , Dmitry Vyukov , Shakeel Butt , Muchun Song , arnd@arndb.de, jbaron@akamai.com, David Rientjes , Minchan Kim , Kalesh Singh , kernel-team , linux-mm , iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, LKML Subject: Re: [RFC PATCH 00/30] Code tagging framework and applications Message-ID: References: <20220830214919.53220-1-surenb@google.com> <20220831084230.3ti3vitrzhzsu3fs@moria.home.lan> <20220831101948.f3etturccmp5ovkl@suse.de> <20220831190154.qdlsxfamans3ya5j@moria.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org On Sun 04-09-22 18:32:58, Suren Baghdasaryan wrote: > On Thu, Sep 1, 2022 at 12:15 PM Michal Hocko wrote: [...] > > Yes, tracking back the call trace would be really needed. The question > > is whether this is really prohibitively expensive. How much overhead are > > we talking about? There is no free lunch here, really. You either have > > the overhead during runtime when the feature is used or on the source > > code level for all the future development (with a maze of macros and > > wrappers). > > As promised, I profiled a simple code that repeatedly makes 10 > allocations/frees in a loop and measured overheads of code tagging, > call stack capturing and tracing+BPF for page and slab allocations. > Summary: > > Page allocations (overheads are compared to get_free_pages() duration): > 6.8% Codetag counter manipulations (__lazy_percpu_counter_add + __alloc_tag_add) > 8.8% lookup_page_ext > 1237% call stack capture > 139% tracepoint with attached empty BPF program Yes, I am not surprised that the call stack capturing is really expensive comparing to the allocator fast path (which is really highly optimized and I suspect that with 10 allocation/free loop you mostly get your memory from the pcp lists). Is this overhead still _that_ visible for somehow less microoptimized workloads which have to take slow paths as well? Also what kind of stack unwinder is configured (I guess ORC)? This is not my area but from what I remember the unwinder overhead varies between ORC and FP. And just to make it clear. I do realize that an overhead from the stack unwinding is unavoidable. And code tagging would logically have lower overhead as it performs much less work. But the main point is whether our existing stack unwiding approach is really prohibitively expensive to be used for debugging purposes on production systems. I might misremember but I recall people having bigger concerns with page_owner memory footprint than the actual stack unwinder overhead. -- Michal Hocko SUSE Labs