From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.6 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 756D1C4727F for ; Tue, 22 Sep 2020 00:58:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3B8D523A79 for ; Tue, 22 Sep 2020 00:58:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1600736321; bh=mO4DYr7N0RLjWZa37eU/VgHJpvS4aRf2iPegAH0iEtc=; h=References:In-Reply-To:From:Date:Subject:To:Cc:List-ID:From; b=ecwn9S40EagrCeRFGzzuJqxSj2pTZKfaRggU9Udb30lCVJkYVdPO7lmIfsFUmwZJD EmMewrFyyuxmVQKSkMORtnIWcrvGdz/tMusq8hwuzzLZ0pvaytnlFAgJTXztoymXmf b16FWyJDomlAq7vJVw0sG0TgL90sBscgnjDHbLwk= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729411AbgIVA6k (ORCPT ); Mon, 21 Sep 2020 20:58:40 -0400 Received: from mail.kernel.org ([198.145.29.99]:51060 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729422AbgIVA6g (ORCPT ); Mon, 21 Sep 2020 20:58:36 -0400 Received: from mail-wm1-f54.google.com (mail-wm1-f54.google.com [209.85.128.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 96B7623A9B for ; Tue, 22 Sep 2020 00:58:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1600736314; bh=mO4DYr7N0RLjWZa37eU/VgHJpvS4aRf2iPegAH0iEtc=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=ZvsltSwjMQQjZiYWUVNiO25TW5AUeBh08HcndWK7Xe35RS/hY9lrzTe2a+eWc93Vs W63+BpGaM0/hEmj0H5NNiiF+B4Dto3ZfhGCuYSg/VL/ifmFADrmP21pnAjqQPEbpTf GPIWfRyXGAJ/aoCVrG5ZycMOv5bit2MDE6Nn2vFw= Received: by mail-wm1-f54.google.com with SMTP id z9so1621448wmk.1 for ; Mon, 21 Sep 2020 17:58:34 -0700 (PDT) X-Gm-Message-State: AOAM530kAZHGY07nFWqSSrPLxaC9tUbYGXQAS+zl4xTV1Aw4pkYOJTve hSPZ86NIERTXjqCQrhQuI4C8k1hKWakUG3wUEzpz5A== X-Google-Smtp-Source: ABdhPJzzXZz8FGhOqJIy3SOg0uhFjle8bEWKILqBSG3M41u7cJLjsHkjpP4JLeSfHDUPrx8gGM2D3rP9PVKnU83D0i8= X-Received: by 2002:a1c:740c:: with SMTP id p12mr1761323wmc.176.1600736312695; Mon, 21 Sep 2020 17:58:32 -0700 (PDT) MIME-Version: 1.0 References: <563138b5-7073-74bc-f0c5-b2bad6277e87@gmail.com> <486c92d0-0f2e-bd61-1ab8-302524af5e08@gmail.com> In-Reply-To: From: Andy Lutomirski Date: Mon, 21 Sep 2020 17:58:20 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag To: Pavel Begunkov Cc: Andy Lutomirski , Arnd Bergmann , Christoph Hellwig , Al Viro , Andrew Morton , Jens Axboe , David Howells , linux-arm-kernel , X86 ML , LKML , "open list:MIPS" , Parisc List , linuxppc-dev , linux-s390 , sparclinux , linux-block , Linux SCSI List , Linux FS Devel , linux-aio , io-uring@vger.kernel.org, linux-arch , Linux-MM , Network Development , keyrings@vger.kernel.org, LSM List Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org On Mon, Sep 21, 2020 at 5:24 PM Pavel Begunkov wro= te: > > > > On 22/09/2020 02:51, Andy Lutomirski wrote: > > On Mon, Sep 21, 2020 at 9:15 AM Pavel Begunkov = wrote: > >> > >> On 21/09/2020 19:10, Pavel Begunkov wrote: > >>> On 20/09/2020 01:22, Andy Lutomirski wrote: > >>>> > >>>>> On Sep 19, 2020, at 2:16 PM, Arnd Bergmann wrote: > >>>>> > >>>>> =EF=BB=BFOn Sat, Sep 19, 2020 at 6:21 PM Andy Lutomirski wrote: > >>>>>>> On Fri, Sep 18, 2020 at 8:16 AM Christoph Hellwig wr= ote: > >>>>>>> On Fri, Sep 18, 2020 at 02:58:22PM +0100, Al Viro wrote: > >>>>>>>> Said that, why not provide a variant that would take an explicit > >>>>>>>> "is it compat" argument and use it there? And have the normal > >>>>>>>> one pass in_compat_syscall() to that... > >>>>>>> > >>>>>>> That would help to not introduce a regression with this series ye= s. > >>>>>>> But it wouldn't fix existing bugs when io_uring is used to access > >>>>>>> read or write methods that use in_compat_syscall(). One example = that > >>>>>>> I recently ran into is drivers/scsi/sg.c. > >>>>> > >>>>> Ah, so reading /dev/input/event* would suffer from the same issue, > >>>>> and that one would in fact be broken by your patch in the hypotheti= cal > >>>>> case that someone tried to use io_uring to read /dev/input/event on= x32... > >>>>> > >>>>> For reference, I checked the socket timestamp handling that has a > >>>>> number of corner cases with time32/time64 formats in compat mode, > >>>>> but none of those appear to be affected by the problem. > >>>>> > >>>>>> Aside from the potentially nasty use of per-task variables, one th= ing > >>>>>> I don't like about PF_FORCE_COMPAT is that it's one-way. If we're > >>>>>> going to have a generic mechanism for this, shouldn't we allow a f= ull > >>>>>> override of the syscall arch instead of just allowing forcing comp= at > >>>>>> so that a compat syscall can do a non-compat operation? > >>>>> > >>>>> The only reason it's needed here is that the caller is in a kernel > >>>>> thread rather than a system call. Are there any possible scenarios > >>>>> where one would actually need the opposite? > >>>>> > >>>> > >>>> I can certainly imagine needing to force x32 mode from a kernel thre= ad. > >>>> > >>>> As for the other direction: what exactly are the desired bitness/arc= h semantics of io_uring? Is the operation bitness chosen by the io_uring c= reation or by the io_uring_enter() bitness? > >>> > >>> It's rather the second one. Even though AFAIR it wasn't discussed > >>> specifically, that how it works now (_partially_). > >> > >> Double checked -- I'm wrong, that's the former one. Most of it is base= d > >> on a flag that was set an creation. > >> > > > > Could we get away with making io_uring_enter() return -EINVAL (or > > maybe -ENOTTY?) if you try to do it with bitness that doesn't match > > the io_uring? And disable SQPOLL in compat mode? > > Something like below. If PF_FORCE_COMPAT or any other solution > doesn't lend by the time, I'll take a look whether other io_uring's > syscalls need similar checks, etc. > > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 0458f02d4ca8..aab20785fa9a 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -8671,6 +8671,10 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, = u32, to_submit, > if (ctx->flags & IORING_SETUP_R_DISABLED) > goto out; > > + ret =3D -EINVAl; > + if (ctx->compat !=3D in_compat_syscall()) > + goto out; > + This seems entirely reasonable to me. Sharing an io_uring ring between programs with different ABIs seems a bit nutty. > /* > * For SQ polling, the thread will do all submissions and complet= ions. > * Just return the requested submit count, and wake the thread if > @@ -9006,6 +9010,10 @@ static int io_uring_create(unsigned entries, struc= t io_uring_params *p, > if (ret) > goto err; > > + ret =3D -EINVAL; > + if (ctx->compat) > + goto err; > + I may be looking at a different kernel than you, but aren't you preventing creating an io_uring regardless of whether SQPOLL is requested? > /* Only gets the ring fd, doesn't install it in the file table */ > fd =3D io_uring_get_fd(ctx, &file); > if (fd < 0) { > -- > Pavel Begunkov