Am 24.01.20 um 17:58 schrieb Jens Axboe: > On 1/24/20 3:38 AM, Stefan Metzmacher wrote: >> Am 22.01.20 um 10:26 schrieb Greg Kroah-Hartman: >>> From: Jens Axboe >>> >>> commit 44d282796f81eb1debc1d7cb53245b4cb3214cb5 upstream. >>> >>> If the credentials or the mm doesn't match, don't allow the task to >>> submit anything on behalf of this ring. The task that owns the ring can >>> pass the file descriptor to another task, but we don't want to allow >>> that task to submit an SQE that then assumes the ring mm and creds if >>> it needs to go async. >>> >>> Cc: stable@vger.kernel.org >>> Suggested-by: Stefan Metzmacher >>> Signed-off-by: Jens Axboe >>> Signed-off-by: Greg Kroah-Hartman >>> >>> >>> --- >>> fs/io_uring.c | 6 ++++++ >>> 1 file changed, 6 insertions(+) >>> >>> --- a/fs/io_uring.c >>> +++ b/fs/io_uring.c >>> @@ -3716,6 +3716,12 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned >>> wake_up(&ctx->sqo_wait); >>> submitted = to_submit; >>> } else if (to_submit) { >>> + if (current->mm != ctx->sqo_mm || >>> + current_cred() != ctx->creds) { >>> + ret = -EPERM; >>> + goto out; >>> + } >>> + >> >> I thought about this a bit more. >> >> I'm not sure if this is actually to restrictive, >> because it means applications like Samba won't >> be able to use io-uring at all. >> >> As even if current_cred() and ctx->creds describe the same >> set of uid,gids the != won't ever match again and >> makes the whole ring unuseable. >> >> I'm not sure about what the best short term solution could be... >> >> 1. May just doing the check for path based operations? >> and fail individual requests with EPERM. >> >> 2. Or force REQ_F_FORCE_ASYNC for path based operations, >> so that they're always executed from within the workqueue >> with were ctx->creds is active. >> >> 3. Or (as proposed earlier) do the override_creds/revert_creds dance >> (and similar for mm) if needed. >> >> To summaries the problem again: >> >> For path based operations like: >> - IORING_OP_CONNECT (maybe also - IORING_OP_ACCEPT???) >> - IORING_OP_SEND*, IORING_OP_RECV* on DGRAM sockets >> - IORING_OP_OPENAT, IORING_OP_STATX, IORING_OP_OPENAT2 >> it's important under which current_cred they are called. >> >> Are IORING_OP_MADVISE, IORING_OP_FADVISE and IORING_OP_FALLOCATE >> are only bound to the credentials of the passed fd they operate on? >> >> The current assumption is that the io_uring_setup() syscall captures >> the current_cred() to ctx->cred and all operations on the ring >> are executed under the context of ctx->cred. >> Therefore all helper threads do the override_creds/revert_creds dance. > > But it doesn't - we're expecting them to match, and with this change, > we assert that it's the case or return -EPERM. > >> But the possible non-blocking line execution of operations in >> the io_uring_enter() syscall doesn't do the override_creds/revert_creds >> dance and execute the operations under current_cred(). >> >> This means it's random depending on filled cached under what >> credentials an operation is executed. >> >> In order to prevent security problems the current patch is enough, >> but as outlined above it will make io-uring complete unuseable >> for applications using any syscall that changes current_cred(). >> >> Change 1. would be a little bit better, but still not really useful. >> >> I'd actually prefer solution 3. as it's still possible to make >> use of non-blocking operations, while the security is the >> same as solution 2. > > For your situation, we need to extend it anyway, and provide a way > to swap between personalities. So yeah it won't work as-is for your > use case, but we can work on making that the case. That's only for the OPENAT2 case, which we might want to use in future, but there's a lot of work required to have async opens in Samba. But I have a experimental module that, just use READV, WRITEV and FSYNC with io-uring in order to avoid our userspace helper threads. And that won't work anymore with the change as Samba change current_cred() very often switch between (at least) 2 identities root and the user. That will change the pointer of current_cred() each time. I mean I could work around the check by using IORING_SETUP_SQPOLL, but I'd like to avoid that. metze