* Re: [PATCH] fs/io_uring.c: fix null ptr deference in io_send_recvmsg() [not found] <20200804125637.GA22088@ubuntu> @ 2020-08-04 13:18 ` Pavel Begunkov 2020-08-04 13:27 ` Jens Axboe 0 siblings, 1 reply; 6+ messages in thread From: Pavel Begunkov @ 2020-08-04 13:18 UTC (permalink / raw) To: Liu Yong, axboe; +Cc: linux-block, io-uring On 04/08/2020 15:56, Liu Yong wrote: > In io_send_recvmsg(), there is no check for the req->file. > User can change the opcode from IORING_OP_NOP to IORING_OP_SENDMSG > through competition after the io_req_set_file(). After sqe->opcode is read and copied in io_init_req(), it only uses in-kernel req->opcode. Also, io_init_req() should check for req->file NULL, so shouldn't happen after. Do you have a reproducer? What kernel version did you use? > > This vulnerability will leak sensitive kernel information. > > [352693.910110] BUG: kernel NULL pointer dereference, address: 0000000000000028 > [352693.910112] #PF: supervisor read access in kernel mode > [352693.910112] #PF: error_code(0x0000) - not-present page > [352693.910113] PGD 8000000251396067 P4D 8000000251396067 PUD 1d64ba067 PMD 0 > [352693.910115] Oops: 0000 [#3] SMP PTI > [352693.910116] CPU: 11 PID: 303132 Comm: test Tainted: G D > [352693.910117] Hardware name: Dell Inc. OptiPlex 3060/0T0MHW, BIOS 1.4.2 06/11/2019 > [352693.910120] RIP: 0010:sock_from_file+0x9/0x30 > [352693.910122] RSP: 0018:ffffc0a5084cfc50 EFLAGS: 00010246 > [352693.910122] RAX: ffff9f6ee284d000 RBX: ffff9f6bd3675000 RCX: ffffffff8b111340 > [352693.910123] RDX: 0000000000000001 RSI: ffffc0a5084cfc64 RDI: 0000000000000000 > [352693.910124] RBP: ffffc0a5084cfc50 R08: 0000000000000000 R09: ffff9f6ee51a9200 > [352693.910124] R10: ffff9f6ee284d200 R11: 0000000000000000 R12: ffff9f6ee51a9200 > [352693.910125] R13: 0000000000000001 R14: ffffffff8b111340 R15: ffff9f6ee284d000 > [352693.910126] FS: 00000000016d7880(0000) GS:ffff9f6eee2c0000(0000) knlGS:0000000000000000 > [352693.910126] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [352693.910127] CR2: 0000000000000028 CR3: 000000041fb4a005 CR4: 00000000003626e0 > [352693.910127] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [352693.910128] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [352693.910128] Call Trace: > [352693.910132] io_send_recvmsg+0x49/0x170 > [352693.910134] ? __switch_to_asm+0x34/0x70 > [352693.910135] __io_submit_sqe+0x45e/0x8e0 > [352693.910136] ? __switch_to_asm+0x34/0x70 > [352693.910137] ? __switch_to_asm+0x40/0x70 > [352693.910138] ? __switch_to_asm+0x34/0x70 > [352693.910138] ? __switch_to_asm+0x40/0x70 > [352693.910139] ? __switch_to_asm+0x34/0x70 > [352693.910140] ? __switch_to_asm+0x40/0x70 > [352693.910141] ? __switch_to_asm+0x34/0x70 > [352693.910142] ? __switch_to_asm+0x40/0x70 > [352693.910143] ? __switch_to_asm+0x34/0x70 > [352693.910144] ? __switch_to_asm+0x34/0x70 > [352693.910145] __io_queue_sqe+0x23/0x230 > [352693.910146] io_queue_sqe+0x7a/0x90 > [352693.910148] io_submit_sqe+0x23d/0x330 > [352693.910149] io_ring_submit+0xca/0x200 > [352693.910150] ? do_nanosleep+0xad/0x160 > [352693.910151] ? hrtimer_init_sleeper+0x2c/0x90 > [352693.910152] ? hrtimer_nanosleep+0xc2/0x1a0 > [352693.910154] __x64_sys_io_uring_enter+0x1e4/0x2c0 > [352693.910156] do_syscall_64+0x57/0x190 > [352693.910157] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > Signed-off-by: Liu Yong <[email protected]> > --- > fs/io_uring.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index e0200406765c..0a26100b8260 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -1675,6 +1675,9 @@ static int io_send_recvmsg(struct io_kiocb *req, const struct io_uring_sqe *sqe, > if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL)) > return -EINVAL; > > + if (!req->file) > + return -EBADF; > + > sock = sock_from_file(req->file, &ret); > if (sock) { > struct user_msghdr __user *msg; > -- Pavel Begunkov ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] fs/io_uring.c: fix null ptr deference in io_send_recvmsg() 2020-08-04 13:18 ` [PATCH] fs/io_uring.c: fix null ptr deference in io_send_recvmsg() Pavel Begunkov @ 2020-08-04 13:27 ` Jens Axboe [not found] ` <CAGAoTxzadSphnE2aLsFKS04TjTKYVq2uLFgH9dvLPwWiyqEGEQ@mail.gmail.com> 0 siblings, 1 reply; 6+ messages in thread From: Jens Axboe @ 2020-08-04 13:27 UTC (permalink / raw) To: Pavel Begunkov, Liu Yong; +Cc: linux-block, io-uring On 8/4/20 7:18 AM, Pavel Begunkov wrote: > On 04/08/2020 15:56, Liu Yong wrote: >> In io_send_recvmsg(), there is no check for the req->file. >> User can change the opcode from IORING_OP_NOP to IORING_OP_SENDMSG >> through competition after the io_req_set_file(). > > After sqe->opcode is read and copied in io_init_req(), it only uses > in-kernel req->opcode. Also, io_init_req() should check for req->file > NULL, so shouldn't happen after. > > Do you have a reproducer? What kernel version did you use? Was looking at this too, and I'm guessing this is some 5.4 based kernel. Unfortunately the oops doesn't include that information. -- Jens Axboe ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <CAGAoTxzadSphnE2aLsFKS04TjTKYVq2uLFgH9dvLPwWiyqEGEQ@mail.gmail.com>]
* Re: [PATCH] fs/io_uring.c: fix null ptr deference in io_send_recvmsg() [not found] ` <CAGAoTxzadSphnE2aLsFKS04TjTKYVq2uLFgH9dvLPwWiyqEGEQ@mail.gmail.com> @ 2020-08-04 17:15 ` Jens Axboe 2020-08-04 21:55 ` Jens Axboe 0 siblings, 1 reply; 6+ messages in thread From: Jens Axboe @ 2020-08-04 17:15 UTC (permalink / raw) To: xiao lin; +Cc: Pavel Begunkov, [email protected], io-uring On 8/4/20 11:02 AM, xiao lin wrote: > 在 2020年8月4日星期二,Jens Axboe <[email protected] <mailto:[email protected]>> 写道: > > On 8/4/20 7:18 AM, Pavel Begunkov wrote: > > On 04/08/2020 15:56, Liu Yong wrote: > >> In io_send_recvmsg(), there is no check for the req->file. > >> User can change the opcode from IORING_OP_NOP to IORING_OP_SENDMSG > >> through competition after the io_req_set_file(). > > > > After sqe->opcode is read and copied in io_init_req(), it only uses > > in-kernel req->opcode. Also, io_init_req() should check for req->file > > NULL, so shouldn't happen after. > > > > Do you have a reproducer? What kernel version did you use? > > Was looking at this too, and I'm guessing this is some 5.4 based kernel. > Unfortunately the oops doesn't include that information. > Sorry, I forgot to mention that the kernel version I am using is 5.4.55. I think there are two options here: 1) Backport the series that ensured we only read those important bits once 2) Make s->sqe a full sqe, and memcpy it in -- Jens Axboe ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] fs/io_uring.c: fix null ptr deference in io_send_recvmsg() 2020-08-04 17:15 ` Jens Axboe @ 2020-08-04 21:55 ` Jens Axboe 2020-08-05 3:40 ` Liu Yong 0 siblings, 1 reply; 6+ messages in thread From: Jens Axboe @ 2020-08-04 21:55 UTC (permalink / raw) To: xiao lin; +Cc: Pavel Begunkov, [email protected], io-uring On 8/4/20 11:15 AM, Jens Axboe wrote: > On 8/4/20 11:02 AM, xiao lin wrote: >> 在 2020年8月4日星期二,Jens Axboe <[email protected] <mailto:[email protected]>> 写道: >> >> On 8/4/20 7:18 AM, Pavel Begunkov wrote: >> > On 04/08/2020 15:56, Liu Yong wrote: >> >> In io_send_recvmsg(), there is no check for the req->file. >> >> User can change the opcode from IORING_OP_NOP to IORING_OP_SENDMSG >> >> through competition after the io_req_set_file(). >> > >> > After sqe->opcode is read and copied in io_init_req(), it only uses >> > in-kernel req->opcode. Also, io_init_req() should check for req->file >> > NULL, so shouldn't happen after. >> > >> > Do you have a reproducer? What kernel version did you use? >> >> Was looking at this too, and I'm guessing this is some 5.4 based kernel. >> Unfortunately the oops doesn't include that information. > >> Sorry, I forgot to mention that the kernel version I am using is 5.4.55. > > I think there are two options here: > > 1) Backport the series that ensured we only read those important bits once > 2) Make s->sqe a full sqe, and memcpy it in Something like this should close the ->opcode re-read for 5.4-stable. diff --git a/fs/io_uring.c b/fs/io_uring.c index e0200406765c..8bb5e19b7c3c 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -279,6 +279,7 @@ struct sqe_submit { bool has_user; bool needs_lock; bool needs_fixed_file; + u8 opcode; }; /* @@ -505,7 +506,7 @@ static inline void io_queue_async_work(struct io_ring_ctx *ctx, int rw = 0; if (req->submit.sqe) { - switch (req->submit.sqe->opcode) { + switch (req->submit.opcode) { case IORING_OP_WRITEV: case IORING_OP_WRITE_FIXED: rw = !(req->rw.ki_flags & IOCB_DIRECT); @@ -1254,23 +1255,15 @@ static int io_import_fixed(struct io_ring_ctx *ctx, int rw, } static ssize_t io_import_iovec(struct io_ring_ctx *ctx, int rw, - const struct sqe_submit *s, struct iovec **iovec, + struct io_kiocb *req, struct iovec **iovec, struct iov_iter *iter) { - const struct io_uring_sqe *sqe = s->sqe; + const struct io_uring_sqe *sqe = req->submit.sqe; void __user *buf = u64_to_user_ptr(READ_ONCE(sqe->addr)); size_t sqe_len = READ_ONCE(sqe->len); u8 opcode; - /* - * We're reading ->opcode for the second time, but the first read - * doesn't care whether it's _FIXED or not, so it doesn't matter - * whether ->opcode changes concurrently. The first read does care - * about whether it is a READ or a WRITE, so we don't trust this read - * for that purpose and instead let the caller pass in the read/write - * flag. - */ - opcode = READ_ONCE(sqe->opcode); + opcode = req->submit.opcode; if (opcode == IORING_OP_READ_FIXED || opcode == IORING_OP_WRITE_FIXED) { ssize_t ret = io_import_fixed(ctx, rw, sqe, iter); @@ -1278,7 +1271,7 @@ static ssize_t io_import_iovec(struct io_ring_ctx *ctx, int rw, return ret; } - if (!s->has_user) + if (!req->submit.has_user) return -EFAULT; #ifdef CONFIG_COMPAT @@ -1425,7 +1418,7 @@ static int io_read(struct io_kiocb *req, const struct sqe_submit *s, if (unlikely(!(file->f_mode & FMODE_READ))) return -EBADF; - ret = io_import_iovec(req->ctx, READ, s, &iovec, &iter); + ret = io_import_iovec(req->ctx, READ, req, &iovec, &iter); if (ret < 0) return ret; @@ -1490,7 +1483,7 @@ static int io_write(struct io_kiocb *req, const struct sqe_submit *s, if (unlikely(!(file->f_mode & FMODE_WRITE))) return -EBADF; - ret = io_import_iovec(req->ctx, WRITE, s, &iovec, &iter); + ret = io_import_iovec(req->ctx, WRITE, req, &iovec, &iter); if (ret < 0) return ret; @@ -2109,15 +2102,14 @@ static int io_req_defer(struct io_ring_ctx *ctx, struct io_kiocb *req, static int __io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req, const struct sqe_submit *s, bool force_nonblock) { - int ret, opcode; + int ret; req->user_data = READ_ONCE(s->sqe->user_data); if (unlikely(s->index >= ctx->sq_entries)) return -EINVAL; - opcode = READ_ONCE(s->sqe->opcode); - switch (opcode) { + switch (req->submit.opcode) { case IORING_OP_NOP: ret = io_nop(req, req->user_data); break; @@ -2181,10 +2173,10 @@ static int __io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req, return 0; } -static struct async_list *io_async_list_from_sqe(struct io_ring_ctx *ctx, - const struct io_uring_sqe *sqe) +static struct async_list *io_async_list_from_req(struct io_ring_ctx *ctx, + struct io_kiocb *req) { - switch (sqe->opcode) { + switch (req->submit.opcode) { case IORING_OP_READV: case IORING_OP_READ_FIXED: return &ctx->pending_async[READ]; @@ -2196,12 +2188,10 @@ static struct async_list *io_async_list_from_sqe(struct io_ring_ctx *ctx, } } -static inline bool io_sqe_needs_user(const struct io_uring_sqe *sqe) +static inline bool io_req_needs_user(struct io_kiocb *req) { - u8 opcode = READ_ONCE(sqe->opcode); - - return !(opcode == IORING_OP_READ_FIXED || - opcode == IORING_OP_WRITE_FIXED); + return !(req->submit.opcode == IORING_OP_READ_FIXED || + req->submit.opcode == IORING_OP_WRITE_FIXED); } static void io_sq_wq_submit_work(struct work_struct *work) @@ -2217,7 +2207,7 @@ static void io_sq_wq_submit_work(struct work_struct *work) int ret; old_cred = override_creds(ctx->creds); - async_list = io_async_list_from_sqe(ctx, req->submit.sqe); + async_list = io_async_list_from_req(ctx, req); allow_kernel_signal(SIGINT); restart: @@ -2239,7 +2229,7 @@ static void io_sq_wq_submit_work(struct work_struct *work) } ret = 0; - if (io_sqe_needs_user(sqe) && !cur_mm) { + if (io_req_needs_user(req) && !cur_mm) { if (!mmget_not_zero(ctx->sqo_mm)) { ret = -EFAULT; } else { @@ -2387,11 +2377,9 @@ static bool io_add_to_prev_work(struct async_list *list, struct io_kiocb *req) return ret; } -static bool io_op_needs_file(const struct io_uring_sqe *sqe) +static bool io_op_needs_file(struct io_kiocb *req) { - int op = READ_ONCE(sqe->opcode); - - switch (op) { + switch (req->submit.opcode) { case IORING_OP_NOP: case IORING_OP_POLL_REMOVE: case IORING_OP_TIMEOUT: @@ -2419,7 +2407,7 @@ static int io_req_set_file(struct io_ring_ctx *ctx, const struct sqe_submit *s, */ req->sequence = s->sequence; - if (!io_op_needs_file(s->sqe)) + if (!io_op_needs_file(req)) return 0; if (flags & IOSQE_FIXED_FILE) { @@ -2460,7 +2448,7 @@ static int __io_queue_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req, s->sqe = sqe_copy; memcpy(&req->submit, s, sizeof(*s)); - list = io_async_list_from_sqe(ctx, s->sqe); + list = io_async_list_from_req(ctx, req); if (!io_add_to_prev_work(list, req)) { if (list) atomic_inc(&list->cnt); @@ -2582,7 +2570,7 @@ static void io_submit_sqe(struct io_ring_ctx *ctx, struct sqe_submit *s, req->user_data = s->sqe->user_data; #if defined(CONFIG_NET) - switch (READ_ONCE(s->sqe->opcode)) { + switch (req->submit.opcode) { case IORING_OP_SENDMSG: case IORING_OP_RECVMSG: spin_lock(¤t->fs->lock); @@ -2697,6 +2685,7 @@ static bool io_get_sqring(struct io_ring_ctx *ctx, struct sqe_submit *s) if (head < ctx->sq_entries) { s->index = head; s->sqe = &ctx->sq_sqes[head]; + s->opcode = READ_ONCE(s->sqe->opcode); s->sequence = ctx->cached_sq_head; ctx->cached_sq_head++; return true; -- Jens Axboe ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] fs/io_uring.c: fix null ptr deference in io_send_recvmsg() 2020-08-04 21:55 ` Jens Axboe @ 2020-08-05 3:40 ` Liu Yong 2020-08-05 4:10 ` Jens Axboe 0 siblings, 1 reply; 6+ messages in thread From: Liu Yong @ 2020-08-05 3:40 UTC (permalink / raw) To: Jens Axboe Cc: Pavel Begunkov, 、 [email protected], 、io-uring On Tue, Aug 04, 2020 at 03:55:16PM -0600, Jens Axboe wrote: > On 8/4/20 11:15 AM, Jens Axboe wrote: > > On 8/4/20 11:02 AM, xiao lin wrote: > >> 在 2020年8月4日星期二,Jens Axboe <[email protected] <mailto:[email protected]>> 写道: > >> > >> On 8/4/20 7:18 AM, Pavel Begunkov wrote: > >> > On 04/08/2020 15:56, Liu Yong wrote: > >> >> In io_send_recvmsg(), there is no check for the req->file. > >> >> User can change the opcode from IORING_OP_NOP to IORING_OP_SENDMSG > >> >> through competition after the io_req_set_file(). > >> > > >> > After sqe->opcode is read and copied in io_init_req(), it only uses > >> > in-kernel req->opcode. Also, io_init_req() should check for req->file > >> > NULL, so shouldn't happen after. > >> > > >> > Do you have a reproducer? What kernel version did you use? > >> > >> Was looking at this too, and I'm guessing this is some 5.4 based kernel. > >> Unfortunately the oops doesn't include that information. > > > >> Sorry, I forgot to mention that the kernel version I am using is 5.4.55. > > > > I think there are two options here: > > > > 1) Backport the series that ensured we only read those important bits once > > 2) Make s->sqe a full sqe, and memcpy it in > > Something like this should close the ->opcode re-read for 5.4-stable. > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index e0200406765c..8bb5e19b7c3c 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -279,6 +279,7 @@ struct sqe_submit { > bool has_user; > bool needs_lock; > bool needs_fixed_file; > + u8 opcode; > }; > > /* > @@ -505,7 +506,7 @@ static inline void io_queue_async_work(struct io_ring_ctx *ctx, > int rw = 0; > > if (req->submit.sqe) { > - switch (req->submit.sqe->opcode) { > + switch (req->submit.opcode) { > case IORING_OP_WRITEV: > case IORING_OP_WRITE_FIXED: > rw = !(req->rw.ki_flags & IOCB_DIRECT); > @@ -1254,23 +1255,15 @@ static int io_import_fixed(struct io_ring_ctx *ctx, int rw, > } > > static ssize_t io_import_iovec(struct io_ring_ctx *ctx, int rw, > - const struct sqe_submit *s, struct iovec **iovec, > + struct io_kiocb *req, struct iovec **iovec, > struct iov_iter *iter) > { > - const struct io_uring_sqe *sqe = s->sqe; > + const struct io_uring_sqe *sqe = req->submit.sqe; > void __user *buf = u64_to_user_ptr(READ_ONCE(sqe->addr)); > size_t sqe_len = READ_ONCE(sqe->len); > u8 opcode; > > - /* > - * We're reading ->opcode for the second time, but the first read > - * doesn't care whether it's _FIXED or not, so it doesn't matter > - * whether ->opcode changes concurrently. The first read does care > - * about whether it is a READ or a WRITE, so we don't trust this read > - * for that purpose and instead let the caller pass in the read/write > - * flag. > - */ > - opcode = READ_ONCE(sqe->opcode); > + opcode = req->submit.opcode; > if (opcode == IORING_OP_READ_FIXED || > opcode == IORING_OP_WRITE_FIXED) { > ssize_t ret = io_import_fixed(ctx, rw, sqe, iter); > @@ -1278,7 +1271,7 @@ static ssize_t io_import_iovec(struct io_ring_ctx *ctx, int rw, > return ret; > } > > - if (!s->has_user) > + if (!req->submit.has_user) > return -EFAULT; > > #ifdef CONFIG_COMPAT > @@ -1425,7 +1418,7 @@ static int io_read(struct io_kiocb *req, const struct sqe_submit *s, > if (unlikely(!(file->f_mode & FMODE_READ))) > return -EBADF; > > - ret = io_import_iovec(req->ctx, READ, s, &iovec, &iter); > + ret = io_import_iovec(req->ctx, READ, req, &iovec, &iter); > if (ret < 0) > return ret; > > @@ -1490,7 +1483,7 @@ static int io_write(struct io_kiocb *req, const struct sqe_submit *s, > if (unlikely(!(file->f_mode & FMODE_WRITE))) > return -EBADF; > > - ret = io_import_iovec(req->ctx, WRITE, s, &iovec, &iter); > + ret = io_import_iovec(req->ctx, WRITE, req, &iovec, &iter); > if (ret < 0) > return ret; > > @@ -2109,15 +2102,14 @@ static int io_req_defer(struct io_ring_ctx *ctx, struct io_kiocb *req, > static int __io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req, > const struct sqe_submit *s, bool force_nonblock) > { > - int ret, opcode; > + int ret; > > req->user_data = READ_ONCE(s->sqe->user_data); > > if (unlikely(s->index >= ctx->sq_entries)) > return -EINVAL; > > - opcode = READ_ONCE(s->sqe->opcode); > - switch (opcode) { > + switch (req->submit.opcode) { > case IORING_OP_NOP: > ret = io_nop(req, req->user_data); > break; > @@ -2181,10 +2173,10 @@ static int __io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req, > return 0; > } > > -static struct async_list *io_async_list_from_sqe(struct io_ring_ctx *ctx, > - const struct io_uring_sqe *sqe) > +static struct async_list *io_async_list_from_req(struct io_ring_ctx *ctx, > + struct io_kiocb *req) > { > - switch (sqe->opcode) { > + switch (req->submit.opcode) { > case IORING_OP_READV: > case IORING_OP_READ_FIXED: > return &ctx->pending_async[READ]; > @@ -2196,12 +2188,10 @@ static struct async_list *io_async_list_from_sqe(struct io_ring_ctx *ctx, > } > } > > -static inline bool io_sqe_needs_user(const struct io_uring_sqe *sqe) > +static inline bool io_req_needs_user(struct io_kiocb *req) > { > - u8 opcode = READ_ONCE(sqe->opcode); > - > - return !(opcode == IORING_OP_READ_FIXED || > - opcode == IORING_OP_WRITE_FIXED); > + return !(req->submit.opcode == IORING_OP_READ_FIXED || > + req->submit.opcode == IORING_OP_WRITE_FIXED); > } > > static void io_sq_wq_submit_work(struct work_struct *work) > @@ -2217,7 +2207,7 @@ static void io_sq_wq_submit_work(struct work_struct *work) > int ret; > > old_cred = override_creds(ctx->creds); > - async_list = io_async_list_from_sqe(ctx, req->submit.sqe); > + async_list = io_async_list_from_req(ctx, req); > > allow_kernel_signal(SIGINT); > restart: > @@ -2239,7 +2229,7 @@ static void io_sq_wq_submit_work(struct work_struct *work) > } > > ret = 0; > - if (io_sqe_needs_user(sqe) && !cur_mm) { > + if (io_req_needs_user(req) && !cur_mm) { > if (!mmget_not_zero(ctx->sqo_mm)) { > ret = -EFAULT; > } else { > @@ -2387,11 +2377,9 @@ static bool io_add_to_prev_work(struct async_list *list, struct io_kiocb *req) > return ret; > } > > -static bool io_op_needs_file(const struct io_uring_sqe *sqe) > +static bool io_op_needs_file(struct io_kiocb *req) > { > - int op = READ_ONCE(sqe->opcode); > - > - switch (op) { > + switch (req->submit.opcode) { > case IORING_OP_NOP: > case IORING_OP_POLL_REMOVE: > case IORING_OP_TIMEOUT: > @@ -2419,7 +2407,7 @@ static int io_req_set_file(struct io_ring_ctx *ctx, const struct sqe_submit *s, > */ > req->sequence = s->sequence; > > - if (!io_op_needs_file(s->sqe)) > + if (!io_op_needs_file(req)) > return 0; > > if (flags & IOSQE_FIXED_FILE) { > @@ -2460,7 +2448,7 @@ static int __io_queue_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req, > > s->sqe = sqe_copy; > memcpy(&req->submit, s, sizeof(*s)); > - list = io_async_list_from_sqe(ctx, s->sqe); > + list = io_async_list_from_req(ctx, req); > if (!io_add_to_prev_work(list, req)) { > if (list) > atomic_inc(&list->cnt); > @@ -2582,7 +2570,7 @@ static void io_submit_sqe(struct io_ring_ctx *ctx, struct sqe_submit *s, > req->user_data = s->sqe->user_data; > > #if defined(CONFIG_NET) > - switch (READ_ONCE(s->sqe->opcode)) { > + switch (req->submit.opcode) { > case IORING_OP_SENDMSG: > case IORING_OP_RECVMSG: > spin_lock(¤t->fs->lock); > @@ -2697,6 +2685,7 @@ static bool io_get_sqring(struct io_ring_ctx *ctx, struct sqe_submit *s) > if (head < ctx->sq_entries) { > s->index = head; > s->sqe = &ctx->sq_sqes[head]; > + s->opcode = READ_ONCE(s->sqe->opcode); > s->sequence = ctx->cached_sq_head; > ctx->cached_sq_head++; > return true; > > -- > Jens Axboe > I think this patch solves similar problems from the root cause. So, Should I submit this commit, or you? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] fs/io_uring.c: fix null ptr deference in io_send_recvmsg() 2020-08-05 3:40 ` Liu Yong @ 2020-08-05 4:10 ` Jens Axboe 0 siblings, 0 replies; 6+ messages in thread From: Jens Axboe @ 2020-08-05 4:10 UTC (permalink / raw) To: Liu Yong Cc: Pavel Begunkov, 、 [email protected], 、io-uring On 8/4/20 9:40 PM, Liu Yong wrote: > On Tue, Aug 04, 2020 at 03:55:16PM -0600, Jens Axboe wrote: >> On 8/4/20 11:15 AM, Jens Axboe wrote: >>> On 8/4/20 11:02 AM, xiao lin wrote: >>>> 在 2020年8月4日星期二,Jens Axboe <[email protected] <mailto:[email protected]>> 写道: >>>> >>>> On 8/4/20 7:18 AM, Pavel Begunkov wrote: >>>> > On 04/08/2020 15:56, Liu Yong wrote: >>>> >> In io_send_recvmsg(), there is no check for the req->file. >>>> >> User can change the opcode from IORING_OP_NOP to IORING_OP_SENDMSG >>>> >> through competition after the io_req_set_file(). >>>> > >>>> > After sqe->opcode is read and copied in io_init_req(), it only uses >>>> > in-kernel req->opcode. Also, io_init_req() should check for req->file >>>> > NULL, so shouldn't happen after. >>>> > >>>> > Do you have a reproducer? What kernel version did you use? >>>> >>>> Was looking at this too, and I'm guessing this is some 5.4 based kernel. >>>> Unfortunately the oops doesn't include that information. >>> >>>> Sorry, I forgot to mention that the kernel version I am using is 5.4.55. >>> >>> I think there are two options here: >>> >>> 1) Backport the series that ensured we only read those important bits once >>> 2) Make s->sqe a full sqe, and memcpy it in >> >> Something like this should close the ->opcode re-read for 5.4-stable. >> >> diff --git a/fs/io_uring.c b/fs/io_uring.c >> index e0200406765c..8bb5e19b7c3c 100644 >> --- a/fs/io_uring.c >> +++ b/fs/io_uring.c >> @@ -279,6 +279,7 @@ struct sqe_submit { >> bool has_user; >> bool needs_lock; >> bool needs_fixed_file; >> + u8 opcode; >> }; >> >> /* >> @@ -505,7 +506,7 @@ static inline void io_queue_async_work(struct io_ring_ctx *ctx, >> int rw = 0; >> >> if (req->submit.sqe) { >> - switch (req->submit.sqe->opcode) { >> + switch (req->submit.opcode) { >> case IORING_OP_WRITEV: >> case IORING_OP_WRITE_FIXED: >> rw = !(req->rw.ki_flags & IOCB_DIRECT); >> @@ -1254,23 +1255,15 @@ static int io_import_fixed(struct io_ring_ctx *ctx, int rw, >> } >> >> static ssize_t io_import_iovec(struct io_ring_ctx *ctx, int rw, >> - const struct sqe_submit *s, struct iovec **iovec, >> + struct io_kiocb *req, struct iovec **iovec, >> struct iov_iter *iter) >> { >> - const struct io_uring_sqe *sqe = s->sqe; >> + const struct io_uring_sqe *sqe = req->submit.sqe; >> void __user *buf = u64_to_user_ptr(READ_ONCE(sqe->addr)); >> size_t sqe_len = READ_ONCE(sqe->len); >> u8 opcode; >> >> - /* >> - * We're reading ->opcode for the second time, but the first read >> - * doesn't care whether it's _FIXED or not, so it doesn't matter >> - * whether ->opcode changes concurrently. The first read does care >> - * about whether it is a READ or a WRITE, so we don't trust this read >> - * for that purpose and instead let the caller pass in the read/write >> - * flag. >> - */ >> - opcode = READ_ONCE(sqe->opcode); >> + opcode = req->submit.opcode; >> if (opcode == IORING_OP_READ_FIXED || >> opcode == IORING_OP_WRITE_FIXED) { >> ssize_t ret = io_import_fixed(ctx, rw, sqe, iter); >> @@ -1278,7 +1271,7 @@ static ssize_t io_import_iovec(struct io_ring_ctx *ctx, int rw, >> return ret; >> } >> >> - if (!s->has_user) >> + if (!req->submit.has_user) >> return -EFAULT; >> >> #ifdef CONFIG_COMPAT >> @@ -1425,7 +1418,7 @@ static int io_read(struct io_kiocb *req, const struct sqe_submit *s, >> if (unlikely(!(file->f_mode & FMODE_READ))) >> return -EBADF; >> >> - ret = io_import_iovec(req->ctx, READ, s, &iovec, &iter); >> + ret = io_import_iovec(req->ctx, READ, req, &iovec, &iter); >> if (ret < 0) >> return ret; >> >> @@ -1490,7 +1483,7 @@ static int io_write(struct io_kiocb *req, const struct sqe_submit *s, >> if (unlikely(!(file->f_mode & FMODE_WRITE))) >> return -EBADF; >> >> - ret = io_import_iovec(req->ctx, WRITE, s, &iovec, &iter); >> + ret = io_import_iovec(req->ctx, WRITE, req, &iovec, &iter); >> if (ret < 0) >> return ret; >> >> @@ -2109,15 +2102,14 @@ static int io_req_defer(struct io_ring_ctx *ctx, struct io_kiocb *req, >> static int __io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req, >> const struct sqe_submit *s, bool force_nonblock) >> { >> - int ret, opcode; >> + int ret; >> >> req->user_data = READ_ONCE(s->sqe->user_data); >> >> if (unlikely(s->index >= ctx->sq_entries)) >> return -EINVAL; >> >> - opcode = READ_ONCE(s->sqe->opcode); >> - switch (opcode) { >> + switch (req->submit.opcode) { >> case IORING_OP_NOP: >> ret = io_nop(req, req->user_data); >> break; >> @@ -2181,10 +2173,10 @@ static int __io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req, >> return 0; >> } >> >> -static struct async_list *io_async_list_from_sqe(struct io_ring_ctx *ctx, >> - const struct io_uring_sqe *sqe) >> +static struct async_list *io_async_list_from_req(struct io_ring_ctx *ctx, >> + struct io_kiocb *req) >> { >> - switch (sqe->opcode) { >> + switch (req->submit.opcode) { >> case IORING_OP_READV: >> case IORING_OP_READ_FIXED: >> return &ctx->pending_async[READ]; >> @@ -2196,12 +2188,10 @@ static struct async_list *io_async_list_from_sqe(struct io_ring_ctx *ctx, >> } >> } >> >> -static inline bool io_sqe_needs_user(const struct io_uring_sqe *sqe) >> +static inline bool io_req_needs_user(struct io_kiocb *req) >> { >> - u8 opcode = READ_ONCE(sqe->opcode); >> - >> - return !(opcode == IORING_OP_READ_FIXED || >> - opcode == IORING_OP_WRITE_FIXED); >> + return !(req->submit.opcode == IORING_OP_READ_FIXED || >> + req->submit.opcode == IORING_OP_WRITE_FIXED); >> } >> >> static void io_sq_wq_submit_work(struct work_struct *work) >> @@ -2217,7 +2207,7 @@ static void io_sq_wq_submit_work(struct work_struct *work) >> int ret; >> >> old_cred = override_creds(ctx->creds); >> - async_list = io_async_list_from_sqe(ctx, req->submit.sqe); >> + async_list = io_async_list_from_req(ctx, req); >> >> allow_kernel_signal(SIGINT); >> restart: >> @@ -2239,7 +2229,7 @@ static void io_sq_wq_submit_work(struct work_struct *work) >> } >> >> ret = 0; >> - if (io_sqe_needs_user(sqe) && !cur_mm) { >> + if (io_req_needs_user(req) && !cur_mm) { >> if (!mmget_not_zero(ctx->sqo_mm)) { >> ret = -EFAULT; >> } else { >> @@ -2387,11 +2377,9 @@ static bool io_add_to_prev_work(struct async_list *list, struct io_kiocb *req) >> return ret; >> } >> >> -static bool io_op_needs_file(const struct io_uring_sqe *sqe) >> +static bool io_op_needs_file(struct io_kiocb *req) >> { >> - int op = READ_ONCE(sqe->opcode); >> - >> - switch (op) { >> + switch (req->submit.opcode) { >> case IORING_OP_NOP: >> case IORING_OP_POLL_REMOVE: >> case IORING_OP_TIMEOUT: >> @@ -2419,7 +2407,7 @@ static int io_req_set_file(struct io_ring_ctx *ctx, const struct sqe_submit *s, >> */ >> req->sequence = s->sequence; >> >> - if (!io_op_needs_file(s->sqe)) >> + if (!io_op_needs_file(req)) >> return 0; >> >> if (flags & IOSQE_FIXED_FILE) { >> @@ -2460,7 +2448,7 @@ static int __io_queue_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req, >> >> s->sqe = sqe_copy; >> memcpy(&req->submit, s, sizeof(*s)); >> - list = io_async_list_from_sqe(ctx, s->sqe); >> + list = io_async_list_from_req(ctx, req); >> if (!io_add_to_prev_work(list, req)) { >> if (list) >> atomic_inc(&list->cnt); >> @@ -2582,7 +2570,7 @@ static void io_submit_sqe(struct io_ring_ctx *ctx, struct sqe_submit *s, >> req->user_data = s->sqe->user_data; >> >> #if defined(CONFIG_NET) >> - switch (READ_ONCE(s->sqe->opcode)) { >> + switch (req->submit.opcode) { >> case IORING_OP_SENDMSG: >> case IORING_OP_RECVMSG: >> spin_lock(¤t->fs->lock); >> @@ -2697,6 +2685,7 @@ static bool io_get_sqring(struct io_ring_ctx *ctx, struct sqe_submit *s) >> if (head < ctx->sq_entries) { >> s->index = head; >> s->sqe = &ctx->sq_sqes[head]; >> + s->opcode = READ_ONCE(s->sqe->opcode); >> s->sequence = ctx->cached_sq_head; >> ctx->cached_sq_head++; >> return true; >> >> -- >> Jens Axboe >> > > I think this patch solves similar problems from the root cause. > So, Should I submit this commit, or you? Thanks for testing, I'll add your tested-by. Probably best if I do it, since it's going to 5.4-stable only and it's not from upstream. -- Jens Axboe ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-08-05 4:11 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <20200804125637.GA22088@ubuntu> 2020-08-04 13:18 ` [PATCH] fs/io_uring.c: fix null ptr deference in io_send_recvmsg() Pavel Begunkov 2020-08-04 13:27 ` Jens Axboe [not found] ` <CAGAoTxzadSphnE2aLsFKS04TjTKYVq2uLFgH9dvLPwWiyqEGEQ@mail.gmail.com> 2020-08-04 17:15 ` Jens Axboe 2020-08-04 21:55 ` Jens Axboe 2020-08-05 3:40 ` Liu Yong 2020-08-05 4:10 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox