* [PATCH 1/9] compat.h: fix a spelling error in <linux/compat.h>
2020-09-25 4:51 let import_iovec deal with compat_iovecs as well v4 Christoph Hellwig
@ 2020-09-25 4:51 ` Christoph Hellwig
2020-09-25 4:51 ` [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c Christoph Hellwig
` (8 subsequent siblings)
9 siblings, 0 replies; 66+ messages in thread
From: Christoph Hellwig @ 2020-09-25 4:51 UTC (permalink / raw)
To: Alexander Viro
Cc: Andrew Morton, Jens Axboe, Arnd Bergmann, David Howells,
David Laight, linux-arm-kernel, linux-kernel, linux-mips,
linux-parisc, linuxppc-dev, linux-s390, sparclinux, linux-block,
linux-scsi, linux-fsdevel, linux-aio, io-uring, linux-arch,
linux-mm, netdev, keyrings, linux-security-module
There is no compat_sys_readv64v2 syscall, only a compat_sys_preadv64v2
one.
Signed-off-by: Christoph Hellwig <[email protected]>
---
include/linux/compat.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/compat.h b/include/linux/compat.h
index b354ce58966e2d..654c1ec36671a4 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -812,7 +812,7 @@ asmlinkage ssize_t compat_sys_pwritev2(compat_ulong_t fd,
const struct compat_iovec __user *vec,
compat_ulong_t vlen, u32 pos_low, u32 pos_high, rwf_t flags);
#ifdef __ARCH_WANT_COMPAT_SYS_PREADV64V2
-asmlinkage long compat_sys_readv64v2(unsigned long fd,
+asmlinkage long compat_sys_preadv64v2(unsigned long fd,
const struct compat_iovec __user *vec,
unsigned long vlen, loff_t pos, rwf_t flags);
#endif
--
2.28.0
^ permalink raw reply related [flat|nested] 66+ messages in thread
* [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c
2020-09-25 4:51 let import_iovec deal with compat_iovecs as well v4 Christoph Hellwig
2020-09-25 4:51 ` [PATCH 1/9] compat.h: fix a spelling error in <linux/compat.h> Christoph Hellwig
@ 2020-09-25 4:51 ` Christoph Hellwig
2020-10-21 16:13 ` Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c" Greg KH
2020-09-25 4:51 ` [PATCH 3/9] iov_iter: refactor rw_copy_check_uvector and import_iovec Christoph Hellwig
` (7 subsequent siblings)
9 siblings, 1 reply; 66+ messages in thread
From: Christoph Hellwig @ 2020-09-25 4:51 UTC (permalink / raw)
To: Alexander Viro
Cc: Andrew Morton, Jens Axboe, Arnd Bergmann, David Howells,
David Laight, linux-arm-kernel, linux-kernel, linux-mips,
linux-parisc, linuxppc-dev, linux-s390, sparclinux, linux-block,
linux-scsi, linux-fsdevel, linux-aio, io-uring, linux-arch,
linux-mm, netdev, keyrings, linux-security-module, David Laight,
David Laight
From: David Laight <[email protected]>
This lets the compiler inline it into import_iovec() generating
much better code.
Signed-off-by: David Laight <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
---
fs/read_write.c | 179 ------------------------------------------------
lib/iov_iter.c | 176 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 176 insertions(+), 179 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index 5db58b8c78d0dd..e5e891a88442ef 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -752,185 +752,6 @@ static ssize_t do_loop_readv_writev(struct file *filp, struct iov_iter *iter,
return ret;
}
-/**
- * rw_copy_check_uvector() - Copy an array of &struct iovec from userspace
- * into the kernel and check that it is valid.
- *
- * @type: One of %CHECK_IOVEC_ONLY, %READ, or %WRITE.
- * @uvector: Pointer to the userspace array.
- * @nr_segs: Number of elements in userspace array.
- * @fast_segs: Number of elements in @fast_pointer.
- * @fast_pointer: Pointer to (usually small on-stack) kernel array.
- * @ret_pointer: (output parameter) Pointer to a variable that will point to
- * either @fast_pointer, a newly allocated kernel array, or NULL,
- * depending on which array was used.
- *
- * This function copies an array of &struct iovec of @nr_segs from
- * userspace into the kernel and checks that each element is valid (e.g.
- * it does not point to a kernel address or cause overflow by being too
- * large, etc.).
- *
- * As an optimization, the caller may provide a pointer to a small
- * on-stack array in @fast_pointer, typically %UIO_FASTIOV elements long
- * (the size of this array, or 0 if unused, should be given in @fast_segs).
- *
- * @ret_pointer will always point to the array that was used, so the
- * caller must take care not to call kfree() on it e.g. in case the
- * @fast_pointer array was used and it was allocated on the stack.
- *
- * Return: The total number of bytes covered by the iovec array on success
- * or a negative error code on error.
- */
-ssize_t rw_copy_check_uvector(int type, const struct iovec __user * uvector,
- unsigned long nr_segs, unsigned long fast_segs,
- struct iovec *fast_pointer,
- struct iovec **ret_pointer)
-{
- unsigned long seg;
- ssize_t ret;
- struct iovec *iov = fast_pointer;
-
- /*
- * SuS says "The readv() function *may* fail if the iovcnt argument
- * was less than or equal to 0, or greater than {IOV_MAX}. Linux has
- * traditionally returned zero for zero segments, so...
- */
- if (nr_segs == 0) {
- ret = 0;
- goto out;
- }
-
- /*
- * First get the "struct iovec" from user memory and
- * verify all the pointers
- */
- if (nr_segs > UIO_MAXIOV) {
- ret = -EINVAL;
- goto out;
- }
- if (nr_segs > fast_segs) {
- iov = kmalloc_array(nr_segs, sizeof(struct iovec), GFP_KERNEL);
- if (iov == NULL) {
- ret = -ENOMEM;
- goto out;
- }
- }
- if (copy_from_user(iov, uvector, nr_segs*sizeof(*uvector))) {
- ret = -EFAULT;
- goto out;
- }
-
- /*
- * According to the Single Unix Specification we should return EINVAL
- * if an element length is < 0 when cast to ssize_t or if the
- * total length would overflow the ssize_t return value of the
- * system call.
- *
- * Linux caps all read/write calls to MAX_RW_COUNT, and avoids the
- * overflow case.
- */
- ret = 0;
- for (seg = 0; seg < nr_segs; seg++) {
- void __user *buf = iov[seg].iov_base;
- ssize_t len = (ssize_t)iov[seg].iov_len;
-
- /* see if we we're about to use an invalid len or if
- * it's about to overflow ssize_t */
- if (len < 0) {
- ret = -EINVAL;
- goto out;
- }
- if (type >= 0
- && unlikely(!access_ok(buf, len))) {
- ret = -EFAULT;
- goto out;
- }
- if (len > MAX_RW_COUNT - ret) {
- len = MAX_RW_COUNT - ret;
- iov[seg].iov_len = len;
- }
- ret += len;
- }
-out:
- *ret_pointer = iov;
- return ret;
-}
-
-#ifdef CONFIG_COMPAT
-ssize_t compat_rw_copy_check_uvector(int type,
- const struct compat_iovec __user *uvector, unsigned long nr_segs,
- unsigned long fast_segs, struct iovec *fast_pointer,
- struct iovec **ret_pointer)
-{
- compat_ssize_t tot_len;
- struct iovec *iov = *ret_pointer = fast_pointer;
- ssize_t ret = 0;
- int seg;
-
- /*
- * SuS says "The readv() function *may* fail if the iovcnt argument
- * was less than or equal to 0, or greater than {IOV_MAX}. Linux has
- * traditionally returned zero for zero segments, so...
- */
- if (nr_segs == 0)
- goto out;
-
- ret = -EINVAL;
- if (nr_segs > UIO_MAXIOV)
- goto out;
- if (nr_segs > fast_segs) {
- ret = -ENOMEM;
- iov = kmalloc_array(nr_segs, sizeof(struct iovec), GFP_KERNEL);
- if (iov == NULL)
- goto out;
- }
- *ret_pointer = iov;
-
- ret = -EFAULT;
- if (!access_ok(uvector, nr_segs*sizeof(*uvector)))
- goto out;
-
- /*
- * Single unix specification:
- * We should -EINVAL if an element length is not >= 0 and fitting an
- * ssize_t.
- *
- * In Linux, the total length is limited to MAX_RW_COUNT, there is
- * no overflow possibility.
- */
- tot_len = 0;
- ret = -EINVAL;
- for (seg = 0; seg < nr_segs; seg++) {
- compat_uptr_t buf;
- compat_ssize_t len;
-
- if (__get_user(len, &uvector->iov_len) ||
- __get_user(buf, &uvector->iov_base)) {
- ret = -EFAULT;
- goto out;
- }
- if (len < 0) /* size_t not fitting in compat_ssize_t .. */
- goto out;
- if (type >= 0 &&
- !access_ok(compat_ptr(buf), len)) {
- ret = -EFAULT;
- goto out;
- }
- if (len > MAX_RW_COUNT - tot_len)
- len = MAX_RW_COUNT - tot_len;
- tot_len += len;
- iov->iov_base = compat_ptr(buf);
- iov->iov_len = (compat_size_t) len;
- uvector++;
- iov++;
- }
- ret = tot_len;
-
-out:
- return ret;
-}
-#endif
-
static ssize_t do_iter_read(struct file *file, struct iov_iter *iter,
loff_t *pos, rwf_t flags)
{
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 5e40786c8f1232..ccea9db3f72be8 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1650,6 +1650,109 @@ const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
}
EXPORT_SYMBOL(dup_iter);
+/**
+ * rw_copy_check_uvector() - Copy an array of &struct iovec from userspace
+ * into the kernel and check that it is valid.
+ *
+ * @type: One of %CHECK_IOVEC_ONLY, %READ, or %WRITE.
+ * @uvector: Pointer to the userspace array.
+ * @nr_segs: Number of elements in userspace array.
+ * @fast_segs: Number of elements in @fast_pointer.
+ * @fast_pointer: Pointer to (usually small on-stack) kernel array.
+ * @ret_pointer: (output parameter) Pointer to a variable that will point to
+ * either @fast_pointer, a newly allocated kernel array, or NULL,
+ * depending on which array was used.
+ *
+ * This function copies an array of &struct iovec of @nr_segs from
+ * userspace into the kernel and checks that each element is valid (e.g.
+ * it does not point to a kernel address or cause overflow by being too
+ * large, etc.).
+ *
+ * As an optimization, the caller may provide a pointer to a small
+ * on-stack array in @fast_pointer, typically %UIO_FASTIOV elements long
+ * (the size of this array, or 0 if unused, should be given in @fast_segs).
+ *
+ * @ret_pointer will always point to the array that was used, so the
+ * caller must take care not to call kfree() on it e.g. in case the
+ * @fast_pointer array was used and it was allocated on the stack.
+ *
+ * Return: The total number of bytes covered by the iovec array on success
+ * or a negative error code on error.
+ */
+ssize_t rw_copy_check_uvector(int type, const struct iovec __user *uvector,
+ unsigned long nr_segs, unsigned long fast_segs,
+ struct iovec *fast_pointer, struct iovec **ret_pointer)
+{
+ unsigned long seg;
+ ssize_t ret;
+ struct iovec *iov = fast_pointer;
+
+ /*
+ * SuS says "The readv() function *may* fail if the iovcnt argument
+ * was less than or equal to 0, or greater than {IOV_MAX}. Linux has
+ * traditionally returned zero for zero segments, so...
+ */
+ if (nr_segs == 0) {
+ ret = 0;
+ goto out;
+ }
+
+ /*
+ * First get the "struct iovec" from user memory and
+ * verify all the pointers
+ */
+ if (nr_segs > UIO_MAXIOV) {
+ ret = -EINVAL;
+ goto out;
+ }
+ if (nr_segs > fast_segs) {
+ iov = kmalloc_array(nr_segs, sizeof(struct iovec), GFP_KERNEL);
+ if (iov == NULL) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ }
+ if (copy_from_user(iov, uvector, nr_segs*sizeof(*uvector))) {
+ ret = -EFAULT;
+ goto out;
+ }
+
+ /*
+ * According to the Single Unix Specification we should return EINVAL
+ * if an element length is < 0 when cast to ssize_t or if the
+ * total length would overflow the ssize_t return value of the
+ * system call.
+ *
+ * Linux caps all read/write calls to MAX_RW_COUNT, and avoids the
+ * overflow case.
+ */
+ ret = 0;
+ for (seg = 0; seg < nr_segs; seg++) {
+ void __user *buf = iov[seg].iov_base;
+ ssize_t len = (ssize_t)iov[seg].iov_len;
+
+ /* see if we we're about to use an invalid len or if
+ * it's about to overflow ssize_t */
+ if (len < 0) {
+ ret = -EINVAL;
+ goto out;
+ }
+ if (type >= 0
+ && unlikely(!access_ok(buf, len))) {
+ ret = -EFAULT;
+ goto out;
+ }
+ if (len > MAX_RW_COUNT - ret) {
+ len = MAX_RW_COUNT - ret;
+ iov[seg].iov_len = len;
+ }
+ ret += len;
+ }
+out:
+ *ret_pointer = iov;
+ return ret;
+}
+
/**
* import_iovec() - Copy an array of &struct iovec from userspace
* into the kernel, check that it is valid, and initialize a new
@@ -1695,6 +1798,79 @@ EXPORT_SYMBOL(import_iovec);
#ifdef CONFIG_COMPAT
#include <linux/compat.h>
+ssize_t compat_rw_copy_check_uvector(int type,
+ const struct compat_iovec __user *uvector,
+ unsigned long nr_segs, unsigned long fast_segs,
+ struct iovec *fast_pointer, struct iovec **ret_pointer)
+{
+ compat_ssize_t tot_len;
+ struct iovec *iov = *ret_pointer = fast_pointer;
+ ssize_t ret = 0;
+ int seg;
+
+ /*
+ * SuS says "The readv() function *may* fail if the iovcnt argument
+ * was less than or equal to 0, or greater than {IOV_MAX}. Linux has
+ * traditionally returned zero for zero segments, so...
+ */
+ if (nr_segs == 0)
+ goto out;
+
+ ret = -EINVAL;
+ if (nr_segs > UIO_MAXIOV)
+ goto out;
+ if (nr_segs > fast_segs) {
+ ret = -ENOMEM;
+ iov = kmalloc_array(nr_segs, sizeof(struct iovec), GFP_KERNEL);
+ if (iov == NULL)
+ goto out;
+ }
+ *ret_pointer = iov;
+
+ ret = -EFAULT;
+ if (!access_ok(uvector, nr_segs*sizeof(*uvector)))
+ goto out;
+
+ /*
+ * Single unix specification:
+ * We should -EINVAL if an element length is not >= 0 and fitting an
+ * ssize_t.
+ *
+ * In Linux, the total length is limited to MAX_RW_COUNT, there is
+ * no overflow possibility.
+ */
+ tot_len = 0;
+ ret = -EINVAL;
+ for (seg = 0; seg < nr_segs; seg++) {
+ compat_uptr_t buf;
+ compat_ssize_t len;
+
+ if (__get_user(len, &uvector->iov_len) ||
+ __get_user(buf, &uvector->iov_base)) {
+ ret = -EFAULT;
+ goto out;
+ }
+ if (len < 0) /* size_t not fitting in compat_ssize_t .. */
+ goto out;
+ if (type >= 0 &&
+ !access_ok(compat_ptr(buf), len)) {
+ ret = -EFAULT;
+ goto out;
+ }
+ if (len > MAX_RW_COUNT - tot_len)
+ len = MAX_RW_COUNT - tot_len;
+ tot_len += len;
+ iov->iov_base = compat_ptr(buf);
+ iov->iov_len = (compat_size_t) len;
+ uvector++;
+ iov++;
+ }
+ ret = tot_len;
+
+out:
+ return ret;
+}
+
ssize_t compat_import_iovec(int type,
const struct compat_iovec __user * uvector,
unsigned nr_segs, unsigned fast_segs,
--
2.28.0
^ permalink raw reply related [flat|nested] 66+ messages in thread
* Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-09-25 4:51 ` [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c Christoph Hellwig
@ 2020-10-21 16:13 ` Greg KH
2020-10-21 20:59 ` David Laight
2020-10-21 23:39 ` Al Viro
0 siblings, 2 replies; 66+ messages in thread
From: Greg KH @ 2020-10-21 16:13 UTC (permalink / raw)
To: Christoph Hellwig, kernel-team
Cc: Alexander Viro, Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, David Laight, linux-arm-kernel, linux-kernel,
linux-mips, linux-parisc, linuxppc-dev, linux-s390, sparclinux,
linux-block, linux-scsi, linux-fsdevel, linux-aio, io-uring,
linux-arch, linux-mm, netdev, keyrings, linux-security-module
On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:
> From: David Laight <[email protected]>
>
> This lets the compiler inline it into import_iovec() generating
> much better code.
>
> Signed-off-by: David Laight <[email protected]>
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
> fs/read_write.c | 179 ------------------------------------------------
> lib/iov_iter.c | 176 +++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 176 insertions(+), 179 deletions(-)
Strangely, this commit causes a regression in Linus's tree right now.
I can't really figure out what the regression is, only that this commit
triggers a "large Android system binary" from working properly. There's
no kernel log messages anywhere, and I don't have any way to strace the
thing in the testing framework, so any hints that people can provide
would be most appreciated.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 66+ messages in thread
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-21 16:13 ` Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c" Greg KH
@ 2020-10-21 20:59 ` David Laight
2020-10-21 23:39 ` Al Viro
1 sibling, 0 replies; 66+ messages in thread
From: David Laight @ 2020-10-21 20:59 UTC (permalink / raw)
To: 'Greg KH', Christoph Hellwig, [email protected]
Cc: Alexander Viro, Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
From: Greg KH
> Sent: 21 October 2020 17:13
>
> On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:
> > From: David Laight <[email protected]>
> >
> > This lets the compiler inline it into import_iovec() generating
> > much better code.
> >
> > Signed-off-by: David Laight <[email protected]>
> > Signed-off-by: Christoph Hellwig <[email protected]>
> > ---
> > fs/read_write.c | 179 ------------------------------------------------
> > lib/iov_iter.c | 176 +++++++++++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 176 insertions(+), 179 deletions(-)
>
> Strangely, this commit causes a regression in Linus's tree right now.
>
> I can't really figure out what the regression is, only that this commit
> triggers a "large Android system binary" from working properly. There's
> no kernel log messages anywhere, and I don't have any way to strace the
> thing in the testing framework, so any hints that people can provide
> would be most appreciated.
My original commit just moved the function source from one file to another.
So it is odd that it makes any difference.
I don't even know if it gets inlined by Christoph's actual patch.
(I have another patch that depended on it that I need to resubmit.)
Some of the other changes from Christoph's same patch set might
make a difference though.
Might be worth forcing it to be not inlined - so it is no change.
Or try adding a kernel log to import_iovec() or the associated
copy failing.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-21 16:13 ` Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c" Greg KH
2020-10-21 20:59 ` David Laight
@ 2020-10-21 23:39 ` Al Viro
2020-10-22 8:26 ` Greg KH
1 sibling, 1 reply; 66+ messages in thread
From: Al Viro @ 2020-10-21 23:39 UTC (permalink / raw)
To: Greg KH
Cc: Christoph Hellwig, kernel-team, Andrew Morton, Jens Axboe,
Arnd Bergmann, David Howells, David Laight, linux-arm-kernel,
linux-kernel, linux-mips, linux-parisc, linuxppc-dev, linux-s390,
sparclinux, linux-block, linux-scsi, linux-fsdevel, linux-aio,
io-uring, linux-arch, linux-mm, netdev, keyrings,
linux-security-module
On Wed, Oct 21, 2020 at 06:13:01PM +0200, Greg KH wrote:
> On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:
> > From: David Laight <[email protected]>
> >
> > This lets the compiler inline it into import_iovec() generating
> > much better code.
> >
> > Signed-off-by: David Laight <[email protected]>
> > Signed-off-by: Christoph Hellwig <[email protected]>
> > ---
> > fs/read_write.c | 179 ------------------------------------------------
> > lib/iov_iter.c | 176 +++++++++++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 176 insertions(+), 179 deletions(-)
>
> Strangely, this commit causes a regression in Linus's tree right now.
>
> I can't really figure out what the regression is, only that this commit
> triggers a "large Android system binary" from working properly. There's
> no kernel log messages anywhere, and I don't have any way to strace the
> thing in the testing framework, so any hints that people can provide
> would be most appreciated.
It's a pure move - modulo changed line breaks in the argument lists
the functions involved are identical before and after that (just checked
that directly, by checking out the trees before and after, extracting two
functions in question from fs/read_write.c and lib/iov_iter.c (before and
after, resp.) and checking the diff between those.
How certain is your bisection?
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-21 23:39 ` Al Viro
@ 2020-10-22 8:26 ` Greg KH
2020-10-22 8:35 ` David Hildenbrand
0 siblings, 1 reply; 66+ messages in thread
From: Greg KH @ 2020-10-22 8:26 UTC (permalink / raw)
To: Al Viro, Nick Desaulniers
Cc: Christoph Hellwig, kernel-team, Andrew Morton, Jens Axboe,
Arnd Bergmann, David Howells, David Laight, linux-arm-kernel,
linux-kernel, linux-mips, linux-parisc, linuxppc-dev, linux-s390,
sparclinux, linux-block, linux-scsi, linux-fsdevel, linux-aio,
io-uring, linux-arch, linux-mm, netdev, keyrings,
linux-security-module
On Thu, Oct 22, 2020 at 12:39:14AM +0100, Al Viro wrote:
> On Wed, Oct 21, 2020 at 06:13:01PM +0200, Greg KH wrote:
> > On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:
> > > From: David Laight <[email protected]>
> > >
> > > This lets the compiler inline it into import_iovec() generating
> > > much better code.
> > >
> > > Signed-off-by: David Laight <[email protected]>
> > > Signed-off-by: Christoph Hellwig <[email protected]>
> > > ---
> > > fs/read_write.c | 179 ------------------------------------------------
> > > lib/iov_iter.c | 176 +++++++++++++++++++++++++++++++++++++++++++++++
> > > 2 files changed, 176 insertions(+), 179 deletions(-)
> >
> > Strangely, this commit causes a regression in Linus's tree right now.
> >
> > I can't really figure out what the regression is, only that this commit
> > triggers a "large Android system binary" from working properly. There's
> > no kernel log messages anywhere, and I don't have any way to strace the
> > thing in the testing framework, so any hints that people can provide
> > would be most appreciated.
>
> It's a pure move - modulo changed line breaks in the argument lists
> the functions involved are identical before and after that (just checked
> that directly, by checking out the trees before and after, extracting two
> functions in question from fs/read_write.c and lib/iov_iter.c (before and
> after, resp.) and checking the diff between those.
>
> How certain is your bisection?
The bisection is very reproducable.
But, this looks now to be a compiler bug. I'm using the latest version
of clang and if I put "noinline" at the front of the function,
everything works.
Nick, any ideas here as to who I should report this to?
I'll work on a fixup patch for the Android kernel tree to see if I can
work around it there, but others will hit this in Linus's tree sooner or
later...
thanks,
greg k-h
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 8:26 ` Greg KH
@ 2020-10-22 8:35 ` David Hildenbrand
2020-10-22 8:40 ` David Laight
0 siblings, 1 reply; 66+ messages in thread
From: David Hildenbrand @ 2020-10-22 8:35 UTC (permalink / raw)
To: Greg KH, Al Viro, Nick Desaulniers
Cc: Christoph Hellwig, kernel-team, Andrew Morton, Jens Axboe,
Arnd Bergmann, David Howells, David Laight, linux-arm-kernel,
linux-kernel, linux-mips, linux-parisc, linuxppc-dev, linux-s390,
sparclinux, linux-block, linux-scsi, linux-fsdevel, linux-aio,
io-uring, linux-arch, linux-mm, netdev, keyrings,
linux-security-module
On 22.10.20 10:26, Greg KH wrote:
> On Thu, Oct 22, 2020 at 12:39:14AM +0100, Al Viro wrote:
>> On Wed, Oct 21, 2020 at 06:13:01PM +0200, Greg KH wrote:
>>> On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:
>>>> From: David Laight <[email protected]>
>>>>
>>>> This lets the compiler inline it into import_iovec() generating
>>>> much better code.
>>>>
>>>> Signed-off-by: David Laight <[email protected]>
>>>> Signed-off-by: Christoph Hellwig <[email protected]>
>>>> ---
>>>> fs/read_write.c | 179 ------------------------------------------------
>>>> lib/iov_iter.c | 176 +++++++++++++++++++++++++++++++++++++++++++++++
>>>> 2 files changed, 176 insertions(+), 179 deletions(-)
>>>
>>> Strangely, this commit causes a regression in Linus's tree right now.
>>>
>>> I can't really figure out what the regression is, only that this commit
>>> triggers a "large Android system binary" from working properly. There's
>>> no kernel log messages anywhere, and I don't have any way to strace the
>>> thing in the testing framework, so any hints that people can provide
>>> would be most appreciated.
>>
>> It's a pure move - modulo changed line breaks in the argument lists
>> the functions involved are identical before and after that (just checked
>> that directly, by checking out the trees before and after, extracting two
>> functions in question from fs/read_write.c and lib/iov_iter.c (before and
>> after, resp.) and checking the diff between those.
>>
>> How certain is your bisection?
>
> The bisection is very reproducable.
>
> But, this looks now to be a compiler bug. I'm using the latest version
> of clang and if I put "noinline" at the front of the function,
> everything works.
Well, the compiler can do more invasive optimizations when inlining. If
you have buggy code that relies on some unspecified behavior, inlining
can change the behavior ... but going over that code, there isn't too
much action going on. At least nothing screamed at me.
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 66+ messages in thread
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 8:35 ` David Hildenbrand
@ 2020-10-22 8:40 ` David Laight
2020-10-22 8:48 ` David Hildenbrand
0 siblings, 1 reply; 66+ messages in thread
From: David Laight @ 2020-10-22 8:40 UTC (permalink / raw)
To: 'David Hildenbrand', Greg KH, Al Viro, Nick Desaulniers
Cc: Christoph Hellwig, [email protected], Andrew Morton,
Jens Axboe, Arnd Bergmann, David Howells,
[email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
From: David Hildenbrand
> Sent: 22 October 2020 09:35
>
> On 22.10.20 10:26, Greg KH wrote:
> > On Thu, Oct 22, 2020 at 12:39:14AM +0100, Al Viro wrote:
> >> On Wed, Oct 21, 2020 at 06:13:01PM +0200, Greg KH wrote:
> >>> On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:
> >>>> From: David Laight <[email protected]>
> >>>>
> >>>> This lets the compiler inline it into import_iovec() generating
> >>>> much better code.
> >>>>
> >>>> Signed-off-by: David Laight <[email protected]>
> >>>> Signed-off-by: Christoph Hellwig <[email protected]>
> >>>> ---
> >>>> fs/read_write.c | 179 ------------------------------------------------
> >>>> lib/iov_iter.c | 176 +++++++++++++++++++++++++++++++++++++++++++++++
> >>>> 2 files changed, 176 insertions(+), 179 deletions(-)
> >>>
> >>> Strangely, this commit causes a regression in Linus's tree right now.
> >>>
> >>> I can't really figure out what the regression is, only that this commit
> >>> triggers a "large Android system binary" from working properly. There's
> >>> no kernel log messages anywhere, and I don't have any way to strace the
> >>> thing in the testing framework, so any hints that people can provide
> >>> would be most appreciated.
> >>
> >> It's a pure move - modulo changed line breaks in the argument lists
> >> the functions involved are identical before and after that (just checked
> >> that directly, by checking out the trees before and after, extracting two
> >> functions in question from fs/read_write.c and lib/iov_iter.c (before and
> >> after, resp.) and checking the diff between those.
> >>
> >> How certain is your bisection?
> >
> > The bisection is very reproducable.
> >
> > But, this looks now to be a compiler bug. I'm using the latest version
> > of clang and if I put "noinline" at the front of the function,
> > everything works.
>
> Well, the compiler can do more invasive optimizations when inlining. If
> you have buggy code that relies on some unspecified behavior, inlining
> can change the behavior ... but going over that code, there isn't too
> much action going on. At least nothing screamed at me.
Apart from all the optimisations that get rid off the 'pass be reference'
parameters and strange conditional tests.
Plenty of scope for the compiler getting it wrong.
But nothing even vaguely illegal.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 8:40 ` David Laight
@ 2020-10-22 8:48 ` David Hildenbrand
2020-10-22 9:01 ` Greg KH
2020-10-22 9:02 ` David Laight
0 siblings, 2 replies; 66+ messages in thread
From: David Hildenbrand @ 2020-10-22 8:48 UTC (permalink / raw)
To: David Laight, Greg KH, Al Viro, Nick Desaulniers
Cc: Christoph Hellwig, [email protected], Andrew Morton,
Jens Axboe, Arnd Bergmann, David Howells,
[email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On 22.10.20 10:40, David Laight wrote:
> From: David Hildenbrand
>> Sent: 22 October 2020 09:35
>>
>> On 22.10.20 10:26, Greg KH wrote:
>>> On Thu, Oct 22, 2020 at 12:39:14AM +0100, Al Viro wrote:
>>>> On Wed, Oct 21, 2020 at 06:13:01PM +0200, Greg KH wrote:
>>>>> On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:
>>>>>> From: David Laight <[email protected]>
>>>>>>
>>>>>> This lets the compiler inline it into import_iovec() generating
>>>>>> much better code.
>>>>>>
>>>>>> Signed-off-by: David Laight <[email protected]>
>>>>>> Signed-off-by: Christoph Hellwig <[email protected]>
>>>>>> ---
>>>>>> fs/read_write.c | 179 ------------------------------------------------
>>>>>> lib/iov_iter.c | 176 +++++++++++++++++++++++++++++++++++++++++++++++
>>>>>> 2 files changed, 176 insertions(+), 179 deletions(-)
>>>>>
>>>>> Strangely, this commit causes a regression in Linus's tree right now.
>>>>>
>>>>> I can't really figure out what the regression is, only that this commit
>>>>> triggers a "large Android system binary" from working properly. There's
>>>>> no kernel log messages anywhere, and I don't have any way to strace the
>>>>> thing in the testing framework, so any hints that people can provide
>>>>> would be most appreciated.
>>>>
>>>> It's a pure move - modulo changed line breaks in the argument lists
>>>> the functions involved are identical before and after that (just checked
>>>> that directly, by checking out the trees before and after, extracting two
>>>> functions in question from fs/read_write.c and lib/iov_iter.c (before and
>>>> after, resp.) and checking the diff between those.
>>>>
>>>> How certain is your bisection?
>>>
>>> The bisection is very reproducable.
>>>
>>> But, this looks now to be a compiler bug. I'm using the latest version
>>> of clang and if I put "noinline" at the front of the function,
>>> everything works.
>>
>> Well, the compiler can do more invasive optimizations when inlining. If
>> you have buggy code that relies on some unspecified behavior, inlining
>> can change the behavior ... but going over that code, there isn't too
>> much action going on. At least nothing screamed at me.
>
> Apart from all the optimisations that get rid off the 'pass be reference'
> parameters and strange conditional tests.
> Plenty of scope for the compiler getting it wrong.
> But nothing even vaguely illegal.
Not the first time that people blame the compiler to then figure out
that something else is wrong ... but maybe this time is different :)
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 8:48 ` David Hildenbrand
@ 2020-10-22 9:01 ` Greg KH
2020-10-22 9:06 ` David Laight
2020-10-22 9:19 ` David Hildenbrand
2020-10-22 9:02 ` David Laight
1 sibling, 2 replies; 66+ messages in thread
From: Greg KH @ 2020-10-22 9:01 UTC (permalink / raw)
To: David Hildenbrand
Cc: David Laight, Al Viro, Nick Desaulniers, Christoph Hellwig,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On Thu, Oct 22, 2020 at 10:48:59AM +0200, David Hildenbrand wrote:
> On 22.10.20 10:40, David Laight wrote:
> > From: David Hildenbrand
> >> Sent: 22 October 2020 09:35
> >>
> >> On 22.10.20 10:26, Greg KH wrote:
> >>> On Thu, Oct 22, 2020 at 12:39:14AM +0100, Al Viro wrote:
> >>>> On Wed, Oct 21, 2020 at 06:13:01PM +0200, Greg KH wrote:
> >>>>> On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:
> >>>>>> From: David Laight <[email protected]>
> >>>>>>
> >>>>>> This lets the compiler inline it into import_iovec() generating
> >>>>>> much better code.
> >>>>>>
> >>>>>> Signed-off-by: David Laight <[email protected]>
> >>>>>> Signed-off-by: Christoph Hellwig <[email protected]>
> >>>>>> ---
> >>>>>> fs/read_write.c | 179 ------------------------------------------------
> >>>>>> lib/iov_iter.c | 176 +++++++++++++++++++++++++++++++++++++++++++++++
> >>>>>> 2 files changed, 176 insertions(+), 179 deletions(-)
> >>>>>
> >>>>> Strangely, this commit causes a regression in Linus's tree right now.
> >>>>>
> >>>>> I can't really figure out what the regression is, only that this commit
> >>>>> triggers a "large Android system binary" from working properly. There's
> >>>>> no kernel log messages anywhere, and I don't have any way to strace the
> >>>>> thing in the testing framework, so any hints that people can provide
> >>>>> would be most appreciated.
> >>>>
> >>>> It's a pure move - modulo changed line breaks in the argument lists
> >>>> the functions involved are identical before and after that (just checked
> >>>> that directly, by checking out the trees before and after, extracting two
> >>>> functions in question from fs/read_write.c and lib/iov_iter.c (before and
> >>>> after, resp.) and checking the diff between those.
> >>>>
> >>>> How certain is your bisection?
> >>>
> >>> The bisection is very reproducable.
> >>>
> >>> But, this looks now to be a compiler bug. I'm using the latest version
> >>> of clang and if I put "noinline" at the front of the function,
> >>> everything works.
> >>
> >> Well, the compiler can do more invasive optimizations when inlining. If
> >> you have buggy code that relies on some unspecified behavior, inlining
> >> can change the behavior ... but going over that code, there isn't too
> >> much action going on. At least nothing screamed at me.
> >
> > Apart from all the optimisations that get rid off the 'pass be reference'
> > parameters and strange conditional tests.
> > Plenty of scope for the compiler getting it wrong.
> > But nothing even vaguely illegal.
>
> Not the first time that people blame the compiler to then figure out
> that something else is wrong ... but maybe this time is different :)
I agree, I hate to blame the compiler, that's almost never the real
problem, but this one sure "feels" like it.
I'm running some more tests, trying to narrow things down as just adding
a "noinline" to the function that got moved here doesn't work on Linus's
tree at the moment because the function was split into multiple
functions.
Give me a few hours...
thanks,
greg k-h
^ permalink raw reply [flat|nested] 66+ messages in thread
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 9:01 ` Greg KH
@ 2020-10-22 9:06 ` David Laight
2020-10-22 9:19 ` David Hildenbrand
1 sibling, 0 replies; 66+ messages in thread
From: David Laight @ 2020-10-22 9:06 UTC (permalink / raw)
To: 'Greg KH', David Hildenbrand
Cc: Al Viro, Nick Desaulniers, Christoph Hellwig,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
From: Greg KH
> Sent: 22 October 2020 10:02
...
> I'm running some more tests, trying to narrow things down as just adding
> a "noinline" to the function that got moved here doesn't work on Linus's
> tree at the moment because the function was split into multiple
> functions.
I was going to look at that once rc2 is in - and the kernel is
likely to work.
I suspect the split version doesn't get inlined the same way?
Which leaves the horrid argument conversion the inline got
rid of back there again.
Which all rather begs the question of why the compiler doesn't
generate the expected code.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 9:01 ` Greg KH
2020-10-22 9:06 ` David Laight
@ 2020-10-22 9:19 ` David Hildenbrand
2020-10-22 9:25 ` David Hildenbrand
2020-10-22 9:28 ` David Laight
1 sibling, 2 replies; 66+ messages in thread
From: David Hildenbrand @ 2020-10-22 9:19 UTC (permalink / raw)
To: Greg KH
Cc: David Laight, Al Viro, Nick Desaulniers, Christoph Hellwig,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On 22.10.20 11:01, Greg KH wrote:
> On Thu, Oct 22, 2020 at 10:48:59AM +0200, David Hildenbrand wrote:
>> On 22.10.20 10:40, David Laight wrote:
>>> From: David Hildenbrand
>>>> Sent: 22 October 2020 09:35
>>>>
>>>> On 22.10.20 10:26, Greg KH wrote:
>>>>> On Thu, Oct 22, 2020 at 12:39:14AM +0100, Al Viro wrote:
>>>>>> On Wed, Oct 21, 2020 at 06:13:01PM +0200, Greg KH wrote:
>>>>>>> On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:
>>>>>>>> From: David Laight <[email protected]>
>>>>>>>>
>>>>>>>> This lets the compiler inline it into import_iovec() generating
>>>>>>>> much better code.
>>>>>>>>
>>>>>>>> Signed-off-by: David Laight <[email protected]>
>>>>>>>> Signed-off-by: Christoph Hellwig <[email protected]>
>>>>>>>> ---
>>>>>>>> fs/read_write.c | 179 ------------------------------------------------
>>>>>>>> lib/iov_iter.c | 176 +++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>> 2 files changed, 176 insertions(+), 179 deletions(-)
>>>>>>>
>>>>>>> Strangely, this commit causes a regression in Linus's tree right now.
>>>>>>>
>>>>>>> I can't really figure out what the regression is, only that this commit
>>>>>>> triggers a "large Android system binary" from working properly. There's
>>>>>>> no kernel log messages anywhere, and I don't have any way to strace the
>>>>>>> thing in the testing framework, so any hints that people can provide
>>>>>>> would be most appreciated.
>>>>>>
>>>>>> It's a pure move - modulo changed line breaks in the argument lists
>>>>>> the functions involved are identical before and after that (just checked
>>>>>> that directly, by checking out the trees before and after, extracting two
>>>>>> functions in question from fs/read_write.c and lib/iov_iter.c (before and
>>>>>> after, resp.) and checking the diff between those.
>>>>>>
>>>>>> How certain is your bisection?
>>>>>
>>>>> The bisection is very reproducable.
>>>>>
>>>>> But, this looks now to be a compiler bug. I'm using the latest version
>>>>> of clang and if I put "noinline" at the front of the function,
>>>>> everything works.
>>>>
>>>> Well, the compiler can do more invasive optimizations when inlining. If
>>>> you have buggy code that relies on some unspecified behavior, inlining
>>>> can change the behavior ... but going over that code, there isn't too
>>>> much action going on. At least nothing screamed at me.
>>>
>>> Apart from all the optimisations that get rid off the 'pass be reference'
>>> parameters and strange conditional tests.
>>> Plenty of scope for the compiler getting it wrong.
>>> But nothing even vaguely illegal.
>>
>> Not the first time that people blame the compiler to then figure out
>> that something else is wrong ... but maybe this time is different :)
>
> I agree, I hate to blame the compiler, that's almost never the real
> problem, but this one sure "feels" like it.
>
> I'm running some more tests, trying to narrow things down as just adding
> a "noinline" to the function that got moved here doesn't work on Linus's
> tree at the moment because the function was split into multiple
> functions.
>
> Give me a few hours...
I might be wrong but
a) import_iovec() uses:
- unsigned nr_segs -> int
- unsigned fast_segs -> int
b) rw_copy_check_uvector() uses:
- unsigned long nr_segs -> long
- unsigned long fast_seg -> long
So when calling rw_copy_check_uvector(), we have to zero-extend the
registers used for passing the arguments. That's definitely done when
calling the function explicitly. Maybe when inlining something is messed up?
Just a thought ...
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 9:19 ` David Hildenbrand
@ 2020-10-22 9:25 ` David Hildenbrand
2020-10-22 9:32 ` David Laight
2020-10-22 9:28 ` David Laight
1 sibling, 1 reply; 66+ messages in thread
From: David Hildenbrand @ 2020-10-22 9:25 UTC (permalink / raw)
To: Greg KH
Cc: David Laight, Al Viro, Nick Desaulniers, Christoph Hellwig,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On 22.10.20 11:19, David Hildenbrand wrote:
> On 22.10.20 11:01, Greg KH wrote:
>> On Thu, Oct 22, 2020 at 10:48:59AM +0200, David Hildenbrand wrote:
>>> On 22.10.20 10:40, David Laight wrote:
>>>> From: David Hildenbrand
>>>>> Sent: 22 October 2020 09:35
>>>>>
>>>>> On 22.10.20 10:26, Greg KH wrote:
>>>>>> On Thu, Oct 22, 2020 at 12:39:14AM +0100, Al Viro wrote:
>>>>>>> On Wed, Oct 21, 2020 at 06:13:01PM +0200, Greg KH wrote:
>>>>>>>> On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:
>>>>>>>>> From: David Laight <[email protected]>
>>>>>>>>>
>>>>>>>>> This lets the compiler inline it into import_iovec() generating
>>>>>>>>> much better code.
>>>>>>>>>
>>>>>>>>> Signed-off-by: David Laight <[email protected]>
>>>>>>>>> Signed-off-by: Christoph Hellwig <[email protected]>
>>>>>>>>> ---
>>>>>>>>> fs/read_write.c | 179 ------------------------------------------------
>>>>>>>>> lib/iov_iter.c | 176 +++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>> 2 files changed, 176 insertions(+), 179 deletions(-)
>>>>>>>>
>>>>>>>> Strangely, this commit causes a regression in Linus's tree right now.
>>>>>>>>
>>>>>>>> I can't really figure out what the regression is, only that this commit
>>>>>>>> triggers a "large Android system binary" from working properly. There's
>>>>>>>> no kernel log messages anywhere, and I don't have any way to strace the
>>>>>>>> thing in the testing framework, so any hints that people can provide
>>>>>>>> would be most appreciated.
>>>>>>>
>>>>>>> It's a pure move - modulo changed line breaks in the argument lists
>>>>>>> the functions involved are identical before and after that (just checked
>>>>>>> that directly, by checking out the trees before and after, extracting two
>>>>>>> functions in question from fs/read_write.c and lib/iov_iter.c (before and
>>>>>>> after, resp.) and checking the diff between those.
>>>>>>>
>>>>>>> How certain is your bisection?
>>>>>>
>>>>>> The bisection is very reproducable.
>>>>>>
>>>>>> But, this looks now to be a compiler bug. I'm using the latest version
>>>>>> of clang and if I put "noinline" at the front of the function,
>>>>>> everything works.
>>>>>
>>>>> Well, the compiler can do more invasive optimizations when inlining. If
>>>>> you have buggy code that relies on some unspecified behavior, inlining
>>>>> can change the behavior ... but going over that code, there isn't too
>>>>> much action going on. At least nothing screamed at me.
>>>>
>>>> Apart from all the optimisations that get rid off the 'pass be reference'
>>>> parameters and strange conditional tests.
>>>> Plenty of scope for the compiler getting it wrong.
>>>> But nothing even vaguely illegal.
>>>
>>> Not the first time that people blame the compiler to then figure out
>>> that something else is wrong ... but maybe this time is different :)
>>
>> I agree, I hate to blame the compiler, that's almost never the real
>> problem, but this one sure "feels" like it.
>>
>> I'm running some more tests, trying to narrow things down as just adding
>> a "noinline" to the function that got moved here doesn't work on Linus's
>> tree at the moment because the function was split into multiple
>> functions.
>>
>> Give me a few hours...
>
> I might be wrong but
>
> a) import_iovec() uses:
> - unsigned nr_segs -> int
> - unsigned fast_segs -> int
> b) rw_copy_check_uvector() uses:
> - unsigned long nr_segs -> long
> - unsigned long fast_seg -> long
>
> So when calling rw_copy_check_uvector(), we have to zero-extend the
> registers used for passing the arguments. That's definitely done when
> calling the function explicitly. Maybe when inlining something is messed up?
>
> Just a thought ...
>
... especially because I recall that clang and gcc behave slightly
differently:
https://github.com/hjl-tools/x86-psABI/issues/2
"Function args are different: narrow types are sign or zero extended to
32 bits, depending on their type. clang depends on this for incoming
args, but gcc doesn't make that assumption. But both compilers do it
when calling, so gcc code can call clang code.
The upper 32 bits of registers are always undefined garbage for types
smaller than 64 bits."
Again, just a thought.
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 66+ messages in thread
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 9:25 ` David Hildenbrand
@ 2020-10-22 9:32 ` David Laight
2020-10-22 9:36 ` David Hildenbrand
0 siblings, 1 reply; 66+ messages in thread
From: David Laight @ 2020-10-22 9:32 UTC (permalink / raw)
To: 'David Hildenbrand', Greg KH
Cc: Al Viro, Nick Desaulniers, Christoph Hellwig,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
From: David Hildenbrand
> Sent: 22 October 2020 10:25
...
> ... especially because I recall that clang and gcc behave slightly
> differently:
>
> https://github.com/hjl-tools/x86-psABI/issues/2
>
> "Function args are different: narrow types are sign or zero extended to
> 32 bits, depending on their type. clang depends on this for incoming
> args, but gcc doesn't make that assumption. But both compilers do it
> when calling, so gcc code can call clang code.
It really is best to use 'int' (or even 'long') for all numeric
arguments (and results) regardless of the domain of the value.
Related, I've always worried about 'bool'....
> The upper 32 bits of registers are always undefined garbage for types
> smaller than 64 bits."
On x86-64 the high bits are zeroed by all 32bit loads.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 9:32 ` David Laight
@ 2020-10-22 9:36 ` David Hildenbrand
2020-10-22 10:48 ` Greg KH
2020-10-22 13:23 ` Christoph Hellwig
0 siblings, 2 replies; 66+ messages in thread
From: David Hildenbrand @ 2020-10-22 9:36 UTC (permalink / raw)
To: David Laight, Greg KH
Cc: Al Viro, Nick Desaulniers, Christoph Hellwig,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On 22.10.20 11:32, David Laight wrote:
> From: David Hildenbrand
>> Sent: 22 October 2020 10:25
> ...
>> ... especially because I recall that clang and gcc behave slightly
>> differently:
>>
>> https://github.com/hjl-tools/x86-psABI/issues/2
>>
>> "Function args are different: narrow types are sign or zero extended to
>> 32 bits, depending on their type. clang depends on this for incoming
>> args, but gcc doesn't make that assumption. But both compilers do it
>> when calling, so gcc code can call clang code.
>
> It really is best to use 'int' (or even 'long') for all numeric
> arguments (and results) regardless of the domain of the value.
>
> Related, I've always worried about 'bool'....
>
>> The upper 32 bits of registers are always undefined garbage for types
>> smaller than 64 bits."
>
> On x86-64 the high bits are zeroed by all 32bit loads.
Yeah, but does not help here.
My thinking: if the compiler that calls import_iovec() has garbage in
the upper 32 bit
a) gcc will zero it out and not rely on it being zero.
b) clang will not zero it out, assuming it is zero.
But
a) will zero it out when calling the !inlined variant
b) clang will zero it out when calling the !inlined variant
When inlining, b) strikes. We access garbage. That would mean that we
have calling code that's not generated by clang/gcc IIUC.
We can test easily by changing the parameters instead of adding an "inline".
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 9:36 ` David Hildenbrand
@ 2020-10-22 10:48 ` Greg KH
2020-10-22 12:18 ` Greg KH
2020-10-22 13:23 ` Christoph Hellwig
1 sibling, 1 reply; 66+ messages in thread
From: Greg KH @ 2020-10-22 10:48 UTC (permalink / raw)
To: David Hildenbrand
Cc: David Laight, Al Viro, Nick Desaulniers, Christoph Hellwig,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
> On 22.10.20 11:32, David Laight wrote:
> > From: David Hildenbrand
> >> Sent: 22 October 2020 10:25
> > ...
> >> ... especially because I recall that clang and gcc behave slightly
> >> differently:
> >>
> >> https://github.com/hjl-tools/x86-psABI/issues/2
> >>
> >> "Function args are different: narrow types are sign or zero extended to
> >> 32 bits, depending on their type. clang depends on this for incoming
> >> args, but gcc doesn't make that assumption. But both compilers do it
> >> when calling, so gcc code can call clang code.
> >
> > It really is best to use 'int' (or even 'long') for all numeric
> > arguments (and results) regardless of the domain of the value.
> >
> > Related, I've always worried about 'bool'....
> >
> >> The upper 32 bits of registers are always undefined garbage for types
> >> smaller than 64 bits."
> >
> > On x86-64 the high bits are zeroed by all 32bit loads.
>
> Yeah, but does not help here.
>
>
> My thinking: if the compiler that calls import_iovec() has garbage in
> the upper 32 bit
>
> a) gcc will zero it out and not rely on it being zero.
> b) clang will not zero it out, assuming it is zero.
>
> But
>
> a) will zero it out when calling the !inlined variant
> b) clang will zero it out when calling the !inlined variant
>
> When inlining, b) strikes. We access garbage. That would mean that we
> have calling code that's not generated by clang/gcc IIUC.
>
> We can test easily by changing the parameters instead of adding an "inline".
Let me try that as well, as I seem to have a good reproducer, but it
takes a while to run...
greg k-h
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 10:48 ` Greg KH
@ 2020-10-22 12:18 ` Greg KH
2020-10-22 12:42 ` David Hildenbrand
0 siblings, 1 reply; 66+ messages in thread
From: Greg KH @ 2020-10-22 12:18 UTC (permalink / raw)
To: David Hildenbrand
Cc: David Laight, Al Viro, Nick Desaulniers, Christoph Hellwig,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On Thu, Oct 22, 2020 at 12:48:05PM +0200, Greg KH wrote:
> On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
> > On 22.10.20 11:32, David Laight wrote:
> > > From: David Hildenbrand
> > >> Sent: 22 October 2020 10:25
> > > ...
> > >> ... especially because I recall that clang and gcc behave slightly
> > >> differently:
> > >>
> > >> https://github.com/hjl-tools/x86-psABI/issues/2
> > >>
> > >> "Function args are different: narrow types are sign or zero extended to
> > >> 32 bits, depending on their type. clang depends on this for incoming
> > >> args, but gcc doesn't make that assumption. But both compilers do it
> > >> when calling, so gcc code can call clang code.
> > >
> > > It really is best to use 'int' (or even 'long') for all numeric
> > > arguments (and results) regardless of the domain of the value.
> > >
> > > Related, I've always worried about 'bool'....
> > >
> > >> The upper 32 bits of registers are always undefined garbage for types
> > >> smaller than 64 bits."
> > >
> > > On x86-64 the high bits are zeroed by all 32bit loads.
> >
> > Yeah, but does not help here.
> >
> >
> > My thinking: if the compiler that calls import_iovec() has garbage in
> > the upper 32 bit
> >
> > a) gcc will zero it out and not rely on it being zero.
> > b) clang will not zero it out, assuming it is zero.
> >
> > But
> >
> > a) will zero it out when calling the !inlined variant
> > b) clang will zero it out when calling the !inlined variant
> >
> > When inlining, b) strikes. We access garbage. That would mean that we
> > have calling code that's not generated by clang/gcc IIUC.
> >
> > We can test easily by changing the parameters instead of adding an "inline".
>
> Let me try that as well, as I seem to have a good reproducer, but it
> takes a while to run...
Ok, that didn't work.
And I can't seem to "fix" this by adding noinline to patches further
along in the patch series (because this commit's function is no longer
present due to later patches.)
Will keep digging...
greg k-h
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 12:18 ` Greg KH
@ 2020-10-22 12:42 ` David Hildenbrand
2020-10-22 12:57 ` Greg KH
0 siblings, 1 reply; 66+ messages in thread
From: David Hildenbrand @ 2020-10-22 12:42 UTC (permalink / raw)
To: Greg KH
Cc: David Laight, Al Viro, Nick Desaulniers, Christoph Hellwig,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On 22.10.20 14:18, Greg KH wrote:
> On Thu, Oct 22, 2020 at 12:48:05PM +0200, Greg KH wrote:
>> On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
>>> On 22.10.20 11:32, David Laight wrote:
>>>> From: David Hildenbrand
>>>>> Sent: 22 October 2020 10:25
>>>> ...
>>>>> ... especially because I recall that clang and gcc behave slightly
>>>>> differently:
>>>>>
>>>>> https://github.com/hjl-tools/x86-psABI/issues/2
>>>>>
>>>>> "Function args are different: narrow types are sign or zero extended to
>>>>> 32 bits, depending on their type. clang depends on this for incoming
>>>>> args, but gcc doesn't make that assumption. But both compilers do it
>>>>> when calling, so gcc code can call clang code.
>>>>
>>>> It really is best to use 'int' (or even 'long') for all numeric
>>>> arguments (and results) regardless of the domain of the value.
>>>>
>>>> Related, I've always worried about 'bool'....
>>>>
>>>>> The upper 32 bits of registers are always undefined garbage for types
>>>>> smaller than 64 bits."
>>>>
>>>> On x86-64 the high bits are zeroed by all 32bit loads.
>>>
>>> Yeah, but does not help here.
>>>
>>>
>>> My thinking: if the compiler that calls import_iovec() has garbage in
>>> the upper 32 bit
>>>
>>> a) gcc will zero it out and not rely on it being zero.
>>> b) clang will not zero it out, assuming it is zero.
>>>
>>> But
>>>
>>> a) will zero it out when calling the !inlined variant
>>> b) clang will zero it out when calling the !inlined variant
>>>
>>> When inlining, b) strikes. We access garbage. That would mean that we
>>> have calling code that's not generated by clang/gcc IIUC.
>>>
>>> We can test easily by changing the parameters instead of adding an "inline".
>>
>> Let me try that as well, as I seem to have a good reproducer, but it
>> takes a while to run...
>
> Ok, that didn't work.
>
> And I can't seem to "fix" this by adding noinline to patches further
> along in the patch series (because this commit's function is no longer
> present due to later patches.)
We might have the same issues with iovec_from_user() and friends now.
>
> Will keep digging...
>
> greg k-h
>
Might be worth to give this a try, just to see if it's related to
garbage in upper 32 bit and the way clang is handling it (might be a BUG
in clang, though):
diff --git a/include/linux/uio.h b/include/linux/uio.h
index 72d88566694e..7527298c6b56 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -267,7 +267,7 @@ size_t hash_and_copy_to_iter(const void *addr,
size_t bytes, void *hashp,
struct iov_iter *i);
struct iovec *iovec_from_user(const struct iovec __user *uvector,
- unsigned long nr_segs, unsigned long fast_segs,
+ unsigned nr_segs, unsigned fast_segs,
struct iovec *fast_iov, bool compat);
ssize_t import_iovec(int type, const struct iovec __user *uvec,
unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 1635111c5bd2..58417f1916dc 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1652,7 +1652,7 @@ const void *dup_iter(struct iov_iter *new, struct
iov_iter *old, gfp_t flags)
EXPORT_SYMBOL(dup_iter);
static int copy_compat_iovec_from_user(struct iovec *iov,
- const struct iovec __user *uvec, unsigned long nr_segs)
+ const struct iovec __user *uvec, unsigned nr_segs)
{
const struct compat_iovec __user *uiov =
(const struct compat_iovec __user *)uvec;
@@ -1684,7 +1684,7 @@ static int copy_compat_iovec_from_user(struct
iovec *iov,
}
static int copy_iovec_from_user(struct iovec *iov,
- const struct iovec __user *uvec, unsigned long nr_segs)
+ const struct iovec __user *uvec, unsigned nr_segs)
{
unsigned long seg;
@@ -1699,7 +1699,7 @@ static int copy_iovec_from_user(struct iovec *iov,
}
struct iovec *iovec_from_user(const struct iovec __user *uvec,
- unsigned long nr_segs, unsigned long fast_segs,
+ unsigned nr_segs, unsigned fast_segs,
struct iovec *fast_iov, bool compat)
{
struct iovec *iov = fast_iov;
@@ -1738,7 +1738,7 @@ ssize_t __import_iovec(int type, const struct
iovec __user *uvec,
struct iov_iter *i, bool compat)
{
ssize_t total_len = 0;
- unsigned long seg;
+ unsigned seg;
struct iovec *iov;
iov = iovec_from_user(uvec, nr_segs, fast_segs, *iovp, compat);
--
Thanks,
David / dhildenb
^ permalink raw reply related [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 12:42 ` David Hildenbrand
@ 2020-10-22 12:57 ` Greg KH
2020-10-22 13:50 ` Greg KH
0 siblings, 1 reply; 66+ messages in thread
From: Greg KH @ 2020-10-22 12:57 UTC (permalink / raw)
To: David Hildenbrand
Cc: David Laight, Al Viro, Nick Desaulniers, Christoph Hellwig,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On Thu, Oct 22, 2020 at 02:42:24PM +0200, David Hildenbrand wrote:
> On 22.10.20 14:18, Greg KH wrote:
> > On Thu, Oct 22, 2020 at 12:48:05PM +0200, Greg KH wrote:
> >> On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
> >>> On 22.10.20 11:32, David Laight wrote:
> >>>> From: David Hildenbrand
> >>>>> Sent: 22 October 2020 10:25
> >>>> ...
> >>>>> ... especially because I recall that clang and gcc behave slightly
> >>>>> differently:
> >>>>>
> >>>>> https://github.com/hjl-tools/x86-psABI/issues/2
> >>>>>
> >>>>> "Function args are different: narrow types are sign or zero extended to
> >>>>> 32 bits, depending on their type. clang depends on this for incoming
> >>>>> args, but gcc doesn't make that assumption. But both compilers do it
> >>>>> when calling, so gcc code can call clang code.
> >>>>
> >>>> It really is best to use 'int' (or even 'long') for all numeric
> >>>> arguments (and results) regardless of the domain of the value.
> >>>>
> >>>> Related, I've always worried about 'bool'....
> >>>>
> >>>>> The upper 32 bits of registers are always undefined garbage for types
> >>>>> smaller than 64 bits."
> >>>>
> >>>> On x86-64 the high bits are zeroed by all 32bit loads.
> >>>
> >>> Yeah, but does not help here.
> >>>
> >>>
> >>> My thinking: if the compiler that calls import_iovec() has garbage in
> >>> the upper 32 bit
> >>>
> >>> a) gcc will zero it out and not rely on it being zero.
> >>> b) clang will not zero it out, assuming it is zero.
> >>>
> >>> But
> >>>
> >>> a) will zero it out when calling the !inlined variant
> >>> b) clang will zero it out when calling the !inlined variant
> >>>
> >>> When inlining, b) strikes. We access garbage. That would mean that we
> >>> have calling code that's not generated by clang/gcc IIUC.
> >>>
> >>> We can test easily by changing the parameters instead of adding an "inline".
> >>
> >> Let me try that as well, as I seem to have a good reproducer, but it
> >> takes a while to run...
> >
> > Ok, that didn't work.
> >
> > And I can't seem to "fix" this by adding noinline to patches further
> > along in the patch series (because this commit's function is no longer
> > present due to later patches.)
>
> We might have the same issues with iovec_from_user() and friends now.
>
> >
> > Will keep digging...
> >
> > greg k-h
> >
>
>
> Might be worth to give this a try, just to see if it's related to
> garbage in upper 32 bit and the way clang is handling it (might be a BUG
> in clang, though):
>
>
> diff --git a/include/linux/uio.h b/include/linux/uio.h
> index 72d88566694e..7527298c6b56 100644
> --- a/include/linux/uio.h
> +++ b/include/linux/uio.h
> @@ -267,7 +267,7 @@ size_t hash_and_copy_to_iter(const void *addr,
> size_t bytes, void *hashp,
> struct iov_iter *i);
>
> struct iovec *iovec_from_user(const struct iovec __user *uvector,
> - unsigned long nr_segs, unsigned long fast_segs,
> + unsigned nr_segs, unsigned fast_segs,
> struct iovec *fast_iov, bool compat);
> ssize_t import_iovec(int type, const struct iovec __user *uvec,
> unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
> diff --git a/lib/iov_iter.c b/lib/iov_iter.c
> index 1635111c5bd2..58417f1916dc 100644
> --- a/lib/iov_iter.c
> +++ b/lib/iov_iter.c
> @@ -1652,7 +1652,7 @@ const void *dup_iter(struct iov_iter *new, struct
> iov_iter *old, gfp_t flags)
> EXPORT_SYMBOL(dup_iter);
>
> static int copy_compat_iovec_from_user(struct iovec *iov,
> - const struct iovec __user *uvec, unsigned long nr_segs)
> + const struct iovec __user *uvec, unsigned nr_segs)
> {
> const struct compat_iovec __user *uiov =
> (const struct compat_iovec __user *)uvec;
> @@ -1684,7 +1684,7 @@ static int copy_compat_iovec_from_user(struct
> iovec *iov,
> }
>
> static int copy_iovec_from_user(struct iovec *iov,
> - const struct iovec __user *uvec, unsigned long nr_segs)
> + const struct iovec __user *uvec, unsigned nr_segs)
> {
> unsigned long seg;
>
> @@ -1699,7 +1699,7 @@ static int copy_iovec_from_user(struct iovec *iov,
> }
>
> struct iovec *iovec_from_user(const struct iovec __user *uvec,
> - unsigned long nr_segs, unsigned long fast_segs,
> + unsigned nr_segs, unsigned fast_segs,
> struct iovec *fast_iov, bool compat)
> {
> struct iovec *iov = fast_iov;
> @@ -1738,7 +1738,7 @@ ssize_t __import_iovec(int type, const struct
> iovec __user *uvec,
> struct iov_iter *i, bool compat)
> {
> ssize_t total_len = 0;
> - unsigned long seg;
> + unsigned seg;
> struct iovec *iov;
>
> iov = iovec_from_user(uvec, nr_segs, fast_segs, *iovp, compat);
>
Ah, I tested the other way around, making everything "unsigned long"
instead. Will go try this too, as other tests are still running...
thanks,
greg k-h
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 12:57 ` Greg KH
@ 2020-10-22 13:50 ` Greg KH
[not found] ` <CAK8P3a1B7OVdyzW0-97JwzZiwp0D0fnSfyete16QTvPp_1m07A@mail.gmail.com>
2020-10-23 12:46 ` David Laight
0 siblings, 2 replies; 66+ messages in thread
From: Greg KH @ 2020-10-22 13:50 UTC (permalink / raw)
To: David Hildenbrand
Cc: David Laight, Al Viro, Nick Desaulniers, Christoph Hellwig,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On Thu, Oct 22, 2020 at 02:57:59PM +0200, Greg KH wrote:
> On Thu, Oct 22, 2020 at 02:42:24PM +0200, David Hildenbrand wrote:
> > On 22.10.20 14:18, Greg KH wrote:
> > > On Thu, Oct 22, 2020 at 12:48:05PM +0200, Greg KH wrote:
> > >> On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
> > >>> On 22.10.20 11:32, David Laight wrote:
> > >>>> From: David Hildenbrand
> > >>>>> Sent: 22 October 2020 10:25
> > >>>> ...
> > >>>>> ... especially because I recall that clang and gcc behave slightly
> > >>>>> differently:
> > >>>>>
> > >>>>> https://github.com/hjl-tools/x86-psABI/issues/2
> > >>>>>
> > >>>>> "Function args are different: narrow types are sign or zero extended to
> > >>>>> 32 bits, depending on their type. clang depends on this for incoming
> > >>>>> args, but gcc doesn't make that assumption. But both compilers do it
> > >>>>> when calling, so gcc code can call clang code.
> > >>>>
> > >>>> It really is best to use 'int' (or even 'long') for all numeric
> > >>>> arguments (and results) regardless of the domain of the value.
> > >>>>
> > >>>> Related, I've always worried about 'bool'....
> > >>>>
> > >>>>> The upper 32 bits of registers are always undefined garbage for types
> > >>>>> smaller than 64 bits."
> > >>>>
> > >>>> On x86-64 the high bits are zeroed by all 32bit loads.
> > >>>
> > >>> Yeah, but does not help here.
> > >>>
> > >>>
> > >>> My thinking: if the compiler that calls import_iovec() has garbage in
> > >>> the upper 32 bit
> > >>>
> > >>> a) gcc will zero it out and not rely on it being zero.
> > >>> b) clang will not zero it out, assuming it is zero.
> > >>>
> > >>> But
> > >>>
> > >>> a) will zero it out when calling the !inlined variant
> > >>> b) clang will zero it out when calling the !inlined variant
> > >>>
> > >>> When inlining, b) strikes. We access garbage. That would mean that we
> > >>> have calling code that's not generated by clang/gcc IIUC.
> > >>>
> > >>> We can test easily by changing the parameters instead of adding an "inline".
> > >>
> > >> Let me try that as well, as I seem to have a good reproducer, but it
> > >> takes a while to run...
> > >
> > > Ok, that didn't work.
> > >
> > > And I can't seem to "fix" this by adding noinline to patches further
> > > along in the patch series (because this commit's function is no longer
> > > present due to later patches.)
> >
> > We might have the same issues with iovec_from_user() and friends now.
> >
> > >
> > > Will keep digging...
> > >
> > > greg k-h
> > >
> >
> >
> > Might be worth to give this a try, just to see if it's related to
> > garbage in upper 32 bit and the way clang is handling it (might be a BUG
> > in clang, though):
> >
> >
> > diff --git a/include/linux/uio.h b/include/linux/uio.h
> > index 72d88566694e..7527298c6b56 100644
> > --- a/include/linux/uio.h
> > +++ b/include/linux/uio.h
> > @@ -267,7 +267,7 @@ size_t hash_and_copy_to_iter(const void *addr,
> > size_t bytes, void *hashp,
> > struct iov_iter *i);
> >
> > struct iovec *iovec_from_user(const struct iovec __user *uvector,
> > - unsigned long nr_segs, unsigned long fast_segs,
> > + unsigned nr_segs, unsigned fast_segs,
> > struct iovec *fast_iov, bool compat);
> > ssize_t import_iovec(int type, const struct iovec __user *uvec,
> > unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
> > diff --git a/lib/iov_iter.c b/lib/iov_iter.c
> > index 1635111c5bd2..58417f1916dc 100644
> > --- a/lib/iov_iter.c
> > +++ b/lib/iov_iter.c
> > @@ -1652,7 +1652,7 @@ const void *dup_iter(struct iov_iter *new, struct
> > iov_iter *old, gfp_t flags)
> > EXPORT_SYMBOL(dup_iter);
> >
> > static int copy_compat_iovec_from_user(struct iovec *iov,
> > - const struct iovec __user *uvec, unsigned long nr_segs)
> > + const struct iovec __user *uvec, unsigned nr_segs)
> > {
> > const struct compat_iovec __user *uiov =
> > (const struct compat_iovec __user *)uvec;
> > @@ -1684,7 +1684,7 @@ static int copy_compat_iovec_from_user(struct
> > iovec *iov,
> > }
> >
> > static int copy_iovec_from_user(struct iovec *iov,
> > - const struct iovec __user *uvec, unsigned long nr_segs)
> > + const struct iovec __user *uvec, unsigned nr_segs)
> > {
> > unsigned long seg;
> >
> > @@ -1699,7 +1699,7 @@ static int copy_iovec_from_user(struct iovec *iov,
> > }
> >
> > struct iovec *iovec_from_user(const struct iovec __user *uvec,
> > - unsigned long nr_segs, unsigned long fast_segs,
> > + unsigned nr_segs, unsigned fast_segs,
> > struct iovec *fast_iov, bool compat)
> > {
> > struct iovec *iov = fast_iov;
> > @@ -1738,7 +1738,7 @@ ssize_t __import_iovec(int type, const struct
> > iovec __user *uvec,
> > struct iov_iter *i, bool compat)
> > {
> > ssize_t total_len = 0;
> > - unsigned long seg;
> > + unsigned seg;
> > struct iovec *iov;
> >
> > iov = iovec_from_user(uvec, nr_segs, fast_segs, *iovp, compat);
> >
>
> Ah, I tested the other way around, making everything "unsigned long"
> instead. Will go try this too, as other tests are still running...
Ok, no, this didn't work either.
Nick, I think I need some compiler help here. Any ideas?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <CAK8P3a1B7OVdyzW0-97JwzZiwp0D0fnSfyete16QTvPp_1m07A@mail.gmail.com>]
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
[not found] ` <CAK8P3a1B7OVdyzW0-97JwzZiwp0D0fnSfyete16QTvPp_1m07A@mail.gmail.com>
@ 2020-10-22 14:40 ` Greg KH
2020-10-22 16:15 ` David Laight
0 siblings, 1 reply; 66+ messages in thread
From: Greg KH @ 2020-10-22 14:40 UTC (permalink / raw)
To: Arnd Bergmann
Cc: David Hildenbrand, David Laight, Al Viro, Nick Desaulniers,
Christoph Hellwig, [email protected], Andrew Morton,
Jens Axboe, David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On Thu, Oct 22, 2020 at 04:28:20PM +0200, Arnd Bergmann wrote:
> On Thu, Oct 22, 2020 at 3:50 PM Greg KH <[email protected]> wrote:
> > On Thu, Oct 22, 2020 at 02:57:59PM +0200, Greg KH wrote:
> > > On Thu, Oct 22, 2020 at 02:42:24PM +0200, David Hildenbrand wrote:
>
> > > > struct iovec *iovec_from_user(const struct iovec __user *uvec,
> > > > - unsigned long nr_segs, unsigned long fast_segs,
> > > > + unsigned nr_segs, unsigned fast_segs,
> > > > struct iovec *fast_iov, bool compat)
> > > > {
> > > > struct iovec *iov = fast_iov;
> > > > @@ -1738,7 +1738,7 @@ ssize_t __import_iovec(int type, const struct
> > > > iovec __user *uvec,
> > > > struct iov_iter *i, bool compat)
> > > > {
> > > > ssize_t total_len = 0;
> > > > - unsigned long seg;
> > > > + unsigned seg;
> > > > struct iovec *iov;
> > > >
> > > > iov = iovec_from_user(uvec, nr_segs, fast_segs, *iovp, compat);
> > > >
> > >
> > > Ah, I tested the other way around, making everything "unsigned long"
> > > instead. Will go try this too, as other tests are still running...
> >
> > Ok, no, this didn't work either.
> >
> > Nick, I think I need some compiler help here. Any ideas?
>
> I don't think the patch above would reliably clear the upper bits if they
> contain garbage.
>
> If the integer extension is the problem, the way I'd try it is to make the
> function take an 'unsigned long' and then explictly mask the upper
> bits with
>
> seg = lower_32_bits(seg);
>
> Can you attach the iov_iter.s files from the broken build, plus the
> one with 'noinline' for comparison? Maybe something can be seen
> in there.
I don't know how to extract the .s files easily from the AOSP build
system, I'll look into that. I'm also now testing by downgrading to an
older version of clang (10 instead of 11), to see if that matters at all
or not...
thanks,
greg k-h
^ permalink raw reply [flat|nested] 66+ messages in thread
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 14:40 ` Greg KH
@ 2020-10-22 16:15 ` David Laight
0 siblings, 0 replies; 66+ messages in thread
From: David Laight @ 2020-10-22 16:15 UTC (permalink / raw)
To: 'Greg KH', Arnd Bergmann
Cc: David Hildenbrand, Al Viro, Nick Desaulniers, Christoph Hellwig,
[email protected], Andrew Morton, Jens Axboe, David Howells,
[email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
From: Greg KH
> Sent: 22 October 2020 15:40
>
> On Thu, Oct 22, 2020 at 04:28:20PM +0200, Arnd Bergmann wrote:
...
> > Can you attach the iov_iter.s files from the broken build, plus the
> > one with 'noinline' for comparison? Maybe something can be seen
> > in there.
>
> I don't know how to extract the .s files easily from the AOSP build
> system, I'll look into that. I'm also now testing by downgrading to an
> older version of clang (10 instead of 11), to see if that matters at all
> or not...
Back from a day out - after it stopped raining.
Trying to use up leave before the end of the year.
Can you use objdump on the kernel binary itself and cut out
the single function?
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 66+ messages in thread
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 13:50 ` Greg KH
[not found] ` <CAK8P3a1B7OVdyzW0-97JwzZiwp0D0fnSfyete16QTvPp_1m07A@mail.gmail.com>
@ 2020-10-23 12:46 ` David Laight
2020-10-23 13:09 ` David Hildenbrand
[not found] ` <CAK8P3a1n+b8hOMhNQSDzgic03dyXbmpccfTJ3C1bGKvzsgMXbg@mail.gmail.com>
1 sibling, 2 replies; 66+ messages in thread
From: David Laight @ 2020-10-23 12:46 UTC (permalink / raw)
To: 'Greg KH', David Hildenbrand
Cc: Al Viro, Nick Desaulniers, Christoph Hellwig,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
From: Greg KH <[email protected]>
> Sent: 22 October 2020 14:51
I've rammed the code into godbolt.
https://godbolt.org/z/9v5PPW
Definitely a clang bug.
Search for [wx]24 in the clang output.
nr_segs comes in as w2 and the initial bound checks are done on w2.
w24 is loaded from w2 - I don't believe this changes the high bits.
There are no references to w24, just x24.
So the kmalloc_array() is passed 'huge' and will fail.
The iov_iter_init also gets the 64bit value.
Note that the gcc code has a sign-extend copy of w2.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-23 12:46 ` David Laight
@ 2020-10-23 13:09 ` David Hildenbrand
2020-10-23 14:33 ` David Hildenbrand
2020-10-23 17:58 ` Al Viro
[not found] ` <CAK8P3a1n+b8hOMhNQSDzgic03dyXbmpccfTJ3C1bGKvzsgMXbg@mail.gmail.com>
1 sibling, 2 replies; 66+ messages in thread
From: David Hildenbrand @ 2020-10-23 13:09 UTC (permalink / raw)
To: David Laight, 'Greg KH'
Cc: Al Viro, Nick Desaulniers, Christoph Hellwig,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On 23.10.20 14:46, David Laight wrote:
> From: Greg KH <[email protected]>
>> Sent: 22 October 2020 14:51
>
> I've rammed the code into godbolt.
>
> https://godbolt.org/z/9v5PPW
>
> Definitely a clang bug.
>
> Search for [wx]24 in the clang output.
> nr_segs comes in as w2 and the initial bound checks are done on w2.
> w24 is loaded from w2 - I don't believe this changes the high bits.
> There are no references to w24, just x24.
> So the kmalloc_array() is passed 'huge' and will fail.
> The iov_iter_init also gets the 64bit value.
>
> Note that the gcc code has a sign-extend copy of w2.
Do we have a result from using "unsigned long" in the base function and
explicitly masking of the high bits? That should definitely work.
Now, I am not a compiler expert, but as I already cited, at least on
x86-64 clang expects that the high bits were cleared by the caller - in
contrast to gcc. I suspect it's the same on arm64, but again, I am no
compiler expert.
If what I said and cites for x86-64 is correct, if the function expects
an "unsigned int", it will happily use 64bit operations without further
checks where valid when assuming high bits are zero. That's why even
converting everything to "unsigned int" as proposed by me won't work on
clang - it assumes high bits are zero (as indicated by Nick).
As I am neither a compiler experts (did I mention that already? ;) ) nor
an arm64 experts, I can't tell if this is a compiler BUG or not.
Main issue seems to be garbage in high bits as originally suggested by me.
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-23 13:09 ` David Hildenbrand
@ 2020-10-23 14:33 ` David Hildenbrand
2020-10-23 14:39 ` David Laight
2020-10-23 17:58 ` Al Viro
1 sibling, 1 reply; 66+ messages in thread
From: David Hildenbrand @ 2020-10-23 14:33 UTC (permalink / raw)
To: David Laight, 'Greg KH'
Cc: Al Viro, Nick Desaulniers, Christoph Hellwig,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On 23.10.20 15:09, David Hildenbrand wrote:
> On 23.10.20 14:46, David Laight wrote:
>> From: Greg KH <[email protected]>
>>> Sent: 22 October 2020 14:51
>>
>> I've rammed the code into godbolt.
>>
>> https://godbolt.org/z/9v5PPW
>>
>> Definitely a clang bug.
>>
>> Search for [wx]24 in the clang output.
>> nr_segs comes in as w2 and the initial bound checks are done on w2.
>> w24 is loaded from w2 - I don't believe this changes the high bits.
>> There are no references to w24, just x24.
>> So the kmalloc_array() is passed 'huge' and will fail.
>> The iov_iter_init also gets the 64bit value.
>>
>> Note that the gcc code has a sign-extend copy of w2.
>
> Do we have a result from using "unsigned long" in the base function and
> explicitly masking of the high bits? That should definitely work.
>
> Now, I am not a compiler expert, but as I already cited, at least on
> x86-64 clang expects that the high bits were cleared by the caller - in
> contrast to gcc. I suspect it's the same on arm64, but again, I am no
> compiler expert.
>
> If what I said and cites for x86-64 is correct, if the function expects
> an "unsigned int", it will happily use 64bit operations without further
> checks where valid when assuming high bits are zero. That's why even
> converting everything to "unsigned int" as proposed by me won't work on
> clang - it assumes high bits are zero (as indicated by Nick).
>
> As I am neither a compiler experts (did I mention that already? ;) ) nor
> an arm64 experts, I can't tell if this is a compiler BUG or not.
>
I just checked against upstream code generated by clang 10 and it
properly discards the upper 32bit via a mov w23 w2.
So at least clang 10 indeed properly assumes we could have garbage and
masks it off.
Maybe the issue is somewhere else, unrelated to nr_pages ... or clang 11
behaves differently.
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 66+ messages in thread
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-23 14:33 ` David Hildenbrand
@ 2020-10-23 14:39 ` David Laight
2020-10-23 14:47 ` 'Greg KH'
0 siblings, 1 reply; 66+ messages in thread
From: David Laight @ 2020-10-23 14:39 UTC (permalink / raw)
To: 'David Hildenbrand', 'Greg KH'
Cc: Al Viro, Nick Desaulniers, Christoph Hellwig,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
From: David Hildenbrand
> Sent: 23 October 2020 15:33
...
> I just checked against upstream code generated by clang 10 and it
> properly discards the upper 32bit via a mov w23 w2.
>
> So at least clang 10 indeed properly assumes we could have garbage and
> masks it off.
>
> Maybe the issue is somewhere else, unrelated to nr_pages ... or clang 11
> behaves differently.
We'll need the disassembly from a failing kernel image.
It isn't that big to hand annotate.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-23 14:39 ` David Laight
@ 2020-10-23 14:47 ` 'Greg KH'
2020-10-23 16:33 ` David Hildenbrand
2020-11-02 9:06 ` David Laight
0 siblings, 2 replies; 66+ messages in thread
From: 'Greg KH' @ 2020-10-23 14:47 UTC (permalink / raw)
To: David Laight
Cc: 'David Hildenbrand', Al Viro, Nick Desaulniers,
Christoph Hellwig, [email protected], Andrew Morton,
Jens Axboe, Arnd Bergmann, David Howells,
[email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On Fri, Oct 23, 2020 at 02:39:24PM +0000, David Laight wrote:
> From: David Hildenbrand
> > Sent: 23 October 2020 15:33
> ...
> > I just checked against upstream code generated by clang 10 and it
> > properly discards the upper 32bit via a mov w23 w2.
> >
> > So at least clang 10 indeed properly assumes we could have garbage and
> > masks it off.
> >
> > Maybe the issue is somewhere else, unrelated to nr_pages ... or clang 11
> > behaves differently.
>
> We'll need the disassembly from a failing kernel image.
> It isn't that big to hand annotate.
I've worked around the merge at the moment in the android tree, but it
is still quite reproducable, and will try to get a .o file to
disassemble on Monday or so...
thanks,
greg k-h
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-23 14:47 ` 'Greg KH'
@ 2020-10-23 16:33 ` David Hildenbrand
2020-11-02 9:06 ` David Laight
1 sibling, 0 replies; 66+ messages in thread
From: David Hildenbrand @ 2020-10-23 16:33 UTC (permalink / raw)
To: 'Greg KH', David Laight
Cc: Al Viro, Nick Desaulniers, Christoph Hellwig,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On 23.10.20 16:47, 'Greg KH' wrote:
> On Fri, Oct 23, 2020 at 02:39:24PM +0000, David Laight wrote:
>> From: David Hildenbrand
>>> Sent: 23 October 2020 15:33
>> ...
>>> I just checked against upstream code generated by clang 10 and it
>>> properly discards the upper 32bit via a mov w23 w2.
>>>
>>> So at least clang 10 indeed properly assumes we could have garbage and
>>> masks it off.
>>>
>>> Maybe the issue is somewhere else, unrelated to nr_pages ... or clang 11
>>> behaves differently.
>>
>> We'll need the disassembly from a failing kernel image.
>> It isn't that big to hand annotate.
>
> I've worked around the merge at the moment in the android tree, but it
> is still quite reproducable, and will try to get a .o file to
> disassemble on Monday or so...
I just compiled pre and post fb041b598997d63c0f7d7305dfae70046bf66fe1 with
clang version 11.0.0 (Fedora 11.0.0-0.2.rc1.fc33)
for aarch64 with defconfig and extracted import_iovec and
rw_copy_check_uvector (skipping the compat things)
Pre fb041b598997d63c0f7d7305dfae70046bf66fe1 import_iovec
-> https://pastebin.com/LtnYMLJt
Post fb041b598997d63c0f7d7305dfae70046bf66fe1 import_iovec
-> https://pastebin.com/BWPmXrAf
Pre fb041b598997d63c0f7d7305dfae70046bf66fe1 rw_copy_check_uvector
-> https://pastebin.com/4nSBYRbf
Post fb041b598997d63c0f7d7305dfae70046bf66fe1 rw_copy_check_uvector
-> https://pastebin.com/hPtEgaEW
I'm only able to spot minor differences ... less gets inlined than I
would have expected. But there are some smaller differences.
Maybe someone wants to have a look before we have object files as used
by Greg ...
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 66+ messages in thread
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-23 14:47 ` 'Greg KH'
2020-10-23 16:33 ` David Hildenbrand
@ 2020-11-02 9:06 ` David Laight
2020-11-02 13:52 ` 'Greg KH'
1 sibling, 1 reply; 66+ messages in thread
From: David Laight @ 2020-11-02 9:06 UTC (permalink / raw)
To: 'Greg KH'
Cc: 'David Hildenbrand', Al Viro, Nick Desaulniers,
Christoph Hellwig, [email protected], Andrew Morton,
Jens Axboe, Arnd Bergmann, David Howells,
[email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
From: 'Greg KH'
> Sent: 23 October 2020 15:47
>
> On Fri, Oct 23, 2020 at 02:39:24PM +0000, David Laight wrote:
> > From: David Hildenbrand
> > > Sent: 23 October 2020 15:33
> > ...
> > > I just checked against upstream code generated by clang 10 and it
> > > properly discards the upper 32bit via a mov w23 w2.
> > >
> > > So at least clang 10 indeed properly assumes we could have garbage and
> > > masks it off.
> > >
> > > Maybe the issue is somewhere else, unrelated to nr_pages ... or clang 11
> > > behaves differently.
> >
> > We'll need the disassembly from a failing kernel image.
> > It isn't that big to hand annotate.
>
> I've worked around the merge at the moment in the android tree, but it
> is still quite reproducable, and will try to get a .o file to
> disassemble on Monday or so...
Did this get properly resolved?
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-11-02 9:06 ` David Laight
@ 2020-11-02 13:52 ` 'Greg KH'
2020-11-02 18:23 ` David Laight
0 siblings, 1 reply; 66+ messages in thread
From: 'Greg KH' @ 2020-11-02 13:52 UTC (permalink / raw)
To: David Laight
Cc: 'David Hildenbrand', Al Viro, Nick Desaulniers,
Christoph Hellwig, [email protected], Andrew Morton,
Jens Axboe, Arnd Bergmann, David Howells,
[email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On Mon, Nov 02, 2020 at 09:06:38AM +0000, David Laight wrote:
> From: 'Greg KH'
> > Sent: 23 October 2020 15:47
> >
> > On Fri, Oct 23, 2020 at 02:39:24PM +0000, David Laight wrote:
> > > From: David Hildenbrand
> > > > Sent: 23 October 2020 15:33
> > > ...
> > > > I just checked against upstream code generated by clang 10 and it
> > > > properly discards the upper 32bit via a mov w23 w2.
> > > >
> > > > So at least clang 10 indeed properly assumes we could have garbage and
> > > > masks it off.
> > > >
> > > > Maybe the issue is somewhere else, unrelated to nr_pages ... or clang 11
> > > > behaves differently.
> > >
> > > We'll need the disassembly from a failing kernel image.
> > > It isn't that big to hand annotate.
> >
> > I've worked around the merge at the moment in the android tree, but it
> > is still quite reproducable, and will try to get a .o file to
> > disassemble on Monday or so...
>
> Did this get properly resolved?
For some reason, 5.10-rc2 fixed all of this up. I backed out all of the
patches I had to revert to get 5.10-rc1 to work properly, and then did
the merge and all is well.
It must have been something to do with the compat changes in this same
area that went in after 5.10-rc1, and something got reorganized in the
files somehow. I really do not know, and at the moment, don't have the
time to track it down anymore. So for now, I'd say it's all good, sorry
for the noise.
greg k-h
^ permalink raw reply [flat|nested] 66+ messages in thread
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-11-02 13:52 ` 'Greg KH'
@ 2020-11-02 18:23 ` David Laight
0 siblings, 0 replies; 66+ messages in thread
From: David Laight @ 2020-11-02 18:23 UTC (permalink / raw)
To: 'Greg KH'
Cc: 'David Hildenbrand', Al Viro, Nick Desaulniers,
Christoph Hellwig, [email protected], Andrew Morton,
Jens Axboe, Arnd Bergmann, David Howells,
[email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
From: 'Greg KH'
> Sent: 02 November 2020 13:52
>
> On Mon, Nov 02, 2020 at 09:06:38AM +0000, David Laight wrote:
> > From: 'Greg KH'
> > > Sent: 23 October 2020 15:47
> > >
> > > On Fri, Oct 23, 2020 at 02:39:24PM +0000, David Laight wrote:
> > > > From: David Hildenbrand
> > > > > Sent: 23 October 2020 15:33
> > > > ...
> > > > > I just checked against upstream code generated by clang 10 and it
> > > > > properly discards the upper 32bit via a mov w23 w2.
> > > > >
> > > > > So at least clang 10 indeed properly assumes we could have garbage and
> > > > > masks it off.
> > > > >
> > > > > Maybe the issue is somewhere else, unrelated to nr_pages ... or clang 11
> > > > > behaves differently.
> > > >
> > > > We'll need the disassembly from a failing kernel image.
> > > > It isn't that big to hand annotate.
> > >
> > > I've worked around the merge at the moment in the android tree, but it
> > > is still quite reproducable, and will try to get a .o file to
> > > disassemble on Monday or so...
> >
> > Did this get properly resolved?
>
> For some reason, 5.10-rc2 fixed all of this up. I backed out all of the
> patches I had to revert to get 5.10-rc1 to work properly, and then did
> the merge and all is well.
>
> It must have been something to do with the compat changes in this same
> area that went in after 5.10-rc1, and something got reorganized in the
> files somehow. I really do not know, and at the moment, don't have the
> time to track it down anymore. So for now, I'd say it's all good, sorry
> for the noise.
Hopefully it won't appear again.
Saved me spending a day off reading arm64 assembler.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-23 13:09 ` David Hildenbrand
2020-10-23 14:33 ` David Hildenbrand
@ 2020-10-23 17:58 ` Al Viro
2020-10-23 18:27 ` Segher Boessenkool
1 sibling, 1 reply; 66+ messages in thread
From: Al Viro @ 2020-10-23 17:58 UTC (permalink / raw)
To: David Hildenbrand
Cc: David Laight, 'Greg KH', Nick Desaulniers,
Christoph Hellwig, [email protected], Andrew Morton,
Jens Axboe, Arnd Bergmann, David Howells,
[email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On Fri, Oct 23, 2020 at 03:09:30PM +0200, David Hildenbrand wrote:
> Now, I am not a compiler expert, but as I already cited, at least on
> x86-64 clang expects that the high bits were cleared by the caller - in
> contrast to gcc. I suspect it's the same on arm64, but again, I am no
> compiler expert.
>
> If what I said and cites for x86-64 is correct, if the function expects
> an "unsigned int", it will happily use 64bit operations without further
> checks where valid when assuming high bits are zero. That's why even
> converting everything to "unsigned int" as proposed by me won't work on
> clang - it assumes high bits are zero (as indicated by Nick).
>
> As I am neither a compiler experts (did I mention that already? ;) ) nor
> an arm64 experts, I can't tell if this is a compiler BUG or not.
On arm64 when callee expects a 32bit argument, the caller is *not* responsible
for clearing the upper half of 64bit register used to pass the value - it only
needs to store the actual value into the lower half. The callee must consider
the contents of the upper half of that register as undefined. See AAPCS64 (e.g.
https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#parameter-passing-rules
); AFAICS, the relevant bit is
"Unlike in the 32-bit AAPCS, named integral values must be narrowed by
the callee rather than the caller."
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-23 17:58 ` Al Viro
@ 2020-10-23 18:27 ` Segher Boessenkool
2020-10-23 21:28 ` David Laight
0 siblings, 1 reply; 66+ messages in thread
From: Segher Boessenkool @ 2020-10-23 18:27 UTC (permalink / raw)
To: Al Viro
Cc: David Hildenbrand, [email protected],
[email protected], David Howells, [email protected],
[email protected], [email protected],
Christoph Hellwig, [email protected],
[email protected], [email protected],
[email protected], Arnd Bergmann,
[email protected], [email protected],
[email protected], Jens Axboe,
[email protected], 'Greg KH', Nick Desaulniers,
[email protected],
[email protected], David Laight,
[email protected], [email protected],
Andrew Morton, [email protected]
On Fri, Oct 23, 2020 at 06:58:57PM +0100, Al Viro wrote:
> On Fri, Oct 23, 2020 at 03:09:30PM +0200, David Hildenbrand wrote:
>
> > Now, I am not a compiler expert, but as I already cited, at least on
> > x86-64 clang expects that the high bits were cleared by the caller - in
> > contrast to gcc. I suspect it's the same on arm64, but again, I am no
> > compiler expert.
> >
> > If what I said and cites for x86-64 is correct, if the function expects
> > an "unsigned int", it will happily use 64bit operations without further
> > checks where valid when assuming high bits are zero. That's why even
> > converting everything to "unsigned int" as proposed by me won't work on
> > clang - it assumes high bits are zero (as indicated by Nick).
> >
> > As I am neither a compiler experts (did I mention that already? ;) ) nor
> > an arm64 experts, I can't tell if this is a compiler BUG or not.
>
> On arm64 when callee expects a 32bit argument, the caller is *not* responsible
> for clearing the upper half of 64bit register used to pass the value - it only
> needs to store the actual value into the lower half. The callee must consider
> the contents of the upper half of that register as undefined. See AAPCS64 (e.g.
> https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#parameter-passing-rules
> ); AFAICS, the relevant bit is
> "Unlike in the 32-bit AAPCS, named integral values must be narrowed by
> the callee rather than the caller."
Or the formal rule:
C.9 If the argument is an Integral or Pointer Type, the size of the
argument is less than or equal to 8 bytes and the NGRN is less
than 8, the argument is copied to the least significant bits in
x[NGRN]. The NGRN is incremented by one. The argument has now
been allocated.
Segher
^ permalink raw reply [flat|nested] 66+ messages in thread
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-23 18:27 ` Segher Boessenkool
@ 2020-10-23 21:28 ` David Laight
2020-10-24 17:29 ` Segher Boessenkool
0 siblings, 1 reply; 66+ messages in thread
From: David Laight @ 2020-10-23 21:28 UTC (permalink / raw)
To: 'Segher Boessenkool', Al Viro
Cc: David Hildenbrand, [email protected],
[email protected], David Howells, [email protected],
[email protected], [email protected],
Christoph Hellwig, [email protected],
[email protected], [email protected],
[email protected], Arnd Bergmann,
[email protected], [email protected],
[email protected], Jens Axboe,
[email protected], 'Greg KH', Nick Desaulniers,
[email protected],
[email protected], [email protected],
[email protected], Andrew Morton,
[email protected]
From: Segher Boessenkool
> Sent: 23 October 2020 19:27
>
> On Fri, Oct 23, 2020 at 06:58:57PM +0100, Al Viro wrote:
> > On Fri, Oct 23, 2020 at 03:09:30PM +0200, David Hildenbrand wrote:
> >
> > > Now, I am not a compiler expert, but as I already cited, at least on
> > > x86-64 clang expects that the high bits were cleared by the caller - in
> > > contrast to gcc. I suspect it's the same on arm64, but again, I am no
> > > compiler expert.
> > >
> > > If what I said and cites for x86-64 is correct, if the function expects
> > > an "unsigned int", it will happily use 64bit operations without further
> > > checks where valid when assuming high bits are zero. That's why even
> > > converting everything to "unsigned int" as proposed by me won't work on
> > > clang - it assumes high bits are zero (as indicated by Nick).
> > >
> > > As I am neither a compiler experts (did I mention that already? ;) ) nor
> > > an arm64 experts, I can't tell if this is a compiler BUG or not.
> >
> > On arm64 when callee expects a 32bit argument, the caller is *not* responsible
> > for clearing the upper half of 64bit register used to pass the value - it only
> > needs to store the actual value into the lower half. The callee must consider
> > the contents of the upper half of that register as undefined. See AAPCS64 (e.g.
> > https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#parameter-passing-rules
> > ); AFAICS, the relevant bit is
> > "Unlike in the 32-bit AAPCS, named integral values must be narrowed by
> > the callee rather than the caller."
>
> Or the formal rule:
>
> C.9 If the argument is an Integral or Pointer Type, the size of the
> argument is less than or equal to 8 bytes and the NGRN is less
> than 8, the argument is copied to the least significant bits in
> x[NGRN]. The NGRN is incremented by one. The argument has now
> been allocated.
So, in essence, if the value is in a 64bit register the calling
code is independent of the actual type of the formal parameter.
Clearly a value might need explicit widening.
I've found a copy of the 64 bit arm instruction set.
Unfortunately it is alpha sorted and repetitive so shows none
of the symmetry and makes things difficult to find.
But, contrary to what someone suggested most register writes
(eg from arithmetic) seem to zero/extend the high bits.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-23 21:28 ` David Laight
@ 2020-10-24 17:29 ` Segher Boessenkool
2020-10-24 21:12 ` David Laight
0 siblings, 1 reply; 66+ messages in thread
From: Segher Boessenkool @ 2020-10-24 17:29 UTC (permalink / raw)
To: David Laight
Cc: Al Viro, David Hildenbrand, [email protected],
[email protected], David Howells, [email protected],
[email protected], [email protected],
Christoph Hellwig, [email protected],
[email protected], [email protected],
[email protected], Arnd Bergmann,
[email protected], [email protected],
[email protected], Jens Axboe,
[email protected], 'Greg KH', Nick Desaulniers,
[email protected],
[email protected], [email protected],
[email protected], Andrew Morton,
[email protected]
On Fri, Oct 23, 2020 at 09:28:59PM +0000, David Laight wrote:
> From: Segher Boessenkool
> > Sent: 23 October 2020 19:27
> > On Fri, Oct 23, 2020 at 06:58:57PM +0100, Al Viro wrote:
> > > On Fri, Oct 23, 2020 at 03:09:30PM +0200, David Hildenbrand wrote:
> > > On arm64 when callee expects a 32bit argument, the caller is *not* responsible
> > > for clearing the upper half of 64bit register used to pass the value - it only
> > > needs to store the actual value into the lower half. The callee must consider
> > > the contents of the upper half of that register as undefined. See AAPCS64 (e.g.
> > > https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#parameter-passing-rules
> > > ); AFAICS, the relevant bit is
> > > "Unlike in the 32-bit AAPCS, named integral values must be narrowed by
> > > the callee rather than the caller."
> >
> > Or the formal rule:
> >
> > C.9 If the argument is an Integral or Pointer Type, the size of the
> > argument is less than or equal to 8 bytes and the NGRN is less
> > than 8, the argument is copied to the least significant bits in
> > x[NGRN]. The NGRN is incremented by one. The argument has now
> > been allocated.
>
> So, in essence, if the value is in a 64bit register the calling
> code is independent of the actual type of the formal parameter.
> Clearly a value might need explicit widening.
No, this says that if you pass a 32-bit integer in a 64-bit register,
then the top 32 bits of that register hold an undefined value.
> I've found a copy of the 64 bit arm instruction set.
> Unfortunately it is alpha sorted and repetitive so shows none
> of the symmetry and makes things difficult to find.
All of this is ABI, not ISA. Look at the AAPCS64 pointed to above.
> But, contrary to what someone suggested most register writes
> (eg from arithmetic) seem to zero/extend the high bits.
Everything that writes a "w" does, yes. But that has nothing to do with
the parameter passing rules, that is ABI. It just means that very often
a 32-bit integer will be passed zero-extended in a 64-bit register, but
that is just luck (or not, it makes finding bugs harder ;-) )
Segher
^ permalink raw reply [flat|nested] 66+ messages in thread
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-24 17:29 ` Segher Boessenkool
@ 2020-10-24 21:12 ` David Laight
0 siblings, 0 replies; 66+ messages in thread
From: David Laight @ 2020-10-24 21:12 UTC (permalink / raw)
To: 'Segher Boessenkool'
Cc: Al Viro, David Hildenbrand, [email protected],
[email protected], David Howells, [email protected],
[email protected], [email protected],
Christoph Hellwig, [email protected],
[email protected], [email protected],
[email protected], Arnd Bergmann,
[email protected], [email protected],
[email protected], Jens Axboe,
[email protected], 'Greg KH', Nick Desaulniers,
[email protected],
[email protected], [email protected],
[email protected], Andrew Morton,
[email protected]
From: Segher Boessenkool
> Sent: 24 October 2020 18:29
>
> On Fri, Oct 23, 2020 at 09:28:59PM +0000, David Laight wrote:
> > From: Segher Boessenkool
> > > Sent: 23 October 2020 19:27
> > > On Fri, Oct 23, 2020 at 06:58:57PM +0100, Al Viro wrote:
> > > > On Fri, Oct 23, 2020 at 03:09:30PM +0200, David Hildenbrand wrote:
> > > > On arm64 when callee expects a 32bit argument, the caller is *not* responsible
> > > > for clearing the upper half of 64bit register used to pass the value - it only
> > > > needs to store the actual value into the lower half. The callee must consider
> > > > the contents of the upper half of that register as undefined. See AAPCS64 (e.g.
> > > > https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#parameter-passing-rules
> > > > ); AFAICS, the relevant bit is
> > > > "Unlike in the 32-bit AAPCS, named integral values must be narrowed by
> > > > the callee rather than the caller."
> > >
> > > Or the formal rule:
> > >
> > > C.9 If the argument is an Integral or Pointer Type, the size of the
> > > argument is less than or equal to 8 bytes and the NGRN is less
> > > than 8, the argument is copied to the least significant bits in
> > > x[NGRN]. The NGRN is incremented by one. The argument has now
> > > been allocated.
> >
> > So, in essence, if the value is in a 64bit register the calling
> > code is independent of the actual type of the formal parameter.
> > Clearly a value might need explicit widening.
>
> No, this says that if you pass a 32-bit integer in a 64-bit register,
> then the top 32 bits of that register hold an undefined value.
That's sort of what I meant.
The 'normal' junk in the hight bits will there because the variable
in the calling code is wider.
> > I've found a copy of the 64 bit arm instruction set.
> > Unfortunately it is alpha sorted and repetitive so shows none
> > of the symmetry and makes things difficult to find.
>
> All of this is ABI, not ISA. Look at the AAPCS64 pointed to above.
>
> > But, contrary to what someone suggested most register writes
> > (eg from arithmetic) seem to zero/extend the high bits.
>
> Everything that writes a "w" does, yes. But that has nothing to do with
> the parameter passing rules, that is ABI. It just means that very often
> a 32-bit integer will be passed zero-extended in a 64-bit register, but
> that is just luck (or not, it makes finding bugs harder ;-) )
Working out why the code is wrong is more of an ISA issue than an ABI one.
It may be an ABI one, but the analysis is ISA.
I've written a lot of asm over the years - decoding compiler generated
asm isn't that hard.
At least ARM doesn't have annulled delay slots.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <CAK8P3a1n+b8hOMhNQSDzgic03dyXbmpccfTJ3C1bGKvzsgMXbg@mail.gmail.com>]
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
[not found] ` <CAK8P3a1n+b8hOMhNQSDzgic03dyXbmpccfTJ3C1bGKvzsgMXbg@mail.gmail.com>
@ 2020-10-23 13:28 ` David Laight
0 siblings, 0 replies; 66+ messages in thread
From: David Laight @ 2020-10-23 13:28 UTC (permalink / raw)
To: 'Arnd Bergmann'
Cc: Greg KH, David Hildenbrand, Al Viro, Nick Desaulniers,
Christoph Hellwig, [email protected], Andrew Morton,
Jens Axboe, David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
From: Arnd Bergmann
> Sent: 23 October 2020 14:23
>
> On Fri, Oct 23, 2020 at 2:46 PM David Laight <[email protected]> wrote:
> >
> > From: Greg KH <[email protected]>
> > > Sent: 22 October 2020 14:51
> >
> > I've rammed the code into godbolt.
> >
> > https://godbolt.org/z/9v5PPW
> >
> > Definitely a clang bug.
> >
> > Search for [wx]24 in the clang output.
> > nr_segs comes in as w2 and the initial bound checks are done on w2.
> > w24 is loaded from w2 - I don't believe this changes the high bits.
>
> You believe wrong, "mov w24, w2" is a zero-extending operation.
Ah ok, but gcc uses utxw for the same task.
I guess they could be the same opcode.
Last time I wrote ARM thumb didn't really exist - never mind 64bit
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 9:36 ` David Hildenbrand
2020-10-22 10:48 ` Greg KH
@ 2020-10-22 13:23 ` Christoph Hellwig
2020-10-22 16:35 ` David Laight
1 sibling, 1 reply; 66+ messages in thread
From: Christoph Hellwig @ 2020-10-22 13:23 UTC (permalink / raw)
To: David Hildenbrand
Cc: David Laight, Greg KH, Al Viro, Nick Desaulniers,
Christoph Hellwig, [email protected], Andrew Morton,
Jens Axboe, Arnd Bergmann, David Howells,
[email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
> My thinking: if the compiler that calls import_iovec() has garbage in
> the upper 32 bit
>
> a) gcc will zero it out and not rely on it being zero.
> b) clang will not zero it out, assuming it is zero.
>
> But
>
> a) will zero it out when calling the !inlined variant
> b) clang will zero it out when calling the !inlined variant
>
> When inlining, b) strikes. We access garbage. That would mean that we
> have calling code that's not generated by clang/gcc IIUC.
Most callchains of import_iovec start with the assembly syscall wrappers.
^ permalink raw reply [flat|nested] 66+ messages in thread
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 13:23 ` Christoph Hellwig
@ 2020-10-22 16:35 ` David Laight
2020-10-22 16:40 ` Matthew Wilcox
2020-10-22 17:54 ` Nick Desaulniers
0 siblings, 2 replies; 66+ messages in thread
From: David Laight @ 2020-10-22 16:35 UTC (permalink / raw)
To: 'Christoph Hellwig', David Hildenbrand
Cc: Greg KH, Al Viro, Nick Desaulniers, [email protected],
Andrew Morton, Jens Axboe, Arnd Bergmann, David Howells,
[email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
From: Christoph Hellwig
> Sent: 22 October 2020 14:24
>
> On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
> > My thinking: if the compiler that calls import_iovec() has garbage in
> > the upper 32 bit
> >
> > a) gcc will zero it out and not rely on it being zero.
> > b) clang will not zero it out, assuming it is zero.
> >
> > But
> >
> > a) will zero it out when calling the !inlined variant
> > b) clang will zero it out when calling the !inlined variant
> >
> > When inlining, b) strikes. We access garbage. That would mean that we
> > have calling code that's not generated by clang/gcc IIUC.
>
> Most callchains of import_iovec start with the assembly syscall wrappers.
Wait...
readv(2) defines:
ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
But the syscall is defined as:
SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec,
unsigned long, vlen)
{
return do_readv(fd, vec, vlen, 0);
}
I'm guessing that nothing actually masks the high bits that come
from an application that is compiled with clang?
The vlen is 'unsigned long' through the first few calls.
So unless there is a non-inlined function than takes vlen
as 'int' the high garbage bits from userspace are kept.
Which makes it a bug in the kernel C syscall wrappers.
They need to explicitly mask the high bits of 32bit
arguments on arm64 but not x86-64.
What does the ARM EABI say about register parameters?
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 16:35 ` David Laight
@ 2020-10-22 16:40 ` Matthew Wilcox
2020-10-22 16:50 ` David Laight
` (2 more replies)
2020-10-22 17:54 ` Nick Desaulniers
1 sibling, 3 replies; 66+ messages in thread
From: Matthew Wilcox @ 2020-10-22 16:40 UTC (permalink / raw)
To: David Laight
Cc: 'Christoph Hellwig', David Hildenbrand, Greg KH, Al Viro,
Nick Desaulniers, [email protected], Andrew Morton,
Jens Axboe, Arnd Bergmann, David Howells,
[email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On Thu, Oct 22, 2020 at 04:35:17PM +0000, David Laight wrote:
> Wait...
> readv(2) defines:
> ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
It doesn't really matter what the manpage says. What does the AOSP
libc header say?
> But the syscall is defined as:
>
> SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec,
> unsigned long, vlen)
> {
> return do_readv(fd, vec, vlen, 0);
> }
^ permalink raw reply [flat|nested] 66+ messages in thread
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 16:40 ` Matthew Wilcox
@ 2020-10-22 16:50 ` David Laight
2020-10-22 17:00 ` Nick Desaulniers
2020-10-22 18:19 ` Al Viro
2 siblings, 0 replies; 66+ messages in thread
From: David Laight @ 2020-10-22 16:50 UTC (permalink / raw)
To: 'Matthew Wilcox'
Cc: 'Christoph Hellwig', David Hildenbrand, Greg KH, Al Viro,
Nick Desaulniers, [email protected], Andrew Morton,
Jens Axboe, Arnd Bergmann, David Howells,
[email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
From: Matthew Wilcox <[email protected]>
> Sent: 22 October 2020 17:41
>
> On Thu, Oct 22, 2020 at 04:35:17PM +0000, David Laight wrote:
> > Wait...
> > readv(2) defines:
> > ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
>
> It doesn't really matter what the manpage says. What does the AOSP
> libc header say?
The only copy I can find is:
/usr/include/x86_64-linux-gnu/sys/uio.h:extern ssize_t readv (int __fd, const struct iovec *__iovec, int __count)
/usr/include/x86_64-linux-gnu/sys/uio.h- __wur;
and not surprisingly agrees.
POSIX and/or TOG will (more or less) mandate the prototype.
> > But the syscall is defined as:
> >
> > SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec,
> > unsigned long, vlen)
> > {
> > return do_readv(fd, vec, vlen, 0);
> > }
I wonder when the high bits of 'fd' get zapped?
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 16:40 ` Matthew Wilcox
2020-10-22 16:50 ` David Laight
@ 2020-10-22 17:00 ` Nick Desaulniers
2020-10-22 20:59 ` Eric Biggers
2020-10-22 18:19 ` Al Viro
2 siblings, 1 reply; 66+ messages in thread
From: Nick Desaulniers @ 2020-10-22 17:00 UTC (permalink / raw)
To: Matthew Wilcox
Cc: David Laight, Christoph Hellwig, David Hildenbrand, Greg KH,
Al Viro, [email protected], Andrew Morton, Jens Axboe,
Arnd Bergmann, David Howells,
[email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On Thu, Oct 22, 2020 at 9:40 AM Matthew Wilcox <[email protected]> wrote:
>
> On Thu, Oct 22, 2020 at 04:35:17PM +0000, David Laight wrote:
> > Wait...
> > readv(2) defines:
> > ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
>
> It doesn't really matter what the manpage says. What does the AOSP
> libc header say?
Same: https://android.googlesource.com/platform/bionic/+/refs/heads/master/libc/include/sys/uio.h#38
Theoretically someone could bypass libc to make a system call, right?
>
> > But the syscall is defined as:
> >
> > SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec,
> > unsigned long, vlen)
> > {
> > return do_readv(fd, vec, vlen, 0);
> > }
>
--
Thanks,
~Nick Desaulniers
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 17:00 ` Nick Desaulniers
@ 2020-10-22 20:59 ` Eric Biggers
2020-10-22 21:28 ` Al Viro
0 siblings, 1 reply; 66+ messages in thread
From: Eric Biggers @ 2020-10-22 20:59 UTC (permalink / raw)
To: Nick Desaulniers
Cc: Matthew Wilcox, David Laight, Christoph Hellwig,
David Hildenbrand, Greg KH, Al Viro, [email protected],
Andrew Morton, Jens Axboe, Arnd Bergmann, David Howells,
[email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On Thu, Oct 22, 2020 at 10:00:44AM -0700, Nick Desaulniers wrote:
> On Thu, Oct 22, 2020 at 9:40 AM Matthew Wilcox <[email protected]> wrote:
> >
> > On Thu, Oct 22, 2020 at 04:35:17PM +0000, David Laight wrote:
> > > Wait...
> > > readv(2) defines:
> > > ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
> >
> > It doesn't really matter what the manpage says. What does the AOSP
> > libc header say?
>
> Same: https://android.googlesource.com/platform/bionic/+/refs/heads/master/libc/include/sys/uio.h#38
>
> Theoretically someone could bypass libc to make a system call, right?
>
> >
> > > But the syscall is defined as:
> > >
> > > SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec,
> > > unsigned long, vlen)
> > > {
> > > return do_readv(fd, vec, vlen, 0);
> > > }
> >
>
FWIW, glibc makes the readv() syscall assuming that fd and vlen are 'int' as
well. So this problem isn't specific to Android's libc.
From objdump -d /lib/x86_64-linux-gnu/libc.so.6:
00000000000f4db0 <readv@@GLIBC_2.2.5>:
f4db0: 64 8b 04 25 18 00 00 mov %fs:0x18,%eax
f4db7: 00
f4db8: 85 c0 test %eax,%eax
f4dba: 75 14 jne f4dd0 <readv@@GLIBC_2.2.5+0x20>
f4dbc: b8 13 00 00 00 mov $0x13,%eax
f4dc1: 0f 05 syscall
...
There's some code for pthread cancellation, but no zeroing of the upper half of
the fd and vlen arguments, which are in %edi and %edx respectively. But the
glibc function prototype uses 'int' for them, not 'unsigned long'
'ssize_t readv(int fd, const struct iovec *iov, int iovcnt);'.
So the high halves of the fd and iovcnt registers can contain garbage. Or at
least that's what gcc (9.3.0) and clang (9.0.1) assume; they both compile the
following
void g(unsigned int x);
void f(unsigned long x)
{
g(x);
}
into f() making a tail call to g(), without zeroing the top half of %rdi.
Also note the following program succeeds on Linux 5.9 on x86_64. On kernels
that have this bug, it should fail. (I couldn't get it to actually fail, so it
must depend on the compiler and/or the kernel config...)
#include <fcntl.h>
#include <stdio.h>
#include <sys/syscall.h>
#include <sys/uio.h>
#include <unistd.h>
int main()
{
int fd = open("/dev/zero", O_RDONLY);
char buf[1000];
struct iovec iov = { .iov_base = buf, .iov_len = sizeof(buf) };
long ret;
ret = syscall(__NR_readv, fd, &iov, 0x100000001);
if (ret < 0)
perror("readv failed");
else
printf("read %ld bytes\n", ret);
}
I think the right fix is to change the readv() (and writev(), etc.) syscalls to
take 'unsigned int' rather than 'unsigned long', as that is what the users are
assuming...
- Eric
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 20:59 ` Eric Biggers
@ 2020-10-22 21:28 ` Al Viro
0 siblings, 0 replies; 66+ messages in thread
From: Al Viro @ 2020-10-22 21:28 UTC (permalink / raw)
To: Eric Biggers
Cc: Linus Torvalds, Nick Desaulniers, Matthew Wilcox, David Laight,
Christoph Hellwig, David Hildenbrand, Greg KH,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On Thu, Oct 22, 2020 at 01:59:32PM -0700, Eric Biggers wrote:
> Also note the following program succeeds on Linux 5.9 on x86_64. On kernels
> that have this bug, it should fail. (I couldn't get it to actually fail, so it
> must depend on the compiler and/or the kernel config...)
It doesn't. See https://www.spinics.net/lists/linux-scsi/msg147836.html for
discussion of that mess.
ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
unsigned long vlen, loff_t *pos, rwf_t flags)
{
struct iovec iovstack[UIO_FASTIOV];
struct iovec *iov = iovstack;
struct iov_iter iter;
ssize_t ret;
ret = import_iovec(READ, vec, vlen, ARRAY_SIZE(iovstack), &iov, &iter);
if (ret >= 0) {
ret = do_iter_read(file, &iter, pos, flags);
kfree(iov);
}
return ret;
}
and import_iovec() takes unsigned int as the third argument, so it *will*
truncate to 32 bits, no matter what. Has done so since 0504c074b546
"switch {compat_,}do_readv_writev() to {compat_,}import_iovec()" back in
March 2015. Yes, it was an incompatible userland ABI change, even though
nothing that used glibc/uclibc/dietlibc would've noticed.
Better yet, up until 2.1.90pre1 passing a 64bit value as the _first_ argument
of readv(2) used to fail with -EBADF if it was too large; at that point it
started to get quietly truncated to 32bit first. And again, no libc users
would've noticed (neither would anything except deliberate regression test
looking for that specific behaviour).
Note that we also have process_madvise(2) with size_t for vlen (huh? It's
a number of array elements, not an object size) and process_vm_{read,write}v(2),
that have unsigned long for the same thing. And the last two *are* using
the same unsigned long from glibc POV.
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 16:40 ` Matthew Wilcox
2020-10-22 16:50 ` David Laight
2020-10-22 17:00 ` Nick Desaulniers
@ 2020-10-22 18:19 ` Al Viro
2 siblings, 0 replies; 66+ messages in thread
From: Al Viro @ 2020-10-22 18:19 UTC (permalink / raw)
To: Matthew Wilcox
Cc: David Laight, 'Christoph Hellwig', David Hildenbrand,
Greg KH, Nick Desaulniers, [email protected], Andrew Morton,
Jens Axboe, Arnd Bergmann, David Howells,
[email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
On Thu, Oct 22, 2020 at 05:40:40PM +0100, Matthew Wilcox wrote:
> On Thu, Oct 22, 2020 at 04:35:17PM +0000, David Laight wrote:
> > Wait...
> > readv(2) defines:
> > ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
>
> It doesn't really matter what the manpage says. What does the AOSP
> libc header say?
FWIW, see https://www.spinics.net/lists/linux-scsi/msg147836.html and
subthread from there on...
^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 16:35 ` David Laight
2020-10-22 16:40 ` Matthew Wilcox
@ 2020-10-22 17:54 ` Nick Desaulniers
[not found] ` <CAK8P3a3LjG+ZvmQrkb9zpgov8xBkQQWrkHBPgjfYSqBKGrwT4w@mail.gmail.com>
1 sibling, 1 reply; 66+ messages in thread
From: Nick Desaulniers @ 2020-10-22 17:54 UTC (permalink / raw)
To: David Laight
Cc: Christoph Hellwig, David Hildenbrand, Greg KH, Al Viro,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
[-- Attachment #1: Type: text/plain, Size: 2086 bytes --]
On Thu, Oct 22, 2020 at 9:35 AM David Laight <[email protected]> wrote:
>
> From: Christoph Hellwig
> > Sent: 22 October 2020 14:24
> >
> > On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
> > > My thinking: if the compiler that calls import_iovec() has garbage in
> > > the upper 32 bit
> > >
> > > a) gcc will zero it out and not rely on it being zero.
> > > b) clang will not zero it out, assuming it is zero.
> > >
> > > But
> > >
> > > a) will zero it out when calling the !inlined variant
> > > b) clang will zero it out when calling the !inlined variant
> > >
> > > When inlining, b) strikes. We access garbage. That would mean that we
> > > have calling code that's not generated by clang/gcc IIUC.
> >
> > Most callchains of import_iovec start with the assembly syscall wrappers.
>
> Wait...
> readv(2) defines:
> ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
>
> But the syscall is defined as:
>
> SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec,
> unsigned long, vlen)
> {
> return do_readv(fd, vec, vlen, 0);
> }
>
> I'm guessing that nothing actually masks the high bits that come
> from an application that is compiled with clang?
>
> The vlen is 'unsigned long' through the first few calls.
> So unless there is a non-inlined function than takes vlen
> as 'int' the high garbage bits from userspace are kept.
Yeah, that's likely a bug: https://godbolt.org/z/KfsPKs
>
> Which makes it a bug in the kernel C syscall wrappers.
> They need to explicitly mask the high bits of 32bit
> arguments on arm64 but not x86-64.
Why not x86-64? Wouldn't it be *any* LP64 ISA?
Attaching a patch that uses the proper width, but I'm pretty sure
there's still a signedness issue . Greg, would you mind running this
through the wringer?
>
> What does the ARM EABI say about register parameters?
AAPCS is the ABI for 64b ARM, IIUC, which is the ISA GKH is reporting
the problem against. IIUC, EABI is one of the 32b ABIs. aarch64 is
LP64 just like x86_64.
--
Thanks,
~Nick Desaulniers
[-- Attachment #2: 0001-fs-fix-up-type-confusion-in-readv-writev.patch --]
[-- Type: application/octet-stream, Size: 4630 bytes --]
From aae26b13ffb9e38bb46b8c85985761b5f196b6f6 Mon Sep 17 00:00:00 2001
From: Nick Desaulniers <[email protected]>
Date: Thu, 22 Oct 2020 10:23:47 -0700
Subject: [PATCH] fs: fix up type confusion in readv/writev
The syscall interface doesn't match up with the interface libc is using
or that's defined in the manual pages.
ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
ssize_t writev(int fd, const struct iovec *iov, int iovcnt);
The kernel was defining `iovcnt` as `unsigned long` which is a problem
when userspace understands this to be `int`.
(There's still likely a signedness bug here, but use the proper widths
that import_iovec() expects.)
Signed-off-by: Nick Desaulniers <[email protected]>
---
fs/read_write.c | 10 +++++-----
fs/splice.c | 2 +-
include/linux/fs.h | 2 +-
lib/iov_iter.c | 4 ++--
4 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index 19f5c4bf75aa..b858f39a4475 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -890,7 +890,7 @@ ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos,
EXPORT_SYMBOL(vfs_iter_write);
ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
- unsigned long vlen, loff_t *pos, rwf_t flags)
+ unsigned int vlen, loff_t *pos, rwf_t flags)
{
struct iovec iovstack[UIO_FASTIOV];
struct iovec *iov = iovstack;
@@ -907,7 +907,7 @@ ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
}
static ssize_t vfs_writev(struct file *file, const struct iovec __user *vec,
- unsigned long vlen, loff_t *pos, rwf_t flags)
+ unsigned int vlen, loff_t *pos, rwf_t flags)
{
struct iovec iovstack[UIO_FASTIOV];
struct iovec *iov = iovstack;
@@ -925,7 +925,7 @@ static ssize_t vfs_writev(struct file *file, const struct iovec __user *vec,
}
static ssize_t do_readv(unsigned long fd, const struct iovec __user *vec,
- unsigned long vlen, rwf_t flags)
+ unsigned int vlen, rwf_t flags)
{
struct fd f = fdget_pos(fd);
ssize_t ret = -EBADF;
@@ -1025,13 +1025,13 @@ static ssize_t do_pwritev(unsigned long fd, const struct iovec __user *vec,
}
SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec,
- unsigned long, vlen)
+ unsigned int, vlen)
{
return do_readv(fd, vec, vlen, 0);
}
SYSCALL_DEFINE3(writev, unsigned long, fd, const struct iovec __user *, vec,
- unsigned long, vlen)
+ unsigned int, vlen)
{
return do_writev(fd, vec, vlen, 0);
}
diff --git a/fs/splice.c b/fs/splice.c
index 70cc52af780b..7508eccfa143 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -342,7 +342,7 @@ const struct pipe_buf_operations nosteal_pipe_buf_ops = {
EXPORT_SYMBOL(nosteal_pipe_buf_ops);
static ssize_t kernel_readv(struct file *file, const struct kvec *vec,
- unsigned long vlen, loff_t offset)
+ unsigned int vlen, loff_t offset)
{
mm_segment_t old_fs;
loff_t pos = offset;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c4ae9cafbbba..211bce5e6e60 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1895,7 +1895,7 @@ static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *);
extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *);
extern ssize_t vfs_readv(struct file *, const struct iovec __user *,
- unsigned long, loff_t *, rwf_t);
+ unsigned int, loff_t *, rwf_t);
extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *,
loff_t, size_t, unsigned int);
extern ssize_t generic_copy_file_range(struct file *file_in, loff_t pos_in,
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 1635111c5bd2..ded9d9c4eb28 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1734,7 +1734,7 @@ struct iovec *iovec_from_user(const struct iovec __user *uvec,
}
ssize_t __import_iovec(int type, const struct iovec __user *uvec,
- unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
+ unsigned int nr_segs, unsigned int fast_segs, struct iovec **iovp,
struct iov_iter *i, bool compat)
{
ssize_t total_len = 0;
@@ -1803,7 +1803,7 @@ ssize_t __import_iovec(int type, const struct iovec __user *uvec,
* Return: Negative error code on error, bytes imported on success
*/
ssize_t import_iovec(int type, const struct iovec __user *uvec,
- unsigned nr_segs, unsigned fast_segs,
+ unsigned int nr_segs, unsigned int fast_segs,
struct iovec **iovp, struct iov_iter *i)
{
return __import_iovec(type, uvec, nr_segs, fast_segs, iovp, i,
--
2.29.0.rc1.297.gfa9743e501-goog
^ permalink raw reply related [flat|nested] 66+ messages in thread
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 9:19 ` David Hildenbrand
2020-10-22 9:25 ` David Hildenbrand
@ 2020-10-22 9:28 ` David Laight
1 sibling, 0 replies; 66+ messages in thread
From: David Laight @ 2020-10-22 9:28 UTC (permalink / raw)
To: 'David Hildenbrand', Greg KH
Cc: Al Viro, Nick Desaulniers, Christoph Hellwig,
[email protected], Andrew Morton, Jens Axboe, Arnd Bergmann,
David Howells, [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
From: David Hildenbrand
> Sent: 22 October 2020 10:19
>
> On 22.10.20 11:01, Greg KH wrote:
> > On Thu, Oct 22, 2020 at 10:48:59AM +0200, David Hildenbrand wrote:
> >> On 22.10.20 10:40, David Laight wrote:
> >>> From: David Hildenbrand
> >>>> Sent: 22 October 2020 09:35
> >>>>
> >>>> On 22.10.20 10:26, Greg KH wrote:
> >>>>> On Thu, Oct 22, 2020 at 12:39:14AM +0100, Al Viro wrote:
> >>>>>> On Wed, Oct 21, 2020 at 06:13:01PM +0200, Greg KH wrote:
> >>>>>>> On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:
> >>>>>>>> From: David Laight <[email protected]>
> >>>>>>>>
> >>>>>>>> This lets the compiler inline it into import_iovec() generating
> >>>>>>>> much better code.
> >>>>>>>>
> >>>>>>>> Signed-off-by: David Laight <[email protected]>
> >>>>>>>> Signed-off-by: Christoph Hellwig <[email protected]>
> >>>>>>>> ---
> >>>>>>>> fs/read_write.c | 179 ------------------------------------------------
> >>>>>>>> lib/iov_iter.c | 176 +++++++++++++++++++++++++++++++++++++++++++++++
> >>>>>>>> 2 files changed, 176 insertions(+), 179 deletions(-)
> >>>>>>>
> >>>>>>> Strangely, this commit causes a regression in Linus's tree right now.
> >>>>>>>
> >>>>>>> I can't really figure out what the regression is, only that this commit
> >>>>>>> triggers a "large Android system binary" from working properly. There's
> >>>>>>> no kernel log messages anywhere, and I don't have any way to strace the
> >>>>>>> thing in the testing framework, so any hints that people can provide
> >>>>>>> would be most appreciated.
> >>>>>>
> >>>>>> It's a pure move - modulo changed line breaks in the argument lists
> >>>>>> the functions involved are identical before and after that (just checked
> >>>>>> that directly, by checking out the trees before and after, extracting two
> >>>>>> functions in question from fs/read_write.c and lib/iov_iter.c (before and
> >>>>>> after, resp.) and checking the diff between those.
> >>>>>>
> >>>>>> How certain is your bisection?
> >>>>>
> >>>>> The bisection is very reproducable.
> >>>>>
> >>>>> But, this looks now to be a compiler bug. I'm using the latest version
> >>>>> of clang and if I put "noinline" at the front of the function,
> >>>>> everything works.
> >>>>
> >>>> Well, the compiler can do more invasive optimizations when inlining. If
> >>>> you have buggy code that relies on some unspecified behavior, inlining
> >>>> can change the behavior ... but going over that code, there isn't too
> >>>> much action going on. At least nothing screamed at me.
> >>>
> >>> Apart from all the optimisations that get rid off the 'pass be reference'
> >>> parameters and strange conditional tests.
> >>> Plenty of scope for the compiler getting it wrong.
> >>> But nothing even vaguely illegal.
> >>
> >> Not the first time that people blame the compiler to then figure out
> >> that something else is wrong ... but maybe this time is different :)
> >
> > I agree, I hate to blame the compiler, that's almost never the real
> > problem, but this one sure "feels" like it.
> >
> > I'm running some more tests, trying to narrow things down as just adding
> > a "noinline" to the function that got moved here doesn't work on Linus's
> > tree at the moment because the function was split into multiple
> > functions.
> >
> > Give me a few hours...
>
> I might be wrong but
>
> a) import_iovec() uses:
> - unsigned nr_segs -> int
> - unsigned fast_segs -> int
> b) rw_copy_check_uvector() uses:
> - unsigned long nr_segs -> long
> - unsigned long fast_seg -> long
>
> So when calling rw_copy_check_uvector(), we have to zero-extend the
> registers used for passing the arguments. That's definitely done when
> calling the function explicitly. Maybe when inlining something is messed up?
That's also not needed on x86-64 - the high bits get cleared by 32bit writes.
But, IIRC, arm64 leaves them unchanged or undefined.
I guessing that every array access uses a *(Rx + Ry) addressing
mode. So indexing an array even with 'unsigned int' requires
an explicit zero-extend on arm64?
(x86-64 ends up with an explicit sign extend when indexing an
array with 'signed int'.)
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 66+ messages in thread
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
2020-10-22 8:48 ` David Hildenbrand
2020-10-22 9:01 ` Greg KH
@ 2020-10-22 9:02 ` David Laight
1 sibling, 0 replies; 66+ messages in thread
From: David Laight @ 2020-10-22 9:02 UTC (permalink / raw)
To: 'David Hildenbrand', Greg KH, Al Viro, Nick Desaulniers
Cc: Christoph Hellwig, [email protected], Andrew Morton,
Jens Axboe, Arnd Bergmann, David Howells,
[email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
From: David Hildenbrand
> Sent: 22 October 2020 09:49
...
> >>> But, this looks now to be a compiler bug. I'm using the latest version
> >>> of clang and if I put "noinline" at the front of the function,
> >>> everything works.
> >>
> >> Well, the compiler can do more invasive optimizations when inlining. If
> >> you have buggy code that relies on some unspecified behavior, inlining
> >> can change the behavior ... but going over that code, there isn't too
> >> much action going on. At least nothing screamed at me.
> >
> > Apart from all the optimisations that get rid off the 'pass be reference'
> > parameters and strange conditional tests.
> > Plenty of scope for the compiler getting it wrong.
> > But nothing even vaguely illegal.
>
> Not the first time that people blame the compiler to then figure out
> that something else is wrong ... but maybe this time is different :)
Usually down to missing asm 'memory' constraints...
Need to read the obj file to see what the compiler did.
The code must be 'approximately right' or nothing would run.
So I'd guess it has to do with > 8 fragments.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 66+ messages in thread
* [PATCH 3/9] iov_iter: refactor rw_copy_check_uvector and import_iovec
2020-09-25 4:51 let import_iovec deal with compat_iovecs as well v4 Christoph Hellwig
2020-09-25 4:51 ` [PATCH 1/9] compat.h: fix a spelling error in <linux/compat.h> Christoph Hellwig
2020-09-25 4:51 ` [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c Christoph Hellwig
@ 2020-09-25 4:51 ` Christoph Hellwig
2020-09-25 4:51 ` [PATCH 4/9] iov_iter: transparently handle compat iovecs in import_iovec Christoph Hellwig
` (6 subsequent siblings)
9 siblings, 0 replies; 66+ messages in thread
From: Christoph Hellwig @ 2020-09-25 4:51 UTC (permalink / raw)
To: Alexander Viro
Cc: Andrew Morton, Jens Axboe, Arnd Bergmann, David Howells,
David Laight, linux-arm-kernel, linux-kernel, linux-mips,
linux-parisc, linuxppc-dev, linux-s390, sparclinux, linux-block,
linux-scsi, linux-fsdevel, linux-aio, io-uring, linux-arch,
linux-mm, netdev, keyrings, linux-security-module
Split rw_copy_check_uvector into two new helpers with more sensible
calling conventions:
- iovec_from_user copies a iovec from userspace either into the provided
stack buffer if it fits, or allocates a new buffer for it. Returns
the actually used iovec. It also verifies that iov_len does fit a
signed type, and handles compat iovecs if the compat flag is set.
- __import_iovec consolidates the native and compat versions of
import_iovec. It calls iovec_from_user, then validates each iovec
actually points to user addresses, and ensures the total length
doesn't overflow.
This has two major implications:
- the access_process_vm case loses the total lenght checking, which
wasn't required anyway, given that each call receives two iovecs
for the local and remote side of the operation, and it verifies
the total length on the local side already.
- instead of a single loop there now are two loops over the iovecs.
Given that the iovecs are cache hot this doesn't make a major
difference
Signed-off-by: Christoph Hellwig <[email protected]>
---
include/linux/compat.h | 6 -
include/linux/fs.h | 13 --
include/linux/uio.h | 12 +-
lib/iov_iter.c | 300 ++++++++++++++++-------------------------
mm/process_vm_access.c | 34 +++--
5 files changed, 138 insertions(+), 227 deletions(-)
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 654c1ec36671a4..b930de791ff16b 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -451,12 +451,6 @@ extern long compat_arch_ptrace(struct task_struct *child, compat_long_t request,
struct epoll_event; /* fortunately, this one is fixed-layout */
-extern ssize_t compat_rw_copy_check_uvector(int type,
- const struct compat_iovec __user *uvector,
- unsigned long nr_segs,
- unsigned long fast_segs, struct iovec *fast_pointer,
- struct iovec **ret_pointer);
-
extern void __user *compat_alloc_user_space(unsigned long len);
int compat_restore_altstack(const compat_stack_t __user *uss);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 7519ae003a082c..e69b45b6cc7b5f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -178,14 +178,6 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
/* File supports async buffered reads */
#define FMODE_BUF_RASYNC ((__force fmode_t)0x40000000)
-/*
- * Flag for rw_copy_check_uvector and compat_rw_copy_check_uvector
- * that indicates that they should check the contents of the iovec are
- * valid, but not check the memory that the iovec elements
- * points too.
- */
-#define CHECK_IOVEC_ONLY -1
-
/*
* Attribute flags. These should be or-ed together to figure out what
* has been changed!
@@ -1887,11 +1879,6 @@ static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
return file->f_op->mmap(file, vma);
}
-ssize_t rw_copy_check_uvector(int type, const struct iovec __user * uvector,
- unsigned long nr_segs, unsigned long fast_segs,
- struct iovec *fast_pointer,
- struct iovec **ret_pointer);
-
extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *);
extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *);
extern ssize_t vfs_readv(struct file *, const struct iovec __user *,
diff --git a/include/linux/uio.h b/include/linux/uio.h
index 3835a8a8e9eae0..92c11fe41c6228 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -266,9 +266,15 @@ bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum, struct
size_t hash_and_copy_to_iter(const void *addr, size_t bytes, void *hashp,
struct iov_iter *i);
-ssize_t import_iovec(int type, const struct iovec __user * uvector,
- unsigned nr_segs, unsigned fast_segs,
- struct iovec **iov, struct iov_iter *i);
+struct iovec *iovec_from_user(const struct iovec __user *uvector,
+ unsigned long nr_segs, unsigned long fast_segs,
+ struct iovec *fast_iov, bool compat);
+ssize_t import_iovec(int type, const struct iovec __user *uvec,
+ unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
+ struct iov_iter *i);
+ssize_t __import_iovec(int type, const struct iovec __user *uvec,
+ unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
+ struct iov_iter *i, bool compat);
#ifdef CONFIG_COMPAT
struct compat_iovec;
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index ccea9db3f72be8..d5d8afe31fca16 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -7,6 +7,7 @@
#include <linux/slab.h>
#include <linux/vmalloc.h>
#include <linux/splice.h>
+#include <linux/compat.h>
#include <net/checksum.h>
#include <linux/scatterlist.h>
#include <linux/instrumented.h>
@@ -1650,107 +1651,133 @@ const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
}
EXPORT_SYMBOL(dup_iter);
-/**
- * rw_copy_check_uvector() - Copy an array of &struct iovec from userspace
- * into the kernel and check that it is valid.
- *
- * @type: One of %CHECK_IOVEC_ONLY, %READ, or %WRITE.
- * @uvector: Pointer to the userspace array.
- * @nr_segs: Number of elements in userspace array.
- * @fast_segs: Number of elements in @fast_pointer.
- * @fast_pointer: Pointer to (usually small on-stack) kernel array.
- * @ret_pointer: (output parameter) Pointer to a variable that will point to
- * either @fast_pointer, a newly allocated kernel array, or NULL,
- * depending on which array was used.
- *
- * This function copies an array of &struct iovec of @nr_segs from
- * userspace into the kernel and checks that each element is valid (e.g.
- * it does not point to a kernel address or cause overflow by being too
- * large, etc.).
- *
- * As an optimization, the caller may provide a pointer to a small
- * on-stack array in @fast_pointer, typically %UIO_FASTIOV elements long
- * (the size of this array, or 0 if unused, should be given in @fast_segs).
- *
- * @ret_pointer will always point to the array that was used, so the
- * caller must take care not to call kfree() on it e.g. in case the
- * @fast_pointer array was used and it was allocated on the stack.
- *
- * Return: The total number of bytes covered by the iovec array on success
- * or a negative error code on error.
- */
-ssize_t rw_copy_check_uvector(int type, const struct iovec __user *uvector,
- unsigned long nr_segs, unsigned long fast_segs,
- struct iovec *fast_pointer, struct iovec **ret_pointer)
+static int copy_compat_iovec_from_user(struct iovec *iov,
+ const struct iovec __user *uvec, unsigned long nr_segs)
+{
+ const struct compat_iovec __user *uiov =
+ (const struct compat_iovec __user *)uvec;
+ int ret = -EFAULT, i;
+
+ if (!user_access_begin(uvec, nr_segs * sizeof(*uvec)))
+ return -EFAULT;
+
+ for (i = 0; i < nr_segs; i++) {
+ compat_uptr_t buf;
+ compat_ssize_t len;
+
+ unsafe_get_user(len, &uiov[i].iov_len, uaccess_end);
+ unsafe_get_user(buf, &uiov[i].iov_base, uaccess_end);
+
+ /* check for compat_size_t not fitting in compat_ssize_t .. */
+ if (len < 0) {
+ ret = -EINVAL;
+ goto uaccess_end;
+ }
+ iov[i].iov_base = compat_ptr(buf);
+ iov[i].iov_len = len;
+ }
+
+ ret = 0;
+uaccess_end:
+ user_access_end();
+ return ret;
+}
+
+static int copy_iovec_from_user(struct iovec *iov,
+ const struct iovec __user *uvec, unsigned long nr_segs)
{
unsigned long seg;
- ssize_t ret;
- struct iovec *iov = fast_pointer;
- /*
- * SuS says "The readv() function *may* fail if the iovcnt argument
- * was less than or equal to 0, or greater than {IOV_MAX}. Linux has
- * traditionally returned zero for zero segments, so...
- */
- if (nr_segs == 0) {
- ret = 0;
- goto out;
+ if (copy_from_user(iov, uvec, nr_segs * sizeof(*uvec)))
+ return -EFAULT;
+ for (seg = 0; seg < nr_segs; seg++) {
+ if ((ssize_t)iov[seg].iov_len < 0)
+ return -EINVAL;
}
+ return 0;
+}
+
+struct iovec *iovec_from_user(const struct iovec __user *uvec,
+ unsigned long nr_segs, unsigned long fast_segs,
+ struct iovec *fast_iov, bool compat)
+{
+ struct iovec *iov = fast_iov;
+ int ret;
+
/*
- * First get the "struct iovec" from user memory and
- * verify all the pointers
+ * SuS says "The readv() function *may* fail if the iovcnt argument was
+ * less than or equal to 0, or greater than {IOV_MAX}. Linux has
+ * traditionally returned zero for zero segments, so...
*/
- if (nr_segs > UIO_MAXIOV) {
- ret = -EINVAL;
- goto out;
- }
+ if (nr_segs == 0)
+ return iov;
+ if (nr_segs > UIO_MAXIOV)
+ return ERR_PTR(-EINVAL);
if (nr_segs > fast_segs) {
iov = kmalloc_array(nr_segs, sizeof(struct iovec), GFP_KERNEL);
- if (iov == NULL) {
- ret = -ENOMEM;
- goto out;
- }
+ if (!iov)
+ return ERR_PTR(-ENOMEM);
}
- if (copy_from_user(iov, uvector, nr_segs*sizeof(*uvector))) {
- ret = -EFAULT;
- goto out;
+
+ if (compat)
+ ret = copy_compat_iovec_from_user(iov, uvec, nr_segs);
+ else
+ ret = copy_iovec_from_user(iov, uvec, nr_segs);
+ if (ret) {
+ if (iov != fast_iov)
+ kfree(iov);
+ return ERR_PTR(ret);
+ }
+
+ return iov;
+}
+
+ssize_t __import_iovec(int type, const struct iovec __user *uvec,
+ unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
+ struct iov_iter *i, bool compat)
+{
+ ssize_t total_len = 0;
+ unsigned long seg;
+ struct iovec *iov;
+
+ iov = iovec_from_user(uvec, nr_segs, fast_segs, *iovp, compat);
+ if (IS_ERR(iov)) {
+ *iovp = NULL;
+ return PTR_ERR(iov);
}
/*
- * According to the Single Unix Specification we should return EINVAL
- * if an element length is < 0 when cast to ssize_t or if the
- * total length would overflow the ssize_t return value of the
- * system call.
+ * According to the Single Unix Specification we should return EINVAL if
+ * an element length is < 0 when cast to ssize_t or if the total length
+ * would overflow the ssize_t return value of the system call.
*
* Linux caps all read/write calls to MAX_RW_COUNT, and avoids the
* overflow case.
*/
- ret = 0;
for (seg = 0; seg < nr_segs; seg++) {
- void __user *buf = iov[seg].iov_base;
ssize_t len = (ssize_t)iov[seg].iov_len;
- /* see if we we're about to use an invalid len or if
- * it's about to overflow ssize_t */
- if (len < 0) {
- ret = -EINVAL;
- goto out;
+ if (!access_ok(iov[seg].iov_base, len)) {
+ if (iov != *iovp)
+ kfree(iov);
+ *iovp = NULL;
+ return -EFAULT;
}
- if (type >= 0
- && unlikely(!access_ok(buf, len))) {
- ret = -EFAULT;
- goto out;
- }
- if (len > MAX_RW_COUNT - ret) {
- len = MAX_RW_COUNT - ret;
+
+ if (len > MAX_RW_COUNT - total_len) {
+ len = MAX_RW_COUNT - total_len;
iov[seg].iov_len = len;
}
- ret += len;
+ total_len += len;
}
-out:
- *ret_pointer = iov;
- return ret;
+
+ iov_iter_init(i, type, iov, nr_segs, total_len);
+ if (iov == *iovp)
+ *iovp = NULL;
+ else
+ *iovp = iov;
+ return total_len;
}
/**
@@ -1759,10 +1786,10 @@ ssize_t rw_copy_check_uvector(int type, const struct iovec __user *uvector,
* &struct iov_iter iterator to access it.
*
* @type: One of %READ or %WRITE.
- * @uvector: Pointer to the userspace array.
+ * @uvec: Pointer to the userspace array.
* @nr_segs: Number of elements in userspace array.
* @fast_segs: Number of elements in @iov.
- * @iov: (input and output parameter) Pointer to pointer to (usually small
+ * @iovp: (input and output parameter) Pointer to pointer to (usually small
* on-stack) kernel array.
* @i: Pointer to iterator that will be initialized on success.
*
@@ -1775,120 +1802,21 @@ ssize_t rw_copy_check_uvector(int type, const struct iovec __user *uvector,
*
* Return: Negative error code on error, bytes imported on success
*/
-ssize_t import_iovec(int type, const struct iovec __user * uvector,
+ssize_t import_iovec(int type, const struct iovec __user *uvec,
unsigned nr_segs, unsigned fast_segs,
- struct iovec **iov, struct iov_iter *i)
+ struct iovec **iovp, struct iov_iter *i)
{
- ssize_t n;
- struct iovec *p;
- n = rw_copy_check_uvector(type, uvector, nr_segs, fast_segs,
- *iov, &p);
- if (n < 0) {
- if (p != *iov)
- kfree(p);
- *iov = NULL;
- return n;
- }
- iov_iter_init(i, type, p, nr_segs, n);
- *iov = p == *iov ? NULL : p;
- return n;
+ return __import_iovec(type, uvec, nr_segs, fast_segs, iovp, i, false);
}
EXPORT_SYMBOL(import_iovec);
#ifdef CONFIG_COMPAT
-#include <linux/compat.h>
-
-ssize_t compat_rw_copy_check_uvector(int type,
- const struct compat_iovec __user *uvector,
- unsigned long nr_segs, unsigned long fast_segs,
- struct iovec *fast_pointer, struct iovec **ret_pointer)
-{
- compat_ssize_t tot_len;
- struct iovec *iov = *ret_pointer = fast_pointer;
- ssize_t ret = 0;
- int seg;
-
- /*
- * SuS says "The readv() function *may* fail if the iovcnt argument
- * was less than or equal to 0, or greater than {IOV_MAX}. Linux has
- * traditionally returned zero for zero segments, so...
- */
- if (nr_segs == 0)
- goto out;
-
- ret = -EINVAL;
- if (nr_segs > UIO_MAXIOV)
- goto out;
- if (nr_segs > fast_segs) {
- ret = -ENOMEM;
- iov = kmalloc_array(nr_segs, sizeof(struct iovec), GFP_KERNEL);
- if (iov == NULL)
- goto out;
- }
- *ret_pointer = iov;
-
- ret = -EFAULT;
- if (!access_ok(uvector, nr_segs*sizeof(*uvector)))
- goto out;
-
- /*
- * Single unix specification:
- * We should -EINVAL if an element length is not >= 0 and fitting an
- * ssize_t.
- *
- * In Linux, the total length is limited to MAX_RW_COUNT, there is
- * no overflow possibility.
- */
- tot_len = 0;
- ret = -EINVAL;
- for (seg = 0; seg < nr_segs; seg++) {
- compat_uptr_t buf;
- compat_ssize_t len;
-
- if (__get_user(len, &uvector->iov_len) ||
- __get_user(buf, &uvector->iov_base)) {
- ret = -EFAULT;
- goto out;
- }
- if (len < 0) /* size_t not fitting in compat_ssize_t .. */
- goto out;
- if (type >= 0 &&
- !access_ok(compat_ptr(buf), len)) {
- ret = -EFAULT;
- goto out;
- }
- if (len > MAX_RW_COUNT - tot_len)
- len = MAX_RW_COUNT - tot_len;
- tot_len += len;
- iov->iov_base = compat_ptr(buf);
- iov->iov_len = (compat_size_t) len;
- uvector++;
- iov++;
- }
- ret = tot_len;
-
-out:
- return ret;
-}
-
-ssize_t compat_import_iovec(int type,
- const struct compat_iovec __user * uvector,
- unsigned nr_segs, unsigned fast_segs,
- struct iovec **iov, struct iov_iter *i)
+ssize_t compat_import_iovec(int type, const struct compat_iovec __user *uvec,
+ unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
+ struct iov_iter *i)
{
- ssize_t n;
- struct iovec *p;
- n = compat_rw_copy_check_uvector(type, uvector, nr_segs, fast_segs,
- *iov, &p);
- if (n < 0) {
- if (p != *iov)
- kfree(p);
- *iov = NULL;
- return n;
- }
- iov_iter_init(i, type, p, nr_segs, n);
- *iov = p == *iov ? NULL : p;
- return n;
+ return __import_iovec(type, (const struct iovec __user *)uvec, nr_segs,
+ fast_segs, iovp, i, true);
}
EXPORT_SYMBOL(compat_import_iovec);
#endif
diff --git a/mm/process_vm_access.c b/mm/process_vm_access.c
index 29c052099affdc..5e728c20c2bead 100644
--- a/mm/process_vm_access.c
+++ b/mm/process_vm_access.c
@@ -276,20 +276,17 @@ static ssize_t process_vm_rw(pid_t pid,
if (rc < 0)
return rc;
if (!iov_iter_count(&iter))
- goto free_iovecs;
-
- rc = rw_copy_check_uvector(CHECK_IOVEC_ONLY, rvec, riovcnt, UIO_FASTIOV,
- iovstack_r, &iov_r);
- if (rc <= 0)
- goto free_iovecs;
-
+ goto free_iov_l;
+ iov_r = iovec_from_user(rvec, riovcnt, UIO_FASTIOV, iovstack_r, false);
+ if (IS_ERR(iov_r)) {
+ rc = PTR_ERR(iov_r);
+ goto free_iov_l;
+ }
rc = process_vm_rw_core(pid, &iter, iov_r, riovcnt, flags, vm_write);
-
-free_iovecs:
if (iov_r != iovstack_r)
kfree(iov_r);
+free_iov_l:
kfree(iov_l);
-
return rc;
}
@@ -333,18 +330,17 @@ compat_process_vm_rw(compat_pid_t pid,
if (rc < 0)
return rc;
if (!iov_iter_count(&iter))
- goto free_iovecs;
- rc = compat_rw_copy_check_uvector(CHECK_IOVEC_ONLY, rvec, riovcnt,
- UIO_FASTIOV, iovstack_r,
- &iov_r);
- if (rc <= 0)
- goto free_iovecs;
-
+ goto free_iov_l;
+ iov_r = iovec_from_user((const struct iovec __user *)rvec, riovcnt,
+ UIO_FASTIOV, iovstack_r, true);
+ if (IS_ERR(iov_r)) {
+ rc = PTR_ERR(iov_r);
+ goto free_iov_l;
+ }
rc = process_vm_rw_core(pid, &iter, iov_r, riovcnt, flags, vm_write);
-
-free_iovecs:
if (iov_r != iovstack_r)
kfree(iov_r);
+free_iov_l:
kfree(iov_l);
return rc;
}
--
2.28.0
^ permalink raw reply related [flat|nested] 66+ messages in thread
* [PATCH 4/9] iov_iter: transparently handle compat iovecs in import_iovec
2020-09-25 4:51 let import_iovec deal with compat_iovecs as well v4 Christoph Hellwig
` (2 preceding siblings ...)
2020-09-25 4:51 ` [PATCH 3/9] iov_iter: refactor rw_copy_check_uvector and import_iovec Christoph Hellwig
@ 2020-09-25 4:51 ` Christoph Hellwig
2020-09-25 4:51 ` [PATCH 5/9] fs: remove various compat readv/writev helpers Christoph Hellwig
` (5 subsequent siblings)
9 siblings, 0 replies; 66+ messages in thread
From: Christoph Hellwig @ 2020-09-25 4:51 UTC (permalink / raw)
To: Alexander Viro
Cc: Andrew Morton, Jens Axboe, Arnd Bergmann, David Howells,
David Laight, linux-arm-kernel, linux-kernel, linux-mips,
linux-parisc, linuxppc-dev, linux-s390, sparclinux, linux-block,
linux-scsi, linux-fsdevel, linux-aio, io-uring, linux-arch,
linux-mm, netdev, keyrings, linux-security-module
Use in compat_syscall to import either native or the compat iovecs, and
remove the now superflous compat_import_iovec.
This removes the need for special compat logic in most callers, and
the remaining ones can still be simplified by using __import_iovec
with a bool compat parameter.
Signed-off-by: Christoph Hellwig <[email protected]>
---
block/scsi_ioctl.c | 12 ++----------
drivers/scsi/sg.c | 9 +--------
fs/aio.c | 8 ++------
fs/io_uring.c | 20 ++++++++------------
fs/read_write.c | 6 ++++--
fs/splice.c | 2 +-
include/linux/uio.h | 8 --------
lib/iov_iter.c | 14 ++------------
mm/process_vm_access.c | 3 ++-
net/compat.c | 4 ++--
security/keys/compat.c | 5 ++---
11 files changed, 26 insertions(+), 65 deletions(-)
diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index ef722f04f88a93..e08df86866ee5d 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -333,16 +333,8 @@ static int sg_io(struct request_queue *q, struct gendisk *bd_disk,
struct iov_iter i;
struct iovec *iov = NULL;
-#ifdef CONFIG_COMPAT
- if (in_compat_syscall())
- ret = compat_import_iovec(rq_data_dir(rq),
- hdr->dxferp, hdr->iovec_count,
- 0, &iov, &i);
- else
-#endif
- ret = import_iovec(rq_data_dir(rq),
- hdr->dxferp, hdr->iovec_count,
- 0, &iov, &i);
+ ret = import_iovec(rq_data_dir(rq), hdr->dxferp,
+ hdr->iovec_count, 0, &iov, &i);
if (ret < 0)
goto out_free_cdb;
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 20472aaaf630a4..bfa8d77322d732 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -1820,14 +1820,7 @@ sg_start_req(Sg_request *srp, unsigned char *cmd)
struct iovec *iov = NULL;
struct iov_iter i;
-#ifdef CONFIG_COMPAT
- if (in_compat_syscall())
- res = compat_import_iovec(rw, hp->dxferp, iov_count,
- 0, &iov, &i);
- else
-#endif
- res = import_iovec(rw, hp->dxferp, iov_count,
- 0, &iov, &i);
+ res = import_iovec(rw, hp->dxferp, iov_count, 0, &iov, &i);
if (res < 0)
return res;
diff --git a/fs/aio.c b/fs/aio.c
index d5ec303855669d..c45c20d875388c 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1489,12 +1489,8 @@ static ssize_t aio_setup_rw(int rw, const struct iocb *iocb,
*iovec = NULL;
return ret;
}
-#ifdef CONFIG_COMPAT
- if (compat)
- return compat_import_iovec(rw, buf, len, UIO_FASTIOV, iovec,
- iter);
-#endif
- return import_iovec(rw, buf, len, UIO_FASTIOV, iovec, iter);
+
+ return __import_iovec(rw, buf, len, UIO_FASTIOV, iovec, iter, compat);
}
static inline void aio_rw_done(struct kiocb *req, ssize_t ret)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 8b426aa29668cb..8c27dc28da182a 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2852,13 +2852,8 @@ static ssize_t __io_import_iovec(int rw, struct io_kiocb *req,
return ret;
}
-#ifdef CONFIG_COMPAT
- if (req->ctx->compat)
- return compat_import_iovec(rw, buf, sqe_len, UIO_FASTIOV,
- iovec, iter);
-#endif
-
- return import_iovec(rw, buf, sqe_len, UIO_FASTIOV, iovec, iter);
+ return __import_iovec(rw, buf, sqe_len, UIO_FASTIOV, iovec, iter,
+ req->ctx->compat);
}
static ssize_t io_import_iovec(int rw, struct io_kiocb *req,
@@ -4200,8 +4195,9 @@ static int __io_recvmsg_copy_hdr(struct io_kiocb *req,
sr->len);
iomsg->iov = NULL;
} else {
- ret = import_iovec(READ, uiov, iov_len, UIO_FASTIOV,
- &iomsg->iov, &iomsg->msg.msg_iter);
+ ret = __import_iovec(READ, uiov, iov_len, UIO_FASTIOV,
+ &iomsg->iov, &iomsg->msg.msg_iter,
+ false);
if (ret > 0)
ret = 0;
}
@@ -4241,9 +4237,9 @@ static int __io_compat_recvmsg_copy_hdr(struct io_kiocb *req,
sr->len = iomsg->iov[0].iov_len;
iomsg->iov = NULL;
} else {
- ret = compat_import_iovec(READ, uiov, len, UIO_FASTIOV,
- &iomsg->iov,
- &iomsg->msg.msg_iter);
+ ret = __import_iovec(READ, (struct iovec __user *)uiov, len,
+ UIO_FASTIOV, &iomsg->iov,
+ &iomsg->msg.msg_iter, true);
if (ret < 0)
return ret;
}
diff --git a/fs/read_write.c b/fs/read_write.c
index e5e891a88442ef..0a68037580b455 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1078,7 +1078,8 @@ static size_t compat_readv(struct file *file,
struct iov_iter iter;
ssize_t ret;
- ret = compat_import_iovec(READ, vec, vlen, UIO_FASTIOV, &iov, &iter);
+ ret = import_iovec(READ, (const struct iovec __user *)vec, vlen,
+ UIO_FASTIOV, &iov, &iter);
if (ret >= 0) {
ret = do_iter_read(file, &iter, pos, flags);
kfree(iov);
@@ -1186,7 +1187,8 @@ static size_t compat_writev(struct file *file,
struct iov_iter iter;
ssize_t ret;
- ret = compat_import_iovec(WRITE, vec, vlen, UIO_FASTIOV, &iov, &iter);
+ ret = import_iovec(WRITE, (const struct iovec __user *)vec, vlen,
+ UIO_FASTIOV, &iov, &iter);
if (ret >= 0) {
file_start_write(file);
ret = do_iter_write(file, &iter, pos, flags);
diff --git a/fs/splice.c b/fs/splice.c
index d7c8a7c4db07ff..132d42b9871f9b 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1387,7 +1387,7 @@ COMPAT_SYSCALL_DEFINE4(vmsplice, int, fd, const struct compat_iovec __user *, io
if (error)
return error;
- error = compat_import_iovec(type, iov32, nr_segs,
+ error = import_iovec(type, (struct iovec __user *)iov32, nr_segs,
ARRAY_SIZE(iovstack), &iov, &iter);
if (error >= 0) {
error = do_vmsplice(f.file, &iter, flags);
diff --git a/include/linux/uio.h b/include/linux/uio.h
index 92c11fe41c6228..daedc61ad3706e 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -275,14 +275,6 @@ ssize_t import_iovec(int type, const struct iovec __user *uvec,
ssize_t __import_iovec(int type, const struct iovec __user *uvec,
unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
struct iov_iter *i, bool compat);
-
-#ifdef CONFIG_COMPAT
-struct compat_iovec;
-ssize_t compat_import_iovec(int type, const struct compat_iovec __user * uvector,
- unsigned nr_segs, unsigned fast_segs,
- struct iovec **iov, struct iov_iter *i);
-#endif
-
int import_single_range(int type, void __user *buf, size_t len,
struct iovec *iov, struct iov_iter *i);
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index d5d8afe31fca16..8c51e1b03814a3 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1806,21 +1806,11 @@ ssize_t import_iovec(int type, const struct iovec __user *uvec,
unsigned nr_segs, unsigned fast_segs,
struct iovec **iovp, struct iov_iter *i)
{
- return __import_iovec(type, uvec, nr_segs, fast_segs, iovp, i, false);
+ return __import_iovec(type, uvec, nr_segs, fast_segs, iovp, i,
+ in_compat_syscall());
}
EXPORT_SYMBOL(import_iovec);
-#ifdef CONFIG_COMPAT
-ssize_t compat_import_iovec(int type, const struct compat_iovec __user *uvec,
- unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
- struct iov_iter *i)
-{
- return __import_iovec(type, (const struct iovec __user *)uvec, nr_segs,
- fast_segs, iovp, i, true);
-}
-EXPORT_SYMBOL(compat_import_iovec);
-#endif
-
int import_single_range(int rw, void __user *buf, size_t len,
struct iovec *iov, struct iov_iter *i)
{
diff --git a/mm/process_vm_access.c b/mm/process_vm_access.c
index 5e728c20c2bead..3f2156aab44263 100644
--- a/mm/process_vm_access.c
+++ b/mm/process_vm_access.c
@@ -326,7 +326,8 @@ compat_process_vm_rw(compat_pid_t pid,
if (flags != 0)
return -EINVAL;
- rc = compat_import_iovec(dir, lvec, liovcnt, UIO_FASTIOV, &iov_l, &iter);
+ rc = import_iovec(dir, (const struct iovec __user *)lvec, liovcnt,
+ UIO_FASTIOV, &iov_l, &iter);
if (rc < 0)
return rc;
if (!iov_iter_count(&iter))
diff --git a/net/compat.c b/net/compat.c
index 95ce707a30a31d..ddd15af3a2837b 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -98,8 +98,8 @@ int get_compat_msghdr(struct msghdr *kmsg,
if (err)
return err;
- err = compat_import_iovec(save_addr ? READ : WRITE, compat_ptr(ptr),
- len, UIO_FASTIOV, iov, &kmsg->msg_iter);
+ err = import_iovec(save_addr ? READ : WRITE, compat_ptr(ptr), len,
+ UIO_FASTIOV, iov, &kmsg->msg_iter);
return err < 0 ? err : 0;
}
diff --git a/security/keys/compat.c b/security/keys/compat.c
index 6ee9d8f6a4a5bb..7ae531db031cf8 100644
--- a/security/keys/compat.c
+++ b/security/keys/compat.c
@@ -33,9 +33,8 @@ static long compat_keyctl_instantiate_key_iov(
if (!_payload_iov)
ioc = 0;
- ret = compat_import_iovec(WRITE, _payload_iov, ioc,
- ARRAY_SIZE(iovstack), &iov,
- &from);
+ ret = import_iovec(WRITE, (const struct iovec __user *)_payload_iov,
+ ioc, ARRAY_SIZE(iovstack), &iov, &from);
if (ret < 0)
return ret;
--
2.28.0
^ permalink raw reply related [flat|nested] 66+ messages in thread
* [PATCH 5/9] fs: remove various compat readv/writev helpers
2020-09-25 4:51 let import_iovec deal with compat_iovecs as well v4 Christoph Hellwig
` (3 preceding siblings ...)
2020-09-25 4:51 ` [PATCH 4/9] iov_iter: transparently handle compat iovecs in import_iovec Christoph Hellwig
@ 2020-09-25 4:51 ` Christoph Hellwig
2020-09-25 4:51 ` [PATCH 6/9] fs: remove the compat readv/writev syscalls Christoph Hellwig
` (4 subsequent siblings)
9 siblings, 0 replies; 66+ messages in thread
From: Christoph Hellwig @ 2020-09-25 4:51 UTC (permalink / raw)
To: Alexander Viro
Cc: Andrew Morton, Jens Axboe, Arnd Bergmann, David Howells,
David Laight, linux-arm-kernel, linux-kernel, linux-mips,
linux-parisc, linuxppc-dev, linux-s390, sparclinux, linux-block,
linux-scsi, linux-fsdevel, linux-aio, io-uring, linux-arch,
linux-mm, netdev, keyrings, linux-security-module
Now that import_iovec handles compat iovecs as well, all the duplicated
code in the compat readv/writev helpers is not needed. Remove them
and switch the compat syscall handlers to use the native helpers.
Signed-off-by: Christoph Hellwig <[email protected]>
---
fs/read_write.c | 179 +++++++----------------------------------
include/linux/compat.h | 20 ++---
2 files changed, 40 insertions(+), 159 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index 0a68037580b455..eab427b7cc0a3f 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1068,226 +1068,107 @@ SYSCALL_DEFINE6(pwritev2, unsigned long, fd, const struct iovec __user *, vec,
return do_pwritev(fd, vec, vlen, pos, flags);
}
+/*
+ * Various compat syscalls. Note that they all pretend to take a native
+ * iovec - import_iovec will properly treat those as compat_iovecs based on
+ * in_compat_syscall().
+ */
#ifdef CONFIG_COMPAT
-static size_t compat_readv(struct file *file,
- const struct compat_iovec __user *vec,
- unsigned long vlen, loff_t *pos, rwf_t flags)
-{
- struct iovec iovstack[UIO_FASTIOV];
- struct iovec *iov = iovstack;
- struct iov_iter iter;
- ssize_t ret;
-
- ret = import_iovec(READ, (const struct iovec __user *)vec, vlen,
- UIO_FASTIOV, &iov, &iter);
- if (ret >= 0) {
- ret = do_iter_read(file, &iter, pos, flags);
- kfree(iov);
- }
- if (ret > 0)
- add_rchar(current, ret);
- inc_syscr(current);
- return ret;
-}
-
-static size_t do_compat_readv(compat_ulong_t fd,
- const struct compat_iovec __user *vec,
- compat_ulong_t vlen, rwf_t flags)
-{
- struct fd f = fdget_pos(fd);
- ssize_t ret;
- loff_t pos;
-
- if (!f.file)
- return -EBADF;
- pos = f.file->f_pos;
- ret = compat_readv(f.file, vec, vlen, &pos, flags);
- if (ret >= 0)
- f.file->f_pos = pos;
- fdput_pos(f);
- return ret;
-
-}
-
COMPAT_SYSCALL_DEFINE3(readv, compat_ulong_t, fd,
- const struct compat_iovec __user *,vec,
+ const struct iovec __user *, vec,
compat_ulong_t, vlen)
{
- return do_compat_readv(fd, vec, vlen, 0);
-}
-
-static long do_compat_preadv64(unsigned long fd,
- const struct compat_iovec __user *vec,
- unsigned long vlen, loff_t pos, rwf_t flags)
-{
- struct fd f;
- ssize_t ret;
-
- if (pos < 0)
- return -EINVAL;
- f = fdget(fd);
- if (!f.file)
- return -EBADF;
- ret = -ESPIPE;
- if (f.file->f_mode & FMODE_PREAD)
- ret = compat_readv(f.file, vec, vlen, &pos, flags);
- fdput(f);
- return ret;
+ return do_readv(fd, vec, vlen, 0);
}
#ifdef __ARCH_WANT_COMPAT_SYS_PREADV64
COMPAT_SYSCALL_DEFINE4(preadv64, unsigned long, fd,
- const struct compat_iovec __user *,vec,
+ const struct iovec __user *, vec,
unsigned long, vlen, loff_t, pos)
{
- return do_compat_preadv64(fd, vec, vlen, pos, 0);
+ return do_preadv(fd, vec, vlen, pos, 0);
}
#endif
COMPAT_SYSCALL_DEFINE5(preadv, compat_ulong_t, fd,
- const struct compat_iovec __user *,vec,
+ const struct iovec __user *, vec,
compat_ulong_t, vlen, u32, pos_low, u32, pos_high)
{
loff_t pos = ((loff_t)pos_high << 32) | pos_low;
- return do_compat_preadv64(fd, vec, vlen, pos, 0);
+ return do_preadv(fd, vec, vlen, pos, 0);
}
#ifdef __ARCH_WANT_COMPAT_SYS_PREADV64V2
COMPAT_SYSCALL_DEFINE5(preadv64v2, unsigned long, fd,
- const struct compat_iovec __user *,vec,
+ const struct iovec __user *, vec,
unsigned long, vlen, loff_t, pos, rwf_t, flags)
{
if (pos == -1)
- return do_compat_readv(fd, vec, vlen, flags);
-
- return do_compat_preadv64(fd, vec, vlen, pos, flags);
+ return do_readv(fd, vec, vlen, flags);
+ return do_preadv(fd, vec, vlen, pos, flags);
}
#endif
COMPAT_SYSCALL_DEFINE6(preadv2, compat_ulong_t, fd,
- const struct compat_iovec __user *,vec,
+ const struct iovec __user *, vec,
compat_ulong_t, vlen, u32, pos_low, u32, pos_high,
rwf_t, flags)
{
loff_t pos = ((loff_t)pos_high << 32) | pos_low;
if (pos == -1)
- return do_compat_readv(fd, vec, vlen, flags);
-
- return do_compat_preadv64(fd, vec, vlen, pos, flags);
-}
-
-static size_t compat_writev(struct file *file,
- const struct compat_iovec __user *vec,
- unsigned long vlen, loff_t *pos, rwf_t flags)
-{
- struct iovec iovstack[UIO_FASTIOV];
- struct iovec *iov = iovstack;
- struct iov_iter iter;
- ssize_t ret;
-
- ret = import_iovec(WRITE, (const struct iovec __user *)vec, vlen,
- UIO_FASTIOV, &iov, &iter);
- if (ret >= 0) {
- file_start_write(file);
- ret = do_iter_write(file, &iter, pos, flags);
- file_end_write(file);
- kfree(iov);
- }
- if (ret > 0)
- add_wchar(current, ret);
- inc_syscw(current);
- return ret;
-}
-
-static size_t do_compat_writev(compat_ulong_t fd,
- const struct compat_iovec __user* vec,
- compat_ulong_t vlen, rwf_t flags)
-{
- struct fd f = fdget_pos(fd);
- ssize_t ret;
- loff_t pos;
-
- if (!f.file)
- return -EBADF;
- pos = f.file->f_pos;
- ret = compat_writev(f.file, vec, vlen, &pos, flags);
- if (ret >= 0)
- f.file->f_pos = pos;
- fdput_pos(f);
- return ret;
+ return do_readv(fd, vec, vlen, flags);
+ return do_preadv(fd, vec, vlen, pos, flags);
}
COMPAT_SYSCALL_DEFINE3(writev, compat_ulong_t, fd,
- const struct compat_iovec __user *, vec,
+ const struct iovec __user *, vec,
compat_ulong_t, vlen)
{
- return do_compat_writev(fd, vec, vlen, 0);
-}
-
-static long do_compat_pwritev64(unsigned long fd,
- const struct compat_iovec __user *vec,
- unsigned long vlen, loff_t pos, rwf_t flags)
-{
- struct fd f;
- ssize_t ret;
-
- if (pos < 0)
- return -EINVAL;
- f = fdget(fd);
- if (!f.file)
- return -EBADF;
- ret = -ESPIPE;
- if (f.file->f_mode & FMODE_PWRITE)
- ret = compat_writev(f.file, vec, vlen, &pos, flags);
- fdput(f);
- return ret;
+ return do_writev(fd, vec, vlen, 0);
}
#ifdef __ARCH_WANT_COMPAT_SYS_PWRITEV64
COMPAT_SYSCALL_DEFINE4(pwritev64, unsigned long, fd,
- const struct compat_iovec __user *,vec,
+ const struct iovec __user *, vec,
unsigned long, vlen, loff_t, pos)
{
- return do_compat_pwritev64(fd, vec, vlen, pos, 0);
+ return do_pwritev(fd, vec, vlen, pos, 0);
}
#endif
COMPAT_SYSCALL_DEFINE5(pwritev, compat_ulong_t, fd,
- const struct compat_iovec __user *,vec,
+ const struct iovec __user *,vec,
compat_ulong_t, vlen, u32, pos_low, u32, pos_high)
{
loff_t pos = ((loff_t)pos_high << 32) | pos_low;
- return do_compat_pwritev64(fd, vec, vlen, pos, 0);
+ return do_pwritev(fd, vec, vlen, pos, 0);
}
#ifdef __ARCH_WANT_COMPAT_SYS_PWRITEV64V2
COMPAT_SYSCALL_DEFINE5(pwritev64v2, unsigned long, fd,
- const struct compat_iovec __user *,vec,
+ const struct iovec __user *, vec,
unsigned long, vlen, loff_t, pos, rwf_t, flags)
{
if (pos == -1)
- return do_compat_writev(fd, vec, vlen, flags);
-
- return do_compat_pwritev64(fd, vec, vlen, pos, flags);
+ return do_writev(fd, vec, vlen, flags);
+ return do_pwritev(fd, vec, vlen, pos, flags);
}
#endif
COMPAT_SYSCALL_DEFINE6(pwritev2, compat_ulong_t, fd,
- const struct compat_iovec __user *,vec,
+ const struct iovec __user *,vec,
compat_ulong_t, vlen, u32, pos_low, u32, pos_high, rwf_t, flags)
{
loff_t pos = ((loff_t)pos_high << 32) | pos_low;
if (pos == -1)
- return do_compat_writev(fd, vec, vlen, flags);
-
- return do_compat_pwritev64(fd, vec, vlen, pos, flags);
+ return do_writev(fd, vec, vlen, flags);
+ return do_pwritev(fd, vec, vlen, pos, flags);
}
-
-#endif
+#endif /* CONFIG_COMPAT */
static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
size_t count, loff_t max)
diff --git a/include/linux/compat.h b/include/linux/compat.h
index b930de791ff16b..306ea7e1172d8d 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -546,25 +546,25 @@ asmlinkage long compat_sys_getdents(unsigned int fd,
/* fs/read_write.c */
asmlinkage long compat_sys_lseek(unsigned int, compat_off_t, unsigned int);
asmlinkage ssize_t compat_sys_readv(compat_ulong_t fd,
- const struct compat_iovec __user *vec, compat_ulong_t vlen);
+ const struct iovec __user *vec, compat_ulong_t vlen);
asmlinkage ssize_t compat_sys_writev(compat_ulong_t fd,
- const struct compat_iovec __user *vec, compat_ulong_t vlen);
+ const struct iovec __user *vec, compat_ulong_t vlen);
/* No generic prototype for pread64 and pwrite64 */
asmlinkage ssize_t compat_sys_preadv(compat_ulong_t fd,
- const struct compat_iovec __user *vec,
+ const struct iovec __user *vec,
compat_ulong_t vlen, u32 pos_low, u32 pos_high);
asmlinkage ssize_t compat_sys_pwritev(compat_ulong_t fd,
- const struct compat_iovec __user *vec,
+ const struct iovec __user *vec,
compat_ulong_t vlen, u32 pos_low, u32 pos_high);
#ifdef __ARCH_WANT_COMPAT_SYS_PREADV64
asmlinkage long compat_sys_preadv64(unsigned long fd,
- const struct compat_iovec __user *vec,
+ const struct iovec __user *vec,
unsigned long vlen, loff_t pos);
#endif
#ifdef __ARCH_WANT_COMPAT_SYS_PWRITEV64
asmlinkage long compat_sys_pwritev64(unsigned long fd,
- const struct compat_iovec __user *vec,
+ const struct iovec __user *vec,
unsigned long vlen, loff_t pos);
#endif
@@ -800,20 +800,20 @@ asmlinkage long compat_sys_execveat(int dfd, const char __user *filename,
const compat_uptr_t __user *argv,
const compat_uptr_t __user *envp, int flags);
asmlinkage ssize_t compat_sys_preadv2(compat_ulong_t fd,
- const struct compat_iovec __user *vec,
+ const struct iovec __user *vec,
compat_ulong_t vlen, u32 pos_low, u32 pos_high, rwf_t flags);
asmlinkage ssize_t compat_sys_pwritev2(compat_ulong_t fd,
- const struct compat_iovec __user *vec,
+ const struct iovec __user *vec,
compat_ulong_t vlen, u32 pos_low, u32 pos_high, rwf_t flags);
#ifdef __ARCH_WANT_COMPAT_SYS_PREADV64V2
asmlinkage long compat_sys_preadv64v2(unsigned long fd,
- const struct compat_iovec __user *vec,
+ const struct iovec __user *vec,
unsigned long vlen, loff_t pos, rwf_t flags);
#endif
#ifdef __ARCH_WANT_COMPAT_SYS_PWRITEV64V2
asmlinkage long compat_sys_pwritev64v2(unsigned long fd,
- const struct compat_iovec __user *vec,
+ const struct iovec __user *vec,
unsigned long vlen, loff_t pos, rwf_t flags);
#endif
--
2.28.0
^ permalink raw reply related [flat|nested] 66+ messages in thread
* [PATCH 6/9] fs: remove the compat readv/writev syscalls
2020-09-25 4:51 let import_iovec deal with compat_iovecs as well v4 Christoph Hellwig
` (4 preceding siblings ...)
2020-09-25 4:51 ` [PATCH 5/9] fs: remove various compat readv/writev helpers Christoph Hellwig
@ 2020-09-25 4:51 ` Christoph Hellwig
2020-09-25 4:51 ` [PATCH 7/9] fs: remove compat_sys_vmsplice Christoph Hellwig
` (3 subsequent siblings)
9 siblings, 0 replies; 66+ messages in thread
From: Christoph Hellwig @ 2020-09-25 4:51 UTC (permalink / raw)
To: Alexander Viro
Cc: Andrew Morton, Jens Axboe, Arnd Bergmann, David Howells,
David Laight, linux-arm-kernel, linux-kernel, linux-mips,
linux-parisc, linuxppc-dev, linux-s390, sparclinux, linux-block,
linux-scsi, linux-fsdevel, linux-aio, io-uring, linux-arch,
linux-mm, netdev, keyrings, linux-security-module
Now that import_iovec handles compat iovecs, the native readv and writev
syscalls can be used for the compat case as well.
Signed-off-by: Christoph Hellwig <[email protected]>
---
arch/arm64/include/asm/unistd32.h | 4 ++--
arch/mips/kernel/syscalls/syscall_n32.tbl | 4 ++--
arch/mips/kernel/syscalls/syscall_o32.tbl | 4 ++--
arch/parisc/kernel/syscalls/syscall.tbl | 4 ++--
arch/powerpc/kernel/syscalls/syscall.tbl | 4 ++--
arch/s390/kernel/syscalls/syscall.tbl | 4 ++--
arch/sparc/kernel/syscalls/syscall.tbl | 4 ++--
arch/x86/entry/syscall_x32.c | 2 ++
arch/x86/entry/syscalls/syscall_32.tbl | 4 ++--
arch/x86/entry/syscalls/syscall_64.tbl | 4 ++--
fs/read_write.c | 14 --------------
include/linux/compat.h | 4 ----
include/uapi/asm-generic/unistd.h | 4 ++--
tools/include/uapi/asm-generic/unistd.h | 4 ++--
tools/perf/arch/powerpc/entry/syscalls/syscall.tbl | 4 ++--
tools/perf/arch/s390/entry/syscalls/syscall.tbl | 4 ++--
tools/perf/arch/x86/entry/syscalls/syscall_64.tbl | 4 ++--
17 files changed, 30 insertions(+), 46 deletions(-)
diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index 734860ac7cf9d5..4a236493dca5b9 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -301,9 +301,9 @@ __SYSCALL(__NR_flock, sys_flock)
#define __NR_msync 144
__SYSCALL(__NR_msync, sys_msync)
#define __NR_readv 145
-__SYSCALL(__NR_readv, compat_sys_readv)
+__SYSCALL(__NR_readv, sys_readv)
#define __NR_writev 146
-__SYSCALL(__NR_writev, compat_sys_writev)
+__SYSCALL(__NR_writev, sys_writev)
#define __NR_getsid 147
__SYSCALL(__NR_getsid, sys_getsid)
#define __NR_fdatasync 148
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index f9df9edb67a407..c99a92646f8ee9 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -25,8 +25,8 @@
15 n32 ioctl compat_sys_ioctl
16 n32 pread64 sys_pread64
17 n32 pwrite64 sys_pwrite64
-18 n32 readv compat_sys_readv
-19 n32 writev compat_sys_writev
+18 n32 readv sys_readv
+19 n32 writev sys_writev
20 n32 access sys_access
21 n32 pipe sysm_pipe
22 n32 _newselect compat_sys_select
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 195b43cf27c848..075064d10661bf 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -156,8 +156,8 @@
142 o32 _newselect sys_select compat_sys_select
143 o32 flock sys_flock
144 o32 msync sys_msync
-145 o32 readv sys_readv compat_sys_readv
-146 o32 writev sys_writev compat_sys_writev
+145 o32 readv sys_readv
+146 o32 writev sys_writev
147 o32 cacheflush sys_cacheflush
148 o32 cachectl sys_cachectl
149 o32 sysmips __sys_sysmips
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index def64d221cd4fb..192abde0001d9d 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -159,8 +159,8 @@
142 common _newselect sys_select compat_sys_select
143 common flock sys_flock
144 common msync sys_msync
-145 common readv sys_readv compat_sys_readv
-146 common writev sys_writev compat_sys_writev
+145 common readv sys_readv
+146 common writev sys_writev
147 common getsid sys_getsid
148 common fdatasync sys_fdatasync
149 common _sysctl sys_ni_syscall
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index c2d737ff2e7bec..6f1e2ecf0edad9 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -193,8 +193,8 @@
142 common _newselect sys_select compat_sys_select
143 common flock sys_flock
144 common msync sys_msync
-145 common readv sys_readv compat_sys_readv
-146 common writev sys_writev compat_sys_writev
+145 common readv sys_readv
+146 common writev sys_writev
147 common getsid sys_getsid
148 common fdatasync sys_fdatasync
149 nospu _sysctl sys_ni_syscall
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index 10456bc936fb09..6101cf2e004cb4 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -134,8 +134,8 @@
142 64 select sys_select -
143 common flock sys_flock sys_flock
144 common msync sys_msync sys_msync
-145 common readv sys_readv compat_sys_readv
-146 common writev sys_writev compat_sys_writev
+145 common readv sys_readv sys_readv
+146 common writev sys_writev sys_writev
147 common getsid sys_getsid sys_getsid
148 common fdatasync sys_fdatasync sys_fdatasync
149 common _sysctl - -
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index 4af114e84f2022..a87ddb282ab16f 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -149,8 +149,8 @@
117 common getrusage sys_getrusage compat_sys_getrusage
118 common getsockopt sys_getsockopt sys_getsockopt
119 common getcwd sys_getcwd
-120 common readv sys_readv compat_sys_readv
-121 common writev sys_writev compat_sys_writev
+120 common readv sys_readv
+121 common writev sys_writev
122 common settimeofday sys_settimeofday compat_sys_settimeofday
123 32 fchown sys_fchown16
123 64 fchown sys_fchown
diff --git a/arch/x86/entry/syscall_x32.c b/arch/x86/entry/syscall_x32.c
index 1583831f61a9df..aa321444a41f63 100644
--- a/arch/x86/entry/syscall_x32.c
+++ b/arch/x86/entry/syscall_x32.c
@@ -12,6 +12,8 @@
* Reuse the 64-bit entry points for the x32 versions that occupy different
* slots in the syscall table.
*/
+#define __x32_sys_readv __x64_sys_readv
+#define __x32_sys_writev __x64_sys_writev
#define __x32_sys_getsockopt __x64_sys_getsockopt
#define __x32_sys_setsockopt __x64_sys_setsockopt
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 9d11028736661b..54ab4beb517f25 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -156,8 +156,8 @@
142 i386 _newselect sys_select compat_sys_select
143 i386 flock sys_flock
144 i386 msync sys_msync
-145 i386 readv sys_readv compat_sys_readv
-146 i386 writev sys_writev compat_sys_writev
+145 i386 readv sys_readv
+146 i386 writev sys_writev
147 i386 getsid sys_getsid
148 i386 fdatasync sys_fdatasync
149 i386 _sysctl sys_ni_syscall
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index f30d6ae9a6883c..b1e59957c5c51c 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -371,8 +371,8 @@
512 x32 rt_sigaction compat_sys_rt_sigaction
513 x32 rt_sigreturn compat_sys_x32_rt_sigreturn
514 x32 ioctl compat_sys_ioctl
-515 x32 readv compat_sys_readv
-516 x32 writev compat_sys_writev
+515 x32 readv sys_readv
+516 x32 writev sys_writev
517 x32 recvfrom compat_sys_recvfrom
518 x32 sendmsg compat_sys_sendmsg
519 x32 recvmsg compat_sys_recvmsg
diff --git a/fs/read_write.c b/fs/read_write.c
index eab427b7cc0a3f..6c13f744c34a38 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1074,13 +1074,6 @@ SYSCALL_DEFINE6(pwritev2, unsigned long, fd, const struct iovec __user *, vec,
* in_compat_syscall().
*/
#ifdef CONFIG_COMPAT
-COMPAT_SYSCALL_DEFINE3(readv, compat_ulong_t, fd,
- const struct iovec __user *, vec,
- compat_ulong_t, vlen)
-{
- return do_readv(fd, vec, vlen, 0);
-}
-
#ifdef __ARCH_WANT_COMPAT_SYS_PREADV64
COMPAT_SYSCALL_DEFINE4(preadv64, unsigned long, fd,
const struct iovec __user *, vec,
@@ -1122,13 +1115,6 @@ COMPAT_SYSCALL_DEFINE6(preadv2, compat_ulong_t, fd,
return do_preadv(fd, vec, vlen, pos, flags);
}
-COMPAT_SYSCALL_DEFINE3(writev, compat_ulong_t, fd,
- const struct iovec __user *, vec,
- compat_ulong_t, vlen)
-{
- return do_writev(fd, vec, vlen, 0);
-}
-
#ifdef __ARCH_WANT_COMPAT_SYS_PWRITEV64
COMPAT_SYSCALL_DEFINE4(pwritev64, unsigned long, fd,
const struct iovec __user *, vec,
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 306ea7e1172d8d..0f1620988267e6 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -545,10 +545,6 @@ asmlinkage long compat_sys_getdents(unsigned int fd,
/* fs/read_write.c */
asmlinkage long compat_sys_lseek(unsigned int, compat_off_t, unsigned int);
-asmlinkage ssize_t compat_sys_readv(compat_ulong_t fd,
- const struct iovec __user *vec, compat_ulong_t vlen);
-asmlinkage ssize_t compat_sys_writev(compat_ulong_t fd,
- const struct iovec __user *vec, compat_ulong_t vlen);
/* No generic prototype for pread64 and pwrite64 */
asmlinkage ssize_t compat_sys_preadv(compat_ulong_t fd,
const struct iovec __user *vec,
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 995b36c2ea7d8a..211c9eacbda6eb 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -207,9 +207,9 @@ __SYSCALL(__NR_read, sys_read)
#define __NR_write 64
__SYSCALL(__NR_write, sys_write)
#define __NR_readv 65
-__SC_COMP(__NR_readv, sys_readv, compat_sys_readv)
+__SC_COMP(__NR_readv, sys_readv, sys_readv)
#define __NR_writev 66
-__SC_COMP(__NR_writev, sys_writev, compat_sys_writev)
+__SC_COMP(__NR_writev, sys_writev, sys_writev)
#define __NR_pread64 67
__SC_COMP(__NR_pread64, sys_pread64, compat_sys_pread64)
#define __NR_pwrite64 68
diff --git a/tools/include/uapi/asm-generic/unistd.h b/tools/include/uapi/asm-generic/unistd.h
index 995b36c2ea7d8a..211c9eacbda6eb 100644
--- a/tools/include/uapi/asm-generic/unistd.h
+++ b/tools/include/uapi/asm-generic/unistd.h
@@ -207,9 +207,9 @@ __SYSCALL(__NR_read, sys_read)
#define __NR_write 64
__SYSCALL(__NR_write, sys_write)
#define __NR_readv 65
-__SC_COMP(__NR_readv, sys_readv, compat_sys_readv)
+__SC_COMP(__NR_readv, sys_readv, sys_readv)
#define __NR_writev 66
-__SC_COMP(__NR_writev, sys_writev, compat_sys_writev)
+__SC_COMP(__NR_writev, sys_writev, sys_writev)
#define __NR_pread64 67
__SC_COMP(__NR_pread64, sys_pread64, compat_sys_pread64)
#define __NR_pwrite64 68
diff --git a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
index 3ca6fe057a0b1f..46be68029587f9 100644
--- a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
+++ b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
@@ -189,8 +189,8 @@
142 common _newselect sys_select compat_sys_select
143 common flock sys_flock
144 common msync sys_msync
-145 common readv sys_readv compat_sys_readv
-146 common writev sys_writev compat_sys_writev
+145 common readv sys_readv
+146 common writev sys_writev
147 common getsid sys_getsid
148 common fdatasync sys_fdatasync
149 nospu _sysctl sys_ni_syscall
diff --git a/tools/perf/arch/s390/entry/syscalls/syscall.tbl b/tools/perf/arch/s390/entry/syscalls/syscall.tbl
index 6a0bbea225db0d..fb5e61ce9d5838 100644
--- a/tools/perf/arch/s390/entry/syscalls/syscall.tbl
+++ b/tools/perf/arch/s390/entry/syscalls/syscall.tbl
@@ -134,8 +134,8 @@
142 64 select sys_select -
143 common flock sys_flock sys_flock
144 common msync sys_msync compat_sys_msync
-145 common readv sys_readv compat_sys_readv
-146 common writev sys_writev compat_sys_writev
+145 common readv sys_readv
+146 common writev sys_writev
147 common getsid sys_getsid sys_getsid
148 common fdatasync sys_fdatasync sys_fdatasync
149 common _sysctl - -
diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
index f30d6ae9a6883c..b1e59957c5c51c 100644
--- a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
@@ -371,8 +371,8 @@
512 x32 rt_sigaction compat_sys_rt_sigaction
513 x32 rt_sigreturn compat_sys_x32_rt_sigreturn
514 x32 ioctl compat_sys_ioctl
-515 x32 readv compat_sys_readv
-516 x32 writev compat_sys_writev
+515 x32 readv sys_readv
+516 x32 writev sys_writev
517 x32 recvfrom compat_sys_recvfrom
518 x32 sendmsg compat_sys_sendmsg
519 x32 recvmsg compat_sys_recvmsg
--
2.28.0
^ permalink raw reply related [flat|nested] 66+ messages in thread
* [PATCH 7/9] fs: remove compat_sys_vmsplice
2020-09-25 4:51 let import_iovec deal with compat_iovecs as well v4 Christoph Hellwig
` (5 preceding siblings ...)
2020-09-25 4:51 ` [PATCH 6/9] fs: remove the compat readv/writev syscalls Christoph Hellwig
@ 2020-09-25 4:51 ` Christoph Hellwig
2020-09-25 4:51 ` [PATCH 8/9] mm: remove compat_process_vm_{readv,writev} Christoph Hellwig
` (2 subsequent siblings)
9 siblings, 0 replies; 66+ messages in thread
From: Christoph Hellwig @ 2020-09-25 4:51 UTC (permalink / raw)
To: Alexander Viro
Cc: Andrew Morton, Jens Axboe, Arnd Bergmann, David Howells,
David Laight, linux-arm-kernel, linux-kernel, linux-mips,
linux-parisc, linuxppc-dev, linux-s390, sparclinux, linux-block,
linux-scsi, linux-fsdevel, linux-aio, io-uring, linux-arch,
linux-mm, netdev, keyrings, linux-security-module
Now that import_iovec handles compat iovecs, the native vmsplice syscall
can be used for the compat case as well.
Signed-off-by: Christoph Hellwig <[email protected]>
---
arch/arm64/include/asm/unistd32.h | 2 +-
arch/mips/kernel/syscalls/syscall_n32.tbl | 2 +-
arch/mips/kernel/syscalls/syscall_o32.tbl | 2 +-
arch/parisc/kernel/syscalls/syscall.tbl | 2 +-
arch/powerpc/kernel/syscalls/syscall.tbl | 2 +-
arch/s390/kernel/syscalls/syscall.tbl | 2 +-
arch/sparc/kernel/syscalls/syscall.tbl | 2 +-
arch/x86/entry/syscall_x32.c | 1 +
arch/x86/entry/syscalls/syscall_32.tbl | 2 +-
arch/x86/entry/syscalls/syscall_64.tbl | 2 +-
fs/splice.c | 57 +++++--------------
include/linux/compat.h | 4 --
include/uapi/asm-generic/unistd.h | 2 +-
tools/include/uapi/asm-generic/unistd.h | 2 +-
.../arch/powerpc/entry/syscalls/syscall.tbl | 2 +-
.../perf/arch/s390/entry/syscalls/syscall.tbl | 2 +-
.../arch/x86/entry/syscalls/syscall_64.tbl | 2 +-
17 files changed, 28 insertions(+), 62 deletions(-)
diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index 4a236493dca5b9..11dfae3a8563bd 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -697,7 +697,7 @@ __SYSCALL(__NR_sync_file_range2, compat_sys_aarch32_sync_file_range2)
#define __NR_tee 342
__SYSCALL(__NR_tee, sys_tee)
#define __NR_vmsplice 343
-__SYSCALL(__NR_vmsplice, compat_sys_vmsplice)
+__SYSCALL(__NR_vmsplice, sys_vmsplice)
#define __NR_move_pages 344
__SYSCALL(__NR_move_pages, compat_sys_move_pages)
#define __NR_getcpu 345
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index c99a92646f8ee9..5a39d4de0ac85b 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -278,7 +278,7 @@
267 n32 splice sys_splice
268 n32 sync_file_range sys_sync_file_range
269 n32 tee sys_tee
-270 n32 vmsplice compat_sys_vmsplice
+270 n32 vmsplice sys_vmsplice
271 n32 move_pages compat_sys_move_pages
272 n32 set_robust_list compat_sys_set_robust_list
273 n32 get_robust_list compat_sys_get_robust_list
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 075064d10661bf..136efc6b8c5444 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -318,7 +318,7 @@
304 o32 splice sys_splice
305 o32 sync_file_range sys_sync_file_range sys32_sync_file_range
306 o32 tee sys_tee
-307 o32 vmsplice sys_vmsplice compat_sys_vmsplice
+307 o32 vmsplice sys_vmsplice
308 o32 move_pages sys_move_pages compat_sys_move_pages
309 o32 set_robust_list sys_set_robust_list compat_sys_set_robust_list
310 o32 get_robust_list sys_get_robust_list compat_sys_get_robust_list
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index 192abde0001d9d..a9e184192caedd 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -330,7 +330,7 @@
292 32 sync_file_range parisc_sync_file_range
292 64 sync_file_range sys_sync_file_range
293 common tee sys_tee
-294 common vmsplice sys_vmsplice compat_sys_vmsplice
+294 common vmsplice sys_vmsplice
295 common move_pages sys_move_pages compat_sys_move_pages
296 common getcpu sys_getcpu
297 common epoll_pwait sys_epoll_pwait compat_sys_epoll_pwait
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index 6f1e2ecf0edad9..0d4985919ca34d 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -369,7 +369,7 @@
282 common unshare sys_unshare
283 common splice sys_splice
284 common tee sys_tee
-285 common vmsplice sys_vmsplice compat_sys_vmsplice
+285 common vmsplice sys_vmsplice
286 common openat sys_openat compat_sys_openat
287 common mkdirat sys_mkdirat
288 common mknodat sys_mknodat
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index 6101cf2e004cb4..b5495a42814bd1 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -316,7 +316,7 @@
306 common splice sys_splice sys_splice
307 common sync_file_range sys_sync_file_range compat_sys_s390_sync_file_range
308 common tee sys_tee sys_tee
-309 common vmsplice sys_vmsplice compat_sys_vmsplice
+309 common vmsplice sys_vmsplice sys_vmsplice
310 common move_pages sys_move_pages compat_sys_move_pages
311 common getcpu sys_getcpu sys_getcpu
312 common epoll_pwait sys_epoll_pwait compat_sys_epoll_pwait
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index a87ddb282ab16f..f1810c1a35caa5 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -38,7 +38,7 @@
23 64 setuid sys_setuid
24 32 getuid sys_getuid16
24 64 getuid sys_getuid
-25 common vmsplice sys_vmsplice compat_sys_vmsplice
+25 common vmsplice sys_vmsplice
26 common ptrace sys_ptrace compat_sys_ptrace
27 common alarm sys_alarm
28 common sigaltstack sys_sigaltstack compat_sys_sigaltstack
diff --git a/arch/x86/entry/syscall_x32.c b/arch/x86/entry/syscall_x32.c
index aa321444a41f63..a4840b9d50ad14 100644
--- a/arch/x86/entry/syscall_x32.c
+++ b/arch/x86/entry/syscall_x32.c
@@ -16,6 +16,7 @@
#define __x32_sys_writev __x64_sys_writev
#define __x32_sys_getsockopt __x64_sys_getsockopt
#define __x32_sys_setsockopt __x64_sys_setsockopt
+#define __x32_sys_vmsplice __x64_sys_vmsplice
#define __SYSCALL_64(nr, sym)
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 54ab4beb517f25..0fb2f172581e51 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -327,7 +327,7 @@
313 i386 splice sys_splice
314 i386 sync_file_range sys_ia32_sync_file_range
315 i386 tee sys_tee
-316 i386 vmsplice sys_vmsplice compat_sys_vmsplice
+316 i386 vmsplice sys_vmsplice
317 i386 move_pages sys_move_pages compat_sys_move_pages
318 i386 getcpu sys_getcpu
319 i386 epoll_pwait sys_epoll_pwait
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index b1e59957c5c51c..642af919183de4 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -388,7 +388,7 @@
529 x32 waitid compat_sys_waitid
530 x32 set_robust_list compat_sys_set_robust_list
531 x32 get_robust_list compat_sys_get_robust_list
-532 x32 vmsplice compat_sys_vmsplice
+532 x32 vmsplice sys_vmsplice
533 x32 move_pages compat_sys_move_pages
534 x32 preadv compat_sys_preadv64
535 x32 pwritev compat_sys_pwritev64
diff --git a/fs/splice.c b/fs/splice.c
index 132d42b9871f9b..18d84544030b39 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -33,7 +33,6 @@
#include <linux/security.h>
#include <linux/gfp.h>
#include <linux/socket.h>
-#include <linux/compat.h>
#include <linux/sched/signal.h>
#include "internal.h"
@@ -1332,20 +1331,6 @@ static int vmsplice_type(struct fd f, int *type)
* Currently we punt and implement it as a normal copy, see pipe_to_user().
*
*/
-static long do_vmsplice(struct file *f, struct iov_iter *iter, unsigned int flags)
-{
- if (unlikely(flags & ~SPLICE_F_ALL))
- return -EINVAL;
-
- if (!iov_iter_count(iter))
- return 0;
-
- if (iov_iter_rw(iter) == WRITE)
- return vmsplice_to_pipe(f, iter, flags);
- else
- return vmsplice_to_user(f, iter, flags);
-}
-
SYSCALL_DEFINE4(vmsplice, int, fd, const struct iovec __user *, uiov,
unsigned long, nr_segs, unsigned int, flags)
{
@@ -1356,6 +1341,9 @@ SYSCALL_DEFINE4(vmsplice, int, fd, const struct iovec __user *, uiov,
struct fd f;
int type;
+ if (unlikely(flags & ~SPLICE_F_ALL))
+ return -EINVAL;
+
f = fdget(fd);
error = vmsplice_type(f, &type);
if (error)
@@ -1363,40 +1351,21 @@ SYSCALL_DEFINE4(vmsplice, int, fd, const struct iovec __user *, uiov,
error = import_iovec(type, uiov, nr_segs,
ARRAY_SIZE(iovstack), &iov, &iter);
- if (error >= 0) {
- error = do_vmsplice(f.file, &iter, flags);
- kfree(iov);
- }
- fdput(f);
- return error;
-}
+ if (error < 0)
+ goto out_fdput;
-#ifdef CONFIG_COMPAT
-COMPAT_SYSCALL_DEFINE4(vmsplice, int, fd, const struct compat_iovec __user *, iov32,
- unsigned int, nr_segs, unsigned int, flags)
-{
- struct iovec iovstack[UIO_FASTIOV];
- struct iovec *iov = iovstack;
- struct iov_iter iter;
- ssize_t error;
- struct fd f;
- int type;
-
- f = fdget(fd);
- error = vmsplice_type(f, &type);
- if (error)
- return error;
+ if (!iov_iter_count(&iter))
+ error = 0;
+ else if (iov_iter_rw(&iter) == WRITE)
+ error = vmsplice_to_pipe(f.file, &iter, flags);
+ else
+ error = vmsplice_to_user(f.file, &iter, flags);
- error = import_iovec(type, (struct iovec __user *)iov32, nr_segs,
- ARRAY_SIZE(iovstack), &iov, &iter);
- if (error >= 0) {
- error = do_vmsplice(f.file, &iter, flags);
- kfree(iov);
- }
+ kfree(iov);
+out_fdput:
fdput(f);
return error;
}
-#endif
SYSCALL_DEFINE6(splice, int, fd_in, loff_t __user *, off_in,
int, fd_out, loff_t __user *, off_out,
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 0f1620988267e6..9e8aa148651455 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -597,10 +597,6 @@ asmlinkage long compat_sys_signalfd4(int ufd,
const compat_sigset_t __user *sigmask,
compat_size_t sigsetsize, int flags);
-/* fs/splice.c */
-asmlinkage long compat_sys_vmsplice(int fd, const struct compat_iovec __user *,
- unsigned int nr_segs, unsigned int flags);
-
/* fs/stat.c */
asmlinkage long compat_sys_newfstatat(unsigned int dfd,
const char __user *filename,
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 211c9eacbda6eb..f2dcb0d5703014 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -237,7 +237,7 @@ __SC_COMP(__NR_signalfd4, sys_signalfd4, compat_sys_signalfd4)
/* fs/splice.c */
#define __NR_vmsplice 75
-__SC_COMP(__NR_vmsplice, sys_vmsplice, compat_sys_vmsplice)
+__SYSCALL(__NR_vmsplice, sys_vmsplice)
#define __NR_splice 76
__SYSCALL(__NR_splice, sys_splice)
#define __NR_tee 77
diff --git a/tools/include/uapi/asm-generic/unistd.h b/tools/include/uapi/asm-generic/unistd.h
index 211c9eacbda6eb..f2dcb0d5703014 100644
--- a/tools/include/uapi/asm-generic/unistd.h
+++ b/tools/include/uapi/asm-generic/unistd.h
@@ -237,7 +237,7 @@ __SC_COMP(__NR_signalfd4, sys_signalfd4, compat_sys_signalfd4)
/* fs/splice.c */
#define __NR_vmsplice 75
-__SC_COMP(__NR_vmsplice, sys_vmsplice, compat_sys_vmsplice)
+__SYSCALL(__NR_vmsplice, sys_vmsplice)
#define __NR_splice 76
__SYSCALL(__NR_splice, sys_splice)
#define __NR_tee 77
diff --git a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
index 46be68029587f9..26f0347c15118b 100644
--- a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
+++ b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
@@ -363,7 +363,7 @@
282 common unshare sys_unshare
283 common splice sys_splice
284 common tee sys_tee
-285 common vmsplice sys_vmsplice compat_sys_vmsplice
+285 common vmsplice sys_vmsplice
286 common openat sys_openat compat_sys_openat
287 common mkdirat sys_mkdirat
288 common mknodat sys_mknodat
diff --git a/tools/perf/arch/s390/entry/syscalls/syscall.tbl b/tools/perf/arch/s390/entry/syscalls/syscall.tbl
index fb5e61ce9d5838..02ad81f69bb7e3 100644
--- a/tools/perf/arch/s390/entry/syscalls/syscall.tbl
+++ b/tools/perf/arch/s390/entry/syscalls/syscall.tbl
@@ -316,7 +316,7 @@
306 common splice sys_splice compat_sys_splice
307 common sync_file_range sys_sync_file_range compat_sys_s390_sync_file_range
308 common tee sys_tee compat_sys_tee
-309 common vmsplice sys_vmsplice compat_sys_vmsplice
+309 common vmsplice sys_vmsplice sys_vmsplice
310 common move_pages sys_move_pages compat_sys_move_pages
311 common getcpu sys_getcpu compat_sys_getcpu
312 common epoll_pwait sys_epoll_pwait compat_sys_epoll_pwait
diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
index b1e59957c5c51c..642af919183de4 100644
--- a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
@@ -388,7 +388,7 @@
529 x32 waitid compat_sys_waitid
530 x32 set_robust_list compat_sys_set_robust_list
531 x32 get_robust_list compat_sys_get_robust_list
-532 x32 vmsplice compat_sys_vmsplice
+532 x32 vmsplice sys_vmsplice
533 x32 move_pages compat_sys_move_pages
534 x32 preadv compat_sys_preadv64
535 x32 pwritev compat_sys_pwritev64
--
2.28.0
^ permalink raw reply related [flat|nested] 66+ messages in thread
* [PATCH 8/9] mm: remove compat_process_vm_{readv,writev}
2020-09-25 4:51 let import_iovec deal with compat_iovecs as well v4 Christoph Hellwig
` (6 preceding siblings ...)
2020-09-25 4:51 ` [PATCH 7/9] fs: remove compat_sys_vmsplice Christoph Hellwig
@ 2020-09-25 4:51 ` Christoph Hellwig
2020-09-25 4:51 ` [PATCH 9/9] security/keys: remove compat_keyctl_instantiate_key_iov Christoph Hellwig
2020-09-25 15:23 ` let import_iovec deal with compat_iovecs as well v4 Al Viro
9 siblings, 0 replies; 66+ messages in thread
From: Christoph Hellwig @ 2020-09-25 4:51 UTC (permalink / raw)
To: Alexander Viro
Cc: Andrew Morton, Jens Axboe, Arnd Bergmann, David Howells,
David Laight, linux-arm-kernel, linux-kernel, linux-mips,
linux-parisc, linuxppc-dev, linux-s390, sparclinux, linux-block,
linux-scsi, linux-fsdevel, linux-aio, io-uring, linux-arch,
linux-mm, netdev, keyrings, linux-security-module
Now that import_iovec handles compat iovecs, the native syscalls
can be used for the compat case as well.
Signed-off-by: Christoph Hellwig <[email protected]>
---
arch/arm64/include/asm/unistd32.h | 4 +-
arch/mips/kernel/syscalls/syscall_n32.tbl | 4 +-
arch/mips/kernel/syscalls/syscall_o32.tbl | 4 +-
arch/parisc/kernel/syscalls/syscall.tbl | 4 +-
arch/powerpc/kernel/syscalls/syscall.tbl | 4 +-
arch/s390/kernel/syscalls/syscall.tbl | 4 +-
arch/sparc/kernel/syscalls/syscall.tbl | 4 +-
arch/x86/entry/syscall_x32.c | 2 +
arch/x86/entry/syscalls/syscall_32.tbl | 4 +-
arch/x86/entry/syscalls/syscall_64.tbl | 4 +-
include/linux/compat.h | 8 ---
include/uapi/asm-generic/unistd.h | 6 +-
mm/process_vm_access.c | 69 -------------------
tools/include/uapi/asm-generic/unistd.h | 6 +-
.../arch/powerpc/entry/syscalls/syscall.tbl | 4 +-
.../perf/arch/s390/entry/syscalls/syscall.tbl | 4 +-
.../arch/x86/entry/syscalls/syscall_64.tbl | 4 +-
17 files changed, 30 insertions(+), 109 deletions(-)
diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index 11dfae3a8563bd..0c280a05f699bf 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -763,9 +763,9 @@ __SYSCALL(__NR_sendmmsg, compat_sys_sendmmsg)
#define __NR_setns 375
__SYSCALL(__NR_setns, sys_setns)
#define __NR_process_vm_readv 376
-__SYSCALL(__NR_process_vm_readv, compat_sys_process_vm_readv)
+__SYSCALL(__NR_process_vm_readv, sys_process_vm_readv)
#define __NR_process_vm_writev 377
-__SYSCALL(__NR_process_vm_writev, compat_sys_process_vm_writev)
+__SYSCALL(__NR_process_vm_writev, sys_process_vm_writev)
#define __NR_kcmp 378
__SYSCALL(__NR_kcmp, sys_kcmp)
#define __NR_finit_module 379
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index 5a39d4de0ac85b..0bc2e0fcf1ee56 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -317,8 +317,8 @@
306 n32 syncfs sys_syncfs
307 n32 sendmmsg compat_sys_sendmmsg
308 n32 setns sys_setns
-309 n32 process_vm_readv compat_sys_process_vm_readv
-310 n32 process_vm_writev compat_sys_process_vm_writev
+309 n32 process_vm_readv sys_process_vm_readv
+310 n32 process_vm_writev sys_process_vm_writev
311 n32 kcmp sys_kcmp
312 n32 finit_module sys_finit_module
313 n32 sched_setattr sys_sched_setattr
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 136efc6b8c5444..b408c13b934296 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -356,8 +356,8 @@
342 o32 syncfs sys_syncfs
343 o32 sendmmsg sys_sendmmsg compat_sys_sendmmsg
344 o32 setns sys_setns
-345 o32 process_vm_readv sys_process_vm_readv compat_sys_process_vm_readv
-346 o32 process_vm_writev sys_process_vm_writev compat_sys_process_vm_writev
+345 o32 process_vm_readv sys_process_vm_readv
+346 o32 process_vm_writev sys_process_vm_writev
347 o32 kcmp sys_kcmp
348 o32 finit_module sys_finit_module
349 o32 sched_setattr sys_sched_setattr
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index a9e184192caedd..2015a5124b78ad 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -372,8 +372,8 @@
327 common syncfs sys_syncfs
328 common setns sys_setns
329 common sendmmsg sys_sendmmsg compat_sys_sendmmsg
-330 common process_vm_readv sys_process_vm_readv compat_sys_process_vm_readv
-331 common process_vm_writev sys_process_vm_writev compat_sys_process_vm_writev
+330 common process_vm_readv sys_process_vm_readv
+331 common process_vm_writev sys_process_vm_writev
332 common kcmp sys_kcmp
333 common finit_module sys_finit_module
334 common sched_setattr sys_sched_setattr
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index 0d4985919ca34d..66a472aa635d3f 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -449,8 +449,8 @@
348 common syncfs sys_syncfs
349 common sendmmsg sys_sendmmsg compat_sys_sendmmsg
350 common setns sys_setns
-351 nospu process_vm_readv sys_process_vm_readv compat_sys_process_vm_readv
-352 nospu process_vm_writev sys_process_vm_writev compat_sys_process_vm_writev
+351 nospu process_vm_readv sys_process_vm_readv
+352 nospu process_vm_writev sys_process_vm_writev
353 nospu finit_module sys_finit_module
354 nospu kcmp sys_kcmp
355 common sched_setattr sys_sched_setattr
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index b5495a42814bd1..7485867a490bb2 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -347,8 +347,8 @@
337 common clock_adjtime sys_clock_adjtime sys_clock_adjtime32
338 common syncfs sys_syncfs sys_syncfs
339 common setns sys_setns sys_setns
-340 common process_vm_readv sys_process_vm_readv compat_sys_process_vm_readv
-341 common process_vm_writev sys_process_vm_writev compat_sys_process_vm_writev
+340 common process_vm_readv sys_process_vm_readv sys_process_vm_readv
+341 common process_vm_writev sys_process_vm_writev sys_process_vm_writev
342 common s390_runtime_instr sys_s390_runtime_instr sys_s390_runtime_instr
343 common kcmp sys_kcmp sys_kcmp
344 common finit_module sys_finit_module sys_finit_module
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index f1810c1a35caa5..4a9365b2e340b2 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -406,8 +406,8 @@
335 common syncfs sys_syncfs
336 common sendmmsg sys_sendmmsg compat_sys_sendmmsg
337 common setns sys_setns
-338 common process_vm_readv sys_process_vm_readv compat_sys_process_vm_readv
-339 common process_vm_writev sys_process_vm_writev compat_sys_process_vm_writev
+338 common process_vm_readv sys_process_vm_readv
+339 common process_vm_writev sys_process_vm_writev
340 32 kern_features sys_ni_syscall sys_kern_features
340 64 kern_features sys_kern_features
341 common kcmp sys_kcmp
diff --git a/arch/x86/entry/syscall_x32.c b/arch/x86/entry/syscall_x32.c
index a4840b9d50ad14..f2fe0a33bcfdd5 100644
--- a/arch/x86/entry/syscall_x32.c
+++ b/arch/x86/entry/syscall_x32.c
@@ -17,6 +17,8 @@
#define __x32_sys_getsockopt __x64_sys_getsockopt
#define __x32_sys_setsockopt __x64_sys_setsockopt
#define __x32_sys_vmsplice __x64_sys_vmsplice
+#define __x32_sys_process_vm_readv __x64_sys_process_vm_readv
+#define __x32_sys_process_vm_writev __x64_sys_process_vm_writev
#define __SYSCALL_64(nr, sym)
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 0fb2f172581e51..5fbe10ad8a23fc 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -358,8 +358,8 @@
344 i386 syncfs sys_syncfs
345 i386 sendmmsg sys_sendmmsg compat_sys_sendmmsg
346 i386 setns sys_setns
-347 i386 process_vm_readv sys_process_vm_readv compat_sys_process_vm_readv
-348 i386 process_vm_writev sys_process_vm_writev compat_sys_process_vm_writev
+347 i386 process_vm_readv sys_process_vm_readv
+348 i386 process_vm_writev sys_process_vm_writev
349 i386 kcmp sys_kcmp
350 i386 finit_module sys_finit_module
351 i386 sched_setattr sys_sched_setattr
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 642af919183de4..347809649ba28f 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -395,8 +395,8 @@
536 x32 rt_tgsigqueueinfo compat_sys_rt_tgsigqueueinfo
537 x32 recvmmsg compat_sys_recvmmsg_time64
538 x32 sendmmsg compat_sys_sendmmsg
-539 x32 process_vm_readv compat_sys_process_vm_readv
-540 x32 process_vm_writev compat_sys_process_vm_writev
+539 x32 process_vm_readv sys_process_vm_readv
+540 x32 process_vm_writev sys_process_vm_writev
541 x32 setsockopt sys_setsockopt
542 x32 getsockopt sys_getsockopt
543 x32 io_setup compat_sys_io_setup
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 9e8aa148651455..2ae4e4bc22c0a4 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -780,14 +780,6 @@ asmlinkage long compat_sys_open_by_handle_at(int mountdirfd,
int flags);
asmlinkage long compat_sys_sendmmsg(int fd, struct compat_mmsghdr __user *mmsg,
unsigned vlen, unsigned int flags);
-asmlinkage ssize_t compat_sys_process_vm_readv(compat_pid_t pid,
- const struct compat_iovec __user *lvec,
- compat_ulong_t liovcnt, const struct compat_iovec __user *rvec,
- compat_ulong_t riovcnt, compat_ulong_t flags);
-asmlinkage ssize_t compat_sys_process_vm_writev(compat_pid_t pid,
- const struct compat_iovec __user *lvec,
- compat_ulong_t liovcnt, const struct compat_iovec __user *rvec,
- compat_ulong_t riovcnt, compat_ulong_t flags);
asmlinkage long compat_sys_execveat(int dfd, const char __user *filename,
const compat_uptr_t __user *argv,
const compat_uptr_t __user *envp, int flags);
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index f2dcb0d5703014..c1dfe99c9c3f70 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -727,11 +727,9 @@ __SYSCALL(__NR_setns, sys_setns)
#define __NR_sendmmsg 269
__SC_COMP(__NR_sendmmsg, sys_sendmmsg, compat_sys_sendmmsg)
#define __NR_process_vm_readv 270
-__SC_COMP(__NR_process_vm_readv, sys_process_vm_readv, \
- compat_sys_process_vm_readv)
+__SYSCALL(__NR_process_vm_readv, sys_process_vm_readv)
#define __NR_process_vm_writev 271
-__SC_COMP(__NR_process_vm_writev, sys_process_vm_writev, \
- compat_sys_process_vm_writev)
+__SYSCALL(__NR_process_vm_writev, sys_process_vm_writev)
#define __NR_kcmp 272
__SYSCALL(__NR_kcmp, sys_kcmp)
#define __NR_finit_module 273
diff --git a/mm/process_vm_access.c b/mm/process_vm_access.c
index 3f2156aab44263..fd12da80b6f27b 100644
--- a/mm/process_vm_access.c
+++ b/mm/process_vm_access.c
@@ -14,10 +14,6 @@
#include <linux/slab.h>
#include <linux/syscalls.h>
-#ifdef CONFIG_COMPAT
-#include <linux/compat.h>
-#endif
-
/**
* process_vm_rw_pages - read/write pages from task specified
* @pages: array of pointers to pages we want to copy
@@ -304,68 +300,3 @@ SYSCALL_DEFINE6(process_vm_writev, pid_t, pid,
{
return process_vm_rw(pid, lvec, liovcnt, rvec, riovcnt, flags, 1);
}
-
-#ifdef CONFIG_COMPAT
-
-static ssize_t
-compat_process_vm_rw(compat_pid_t pid,
- const struct compat_iovec __user *lvec,
- unsigned long liovcnt,
- const struct compat_iovec __user *rvec,
- unsigned long riovcnt,
- unsigned long flags, int vm_write)
-{
- struct iovec iovstack_l[UIO_FASTIOV];
- struct iovec iovstack_r[UIO_FASTIOV];
- struct iovec *iov_l = iovstack_l;
- struct iovec *iov_r = iovstack_r;
- struct iov_iter iter;
- ssize_t rc = -EFAULT;
- int dir = vm_write ? WRITE : READ;
-
- if (flags != 0)
- return -EINVAL;
-
- rc = import_iovec(dir, (const struct iovec __user *)lvec, liovcnt,
- UIO_FASTIOV, &iov_l, &iter);
- if (rc < 0)
- return rc;
- if (!iov_iter_count(&iter))
- goto free_iov_l;
- iov_r = iovec_from_user((const struct iovec __user *)rvec, riovcnt,
- UIO_FASTIOV, iovstack_r, true);
- if (IS_ERR(iov_r)) {
- rc = PTR_ERR(iov_r);
- goto free_iov_l;
- }
- rc = process_vm_rw_core(pid, &iter, iov_r, riovcnt, flags, vm_write);
- if (iov_r != iovstack_r)
- kfree(iov_r);
-free_iov_l:
- kfree(iov_l);
- return rc;
-}
-
-COMPAT_SYSCALL_DEFINE6(process_vm_readv, compat_pid_t, pid,
- const struct compat_iovec __user *, lvec,
- compat_ulong_t, liovcnt,
- const struct compat_iovec __user *, rvec,
- compat_ulong_t, riovcnt,
- compat_ulong_t, flags)
-{
- return compat_process_vm_rw(pid, lvec, liovcnt, rvec,
- riovcnt, flags, 0);
-}
-
-COMPAT_SYSCALL_DEFINE6(process_vm_writev, compat_pid_t, pid,
- const struct compat_iovec __user *, lvec,
- compat_ulong_t, liovcnt,
- const struct compat_iovec __user *, rvec,
- compat_ulong_t, riovcnt,
- compat_ulong_t, flags)
-{
- return compat_process_vm_rw(pid, lvec, liovcnt, rvec,
- riovcnt, flags, 1);
-}
-
-#endif
diff --git a/tools/include/uapi/asm-generic/unistd.h b/tools/include/uapi/asm-generic/unistd.h
index f2dcb0d5703014..c1dfe99c9c3f70 100644
--- a/tools/include/uapi/asm-generic/unistd.h
+++ b/tools/include/uapi/asm-generic/unistd.h
@@ -727,11 +727,9 @@ __SYSCALL(__NR_setns, sys_setns)
#define __NR_sendmmsg 269
__SC_COMP(__NR_sendmmsg, sys_sendmmsg, compat_sys_sendmmsg)
#define __NR_process_vm_readv 270
-__SC_COMP(__NR_process_vm_readv, sys_process_vm_readv, \
- compat_sys_process_vm_readv)
+__SYSCALL(__NR_process_vm_readv, sys_process_vm_readv)
#define __NR_process_vm_writev 271
-__SC_COMP(__NR_process_vm_writev, sys_process_vm_writev, \
- compat_sys_process_vm_writev)
+__SYSCALL(__NR_process_vm_writev, sys_process_vm_writev)
#define __NR_kcmp 272
__SYSCALL(__NR_kcmp, sys_kcmp)
#define __NR_finit_module 273
diff --git a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
index 26f0347c15118b..a188f053cbf90a 100644
--- a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
+++ b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
@@ -443,8 +443,8 @@
348 common syncfs sys_syncfs
349 common sendmmsg sys_sendmmsg compat_sys_sendmmsg
350 common setns sys_setns
-351 nospu process_vm_readv sys_process_vm_readv compat_sys_process_vm_readv
-352 nospu process_vm_writev sys_process_vm_writev compat_sys_process_vm_writev
+351 nospu process_vm_readv sys_process_vm_readv
+352 nospu process_vm_writev sys_process_vm_writev
353 nospu finit_module sys_finit_module
354 nospu kcmp sys_kcmp
355 common sched_setattr sys_sched_setattr
diff --git a/tools/perf/arch/s390/entry/syscalls/syscall.tbl b/tools/perf/arch/s390/entry/syscalls/syscall.tbl
index 02ad81f69bb7e3..c44c83032c3a04 100644
--- a/tools/perf/arch/s390/entry/syscalls/syscall.tbl
+++ b/tools/perf/arch/s390/entry/syscalls/syscall.tbl
@@ -347,8 +347,8 @@
337 common clock_adjtime sys_clock_adjtime compat_sys_clock_adjtime
338 common syncfs sys_syncfs sys_syncfs
339 common setns sys_setns sys_setns
-340 common process_vm_readv sys_process_vm_readv compat_sys_process_vm_readv
-341 common process_vm_writev sys_process_vm_writev compat_sys_process_vm_writev
+340 common process_vm_readv sys_process_vm_readv sys_process_vm_readv
+341 common process_vm_writev sys_process_vm_writev sys_process_vm_writev
342 common s390_runtime_instr sys_s390_runtime_instr sys_s390_runtime_instr
343 common kcmp sys_kcmp compat_sys_kcmp
344 common finit_module sys_finit_module compat_sys_finit_module
diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
index 642af919183de4..347809649ba28f 100644
--- a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
@@ -395,8 +395,8 @@
536 x32 rt_tgsigqueueinfo compat_sys_rt_tgsigqueueinfo
537 x32 recvmmsg compat_sys_recvmmsg_time64
538 x32 sendmmsg compat_sys_sendmmsg
-539 x32 process_vm_readv compat_sys_process_vm_readv
-540 x32 process_vm_writev compat_sys_process_vm_writev
+539 x32 process_vm_readv sys_process_vm_readv
+540 x32 process_vm_writev sys_process_vm_writev
541 x32 setsockopt sys_setsockopt
542 x32 getsockopt sys_getsockopt
543 x32 io_setup compat_sys_io_setup
--
2.28.0
^ permalink raw reply related [flat|nested] 66+ messages in thread
* [PATCH 9/9] security/keys: remove compat_keyctl_instantiate_key_iov
2020-09-25 4:51 let import_iovec deal with compat_iovecs as well v4 Christoph Hellwig
` (7 preceding siblings ...)
2020-09-25 4:51 ` [PATCH 8/9] mm: remove compat_process_vm_{readv,writev} Christoph Hellwig
@ 2020-09-25 4:51 ` Christoph Hellwig
2020-09-25 15:23 ` let import_iovec deal with compat_iovecs as well v4 Al Viro
9 siblings, 0 replies; 66+ messages in thread
From: Christoph Hellwig @ 2020-09-25 4:51 UTC (permalink / raw)
To: Alexander Viro
Cc: Andrew Morton, Jens Axboe, Arnd Bergmann, David Howells,
David Laight, linux-arm-kernel, linux-kernel, linux-mips,
linux-parisc, linuxppc-dev, linux-s390, sparclinux, linux-block,
linux-scsi, linux-fsdevel, linux-aio, io-uring, linux-arch,
linux-mm, netdev, keyrings, linux-security-module
Now that import_iovec handles compat iovecs, the native version of
keyctl_instantiate_key_iov can be used for the compat case as well.
Signed-off-by: Christoph Hellwig <[email protected]>
---
security/keys/compat.c | 36 ++----------------------------------
security/keys/internal.h | 5 -----
security/keys/keyctl.c | 2 +-
3 files changed, 3 insertions(+), 40 deletions(-)
diff --git a/security/keys/compat.c b/security/keys/compat.c
index 7ae531db031cf8..1545efdca56227 100644
--- a/security/keys/compat.c
+++ b/security/keys/compat.c
@@ -11,38 +11,6 @@
#include <linux/slab.h>
#include "internal.h"
-/*
- * Instantiate a key with the specified compatibility multipart payload and
- * link the key into the destination keyring if one is given.
- *
- * The caller must have the appropriate instantiation permit set for this to
- * work (see keyctl_assume_authority). No other permissions are required.
- *
- * If successful, 0 will be returned.
- */
-static long compat_keyctl_instantiate_key_iov(
- key_serial_t id,
- const struct compat_iovec __user *_payload_iov,
- unsigned ioc,
- key_serial_t ringid)
-{
- struct iovec iovstack[UIO_FASTIOV], *iov = iovstack;
- struct iov_iter from;
- long ret;
-
- if (!_payload_iov)
- ioc = 0;
-
- ret = import_iovec(WRITE, (const struct iovec __user *)_payload_iov,
- ioc, ARRAY_SIZE(iovstack), &iov, &from);
- if (ret < 0)
- return ret;
-
- ret = keyctl_instantiate_key_common(id, &from, ringid);
- kfree(iov);
- return ret;
-}
-
/*
* The key control system call, 32-bit compatibility version for 64-bit archs
*/
@@ -113,8 +81,8 @@ COMPAT_SYSCALL_DEFINE5(keyctl, u32, option,
return keyctl_reject_key(arg2, arg3, arg4, arg5);
case KEYCTL_INSTANTIATE_IOV:
- return compat_keyctl_instantiate_key_iov(
- arg2, compat_ptr(arg3), arg4, arg5);
+ return keyctl_instantiate_key_iov(arg2, compat_ptr(arg3), arg4,
+ arg5);
case KEYCTL_INVALIDATE:
return keyctl_invalidate_key(arg2);
diff --git a/security/keys/internal.h b/security/keys/internal.h
index 338a526cbfa516..9b9cf3b6fcbb4d 100644
--- a/security/keys/internal.h
+++ b/security/keys/internal.h
@@ -262,11 +262,6 @@ extern long keyctl_instantiate_key_iov(key_serial_t,
const struct iovec __user *,
unsigned, key_serial_t);
extern long keyctl_invalidate_key(key_serial_t);
-
-struct iov_iter;
-extern long keyctl_instantiate_key_common(key_serial_t,
- struct iov_iter *,
- key_serial_t);
extern long keyctl_restrict_keyring(key_serial_t id,
const char __user *_type,
const char __user *_restriction);
diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
index 9febd37a168fd0..e26bbccda7ccee 100644
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -1164,7 +1164,7 @@ static int keyctl_change_reqkey_auth(struct key *key)
*
* If successful, 0 will be returned.
*/
-long keyctl_instantiate_key_common(key_serial_t id,
+static long keyctl_instantiate_key_common(key_serial_t id,
struct iov_iter *from,
key_serial_t ringid)
{
--
2.28.0
^ permalink raw reply related [flat|nested] 66+ messages in thread
* Re: let import_iovec deal with compat_iovecs as well v4
2020-09-25 4:51 let import_iovec deal with compat_iovecs as well v4 Christoph Hellwig
` (8 preceding siblings ...)
2020-09-25 4:51 ` [PATCH 9/9] security/keys: remove compat_keyctl_instantiate_key_iov Christoph Hellwig
@ 2020-09-25 15:23 ` Al Viro
9 siblings, 0 replies; 66+ messages in thread
From: Al Viro @ 2020-09-25 15:23 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Andrew Morton, Jens Axboe, Arnd Bergmann, David Howells,
David Laight, linux-arm-kernel, linux-kernel, linux-mips,
linux-parisc, linuxppc-dev, linux-s390, sparclinux, linux-block,
linux-scsi, linux-fsdevel, linux-aio, io-uring, linux-arch,
linux-mm, netdev, keyrings, linux-security-module
On Fri, Sep 25, 2020 at 06:51:37AM +0200, Christoph Hellwig wrote:
> Hi Al,
>
> this series changes import_iovec to transparently deal with compat iovec
> structures, and then cleanups up a lot of code dupliation.
OK, I can live with that. Applied, let's see if it passes smoke tests
into -next it goes.
^ permalink raw reply [flat|nested] 66+ messages in thread