public inbox for [email protected]
 help / color / mirror / Atom feed
* liburing test results on hppa
@ 2023-02-16 23:00 John David Anglin
  2023-02-16 23:12 ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: John David Anglin @ 2023-02-16 23:00 UTC (permalink / raw)
  To: linux-parisc; +Cc: io-uring, Jens Axboe, Helge Deller

Here are liburing test results on hppa:

Running test 232c93d07b74.t 5 sec [5]
Running test 35fa71a030ca.t 5 sec [5]
Running test 500f9fbadef8.t 25 sec [25]
Running test 7ad0e4b2f83c.t 1 sec [1]
Running test 8a9973408177.t 0 sec [0]
Running test 917257daa0fe.t 0 sec [0]
Running test a0908ae19763.t 0 sec [0]
Running test a4c0b3decb33.t Test a4c0b3decb33.t timed out (may not be a failure)
Running test accept.t 2 sec [2]
Running test accept-link.t 0 sec [0]
Running test accept-reuse.t 0 sec [0]
Running test accept-test.t 0 sec [0]
Running test across-fork.t 0 sec [0]
Running test b19062a56726.t 0 sec [1]
Running test b5837bd5311d.t 0 sec [0]
Running test buf-ring.t bad run 0/0 = -233
test_running(1) failed
Test buf-ring.t failed with ret 1
Running test ce593a6c480a.t 1 sec [1]
Running test close-opath.t 0 sec [0]
Running test connect.t 0 sec [0]
Running test cq-full.t 0 sec [0]
Running test cq-overflow.t 12 sec [12]
Running test cq-peek-batch.t 0 sec [0]
Running test cq-ready.t 0 sec [0]
Running test cq-size.t 0 sec [0]
Running test d4ae271dfaae.t 0 sec [0]
Running test d77a67ed5f27.t 0 sec [0]
Running test defer.t 3 sec [3]
Running test defer-taskrun.t 1 sec [0]
Running test double-poll-crash.t Skipped
Running test drop-submit.t 0 sec [0]
Running test eeed8b54e0df.t 0 sec [0]
Running test empty-eownerdead.t 0 sec [0]
Running test eploop.t 0 sec [0]
Running test eventfd.t 0 sec [0]
Running test eventfd-disable.t 0 sec [0]
Running test eventfd-reg.t 0 sec [0]
Running test eventfd-ring.t 0 sec [0]
Running test evloop.t 0 sec [0]
Running test exec-target.t 0 sec [0]
Running test exit-no-cleanup.t 0 sec [0]
Running test fadvise.t 0 sec [0]
Running test fallocate.t 0 sec [0]
Running test fc2a85cb02ef.t Test needs failslab/fail_futex/fail_page_alloc enabled, skipped
Skipped
Running test fd-pass.t 1 sec [0]
Running test file-register.t 4 sec [4]
Running test files-exit-hang-poll.t 1 sec [1]
Running test files-exit-hang-timeout.t 1 sec [1]
Running test file-update.t 0 sec [0]
Running test file-verify.t Found 98528, wanted 622816
Buffered novec reg test failed
Test file-verify.t failed with ret 1
Running test fixed-buf-iter.t 0 sec [0]
Running test fixed-link.t 0 sec [0]
Running test fixed-reuse.t 0 sec [0]
Running test fpos.t 1 sec [0]
Running test fsnotify.t Skipped
Running test fsync.t 0 sec [0]
Running test hardlink.t 0 sec [1]
Running test io-cancel.t 4 sec [3]
Running test iopoll.t 2 sec [2]
Running test iopoll-leak.t 0 sec [0]
Running test iopoll-overflow.t 1 sec [1]
Running test io_uring_enter.t 1 sec [0]
Running test io_uring_passthrough.t Skipped
Running test io_uring_register.t Unable to map a huge page.  Try increasing /proc/sys/vm/nr_hugepages by at least 1.
Skipping the hugepage test
0 sec [0]
Running test io_uring_setup.t 0 sec [0]
Running test lfs-openat.t 0 sec [0]
Running test lfs-openat-write.t 0 sec [0]
Running test link.t 0 sec [0]
Running test link_drain.t 3 sec [2]
Running test link-timeout.t 1 sec [2]
Running test madvise.t 1 sec [0]
Running test mkdir.t 0 sec [0]
Running test msg-ring.t 0 sec [1]
Running test msg-ring-flags.t Skipped
Running test msg-ring-overflow.t 0 sec [0]
Running test multicqes_drain.t 26 sec [25]
Running test nolibc.t Skipped
Running test nop-all-sizes.t 0 sec [1]
Running test nop.t 1 sec [0]
Running test openat2.t 0 sec [0]
Running test open-close.t 0 sec [0]
Running test open-direct-link.t 0 sec [0]
Running test open-direct-pick.t 0 sec [0]
Running test personality.t Not root, skipping
0 sec [0]
Running test pipe-bug.t 6 sec [5]
Running test pipe-eof.t 0 sec [0]
Running test pipe-reuse.t 0 sec [0]
Running test poll.t 1 sec [1]
Running test poll-cancel.t 0 sec [0]
Running test poll-cancel-all.t 0 sec [0]
Running test poll-cancel-ton.t 0 sec [0]
Running test poll-link.t 1 sec [1]
Running test poll-many.t 19 sec [19]
Running test poll-mshot-overflow.t 0 sec [0]
Running test poll-mshot-update.t 21 sec [21]
Running test poll-race.t 2 sec [2]
Running test poll-race-mshot.t Skipped
Running test poll-ring.t 0 sec [0]
Running test poll-v-poll.t 0 sec [0]
Running test pollfree.t 0 sec [0]
Running test probe.t 0 sec [0]
Running test read-before-exit.t 3 sec [5]
Running test read-write.t Not root, skipping test_write_efbig
8 sec [8]
Running test recv-msgall.t 1 sec [0]
Running test recv-msgall-stream.t 0 sec [0]
Running test recv-multishot.t 3 sec [3]
Running test register-restrictions.t 0 sec [0]
Running test rename.t 0 sec [0]
Running test ringbuf-read.t cqe res -233
dio test failed
Test ringbuf-read.t failed with ret 1
Running test ring-leak2.t 1 sec [1]
Running test ring-leak.t 0 sec [0]
Running test rsrc_tags.t 16 sec [16]
Running test rw_merge_test.t 0 sec [0]
Running test self.t 0 sec [0]
Running test sendmsg_fs_cve.t chroot not allowed, skip
0 sec [0]
Running test send_recv.t 0 sec [0]
Running test send_recvmsg.t do_recvmsg: failed cqe: -233
send_recvmsg 0 1 0 1 0 failed
Test send_recvmsg.t failed with ret 1
Running test send-zerocopy.t Test send-zerocopy.t timed out (may not be a failure)
Running test shared-wq.t 0 sec [0]
Running test short-read.t 0 sec [0]
Running test shutdown.t 0 sec [0]
Running test sigfd-deadlock.t 0 sec [0]
Running test single-issuer.t 0 sec [0]
Running test skip-cqe.t 0 sec [0]
Running test socket.t 0 sec [0]
Running test socket-rw.t 0 sec [0]
Running test socket-rw-eagain.t 0 sec [0]
Running test socket-rw-offset.t 0 sec [0]
Running test splice.t 0 sec [0]
Running test sq-full.t 0 sec [1]
Running test sq-full-cpp.t 0 sec [0]
Running test sqpoll-cancel-hang.t Skipped
Running test sqpoll-disable-exit.t 2 sec [2]
Running test sq-poll-dup.t 6 sec [5]
Running test sqpoll-exit-hang.t 1 sec [1]
Running test sq-poll-kthread.t 2 sec [3]
Running test sq-poll-share.t 13 sec [11]
Running test sqpoll-sleep.t 0 sec [0]
Running test sq-space_left.t 0 sec [0]
Running test stdout.t This is a pipe test
This is a fixed pipe test
0 sec [1]
Running test submit-and-wait.t 1 sec [1]
Running test submit-link-fail.t 0 sec [0]
Running test submit-reuse.t 1 sec [1]
Running test symlink.t 0 sec [0]
Running test sync-cancel.t 1 sec [0]
Running test teardowns.t 0 sec [0]
Running test thread-exit.t 0 sec [1]
Running test timeout.t 6 sec [6]
Running test timeout-new.t 3 sec [2]
Running test timeout-overflow.t Skipped
Running test tty-write-dpoll.t 0 sec [0]
Running test unlink.t 0 sec [0]
Running test version.t 0 sec [0]
Running test wakeup-hang.t 2 sec [2]
Running test xattr.t 0 sec [0]
Running test statx.t 0 sec [0]
Running test sq-full-cpp.t 1 sec [0]
Tests timed out (2): <a4c0b3decb33.t> <send-zerocopy.t>
Tests failed (4): <buf-ring.t> <file-verify.t> <ringbuf-read.t> <send_recvmsg.t>
make[1]: *** [Makefile:250: runtests] Error 1
make[1]: Leaving directory '/home/dave/gnu/liburing/liburing/test'
make: *** [Makefile:21: runtests] Error 2

I modified poll-race-mshot.t to skip on hppa.  Added handle_tw_list and io_uring_try_cancel_requests fixes.
This appears to have fixed stalls.

Dave

-- 
John David Anglin  [email protected]


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: liburing test results on hppa
  2023-02-16 23:00 liburing test results on hppa John David Anglin
@ 2023-02-16 23:12 ` Jens Axboe
  2023-02-16 23:26   ` John David Anglin
  0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2023-02-16 23:12 UTC (permalink / raw)
  To: John David Anglin, linux-parisc; +Cc: io-uring, Helge Deller

On 2/16/23 4:00?PM, John David Anglin wrote:
> Running test buf-ring.t bad run 0/0 = -233
> test_running(1) failed
> Test buf-ring.t failed with ret 1

As mentioned previously, this one and the other -233 I suspect are due
to the same coloring issue as was fixed by Helge's patch for the ring
mmaps, as the provided buffer rings work kinda the same way. The
application allocates some aligned memory, and registers it and the
kernel then maps it.

I wonder if these would work properly if the address was aligned to
0x400000? Should be easy to verify, just modify the alignment for the
posix_memalign() calls in test/buf-ring.c.

> Running test file-verify.t Found 98528, wanted 622816
> Buffered novec reg test failed
> Test file-verify.t failed with ret 1

Unsure what this is.

> Tests timed out (2): <a4c0b3decb33.t> <send-zerocopy.t>

I suspect the box is just too slow to run these before the script
decides they have timed out.

> I modified poll-race-mshot.t to skip on hppa.  Added handle_tw_list
> and io_uring_try_cancel_requests fixes. This appears to have fixed
> stalls.

poll-race-mshot is the most interesting one, but I'll need som actual
info on that one to make guesses as to what is going on. A raw hex trace
doesn't really help me very much...

But I don't think we'll make much progress here unless someone dives in
and takes a closer look. So while I appreciate the test report, we need
to dig a bit deeper to figure out poll-race-mshot and file-verify. qemu
may be useful for some things, but it's not of much help here.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: liburing test results on hppa
  2023-02-16 23:12 ` Jens Axboe
@ 2023-02-16 23:26   ` John David Anglin
  2023-02-16 23:32     ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: John David Anglin @ 2023-02-16 23:26 UTC (permalink / raw)
  To: Jens Axboe, linux-parisc; +Cc: io-uring, Helge Deller

On 2023-02-16 6:12 p.m., Jens Axboe wrote:
> On 2/16/23 4:00?PM, John David Anglin wrote:
>> Running test buf-ring.t bad run 0/0 = -233
>> test_running(1) failed
>> Test buf-ring.t failed with ret 1
> As mentioned previously, this one and the other -233 I suspect are due
> to the same coloring issue as was fixed by Helge's patch for the ring
> mmaps, as the provided buffer rings work kinda the same way. The
> application allocates some aligned memory, and registers it and the
> kernel then maps it.
>
> I wonder if these would work properly if the address was aligned to
> 0x400000? Should be easy to verify, just modify the alignment for the
> posix_memalign() calls in test/buf-ring.c.
Doesn't help.  Same error.  Can you point to where the kernel maps it?

-- 
John David Anglin  [email protected]


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: liburing test results on hppa
  2023-02-16 23:26   ` John David Anglin
@ 2023-02-16 23:32     ` Jens Axboe
  2023-02-17  2:33       ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2023-02-16 23:32 UTC (permalink / raw)
  To: John David Anglin, linux-parisc; +Cc: io-uring, Helge Deller

On 2/16/23 4:26?PM, John David Anglin wrote:
> On 2023-02-16 6:12 p.m., Jens Axboe wrote:
>> On 2/16/23 4:00?PM, John David Anglin wrote:
>>> Running test buf-ring.t bad run 0/0 = -233
>>> test_running(1) failed
>>> Test buf-ring.t failed with ret 1
>> As mentioned previously, this one and the other -233 I suspect are due
>> to the same coloring issue as was fixed by Helge's patch for the ring
>> mmaps, as the provided buffer rings work kinda the same way. The
>> application allocates some aligned memory, and registers it and the
>> kernel then maps it.
>>
>> I wonder if these would work properly if the address was aligned to
>> 0x400000? Should be easy to verify, just modify the alignment for the
>> posix_memalign() calls in test/buf-ring.c.
> Doesn't help.  Same error.  Can you point to where the kernel maps it?

Yep, it goes io_uring.c:io_uring_register() ->
kbuf.c:io_register_pbuf_ring() -> rsrc.c:io_pin_pages() which ultimately
calls pin_user_pages() to map the memory.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: liburing test results on hppa
  2023-02-16 23:32     ` Jens Axboe
@ 2023-02-17  2:33       ` Jens Axboe
  2023-02-17  2:59         ` John David Anglin
  0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2023-02-17  2:33 UTC (permalink / raw)
  To: John David Anglin, linux-parisc; +Cc: io-uring, Helge Deller

On 2/16/23 4:32 PM, Jens Axboe wrote:
> On 2/16/23 4:26?PM, John David Anglin wrote:
>> On 2023-02-16 6:12 p.m., Jens Axboe wrote:
>>> On 2/16/23 4:00?PM, John David Anglin wrote:
>>>> Running test buf-ring.t bad run 0/0 = -233
>>>> test_running(1) failed
>>>> Test buf-ring.t failed with ret 1
>>> As mentioned previously, this one and the other -233 I suspect are due
>>> to the same coloring issue as was fixed by Helge's patch for the ring
>>> mmaps, as the provided buffer rings work kinda the same way. The
>>> application allocates some aligned memory, and registers it and the
>>> kernel then maps it.
>>>
>>> I wonder if these would work properly if the address was aligned to
>>> 0x400000? Should be easy to verify, just modify the alignment for the
>>> posix_memalign() calls in test/buf-ring.c.
>> Doesn't help.  Same error.  Can you point to where the kernel maps it?
> 
> Yep, it goes io_uring.c:io_uring_register() ->
> kbuf.c:io_register_pbuf_ring() -> rsrc.c:io_pin_pages() which ultimately
> calls pin_user_pages() to map the memory.

Followup - a few of the provided buffer ring cases failed to properly
initialize the ring, poll-mshot-race was one of them... I've pushed out
fixes for this. Not sure if it fixes your particular issue, but worth
giving it another run.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: liburing test results on hppa
  2023-02-17  2:33       ` Jens Axboe
@ 2023-02-17  2:59         ` John David Anglin
  2023-02-21 19:55           ` John David Anglin
  0 siblings, 1 reply; 7+ messages in thread
From: John David Anglin @ 2023-02-17  2:59 UTC (permalink / raw)
  To: Jens Axboe, linux-parisc; +Cc: io-uring, Helge Deller

On 2023-02-16 9:33 p.m., Jens Axboe wrote:
> On 2/16/23 4:32 PM, Jens Axboe wrote:
>> On 2/16/23 4:26?PM, John David Anglin wrote:
>>> On 2023-02-16 6:12 p.m., Jens Axboe wrote:
>>>> On 2/16/23 4:00?PM, John David Anglin wrote:
>>>>> Running test buf-ring.t bad run 0/0 = -233
>>>>> test_running(1) failed
>>>>> Test buf-ring.t failed with ret 1
>>>> As mentioned previously, this one and the other -233 I suspect are due
>>>> to the same coloring issue as was fixed by Helge's patch for the ring
>>>> mmaps, as the provided buffer rings work kinda the same way. The
>>>> application allocates some aligned memory, and registers it and the
>>>> kernel then maps it.
>>>>
>>>> I wonder if these would work properly if the address was aligned to
>>>> 0x400000? Should be easy to verify, just modify the alignment for the
>>>> posix_memalign() calls in test/buf-ring.c.
>>> Doesn't help.  Same error.  Can you point to where the kernel maps it?
>> Yep, it goes io_uring.c:io_uring_register() ->
>> kbuf.c:io_register_pbuf_ring() -> rsrc.c:io_pin_pages() which ultimately
>> calls pin_user_pages() to map the memory.
> Followup - a few of the provided buffer ring cases failed to properly
> initialize the ring, poll-mshot-race was one of them... I've pushed out
> fixes for this. Not sure if it fixes your particular issue, but worth
> giving it another run.
Results are still the same:
Running test file-verify.t Found 163840, wanted 688128
Buffered novec reg test failed
Test file-verify.t failed with ret 1

Tests timed out (2): <a4c0b3decb33.t> <send-zerocopy.t>
Tests failed (4): <buf-ring.t> <file-verify.t> <ringbuf-read.t> <send_recvmsg.t>

poll-mshot-race still causes HPMC.

-- 
John David Anglin  [email protected]


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: liburing test results on hppa
  2023-02-17  2:59         ` John David Anglin
@ 2023-02-21 19:55           ` John David Anglin
  0 siblings, 0 replies; 7+ messages in thread
From: John David Anglin @ 2023-02-21 19:55 UTC (permalink / raw)
  To: Jens Axboe, linux-parisc; +Cc: io-uring, Helge Deller

On 2023-02-16 9:59 p.m., John David Anglin wrote:
>>>>> As mentioned previously, this one and the other -233 I suspect are due
>>>>> to the same coloring issue as was fixed by Helge's patch for the ring
>>>>> mmaps, as the provided buffer rings work kinda the same way. The
>>>>> application allocates some aligned memory, and registers it and the
>>>>> kernel then maps it.
>>>>>
>>>>> I wonder if these would work properly if the address was aligned to
>>>>> 0x400000? Should be easy to verify, just modify the alignment for the
>>>>> posix_memalign() calls in test/buf-ring.c.
>>>> Doesn't help.  Same error.  Can you point to where the kernel maps it?
>>> Yep, it goes io_uring.c:io_uring_register() ->
>>> kbuf.c:io_register_pbuf_ring() -> rsrc.c:io_pin_pages() which ultimately
>>> calls pin_user_pages() to map the memory.
>> Followup - a few of the provided buffer ring cases failed to properly
>> initialize the ring, poll-mshot-race was one of them... I've pushed out
>> fixes for this. Not sure if it fixes your particular issue, but worth
>> giving it another run.
> Results are still the same:
> Running test file-verify.t Found 163840, wanted 688128
> Buffered novec reg test failed
> Test file-verify.t failed with ret 1
>
> Tests timed out (2): <a4c0b3decb33.t> <send-zerocopy.t>
> Tests failed (4): <buf-ring.t> <file-verify.t> <ringbuf-read.t> <send_recvmsg.t>
>
> poll-mshot-race still causes HPMC.

The timeouts are not a problem.  The following change fixed <a4c0b3decb33.t> <send-zerocopy.t>:

diff --git a/test/a4c0b3decb33.c b/test/a4c0b3decb33.c
index f282d1b..6be73b6 100644
--- a/test/a4c0b3decb33.c
+++ b/test/a4c0b3decb33.c
@@ -124,7 +124,7 @@ static void loop(void)
              if (waitpid(-1, &status, WNOHANG | WAIT_FLAGS) == pid)
                  break;
              sleep_ms(1);
-            if (current_time_ms() - start < 5 * 1000)
+            if (current_time_ms() - start < 100 * 1000)
                  continue;
              kill_and_wait(pid, &status);
              break;
diff --git a/test/runtests.sh b/test/runtests.sh
index 924fdce..8c3a4bf 100755
--- a/test/runtests.sh
+++ b/test/runtests.sh
@@ -1,7 +1,7 @@
  #!/usr/bin/env bash

  TESTS=("$@")
-TIMEOUT=60
+TIMEOUT=300
  DMESG_FILTER="cat"
  TEST_DIR=$(dirname "$0")
  FAILED=()

I believe you are correct about the colouring issue being the problem with the other tests.
I've been playing with the send_recvmsg.t test as it seems the simplest.

On parisc, caches are required to detect that the same physical memory is being accessed by
two virtual addresses if offset bits 42 through 63 are the same in both virtual addresses (i.e.,
the addresses must be equal modulo 0x400000).  There is also a constraint on space bits but
space register hashing is disabled, so it doesn't come into play.

We have a linear offset between kernel and physical addresses in linux.  So, the user virtual
address must be equivalent to the physical address of a page for user and kernel accesses to
be detected by the caches.

For io_uring to work, I believe the user and kernel addresses used to access the buffers must
be equivalent.  However, as far as I can see, we only setup equivalent aliases for file backed
mappings with MAP_SHARED.  There doesn't appear to be any connection between the kernel
page addresses allocated for a mapping and the assigned user virtual addresses.  Thus, it doesn't
help to align the user virtual address to 0x400000.  The kernel virtual address still has the wrong
colour.

Maybe something could by done with anonymous MAP_SHARED mappings to make them equivalent?
The mmap man page says "Support for MAP_ANONYMOUS in conjunction with MAP_SHARED was added
in Linux 2.4."

I tried to use a file based mapping in the send_recvmsg.t (tried both ring.ring_fd and a temporary
file).  At first, I thought this worked.  But it turns out that pin_user_pages fails and returns -EFAULT.

                        reg.ring_addr = (unsigned long) ptr;
                         reg.ring_entries = 1;
                         reg.bgid = BUF_BGID;
                         if (io_uring_register_buf_ring(&ring, &reg, 0)) {
                                 no_pbuf_ring = 1;
                                 goto out;
                         }

So, the io_uring_register_buf_ring call fails and the code bails with no error message.

I'm not sure why pin_user_pages fails.  Today, I've been wondering if a mlock call would
lock the mmap'd buffer into RAM and fix pin_user_pages?

Dave

-- 
John David Anglin  [email protected]


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-02-21 19:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-16 23:00 liburing test results on hppa John David Anglin
2023-02-16 23:12 ` Jens Axboe
2023-02-16 23:26   ` John David Anglin
2023-02-16 23:32     ` Jens Axboe
2023-02-17  2:33       ` Jens Axboe
2023-02-17  2:59         ` John David Anglin
2023-02-21 19:55           ` John David Anglin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox