public inbox for [email protected]
 help / color / mirror / Atom feed
* Re: crash on accept
       [not found] <CAD-J=zZnmnjgC9Epd5muON2dx6reCzYMzJBD=jFekxB9mgp6GA@mail.gmail.com>
@ 2020-02-19 20:09 ` Jens Axboe
  2020-02-19 20:11   ` Glauber Costa
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2020-02-19 20:09 UTC (permalink / raw)
  To: Glauber Costa, io-uring, Avi Kivity

On 2/19/20 9:23 AM, Glauber Costa wrote:
> Hi,
> 
> I started using af0a72622a1fb7179cf86ae714d52abadf7d8635 today so I could consume the new fast poll flag, and one of my tests that was previously passing now crashes

Thanks for testing the new stuff! As always, would really appreciate a
test case that I can run, makes my job so much easier.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: crash on accept
  2020-02-19 20:09 ` crash on accept Jens Axboe
@ 2020-02-19 20:11   ` Glauber Costa
  2020-02-19 20:12     ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Glauber Costa @ 2020-02-19 20:11 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, Avi Kivity

On Wed, Feb 19, 2020 at 3:09 PM Jens Axboe <[email protected]> wrote:
>
> On 2/19/20 9:23 AM, Glauber Costa wrote:
> > Hi,
> >
> > I started using af0a72622a1fb7179cf86ae714d52abadf7d8635 today so I could consume the new fast poll flag, and one of my tests that was previously passing now crashes
>
> Thanks for testing the new stuff! As always, would really appreciate a
> test case that I can run, makes my job so much easier.

Trigger warning:
It's in C++.

I am finishing refactoring some of my code now. It's nothing
substantial so I am positive it will hit again. Once I re-reproduce
I'll send you instructions.

Reading the code it's not obvious to me how it happens, so it'll be
harder for me to cook up a simple C reproducer ATM.


>
> --
> Jens Axboe
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: crash on accept
  2020-02-19 20:11   ` Glauber Costa
@ 2020-02-19 20:12     ` Jens Axboe
  2020-02-19 20:25       ` Glauber Costa
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2020-02-19 20:12 UTC (permalink / raw)
  To: Glauber Costa; +Cc: io-uring, Avi Kivity

On 2/19/20 1:11 PM, Glauber Costa wrote:
> On Wed, Feb 19, 2020 at 3:09 PM Jens Axboe <[email protected]> wrote:
>>
>> On 2/19/20 9:23 AM, Glauber Costa wrote:
>>> Hi,
>>>
>>> I started using af0a72622a1fb7179cf86ae714d52abadf7d8635 today so I could consume the new fast poll flag, and one of my tests that was previously passing now crashes
>>
>> Thanks for testing the new stuff! As always, would really appreciate a
>> test case that I can run, makes my job so much easier.
> 
> Trigger warning:
> It's in C++.

As long as it reproduces, I don't really have to look at it :-)

> I am finishing refactoring some of my code now. It's nothing
> substantial so I am positive it will hit again. Once I re-reproduce
> I'll send you instructions.
> 
> Reading the code it's not obvious to me how it happens, so it'll be
> harder for me to cook up a simple C reproducer ATM.

I'll look here as well, as time permits.


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: crash on accept
  2020-02-19 20:12     ` Jens Axboe
@ 2020-02-19 20:25       ` Glauber Costa
  2020-02-19 20:29         ` Avi Kivity
  0 siblings, 1 reply; 9+ messages in thread
From: Glauber Costa @ 2020-02-19 20:25 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, Avi Kivity

On Wed, Feb 19, 2020 at 3:13 PM Jens Axboe <[email protected]> wrote:
>
> On 2/19/20 1:11 PM, Glauber Costa wrote:
> > On Wed, Feb 19, 2020 at 3:09 PM Jens Axboe <[email protected]> wrote:
> >>
> >> On 2/19/20 9:23 AM, Glauber Costa wrote:
> >>> Hi,
> >>>
> >>> I started using af0a72622a1fb7179cf86ae714d52abadf7d8635 today so I could consume the new fast poll flag, and one of my tests that was previously passing now crashes
> >>
> >> Thanks for testing the new stuff! As always, would really appreciate a
> >> test case that I can run, makes my job so much easier.
> >
> > Trigger warning:
> > It's in C++.
>
> As long as it reproduces, I don't really have to look at it :-)

Instructions:
1. clone https://github.com/glommer/seastar.git, branch uring-accept-crash
2. git submodule update --recursive --init, because we have a shit-ton
of submodules because why not.
3. install all dependencies with ./install-dependencies.sh
    note: that does not install liburing yet, you need to have at
least 0.4 (I trust you do), with the patch I just sent to add the fast
poll flag. It still fails sometimes in my system if liburing is
installed in /usr/lib instead of /usr/lib64 because cmake is made by
the devil.
3. ./configure.py --mode=release
4. ninja -C build/release tests/unit/unix_domain_test
5. crash your system (hopefully) by executing
./build/release/tests/unit/unix_domain_test -- -c1
--reactor-backend=uring


>
> > I am finishing refactoring some of my code now. It's nothing
> > substantial so I am positive it will hit again. Once I re-reproduce
> > I'll send you instructions.
> >
> > Reading the code it's not obvious to me how it happens, so it'll be
> > harder for me to cook up a simple C reproducer ATM.
>
> I'll look here as well, as time permits.
>
>
> --
> Jens Axboe
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: crash on accept
  2020-02-19 20:25       ` Glauber Costa
@ 2020-02-19 20:29         ` Avi Kivity
  2020-02-19 23:09           ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Avi Kivity @ 2020-02-19 20:29 UTC (permalink / raw)
  To: Glauber Costa, Jens Axboe; +Cc: io-uring

On 2/19/20 10:25 PM, Glauber Costa wrote:
> On Wed, Feb 19, 2020 at 3:13 PM Jens Axboe <[email protected]> wrote:
>> On 2/19/20 1:11 PM, Glauber Costa wrote:
>>> On Wed, Feb 19, 2020 at 3:09 PM Jens Axboe <[email protected]> wrote:
>>>> On 2/19/20 9:23 AM, Glauber Costa wrote:
>>>>> Hi,
>>>>>
>>>>> I started using af0a72622a1fb7179cf86ae714d52abadf7d8635 today so I could consume the new fast poll flag, and one of my tests that was previously passing now crashes
>>>> Thanks for testing the new stuff! As always, would really appreciate a
>>>> test case that I can run, makes my job so much easier.
>>> Trigger warning:
>>> It's in C++.
>> As long as it reproduces, I don't really have to look at it :-)
> Instructions:
> 1. clone https://github.com/glommer/seastar.git, branch uring-accept-crash
> 2. git submodule update --recursive --init, because we have a shit-ton
> of submodules because why not.


Actually, seastar has only one submodule (dpdk) and it is optional, so 
you need not clone it.


> 3. install all dependencies with ./install-dependencies.sh
>      note: that does not install liburing yet, you need to have at
> least 0.4 (I trust you do), with the patch I just sent to add the fast
> poll flag. It still fails sometimes in my system if liburing is
> installed in /usr/lib instead of /usr/lib64 because cmake is made by
> the devil.
> 3. ./configure.py --mode=release


--mode dev will compile many times faster


> 4. ninja -C build/release tests/unit/unix_domain_test
> 5. crash your system (hopefully) by executing
> ./build/release/tests/unit/unix_domain_test -- -c1
> --reactor-backend=uring
>
s/release/dev/ in steps 4, 5 if you use dev mode.


>>> I am finishing refactoring some of my code now. It's nothing
>>> substantial so I am positive it will hit again. Once I re-reproduce
>>> I'll send you instructions.
>>>
>>> Reading the code it's not obvious to me how it happens, so it'll be
>>> harder for me to cook up a simple C reproducer ATM.
>> I'll look here as well, as time permits.
>>
>>
>> --
>> Jens Axboe
>>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: crash on accept
  2020-02-19 20:29         ` Avi Kivity
@ 2020-02-19 23:09           ` Jens Axboe
  2020-02-20  1:37             ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2020-02-19 23:09 UTC (permalink / raw)
  To: Avi Kivity, Glauber Costa; +Cc: io-uring

On 2/19/20 1:29 PM, Avi Kivity wrote:
> On 2/19/20 10:25 PM, Glauber Costa wrote:
>> On Wed, Feb 19, 2020 at 3:13 PM Jens Axboe <[email protected]> wrote:
>>> On 2/19/20 1:11 PM, Glauber Costa wrote:
>>>> On Wed, Feb 19, 2020 at 3:09 PM Jens Axboe <[email protected]> wrote:
>>>>> On 2/19/20 9:23 AM, Glauber Costa wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I started using af0a72622a1fb7179cf86ae714d52abadf7d8635 today so I could consume the new fast poll flag, and one of my tests that was previously passing now crashes
>>>>> Thanks for testing the new stuff! As always, would really appreciate a
>>>>> test case that I can run, makes my job so much easier.
>>>> Trigger warning:
>>>> It's in C++.
>>> As long as it reproduces, I don't really have to look at it :-)
>> Instructions:
>> 1. clone https://github.com/glommer/seastar.git, branch uring-accept-crash
>> 2. git submodule update --recursive --init, because we have a shit-ton
>> of submodules because why not.
> 
> 
> Actually, seastar has only one submodule (dpdk) and it is optional, so 
> you need not clone it.
> 
> 
>> 3. install all dependencies with ./install-dependencies.sh
>>      note: that does not install liburing yet, you need to have at
>> least 0.4 (I trust you do), with the patch I just sent to add the fast
>> poll flag. It still fails sometimes in my system if liburing is
>> installed in /usr/lib instead of /usr/lib64 because cmake is made by
>> the devil.
>> 3. ./configure.py --mode=release
> 
> 
> --mode dev will compile many times faster
> 
> 
>> 4. ninja -C build/release tests/unit/unix_domain_test
>> 5. crash your system (hopefully) by executing
>> ./build/release/tests/unit/unix_domain_test -- -c1
>> --reactor-backend=uring
>>
> s/release/dev/ in steps 4, 5 if you use dev mode.

Thanks, this is great, I can reproduce!


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: crash on accept
  2020-02-19 23:09           ` Jens Axboe
@ 2020-02-20  1:37             ` Jens Axboe
  2020-02-20  2:52               ` Glauber Costa
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2020-02-20  1:37 UTC (permalink / raw)
  To: Avi Kivity, Glauber Costa; +Cc: io-uring

On 2/19/20 4:09 PM, Jens Axboe wrote:
> On 2/19/20 1:29 PM, Avi Kivity wrote:
>> On 2/19/20 10:25 PM, Glauber Costa wrote:
>>> On Wed, Feb 19, 2020 at 3:13 PM Jens Axboe <[email protected]> wrote:
>>>> On 2/19/20 1:11 PM, Glauber Costa wrote:
>>>>> On Wed, Feb 19, 2020 at 3:09 PM Jens Axboe <[email protected]> wrote:
>>>>>> On 2/19/20 9:23 AM, Glauber Costa wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I started using af0a72622a1fb7179cf86ae714d52abadf7d8635 today so I could consume the new fast poll flag, and one of my tests that was previously passing now crashes
>>>>>> Thanks for testing the new stuff! As always, would really appreciate a
>>>>>> test case that I can run, makes my job so much easier.
>>>>> Trigger warning:
>>>>> It's in C++.
>>>> As long as it reproduces, I don't really have to look at it :-)
>>> Instructions:
>>> 1. clone https://github.com/glommer/seastar.git, branch uring-accept-crash
>>> 2. git submodule update --recursive --init, because we have a shit-ton
>>> of submodules because why not.
>>
>>
>> Actually, seastar has only one submodule (dpdk) and it is optional, so 
>> you need not clone it.
>>
>>
>>> 3. install all dependencies with ./install-dependencies.sh
>>>      note: that does not install liburing yet, you need to have at
>>> least 0.4 (I trust you do), with the patch I just sent to add the fast
>>> poll flag. It still fails sometimes in my system if liburing is
>>> installed in /usr/lib instead of /usr/lib64 because cmake is made by
>>> the devil.
>>> 3. ./configure.py --mode=release
>>
>>
>> --mode dev will compile many times faster
>>
>>
>>> 4. ninja -C build/release tests/unit/unix_domain_test
>>> 5. crash your system (hopefully) by executing
>>> ./build/release/tests/unit/unix_domain_test -- -c1
>>> --reactor-backend=uring
>>>
>> s/release/dev/ in steps 4, 5 if you use dev mode.
> 
> Thanks, this is great, I can reproduce!

Can you try the current branch? Should be 77aac7e7738 (or newer).

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: crash on accept
  2020-02-20  1:37             ` Jens Axboe
@ 2020-02-20  2:52               ` Glauber Costa
  2020-02-20  3:53                 ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Glauber Costa @ 2020-02-20  2:52 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Avi Kivity, io-uring

I don't see a crash now, thanks.

I can now go back to trying to figure out why the test is just hanging
forever, as I was doing earlier =)
(99.9% I broke something with the last rework).

Out of curiosity, as it may help me with the above: I notice you
didn't add a patch on top, but rather rebased the tree.

What was the problem leading to the crash ?

On Wed, Feb 19, 2020 at 8:37 PM Jens Axboe <[email protected]> wrote:
>
> On 2/19/20 4:09 PM, Jens Axboe wrote:
> > On 2/19/20 1:29 PM, Avi Kivity wrote:
> >> On 2/19/20 10:25 PM, Glauber Costa wrote:
> >>> On Wed, Feb 19, 2020 at 3:13 PM Jens Axboe <[email protected]> wrote:
> >>>> On 2/19/20 1:11 PM, Glauber Costa wrote:
> >>>>> On Wed, Feb 19, 2020 at 3:09 PM Jens Axboe <[email protected]> wrote:
> >>>>>> On 2/19/20 9:23 AM, Glauber Costa wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> I started using af0a72622a1fb7179cf86ae714d52abadf7d8635 today so I could consume the new fast poll flag, and one of my tests that was previously passing now crashes
> >>>>>> Thanks for testing the new stuff! As always, would really appreciate a
> >>>>>> test case that I can run, makes my job so much easier.
> >>>>> Trigger warning:
> >>>>> It's in C++.
> >>>> As long as it reproduces, I don't really have to look at it :-)
> >>> Instructions:
> >>> 1. clone https://github.com/glommer/seastar.git, branch uring-accept-crash
> >>> 2. git submodule update --recursive --init, because we have a shit-ton
> >>> of submodules because why not.
> >>
> >>
> >> Actually, seastar has only one submodule (dpdk) and it is optional, so
> >> you need not clone it.
> >>
> >>
> >>> 3. install all dependencies with ./install-dependencies.sh
> >>>      note: that does not install liburing yet, you need to have at
> >>> least 0.4 (I trust you do), with the patch I just sent to add the fast
> >>> poll flag. It still fails sometimes in my system if liburing is
> >>> installed in /usr/lib instead of /usr/lib64 because cmake is made by
> >>> the devil.
> >>> 3. ./configure.py --mode=release
> >>
> >>
> >> --mode dev will compile many times faster
> >>
> >>
> >>> 4. ninja -C build/release tests/unit/unix_domain_test
> >>> 5. crash your system (hopefully) by executing
> >>> ./build/release/tests/unit/unix_domain_test -- -c1
> >>> --reactor-backend=uring
> >>>
> >> s/release/dev/ in steps 4, 5 if you use dev mode.
> >
> > Thanks, this is great, I can reproduce!
>
> Can you try the current branch? Should be 77aac7e7738 (or newer).
>
> --
> Jens Axboe
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: crash on accept
  2020-02-20  2:52               ` Glauber Costa
@ 2020-02-20  3:53                 ` Jens Axboe
  0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2020-02-20  3:53 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Avi Kivity, io-uring

On 2/19/20 7:52 PM, Glauber Costa wrote:
> I don't see a crash now, thanks.

Updated again, should be more solid now. Just in case you run into
issues.

> I can now go back to trying to figure out why the test is just hanging
> forever, as I was doing earlier =)
> (99.9% I broke something with the last rework).
> 
> Out of curiosity, as it may help me with the above: I notice you
> didn't add a patch on top, but rather rebased the tree.

Yeah, the poll based async bits are very much a work in progress,
and I'll just fold fixes for now.

> What was the problem leading to the crash ?

It had to do with repeated retry. Eg we want to read from something,
we try and get -EAGAIN. Arm the poll handler, poll results says we're
good to go. Retry, get -EAGAIN again. Now we give up, but the state
wasn't restored properly.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-02-20  3:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CAD-J=zZnmnjgC9Epd5muON2dx6reCzYMzJBD=jFekxB9mgp6GA@mail.gmail.com>
2020-02-19 20:09 ` crash on accept Jens Axboe
2020-02-19 20:11   ` Glauber Costa
2020-02-19 20:12     ` Jens Axboe
2020-02-19 20:25       ` Glauber Costa
2020-02-19 20:29         ` Avi Kivity
2020-02-19 23:09           ` Jens Axboe
2020-02-20  1:37             ` Jens Axboe
2020-02-20  2:52               ` Glauber Costa
2020-02-20  3:53                 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox