Re: [PATCH] io_uring: create percpu io sq thread when IORING_SETUP_SQ_AFF is flagged

public inbox for [email protected]
 help / color / mirror / Atom feed

From: Pavel Begunkov <[email protected]>
To: Xiaoguang Wang <[email protected]>,
	[email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: [PATCH] io_uring: create percpu io sq thread when IORING_SETUP_SQ_AFF is flagged
Date: Wed, 3 Jun 2020 17:47:24 +0300	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 26/05/2020 17:42, Xiaoguang Wang wrote:
> Yes, I don't try to make all io_uring instances in the system share threads, I just
> make io_uring instances which are bound to same cpu core, share one io_sq_thread that
> only is created once for every cpu core.
> Otherwise in current io_uring mainline codes, we'd better not bind different io_uring
> instances to same cpu core,  some instances' busy loop in its sq_thread_idle period will
> impact other instanes who currently there are reqs to handle.

I got a bit carried away from your initial case, but there is the case:
Let's we have 2 unrelated apps that create SQPOLL io_uring instances. Let's say they are
in 2 different docker container for the argument (and let's just assume a docker container
user can create such).

The first app1 submits 32K fat requests as described before. The second one (app2) is a
low-latency app, submits reqs by 1, but expects it to be picked really fast. And let's
assume their SQPOLL threads pinned to the same CPU.

1. old version:
The CPU spends some time allocated by a scheduler on 32K requests of app1,
probably not issuing them all but that's fine. And then it goes to the app2.
So, the submit-to-pickup latency for app2 is capped by a task scheduler.
That's somewhat fair. 

2. your version:
io_sq_thread first processes all 32K of requests of app1, and only then goes to app2.
app2 is screwed, unfair as life can be. And a malicious user can create many io_uring
instances as in app1. So the latency will be further multiplied.

Any solution I can think of is ugly and won't ever land upstream. Like creating your
own scheduling framework for io_uring, wiring kindof cgroups, etc. And actually SQPOLL
shouldn't be so ubiquitous (+needs privileges). E.g. I expect there will be a single
app per system using it, e.g. a database consuming most of the resources anyway.
And that's why I think it's better to not trying to solve your original issue.

However, what the patch can be easily remade into is sharing an SQPOLL thread between
io_uring instances of a single app/user, like passing fd described before.
The most obvious example is to share 1 SQPOLL thread (or N << num_cpus) between all
user threads, so
- still creating io_uring per thread to not synchronise SQ
- retaining CPU time for real user work (instead of having N SQPOLL threads)
- increasing polling efficiency (more work -- less idle polling)
- and scheduling/task migration, etc.

note: would be great to check, that it has all necessary cond_resched()

> 
>> specified set of them. In other words, making yours @percpu_threads not global,
>> but rather binding to a set of io_urings.
>>
>> e.g. create 2 independent per-cpu sets of threads. 1st one for {a1,a2,a3}, 2nd
>> for {b1,b2,b3}.
>>
>> a1 = create_uring()
>> a2 = create_uring(shared_sq_threads=a1)
>> a3 = create_uring(shared_sq_threads=a1)
>>
>> b1 = create_uring()
>> b2 = create_uring(shared_sq_threads=b1)
>> b3 = create_uring(shared_sq_threads=b1)
> Thanks your suggestions, I'll try to consider it in V2 patch.
> 
> But I still have one question: now "shared_sq_threads=a1" and "shared_sq_threads=b1"
> can be bound to same cpu core? I think it's still not efficient. E.g. "shared_sq_threads=a1"
> busy loop in its sq_thread_idle period will impact shared_sq_threads=b1 who currently
> there are reqs to handle.
> 
> Now I also wonder whether a io_uring instance,which has SQPOLL enabed and does not
> specify a sq_thread_cpu, can be used in real business environment, the io_sq_thread
> may run in any cpu cores, which may affect any other application, e.g. cpu resource
> contend. So if trying to use SQPOLL, we'd better allocate specified cpu.

I think it's ok.
- needs privileges to create SQPOLL
- don't expect it used without thinking twice
- for not-pinned case everything will be rescheduled to other cpus as appropriate
  It shouldn't affect much in normal circumstances.

-- 
Pavel Begunkov

next prev parent reply	other threads:[~2020-06-03 14:48 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-20 11:56 [PATCH] io_uring: create percpu io sq thread when IORING_SETUP_SQ_AFF is flagged Xiaoguang Wang
2020-05-20 12:11 ` Xiaoguang Wang
2020-05-22 11:17   ` Yu Jian Wu
2020-05-25  3:16     ` Xiaoguang Wang
2020-05-20 22:09 ` Pavel Begunkov
2020-05-22  8:33   ` Xiaoguang Wang
2020-05-24 11:46     ` Pavel Begunkov
2020-05-26 14:42       ` Xiaoguang Wang
2020-06-03 14:47         ` Pavel Begunkov [this message]
2020-06-03 18:48           ` Pavel Begunkov
2020-05-28  7:56       ` Xiaoguang Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox