From: Jens Axboe <[email protected]>
To: Pavel Begunkov <[email protected]>,
Xiaoguang Wang <[email protected]>,
[email protected]
Subject: Re: [PATCH 8/8] io_uring: enable IORING_SETUP_ATTACH_WQ to attach to SQPOLL thread too
Date: Mon, 7 Sep 2020 10:11:46 -0600 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 9/7/20 8:00 AM, Pavel Begunkov wrote:
> On 07/09/2020 11:56, Xiaoguang Wang wrote:
>> hi,
>>
>> First thanks for this patchset:) and I have some questions below:
>>
>> 1. Does a uring instance which has SQPOLL enabled but does not specify
>> a valid cpu is meaningful in real business product?
>> IMHO, in a real business product, cpu usage should be under strict management,
>> say a uring instance which has SQPOLL enabled but is not bound to fixed cpu,
>> then this uring instance's corresponding io_sq_thread could be scheduled to any
>> cpu(and can be re-scheduled many times), which may impact any other application
>> greatly because of cpu contention, especially this io_sq_thread is just doing
>> busy loop in its sq_thread_idle period time, so in a real business product, I
>> wonder whether SQPOLL is only meaningful if user specifies a fixed cpu core.
>
> It is meaningful for a part of cases, for the other part you can set
> processes' affinities and pin each SQPOLL thread to a specific CPU, as
> you'd do even without this series. And that gives you more flexibilty,
> IMHO.
Right, and once we have "any ctx can use this poll thread", then percpu
poll threads could be a subset of that. So no reason that kind of
functionality couldn't be layered on top of that.
You'd probably just need an extra SETUP flag for this. Xiaoguang, any
interest in attempting to layer that on top of that current code?
>> 2. Does IORING_SETUP_ATTACH_WQ is really convenient?
>> IORING_SETUP_ATTACH_WQ always needs to ensure a parent uring instance exists,
>> that means it also requires app to regulate the creation oder of uring instanes,
>> which maybe not convenient, just imagine how to integrate this new SQPOLL
>> improvements to fio util.
>
> It may be not so convenient, but it's flexible enough and prevents
> isolation breaking issues. We've discussed that in the thread with
> your patches.
I think there's one minor issue there, and that is matching on just the
fd. That could cause different process rings to attach to the wrong one.
I think we need this on top to match the file as well. This is a bit
different on the SQPOLL side vs the io-wq workqueue.
Something like the below incremental should do it, makes it clear that
we need to be able to get at this file, and doesn't risk false attaching
globally. As a plus, it gets rid of that global list too, and makes it
consistent with io-wq for ATTACH_WQ.
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 0490edfcdd88..4bd18e01ae89 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -232,10 +232,6 @@ struct io_restriction {
struct io_sq_data {
refcount_t refs;
- /* global sqd lookup */
- struct list_head all_sqd_list;
- int attach_fd;
-
/* ctx's that are using this sqd */
struct list_head ctx_list;
struct list_head ctx_new_list;
@@ -245,9 +241,6 @@ struct io_sq_data {
struct wait_queue_head wait;
};
-static LIST_HEAD(sqd_list);
-static DEFINE_MUTEX(sqd_lock);
-
struct io_ring_ctx {
struct {
struct percpu_ref refs;
@@ -7034,32 +7027,37 @@ static void io_put_sq_data(struct io_sq_data *sqd)
kthread_stop(sqd->thread);
}
- mutex_lock(&sqd_lock);
- list_del(&sqd->all_sqd_list);
- mutex_unlock(&sqd_lock);
-
kfree(sqd);
}
}
static struct io_sq_data *io_attach_sq_data(struct io_uring_params *p)
{
- struct io_sq_data *sqd, *ret = ERR_PTR(-ENXIO);
+ struct io_ring_ctx *ctx_attach;
+ struct io_sq_data *sqd;
+ struct fd f;
- mutex_lock(&sqd_lock);
- list_for_each_entry(sqd, &sqd_list, all_sqd_list) {
- if (sqd->attach_fd == p->wq_fd) {
- refcount_inc(&sqd->refs);
- ret = sqd;
- break;
- }
+ f = fdget(p->wq_fd);
+ if (!f.file)
+ return ERR_PTR(-ENXIO);
+ if (f.file->f_op != &io_uring_fops) {
+ fdput(f);
+ return ERR_PTR(-EINVAL);
}
- mutex_unlock(&sqd_lock);
- return ret;
+ ctx_attach = f.file->private_data;
+ sqd = ctx_attach->sq_data;
+ if (!sqd) {
+ fdput(f);
+ return ERR_PTR(-EINVAL);
+ }
+
+ refcount_inc(&sqd->refs);
+ fdput(f);
+ return sqd;
}
-static struct io_sq_data *io_get_sq_data(struct io_uring_params *p, int ring_fd)
+static struct io_sq_data *io_get_sq_data(struct io_uring_params *p)
{
struct io_sq_data *sqd;
@@ -7071,15 +7069,11 @@ static struct io_sq_data *io_get_sq_data(struct io_uring_params *p, int ring_fd)
return ERR_PTR(-ENOMEM);
refcount_set(&sqd->refs, 1);
- sqd->attach_fd = ring_fd;
INIT_LIST_HEAD(&sqd->ctx_list);
INIT_LIST_HEAD(&sqd->ctx_new_list);
mutex_init(&sqd->ctx_lock);
init_waitqueue_head(&sqd->wait);
- mutex_lock(&sqd_lock);
- list_add_tail(&sqd->all_sqd_list, &sqd_list);
- mutex_unlock(&sqd_lock);
return sqd;
}
@@ -7746,7 +7740,7 @@ static int io_init_wq_offload(struct io_ring_ctx *ctx,
}
static int io_sq_offload_create(struct io_ring_ctx *ctx,
- struct io_uring_params *p, int ring_fd)
+ struct io_uring_params *p)
{
int ret;
@@ -7757,7 +7751,7 @@ static int io_sq_offload_create(struct io_ring_ctx *ctx,
if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_NICE))
goto err;
- sqd = io_get_sq_data(p, ring_fd);
+ sqd = io_get_sq_data(p);
if (IS_ERR(sqd)) {
ret = PTR_ERR(sqd);
goto err;
@@ -8972,7 +8966,7 @@ static int io_uring_create(unsigned entries, struct io_uring_params *p,
goto err;
}
- ret = io_sq_offload_create(ctx, p, fd);
+ ret = io_sq_offload_create(ctx, p);
if (ret)
goto err;
--
Jens Axboe
next prev parent reply other threads:[~2020-09-07 16:11 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-03 2:20 [PATCHSET for-next 0/8] io_uring SQPOLL improvements Jens Axboe
2020-09-03 2:20 ` [PATCH 1/8] io_uring: io_sq_thread() doesn't need to flush signals Jens Axboe
2020-09-03 2:20 ` [PATCH 2/8] io_uring: allow SQPOLL with CAP_SYS_NICE privileges Jens Axboe
2020-09-03 2:20 ` [PATCH 3/8] io_uring: use private ctx wait queue entries for SQPOLL Jens Axboe
2020-09-03 2:20 ` [PATCH 4/8] io_uring: move SQPOLL post-wakeup ring need wakeup flag into wake handler Jens Axboe
2020-09-03 2:20 ` [PATCH 5/8] io_uring: split work handling part of SQPOLL into helper Jens Axboe
2020-09-03 2:20 ` [PATCH 6/8] io_uring: split SQPOLL data into separate structure Jens Axboe
2020-09-03 2:20 ` [PATCH 7/8] io_uring: base SQPOLL handling off io_sq_data Jens Axboe
2020-09-03 2:20 ` [PATCH 8/8] io_uring: enable IORING_SETUP_ATTACH_WQ to attach to SQPOLL thread too Jens Axboe
2020-09-07 8:56 ` Xiaoguang Wang
2020-09-07 14:00 ` Pavel Begunkov
2020-09-07 16:11 ` Jens Axboe [this message]
2020-09-07 16:14 ` Jens Axboe
2020-09-07 16:18 ` Jens Axboe
2020-09-08 2:28 ` Xiaoguang Wang
2020-09-08 2:53 ` Xiaoguang Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox