On 11/21/24 9:57 AM, Jens Axboe wrote:
> I did run a basic IRQ storage test as-is, and will compare that with the
> llist stuff we have now. Just in terms of overhead. It's not quite a
> networking test, but you do get the IRQ side and some burstiness in
> terms of completions that way too, at high rates. So should be roughly
> comparable.

Perf looks comparable, it's about 60M IOPS. Some fluctuation with IRQ
driven, so won't render an opinion on whether one is faster than the
other. What is visible though is that adding and running local task_work
drops from 2.39% to 2.02% using spinlock + io_wq_work_list over llist,
and we entirely drop 2.2% of list reversing in the process.

-- 
Jens Axboe