public inbox for gwml@vger.gnuweeb.org
 help / color / mirror / Atom feed
From: reyuki <reyuki@gnuweeb.org>
To: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Cc: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>,
	 "GNU/Weeb Mailing List" <gwml@vger.gnuweeb.org>
Subject: Re: [PATCH gwproxy v1 0/3] Initial work for DNS lookup implementation
Date: Fri, 1 Aug 2025 08:49:22 +0700	[thread overview]
Message-ID: <CAADvAgonYwm-mwHB-zC_MOb5gYWvd=eH2KeZ7h6WTSF0Tgud0A@mail.gmail.com> (raw)
In-Reply-To: <aItyGEFcXFieqp+r@linux.gnuweeb.org>

On Thu, Jul 31, 2025 at 8:39 PM Ammar Faizi wrote:
>
> On Thu, Jul 31, 2025 at 10:07:43AM +0700, Ahmad Gani wrote:
>> Is it preferred to use the current model (spawning dedicated DNS threads)
>> and make the behavior of the resolver the same as getaddrinfo (blocking)?
>> So far I've created the addrinfo interfaces with C-ares style and tested
>> it; although, it's still blocking.
>>
>> I would like to know the numbers for comparison between the thread model
>> vs. the asynchronous model. Which approach is best for this scenario/case
>> (DNS resolution)?
>
> It's better if the DNS resolution can be done non-blocking via epoll
> than a separate thread.
>
> With dedicated DNS worker threads, you need:
>
>   1) Queue.
>   2) Mutex.
>   3) Condvar.
>   4) An evenfd to notify the sleeping epoll.
>   5) A reference count to avoid UAF on cancellation.
>   6) Open-and-close a SOCK_DGRAM for each query.
>   7) eventfd_write() from the producer.
>   8) eventfd_read() from the consumer (sleeping epoll).
>
> The steps to perform just a single query are unecessarily compilated.
> The communication between multiple threads costs could have been elided
> with a non-blocking pollable socket.
>
> With a non-blocking pollable socket, everything is done more efficiently:
>
>   1) No contention waiting on queue mutex lock.
>   2) Only one SOCK_DGRAM socket is needed (can be reused forever).
>   3) No event fd is needed.
>   4) No mutex is not needed (maybe only for the cache, but even that,
>      it's very minimal with rwlock, not a full mutex protection).
>
> You can scale up the number of SOCK_DGRAM sockets if you ever want to
> use multiple DNS servers. Anyway, since SOCK_DGRAM is stateless, you
> can even use one SOCK_DGRAM to send-and-recv to-from multiple DNS
> servers (see sendto(2) and recvfrom(2)).
>
> Also, with a very busy proxy server, you'll be more far away from
> hitting RLIMIT_NOFILE as the number of fds is probably cut in half
> (no event fd + socket fds created by getaddrinfo() internally).
>
>> I feel like multi-threading is much faster than single-threading, as it's
>> executed in true parallel (on a multi-core system) compared to
>> concurrently doing things with an event notification mechanism.
>
> Yes, but you should add more threads that call epoll_wait(). Not the
> number of DNS threads. The reason why we have so many DNS threads is
> because we can't poll it. Not because we have maxed out a CPU core to
> 100%.
>
> For now, multithread makes things faster when performing getaddrinfo()
> because multiple queries are done asynchronously. It's not because
> getaddrinfo() calls have fully eaten your CPU core. Right?
>
> Ideally, the same thing could be achieved with epoll_wait() without
> extra mutex, condvar and eventfd (which is cheaper).
>
> Communication between threads is costly, avoid it if possible. If you
> can have multithreaded workload with zero communication between threads,
> that's very good. And that's one of the reasons we want to invent our own
> DNS resolver. It's an effort to reduce the communication between threads.

Thanks for convincing me, now I can decide to use the polling option instead
of blocking getaddrinfo option that's executed in dedicated threads.

I think I get the general idea of using pollable socket:
- mark the UDP socket as non block
- let the caller have reference to the socket, so they can poll it

But have difficulties in the technical implementation:

When the caller is notified, they must call what?

It is a UDP socket, it's said to be stateless, does connect can possibly
return EINPROGRESS? It seems to be a common case in non-blocking
TCP socket. And if EAGAIN is received from recv, do we retry from send?

I find it unclear how the library and the program interact when using the
non-blocking behavior, but I think I will give it a thought (maybe add an
internal state to address these concerns).

>> I also noticed that in the C-ares example [3], they recommend using the
>> event thread example, and it is similar to the io_uring model in my
>> perspective, where the operation is executed internally instead of
>> letting the caller poll for readiness with ares_process_fd. Maybe we can
>> mimic this aspect?
>
> That's wrong, io_uring arms poll for networking workloads, it does not
> create io-wq threads (except for shutdown()).

I didn't talk about io-wq threads; I don't have any technical
understanding or knowledge about them. When I said 'the operation is
executed internally,' what I meant was my general understanding of the
io_uring model that tells of the completion of certain operations you've
requested. which is different compared to the epoll model that tells
readiness for a specific operation, and you execute that operation on
your own.

> https://discord.com/channels/1241076672589991966/1241076672589991970/1398781840546074624

I can't view the linked message from the given link, is it a private community?

--
Ahmad Gani

  reply	other threads:[~2025-08-01  1:49 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-31  3:07 [PATCH gwproxy v1 0/3] Initial work for DNS lookup implementation Ahmad Gani
2025-07-31  3:07 ` [PATCH gwproxy v1 1/3] dnslookup: split common functionality and struct into net.c Ahmad Gani
2025-07-31 14:01   ` Ammar Faizi
2025-07-31 18:28     ` Alviro Iskandar Setiawan
2025-07-31 18:36       ` Ammar Faizi
2025-07-31 18:42         ` Alviro Iskandar Setiawan
2025-07-31 18:53           ` Ammar Faizi
2025-07-31 19:03             ` Alviro Iskandar Setiawan
2025-07-31  3:07 ` [PATCH gwproxy v1 2/3] dnslookup: Allow only port string number Ahmad Gani
2025-07-31  3:07 ` [PATCH gwproxy v1 3/3] dnslookup: Initial work for implementation of C-ares-like getaddrinfo function Ahmad Gani
2025-07-31 18:19   ` Alviro Iskandar Setiawan
2025-07-31 19:14   ` Alviro Iskandar Setiawan
2025-08-01  1:51     ` reyuki
2025-08-01 23:32       ` Alviro Iskandar Setiawan
2025-07-31 13:39 ` [PATCH gwproxy v1 0/3] Initial work for DNS lookup implementation Ammar Faizi
2025-08-01  1:49   ` reyuki [this message]
2025-08-01  2:19     ` Ammar Faizi
2025-08-05  6:28       ` reyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAADvAgonYwm-mwHB-zC_MOb5gYWvd=eH2KeZ7h6WTSF0Tgud0A@mail.gmail.com' \
    --to=reyuki@gnuweeb.org \
    --cc=alviro.iskandar@gnuweeb.org \
    --cc=ammarfaizi2@gnuweeb.org \
    --cc=gwml@vger.gnuweeb.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox