From: Ahmad Gani <reyuki@gnuweeb.org>
To: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Cc: Ahmad Gani <reyuki@gnuweeb.org>,
Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>,
GNU/Weeb Mailing List <gwml@vger.gnuweeb.org>
Subject: [PATCH gwproxy v3 0/9] Initial work for DNS lookup implementation
Date: Tue, 5 Aug 2025 13:49:20 +0700 [thread overview]
Message-ID: <20250805064933.109080-1-reyuki@gnuweeb.org> (raw)
Hi Chief,
This is revision v3 of initial work of dns lookup feature, The patches
aren't final, but it's enough to get a grasp of what the interface looks
like. I've provided a temporary entry point main function to test it and
uses epoll to make it asynchronous.
There are 9 patches in this series:
- create net.h and net.c for network related functionality and struct
- add default_port parameter to convert_str_to_ssaddr function
- allow only port string number in service parameter of gwp_dns_queue
- initial work for implementation of C-ares-like getaddrinfo function
- make gw_ares_getaddrinfo function asynchronous
- update unit test on dnsparser.c
- fix serialize_answ function in dnsparser.c
- transaction id creation is delegated to caller
- code style refactor
# Remove TODO(reyuki): hints->ai_family is used to filter results
The filtering has already been completed using a bitmask. As for sorting
the results, that responsibility is delegated to the caller, allowing
them to choose their preferred sorting method.
# Make UDP socket non-blocking
As discussed previously [1], I made the UDP socket non-blocking for
asynchronous operation and provide a function helper for caller to handle
socket-related operation.
# Move socket creation to library initialization
Since we use the UDP protocol, socket creation is moved to library
initialization, so it is just created once and reused as many times as
possible.
# Make UDP socket unconnected
Avoid connect syscall on UDP socket to allow communicating with multiple
recursive resolvers or name servers with sendto and recvfrom syscalls.
# Restructure internal struct and program execution flow
Unlike C-ares, this design requires the getaddrinfo callback to be specified
during initialization, since it typically points to the same handler and
rarely changes between calls.
To support non-blocking behavior, the program needs to track the state of
each request, so I need to make changes to the structure and program
execution flow.
Each gw_ares_getaddrinfo call creates a new request on the internal side
of dnslookup; you can see the gw_ares_request struct for what's stored.
## The changes are made based on the following principles:
1. The DNS-specific details are intentionally hidden from the program
The DNS-specific tasks and operations are intentionally hidden from the
program using the library. The program doesn't need to know anything about
the DNS protocol.
2. How the library and the program interact
The library MUST tell the program if there's an error inside the
gw_ares_getaddrinfo call, or if the getaddrinfo operation has completed.
This is achieved by calling the getaddrinfo_cb function.
The library MUST tell the program if a socket file descriptor is created or
a particular event is adjusted. This is achieved by calling the
sockstate_cb function.
The program is responsible for handling this appropriately using the
callback it has specified.
Although no event is currently adjusted, it may be implemented in a future
patch series.
The program MUST tell the library if a particular socket file descriptor
is ready to be used for a certain operation, like recv or send. This is
achieved by calling the gw_ares_process_event function.
Although currently only recv is executed, other operations may be added
in a future patch series.
## And here's how the library is supposed to work after these changes:
Initialize the library will create a UDP socket that can be used
internally by the library to communicate with the DNS server.
There are two callbacks, and to avoid the confusion of what callback
I'm talking about, let's name them getaddrinfo_cb and sockstate_cb.
You must prepare both and pass it to gw_ares_init through gw_ares_options,
the sockstate_cb will receive a socket file descriptor from the callback
and it will be used in your event notification setup (poll, epoll, etc...).
You can specify ai_family member in the hints structure to filter the
addrinfo in the results, use AF_UNSPEC to select both IPv4 and IPv6.
When you call gw_ares_getaddrinfo, it will attempt to send DNS packet to
DNS server you have specified in gw_ares_options. It will not block your
program, however you are supposed to call gw_ares_process_event later in
your own event loop.
The function gw_ares_process_event is an abstraction provided by the
library to handle socket-related tasks and give you the results by
calling the callback you've specified in the gw_ares_options struct.
# Update unit test of dns parser
Because I wanted to test how the DNS server behaves when handling multiple
requests, I updated the old unit test. As expected, it's possible to send
multiple requests—as long as you don't batch them (i.e., QDCOUNT > 1) [2].
As Sir Ammar pointed out, the UDP protocol is stateless. That means you
can use a single UDP socket to send packets to and receive packets from
multiple destinations. So there's no need to wait for a response (recv)
before sending the next request—you can send several first, then receive
them later.
# Transaction id creation is delegated to caller
Because UDP protocol is unreliable, there's no guarantee in the order of
packet arrival.
A single request can have two transaction IDs if the address family hint
is AF_UNSPEC. To simplify this, the program now uses the same txid for both
IPv4 and IPv6. This avoids the need to store two txids in an array, which
can cause issues when responses arrive out of order.
# Fix logic bug in serialize_answ that can lead to memory error: invalid read
I accidentally found a logic bug in the DNS parser while updating the
unit test code. It was first detected by Valgrind, but the error report
did not directly address the real issue:
==764369== Conditional jump or move depends on uninitialised value(s)
==764369== at 0x49B8C9A: inet_ntop6 (inet_ntop.c:126)
==764369== by 0x49B8C9A: inet_ntop (inet_ntop.c:59)
==764369== by 0x4002473: convert_ssaddr_to_str (net.c:91)
==764369== by 0x4001FFA: test_parse_ipv6 (dnsparser.c:320)
==764369== by 0x400206B: run_all_tests (dnsparser.c:341)
==764369== by 0x4002097: main (dnsparser.c:347)
I wonder why Valgrind said that even though I already memset gs? So I
tried with ASAN, and it detected a stack-buffer-overflow. It is triggered
by a malformed DNS response packet; probably I was copy-pasting the raw
hex that was incomplete, and it led to this finding.
The solution is simple: check how many bytes remain before reading—never
read if there are none left, along with other changes made to fix the problem.
# Changelog
v1 -> v2:
- use existing convert_str_to_ssaddr instead of init_addr
- fix memory leak when init_addr (now it's replaced) failed
- modify convert_str_to_ssaddr to support default port
- bring back __cold attribute on convert_ssaddr_to_str
- don't fill a dangling pointer as Sir Alviro said [3]
- for now it's blocking, attempt_retry label is not needed
v2 -> v3:
- remove TODO(reyuki): hints->ai_family is used to filter results
- make gw_ares_getaddrinfo asynchronous by change UDP socket to non-blocking
- move socket creation to library initialization
- make UDP socket unconnected
- restructure internal struct and program execution flow to support async
- update unit test of dns parser
- transaction id creation is delegated to caller
- fix logic bug in serialize_answ that can lead to memory error: invalid read
- rename parameter prt and split commit message
- move variable declaration in for loop according to the Sir Alviro coding-style
[1]:
- https://lore.gnuweeb.org/gwml/aIwkO9Aw+IFht0vd@biznet-home.integral.gnuweeb.org/
[2]:
- www.ietf.org/rfc/rfc9619.pdf
- https://stackoverflow.com/questions/4082081/requesting-a-and-aaaa-records-in-single-dns-query/4083071
[3]:
- https://lore.gnuweeb.org/gwml/CAOG64qMPmFL-+7OrVv4psmyOt8G-3OZJGa=bQ1a_wuV06WTyng@mail.gmail.com/
Signed-off-by: Ahmad Gani <reyuki@gnuweeb.org>
---
Ahmad Gani (9):
dnslookup: Split common functionality and struct into net.h and net.c
dnslookup: Add a new parameter, default_port.
dnslookup: Allow only port string number
dnslookup: Initial work for implementation of C-ares-like getaddrinfo
function
dnsparser: Update unit test of dns parser
dnsparser: Fix serialize_answ function
dnsparser: Transaction id creation is delegated to caller
dnslookup: Make gw_ares_getaddrinfo asynchronous
dnslookup: code style changes
Makefile | 1 +
man/gwp_dns_queue.3 | 2 +-
src/gwproxy/dns.h | 12 +-
src/gwproxy/dnslookup.c | 432 ++++++++++++++++++++++++++++++++++++++++
src/gwproxy/dnslookup.h | 161 +++++++++++++++
src/gwproxy/dnsparser.c | 416 ++++++++++++++++++++++++++++++++++++++
src/gwproxy/dnsparser.h | 192 ++++++++++++++++++
src/gwproxy/gwproxy.c | 93 +--------
src/gwproxy/net.c | 103 ++++++++++
src/gwproxy/net.h | 35 ++++
10 files changed, 1345 insertions(+), 102 deletions(-)
create mode 100644 src/gwproxy/dnslookup.c
create mode 100644 src/gwproxy/dnslookup.h
create mode 100644 src/gwproxy/dnsparser.c
create mode 100644 src/gwproxy/dnsparser.h
create mode 100644 src/gwproxy/net.c
create mode 100644 src/gwproxy/net.h
base-commit: 0753f2d766e85fcbffc1f83dfd4e67d6206591cb
--
Ahmad Gani
next reply other threads:[~2025-08-05 6:50 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-05 6:49 Ahmad Gani [this message]
2025-08-05 6:49 ` [PATCH gwproxy v3 1/9] dnslookup: Split common functionality and struct into net.h and net.c Ahmad Gani
2025-08-05 9:26 ` Ammar Faizi
2025-08-05 6:49 ` [PATCH gwproxy v3 2/9] dnslookup: Add a new parameter, default_port Ahmad Gani
2025-08-05 9:26 ` Ammar Faizi
2025-08-05 10:15 ` reyuki
2025-08-05 10:26 ` Ammar Faizi
2025-08-05 10:43 ` Ahmad Gani
2025-08-05 10:46 ` Ammar Faizi
2025-08-05 12:45 ` Ahmad Gani
2025-08-05 6:49 ` [PATCH gwproxy v3 3/9] dnslookup: Allow only port string number Ahmad Gani
2025-08-05 6:49 ` [PATCH gwproxy v3 4/9] dnslookup: Initial work for implementation of C-ares-like getaddrinfo function Ahmad Gani
2025-08-05 6:49 ` [PATCH gwproxy v3 5/9] dnsparser: Update unit test of dns parser Ahmad Gani
2025-08-05 9:27 ` Ammar Faizi
2025-08-05 6:49 ` [PATCH gwproxy v3 6/9] dnsparser: Fix serialize_answ function Ahmad Gani
2025-08-05 9:26 ` Ammar Faizi
2025-08-05 12:47 ` Ahmad Gani
2025-08-05 13:04 ` Ammar Faizi
2025-08-05 13:12 ` Ahmad Gani
2025-08-05 13:51 ` Ahmad Gani
2025-08-05 14:02 ` Ammar Faizi
2025-08-05 6:49 ` [PATCH gwproxy v3 7/9] dnsparser: Transaction id creation is delegated to caller Ahmad Gani
2025-08-05 6:49 ` [PATCH gwproxy v3 8/9] dnslookup: Make gw_ares_getaddrinfo asynchronous Ahmad Gani
2025-08-05 6:49 ` [PATCH gwproxy v3 9/9] dnslookup: code style changes Ahmad Gani
2025-08-05 9:26 ` Ammar Faizi
2025-08-05 13:22 ` [PATCH gwproxy v3 0/9] Initial work for DNS lookup implementation Ammar Faizi
2025-08-05 13:28 ` Ahmad Gani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250805064933.109080-1-reyuki@gnuweeb.org \
--to=reyuki@gnuweeb.org \
--cc=alviro.iskandar@gnuweeb.org \
--cc=ammarfaizi2@gnuweeb.org \
--cc=gwml@vger.gnuweeb.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox