From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server-vie001.gnuweeb.org X-Spam-Level: X-Spam-Status: No, score=-1.2 required=5.0 tests=ALL_TRUSTED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,URIBL_DBL_BLOCKED_OPENDNS, URIBL_ZEN_BLOCKED_OPENDNS autolearn=ham autolearn_force=no version=3.4.6 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gnuweeb.org; s=new2025; t=1754376630; bh=XXDrQHLmJmeBDTnjR7l6ZV7LlFbB0ajAcHojHe4VMks=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type: Content-Transfer-Encoding:Message-ID:Date:From:Reply-To:Subject:To: Cc:In-Reply-To:References:Resent-Date:Resent-From:Resent-To: Resent-Cc:User-Agent:Content-Type:Content-Transfer-Encoding; b=elSxpFdlIEEkiMLe/1oBT3qWusyqx2xXvg5+nABiYfwiIQKnhjloRzIJgDtFpR3F7 5DqC2PzZO5ehF/LEJ2lj3nHkZ0falHK6C1Uz75YYc4BjgKOnjU5Ct3Lm7KkJgAJuiW Kw/41HdlJ12pqFYLvObn5MsYCr0b9XTlNX1S+tC7iEL2C4QChtK6ve2zUgsv+g3vyR 8SEEqUFCqWS4AHDoKUC31SwlcwRdL4ouxFEOUpB66XDFWcaOZdwBRK9jQ8Ja2bHFtS 2+TDU1KeUMtbSL7YNzIFSsAuTVpk9A37DHZsIOa779mfonmVCcUw77A7TL/PHfarUP KSzANULpgcFvg== Received: from zero (unknown [182.253.151.158]) by server-vie001.gnuweeb.org (Postfix) with ESMTPSA id B60EC312801D; Tue, 5 Aug 2025 06:50:29 +0000 (UTC) From: Ahmad Gani To: Ammar Faizi Cc: Ahmad Gani , Alviro Iskandar Setiawan , GNU/Weeb Mailing List Subject: [PATCH gwproxy v3 0/9] Initial work for DNS lookup implementation Date: Tue, 5 Aug 2025 13:49:20 +0700 Message-ID: <20250805064933.109080-1-reyuki@gnuweeb.org> X-Mailer: git-send-email 2.50.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit List-Id: Hi Chief, This is revision v3 of initial work of dns lookup feature, The patches aren't final, but it's enough to get a grasp of what the interface looks like. I've provided a temporary entry point main function to test it and uses epoll to make it asynchronous. There are 9 patches in this series: - create net.h and net.c for network related functionality and struct - add default_port parameter to convert_str_to_ssaddr function - allow only port string number in service parameter of gwp_dns_queue - initial work for implementation of C-ares-like getaddrinfo function - make gw_ares_getaddrinfo function asynchronous - update unit test on dnsparser.c - fix serialize_answ function in dnsparser.c - transaction id creation is delegated to caller - code style refactor # Remove TODO(reyuki): hints->ai_family is used to filter results The filtering has already been completed using a bitmask. As for sorting the results, that responsibility is delegated to the caller, allowing them to choose their preferred sorting method. # Make UDP socket non-blocking As discussed previously [1], I made the UDP socket non-blocking for asynchronous operation and provide a function helper for caller to handle socket-related operation. # Move socket creation to library initialization Since we use the UDP protocol, socket creation is moved to library initialization, so it is just created once and reused as many times as possible. # Make UDP socket unconnected Avoid connect syscall on UDP socket to allow communicating with multiple recursive resolvers or name servers with sendto and recvfrom syscalls. # Restructure internal struct and program execution flow Unlike C-ares, this design requires the getaddrinfo callback to be specified during initialization, since it typically points to the same handler and rarely changes between calls. To support non-blocking behavior, the program needs to track the state of each request, so I need to make changes to the structure and program execution flow. Each gw_ares_getaddrinfo call creates a new request on the internal side of dnslookup; you can see the gw_ares_request struct for what's stored. ## The changes are made based on the following principles: 1. The DNS-specific details are intentionally hidden from the program The DNS-specific tasks and operations are intentionally hidden from the program using the library. The program doesn't need to know anything about the DNS protocol. 2. How the library and the program interact The library MUST tell the program if there's an error inside the gw_ares_getaddrinfo call, or if the getaddrinfo operation has completed. This is achieved by calling the getaddrinfo_cb function. The library MUST tell the program if a socket file descriptor is created or a particular event is adjusted. This is achieved by calling the sockstate_cb function. The program is responsible for handling this appropriately using the callback it has specified. Although no event is currently adjusted, it may be implemented in a future patch series. The program MUST tell the library if a particular socket file descriptor is ready to be used for a certain operation, like recv or send. This is achieved by calling the gw_ares_process_event function. Although currently only recv is executed, other operations may be added in a future patch series. ## And here's how the library is supposed to work after these changes: Initialize the library will create a UDP socket that can be used internally by the library to communicate with the DNS server. There are two callbacks, and to avoid the confusion of what callback I'm talking about, let's name them getaddrinfo_cb and sockstate_cb. You must prepare both and pass it to gw_ares_init through gw_ares_options, the sockstate_cb will receive a socket file descriptor from the callback and it will be used in your event notification setup (poll, epoll, etc...). You can specify ai_family member in the hints structure to filter the addrinfo in the results, use AF_UNSPEC to select both IPv4 and IPv6. When you call gw_ares_getaddrinfo, it will attempt to send DNS packet to DNS server you have specified in gw_ares_options. It will not block your program, however you are supposed to call gw_ares_process_event later in your own event loop. The function gw_ares_process_event is an abstraction provided by the library to handle socket-related tasks and give you the results by calling the callback you've specified in the gw_ares_options struct. # Update unit test of dns parser Because I wanted to test how the DNS server behaves when handling multiple requests, I updated the old unit test. As expected, it's possible to send multiple requests—as long as you don't batch them (i.e., QDCOUNT > 1) [2]. As Sir Ammar pointed out, the UDP protocol is stateless. That means you can use a single UDP socket to send packets to and receive packets from multiple destinations. So there's no need to wait for a response (recv) before sending the next request—you can send several first, then receive them later. # Transaction id creation is delegated to caller Because UDP protocol is unreliable, there's no guarantee in the order of packet arrival. A single request can have two transaction IDs if the address family hint is AF_UNSPEC. To simplify this, the program now uses the same txid for both IPv4 and IPv6. This avoids the need to store two txids in an array, which can cause issues when responses arrive out of order. # Fix logic bug in serialize_answ that can lead to memory error: invalid read I accidentally found a logic bug in the DNS parser while updating the unit test code. It was first detected by Valgrind, but the error report did not directly address the real issue: ==764369== Conditional jump or move depends on uninitialised value(s) ==764369== at 0x49B8C9A: inet_ntop6 (inet_ntop.c:126) ==764369== by 0x49B8C9A: inet_ntop (inet_ntop.c:59) ==764369== by 0x4002473: convert_ssaddr_to_str (net.c:91) ==764369== by 0x4001FFA: test_parse_ipv6 (dnsparser.c:320) ==764369== by 0x400206B: run_all_tests (dnsparser.c:341) ==764369== by 0x4002097: main (dnsparser.c:347) I wonder why Valgrind said that even though I already memset gs? So I tried with ASAN, and it detected a stack-buffer-overflow. It is triggered by a malformed DNS response packet; probably I was copy-pasting the raw hex that was incomplete, and it led to this finding. The solution is simple: check how many bytes remain before reading—never read if there are none left, along with other changes made to fix the problem. # Changelog v1 -> v2: - use existing convert_str_to_ssaddr instead of init_addr - fix memory leak when init_addr (now it's replaced) failed - modify convert_str_to_ssaddr to support default port - bring back __cold attribute on convert_ssaddr_to_str - don't fill a dangling pointer as Sir Alviro said [3] - for now it's blocking, attempt_retry label is not needed v2 -> v3: - remove TODO(reyuki): hints->ai_family is used to filter results - make gw_ares_getaddrinfo asynchronous by change UDP socket to non-blocking - move socket creation to library initialization - make UDP socket unconnected - restructure internal struct and program execution flow to support async - update unit test of dns parser - transaction id creation is delegated to caller - fix logic bug in serialize_answ that can lead to memory error: invalid read - rename parameter prt and split commit message - move variable declaration in for loop according to the Sir Alviro coding-style [1]: - https://lore.gnuweeb.org/gwml/aIwkO9Aw+IFht0vd@biznet-home.integral.gnuweeb.org/ [2]: - www.ietf.org/rfc/rfc9619.pdf - https://stackoverflow.com/questions/4082081/requesting-a-and-aaaa-records-in-single-dns-query/4083071 [3]: - https://lore.gnuweeb.org/gwml/CAOG64qMPmFL-+7OrVv4psmyOt8G-3OZJGa=bQ1a_wuV06WTyng@mail.gmail.com/ Signed-off-by: Ahmad Gani --- Ahmad Gani (9): dnslookup: Split common functionality and struct into net.h and net.c dnslookup: Add a new parameter, default_port. dnslookup: Allow only port string number dnslookup: Initial work for implementation of C-ares-like getaddrinfo function dnsparser: Update unit test of dns parser dnsparser: Fix serialize_answ function dnsparser: Transaction id creation is delegated to caller dnslookup: Make gw_ares_getaddrinfo asynchronous dnslookup: code style changes Makefile | 1 + man/gwp_dns_queue.3 | 2 +- src/gwproxy/dns.h | 12 +- src/gwproxy/dnslookup.c | 432 ++++++++++++++++++++++++++++++++++++++++ src/gwproxy/dnslookup.h | 161 +++++++++++++++ src/gwproxy/dnsparser.c | 416 ++++++++++++++++++++++++++++++++++++++ src/gwproxy/dnsparser.h | 192 ++++++++++++++++++ src/gwproxy/gwproxy.c | 93 +-------- src/gwproxy/net.c | 103 ++++++++++ src/gwproxy/net.h | 35 ++++ 10 files changed, 1345 insertions(+), 102 deletions(-) create mode 100644 src/gwproxy/dnslookup.c create mode 100644 src/gwproxy/dnslookup.h create mode 100644 src/gwproxy/dnsparser.c create mode 100644 src/gwproxy/dnsparser.h create mode 100644 src/gwproxy/net.c create mode 100644 src/gwproxy/net.h base-commit: 0753f2d766e85fcbffc1f83dfd4e67d6206591cb -- Ahmad Gani