* [PATCH gwproxy v8 0/2] Initial work on integration of DNS parser lib in gwproxy
@ 2025-08-29 7:55 Ahmad Gani
2025-08-29 7:55 ` [PATCH gwproxy v8 1/2] dnsparser: Add dns parser code Ahmad Gani
` (2 more replies)
0 siblings, 3 replies; 34+ messages in thread
From: Ahmad Gani @ 2025-08-29 7:55 UTC (permalink / raw)
To: Ammar Faizi; +Cc: Ahmad Gani, Alviro Iskandar Setiawan, GNU/Weeb Mailing List
[[ resend with minor fix ]]
Hi Chief,
This is revision v8 of the initial work on the integration of the DNS
parser lib in gwproxy. This is an RFC draft; the patches themselves
aren't final.
There are 2 patches in this series:
- refactor gwproxy codebase (specifically, DNS system)
- add dns parser code
Thanks for taking a look.
# Changelog
v1 -> v2:
- use existing convert_str_to_ssaddr instead of init_addr
- fix memory leak when init_addr (now it's replaced) failed
- modify convert_str_to_ssaddr to support default port
- bring back __cold attribute on convert_ssaddr_to_str
- don't fill a dangling pointer as Sir Alviro said
- for now it's blocking, attempt_retry label is not needed
v2 -> v3:
- remove TODO(reyuki): hints->ai_family is used to filter results
- make gw_ares_getaddrinfo asynchronous by change UDP socket to non-blocking
- move socket creation to library initialization
- make UDP socket unconnected
- restructure internal struct and program execution flow to support async
- update unit test of dns parser
- transaction id creation is delegated to caller
- fix logic bug in serialize_answ that can lead to memory error: invalid read
- rename parameter prt and split commit message
- move variable declaration in for loop according to the Sir Alviro coding-style
v3 -> v4:
- add else block in the convert_str_to_ssaddr function
- use format specifier %hu for uint16_t in printf
- update base commit to branch master from upstream remote repository
v4 -> v5:
- squash commits to eliminate changes unrelated to the commit subject
- change commit subject: use imperative form for the subject
- update base commit to branch master from upstream remote repository
- changing the email subject
- drop the c-ares style thing
- restructure dnsparser.c and dnsparser.h
v5 -> v6:
- mark this feature as experimental and disabled by default
- fix minor issues from master branch
- fix dns parser
- add dns server as gwproxy' cmdline option
- add fallback mechanism for raw dns backend
v6 -> v7:
- squash commits
- refactor gwproxy codebase (specifically, DNS system)
- add dns parser code
v7 -> v8:
- use __sys_close instead of libc's wrapper close
- add newline at EoF
- fix warning: ‘af’ may be used uninitialized [-Wmaybe-uninitialized]
- remove break that occured after goto statement
Signed-off-by: Ahmad Gani <reyuki@gnuweeb.org>
---
Ahmad Gani (2):
dnsparser: Add dns parser code
gwproxy: refactor code base to add experimental raw DNS backend
Makefile | 2 +-
configure | 8 +
src/gwproxy/dns.c | 241 +++++++++++++++--
src/gwproxy/dns.h | 29 +-
src/gwproxy/dnsparser.c | 581 ++++++++++++++++++++++++++++++++++++++++
src/gwproxy/dnsparser.h | 192 +++++++++++++
src/gwproxy/ev/epoll.c | 67 ++++-
src/gwproxy/gwproxy.c | 54 +++-
src/gwproxy/gwproxy.h | 5 +-
9 files changed, 1130 insertions(+), 49 deletions(-)
create mode 100644 src/gwproxy/dnsparser.c
create mode 100644 src/gwproxy/dnsparser.h
base-commit: b2a7a2d33ba8676d917fd26fc3a86bdde8961d76
--
Ahmad Gani
^ permalink raw reply [flat|nested] 34+ messages in thread* [PATCH gwproxy v8 1/2] dnsparser: Add dns parser code 2025-08-29 7:55 [PATCH gwproxy v8 0/2] Initial work on integration of DNS parser lib in gwproxy Ahmad Gani @ 2025-08-29 7:55 ` Ahmad Gani 2025-08-29 7:55 ` [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend Ahmad Gani 2025-09-05 9:18 ` [PATCH gwproxy v8 0/2] Initial work on integration of DNS parser lib in gwproxy Ammar Faizi 2 siblings, 0 replies; 34+ messages in thread From: Ahmad Gani @ 2025-08-29 7:55 UTC (permalink / raw) To: Ammar Faizi; +Cc: Ahmad Gani, Alviro Iskandar Setiawan, GNU/Weeb Mailing List Introducing dns parser for better flexibility over the program flow, the DNS protocol implementation is limited to standard query (OPCODE_QUERY). Also, Delegate creation of transaction id to the caller, It gives the caller more flexibility in handling DNS requests. A single domain name lookup can have two responses: one for IPv4 and one for IPv6. With this change, the caller may use the same txid for both queries, making it easier to group related DNS responses. Signed-off-by: Ahmad Gani <reyuki@gnuweeb.org> --- src/gwproxy/dnsparser.c | 581 ++++++++++++++++++++++++++++++++++++++++ src/gwproxy/dnsparser.h | 192 +++++++++++++ 2 files changed, 773 insertions(+) create mode 100644 src/gwproxy/dnsparser.c create mode 100644 src/gwproxy/dnsparser.h diff --git a/src/gwproxy/dnsparser.c b/src/gwproxy/dnsparser.c new file mode 100644 index 000000000000..cb2a090dfbb0 --- /dev/null +++ b/src/gwproxy/dnsparser.c @@ -0,0 +1,581 @@ +#define _DEFAULT_SOURCE +#include <endian.h> +#include <stdbool.h> +#include <gwproxy/dnsparser.h> +#include <gwproxy/net.h> + +static ssize_t construct_qname(uint8_t *dst, size_t dst_len, const char *qname) +{ + const uint8_t *p = (const uint8_t *)qname; + uint8_t *lp = dst; /* Length position. */ + uint8_t *sp = lp + 1; /* String position. */ + size_t total = 0; + uint16_t l = 0; + + while (1) { + uint8_t c = *p++; + + total++; + if (total >= dst_len) + return -ENAMETOOLONG; + + if (c == '.' || c == '\0') { + if (l < 1 || l > DOMAIN_LABEL_LIMIT) + return -EINVAL; + + *lp = (uint8_t)l; + lp = sp++; + l = 0; + if (!c) + break; + } else { + l++; + *sp = c; + sp++; + } + } + + return total; +} + +static int calculate_question_len(uint8_t *in, size_t in_len) +{ + const uint8_t *p = in; + int tot_len, advance_len; + + tot_len = 0; + while (true) { + if (*p == 0x0) { + tot_len++; + break; + } + + if (tot_len >= (int)in_len) + return -ENOBUFS; + + advance_len = *p + 1; + tot_len += advance_len; + p += advance_len; + } + + if (tot_len > DOMAIN_NAME_LIMIT) + return -ENAMETOOLONG; + + tot_len += 4; + if (tot_len >= (int)in_len) + return -ENOBUFS; + + return tot_len; +} + +static int serialize_answ(uint16_t txid, uint8_t *in, size_t in_len, gwdns_answ_data *out) +{ + gwdns_header_pkt *hdr; + uint16_t raw_flags; + size_t idx, i; + void *ptr; + int ret; + + idx = sizeof(*hdr); + if (idx >= in_len) + return -EAGAIN; + + hdr = (void *)in; + if (memcmp(&txid, &hdr->id, sizeof(txid))) + return -EINVAL; + + memcpy(&raw_flags, &hdr->flags, sizeof(raw_flags)); + raw_flags = ntohs(raw_flags); + /* QR MUST 1 = response from dns server */ + if (!DNS_QR(raw_flags)) + return -EINVAL; + + /* OPCODE MUST 0 = standard query */ + if (DNS_OPCODE(raw_flags)) + return -EINVAL; + + /* RCODE MUST 0 = No error */ + if (DNS_RCODE(raw_flags)) + return -EPROTO; + + // is it safe or recommended to alter the in buffer directly? + hdr->ancount = ntohs(hdr->ancount); + if (!hdr->ancount) + return -ENODATA; + + /* + * Check the sizes upfront. + * + * 1 bytes for variable-length + * in[idx] for the length of first name + * 1 bytes for null terminator + * 2 bytes for qtype + * 2 bytes for qclass + */ + if ((size_t)(1 + in[idx] + 1 + 2 + 2) >= in_len) + return -EINVAL; + + ret = calculate_question_len(&in[idx], in_len - idx); + if (ret <= 0) + return -EINVAL; + + idx += ret; + if (idx >= in_len) + return -EAGAIN; + + out->hdr.ancount = 0; + ptr = malloc(hdr->ancount * sizeof(uint8_t *)); + if (!ptr) + return -ENOMEM; + + out->rr_answ = ptr; + for (i = 0; i < hdr->ancount; i++) { + uint16_t is_compressed, rdlength; + gwdns_serialized_answ *item = malloc(sizeof(gwdns_serialized_answ)); + if (!item) { + ret = -ENOMEM; + goto exit_free; + } + + memcpy(&is_compressed, &in[idx], sizeof(is_compressed)); + is_compressed = ntohs(is_compressed); + is_compressed = DNS_IS_COMPRESSED(is_compressed); + assert(is_compressed); + idx += 2; // NAME + if (idx >= in_len) { + ret = -EAGAIN; + free(item); + goto exit_free; + } + + memcpy(&item->rr_type, &in[idx], 2); + item->rr_type = ntohs(item->rr_type); + idx += 2; // TYPE + if (idx >= in_len) { + ret = -EAGAIN; + free(item); + goto exit_free; + } + memcpy(&item->rr_class, &in[idx], 2); + item->rr_class = ntohs(item->rr_class); + idx += 2; // CLASS + if (idx >= in_len) { + ret = -EAGAIN; + free(item); + goto exit_free; + } + memcpy(&item->ttl, &in[idx], 4); + item->ttl = be32toh(item->ttl); + idx += 4; // TTL + if (idx >= in_len) { + ret = -EAGAIN; + free(item); + goto exit_free; + } + + memcpy(&rdlength, &in[idx], sizeof(rdlength)); + rdlength = ntohs(rdlength); + switch (item->rr_type) { + case TYPE_AAAA: + if (rdlength != sizeof(struct in6_addr)) { + ret = -EINVAL; + free(item); + goto exit_free; + } + break; + case TYPE_A: + if (rdlength != sizeof(struct in_addr)) { + ret = -EINVAL; + free(item); + goto exit_free; + } + break; + case TYPE_CNAME: + idx += 2 + rdlength; + free(item); + continue; + + default: + ret = -EINVAL; + free(item); + goto exit_free; + break; + } + + item->rdlength = rdlength; + idx += 2; + if (idx >= in_len) { + ret = -EAGAIN; + free(item); + goto exit_free; + } + + ptr = malloc(rdlength); + if (!ptr) { + ret = -ENOMEM; + free(item); + goto exit_free; + } + + memcpy(ptr, &in[idx], rdlength); + idx += rdlength; + if (idx > in_len) { + ret = -EINVAL; + free(item); + free(ptr); + goto exit_free; + } + + item->rdata = ptr; + out->rr_answ[out->hdr.ancount] = item; + out->hdr.ancount++; + } + + if (!out->hdr.ancount) { + free(out->rr_answ); + return -ENODATA; + } + + return 0; +exit_free: + for (i = 0; i < out->hdr.ancount; i++) { + free(out->rr_answ[i]->rdata); + free(out->rr_answ[i]); + } + free(out->rr_answ); + return ret; +} + +static void free_serialize_answ(gwdns_answ_data *answ) +{ + size_t i; + for (i = 0; i < answ->hdr.ancount; i++) { + free(answ->rr_answ[i]->rdata); + free(answ->rr_answ[i]); + } + free(answ->rr_answ); +} + +int gwdns_parse_query(uint16_t txid, const char *service, + uint8_t *in, size_t in_len, + struct gwdns_addrinfo_node **ai) +{ + struct gwdns_addrinfo_node *results, *tail; + gwdns_answ_data raw_answ; + int r, port; + size_t i; + + port = atoi(service); + if (port < 0) + return -EINVAL; + port = htons(port); + + r = serialize_answ(txid, in, in_len, &raw_answ); + if (r) + return r; + + results = tail = NULL; + for (i = 0; i < raw_answ.hdr.ancount; i++) { + struct gwdns_addrinfo_node *new_node; + gwdns_serialized_answ *answ; + struct sockaddr_in6 *i6; + struct sockaddr_in *i4; + + answ = raw_answ.rr_answ[i]; + new_node = malloc(sizeof(*new_node)); + if (!new_node) { + r = -ENOMEM; + goto exit_free; + } + new_node->ai_next = NULL; + + if (answ->rr_type == TYPE_AAAA) { + i6 = &new_node->ai_addr.i6; + new_node->ai_family = AF_INET6; + new_node->ai_addrlen = sizeof(i6); + i6->sin6_port = port; + i6->sin6_family = AF_INET6; + assert(sizeof(i6->sin6_addr) == answ->rdlength); + memcpy(&i6->sin6_addr, answ->rdata, answ->rdlength); + } else { + i4 = &new_node->ai_addr.i4; + new_node->ai_family = AF_INET; + new_node->ai_addrlen = sizeof(i4); + i4->sin_port = port; + i4->sin_family = AF_INET; + assert(sizeof(i4->sin_addr) == answ->rdlength); + memcpy(&i4->sin_addr, answ->rdata, answ->rdlength); + new_node->ai_ttl = answ->ttl; + } + + if (!tail) + results = new_node; + else + tail->ai_next = new_node; + tail = new_node; + } + + *ai = results; + r = 0; +exit_free: + free_serialize_answ(&raw_answ); + return r; +} + +void gwdns_free_parsed_query(struct gwdns_addrinfo_node *ai) +{ + struct gwdns_addrinfo_node *tmp, *node = ai; + while (node) { + tmp = node->ai_next; + free(node); + node = tmp; + } +} + +static ssize_t construct_question(gwdns_question_part *question) +{ + gwdns_header_pkt *hdr; + gwdns_query_pkt pkt; + uint16_t qtype, qclass; + size_t required_len; + ssize_t bw; + + switch (question->type) { + case AF_INET6: + question->type = TYPE_AAAA; + break; + case AF_INET: + question->type = TYPE_A; + break; + default: + return -EINVAL; + } + + hdr = &pkt.hdr; + /* + * the memset implicitly set opcode to OPCODE_QUERY + */ + memset(hdr, 0, sizeof(*hdr)); + /* + * no need to htons, so no ntohs for comparison in serialize_answ. + */ + hdr->id = question->txid; + DNS_SET_RD(hdr->flags, true); + hdr->flags = htons(hdr->flags); + hdr->qdcount = htons(1); + + /* + * pkt.body is interpreted as question section + * for layout and format, see RFC 1035 4.1.2. Question section format + */ + bw = construct_qname(pkt.body, sizeof(pkt.body) - 3, question->domain); + if (bw < 0) + return bw; + + pkt.body[bw++] = 0x0; + qtype = htons(question->type); + qclass = htons(CLASS_IN); + memcpy(&pkt.body[bw], &qtype, 2); + bw += 2; + memcpy(&pkt.body[bw], &qclass, 2); + bw += 2; + + required_len = sizeof(pkt.hdr) + bw; + if (question->dst_len < required_len) + return -ENOBUFS; + + memcpy(question->dst_buffer, &pkt, required_len); + + return required_len; +} + +ssize_t gwdns_build_query(uint16_t txid, const char *name, int family, uint8_t *out, size_t out_len) +{ + gwdns_question_part q; + + q.domain = name; + q.type = family; + q.txid = txid; + q.dst_buffer = out; + q.dst_len = out_len; + return construct_question(&q); +} + +#ifdef RUNTEST + +void test_parse_ipv4(void) +{ + struct gwdns_addrinfo_node *d, *node; + uint16_t txid; + + uint8_t recv_pkt[] = { + 0x23, 0xc6, 0x81, 0x80, 0x00, 0x01, 0x00, 0x06, 0x00, 0x00, 0x00, 0x00, + 0x06, 0x67, 0x6f, 0x6f, 0x67, 0x6c, 0x65, 0x03, 0x63, 0x6f, 0x6d, 0x00, + 0x00, 0x01, 0x00, 0x01, 0xc0, 0x0c, 0x00, 0x01, 0x00, 0x01, 0x00, 0x00, + 0x00, 0x35, 0x00, 0x04, 0x4a, 0x7d, 0x18, 0x8a, 0xc0, 0x0c, 0x00, 0x01, + 0x00, 0x01, 0x00, 0x00, 0x00, 0x35, 0x00, 0x04, 0x4a, 0x7d, 0x18, 0x66, + 0xc0, 0x0c, 0x00, 0x01, 0x00, 0x01, 0x00, 0x00, 0x00, 0x35, 0x00, 0x04, + 0x4a, 0x7d, 0x18, 0x64, 0xc0, 0x0c, 0x00, 0x01, 0x00, 0x01, 0x00, 0x00, + 0x00, 0x35, 0x00, 0x04, 0x4a, 0x7d, 0x18, 0x8b, 0xc0, 0x0c, 0x00, 0x01, + 0x00, 0x01, 0x00, 0x00, 0x00, 0x35, 0x00, 0x04, 0x4a, 0x7d, 0x18, 0x65, + 0xc0, 0x0c, 0x00, 0x01, 0x00, 0x01, 0x00, 0x00, 0x00, 0x35, 0x00, 0x04, + 0x4a, 0x7d, 0x18, 0x71 + }; + + memcpy(&txid, recv_pkt, 2); + d = NULL; + assert(!gwdns_parse_query(txid, "80", recv_pkt, sizeof(recv_pkt), &d)); + assert(d); + node = d; + while (node) { + struct gwdns_addrinfo_node *tmp; + char buff[FULL_ADDRSTRLEN]; + + tmp = node->ai_next; + assert(node->ai_family == AF_INET); + convert_ssaddr_to_str(buff, &node->ai_addr); + printf("IPv4: %s\n", buff); + node = tmp; + } + + gwdns_free_parsed_query(d); +} + +void test_parse_ipv6(void) +{ + struct gwdns_addrinfo_node *d, *node; + uint16_t txid; + + uint8_t recv_pkt[] = { + 0x45, 0x67, + 0x81, 0x80, + 0x00, 0x01, + 0x00, 0x04, + 0x00, 0x00, + 0x00, 0x00, + + 0x06, 0x67, 0x6f, 0x6f, 0x67, 0x6c, 0x65, + 0x03, 0x63, 0x6f, 0x6d, + 0x00, + 0x00, 0x1c, + 0x00, 0x01, + + 0xc0, 0x0c, + 0x00, 0x1c, + 0x00, 0x01, + 0x00, 0x00, 0x09, 0x06, + 0x00, 0x10, + 0x24, 0x04, 0x68, 0x00, 0x40, 0x03, 0x0c, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x71, + + 0xc0, 0x0c, + 0x00, 0x1c, 0x00, 0x01, 0x00, 0x00, 0x09, 0x06, + 0x00, 0x10, + 0x24, 0x04, 0x68, 0x00, 0x40, 0x03, 0x0c, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x8a, + + 0xc0, 0x0c, + 0x00, 0x1c, + 0x00, 0x01, + 0x00, 0x00, 0x09, 0x06, + 0x00, 0x10, + 0x24, 0x04, 0x68, 0x00, 0x40, 0x03, 0x0c, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x65, + + 0xc0, 0x0c, + 0x00, 0x1c, + 0x00, 0x01, + 0x00, 0x00, 0x0c, 0x16, + 0x00, 0x10, + 0x24, 0x04, 0x68, 0x00, 0x40, 0x03, 0x0c, 0x11, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x71 + }; + + memcpy(&txid, recv_pkt, sizeof(txid)); + + node = NULL; + assert(!gwdns_parse_query(txid, "80", recv_pkt, sizeof(recv_pkt), &d)); + assert(d); + node = d; + while (node) { + struct gwdns_addrinfo_node *tmp; + char buff[FULL_ADDRSTRLEN]; + + tmp = node->ai_next; + assert(node->ai_family == AF_INET6); + convert_ssaddr_to_str(buff, &node->ai_addr); + printf("IPv6: %s\n", buff); + node = tmp; + } + + gwdns_free_parsed_query(d); +} + +void test_build_ipv4(void) +{ + uint8_t buff[UDP_MSG_LIMIT]; + gwdns_header_pkt *hdr; + uint16_t c; + ssize_t r; + + c = 0xFFFF; + r = gwdns_build_query(c, "google.com", AF_INET, buff, sizeof(buff)); + assert(r > 0); + + hdr = (void *)buff; + assert(ntohs(hdr->qdcount) == 1); + assert(!hdr->nscount); + assert(!DNS_QR(hdr->flags)); + assert(DNS_OPCODE(hdr->flags) == OPCODE_QUERY); + c = htons(TYPE_A); + assert(!memcmp(buff + 12 + 12, &c, 2)); +} + +void test_build_ipv6(void) +{ + uint8_t buff[UDP_MSG_LIMIT]; + gwdns_header_pkt *hdr; + uint16_t c; + ssize_t r; + + c = 0xFFFF; + r = gwdns_build_query(c, "google.com", AF_INET6, buff, sizeof(buff)); + assert(r > 0); + + hdr = (void *)buff; + assert(ntohs(hdr->qdcount) == 1); + assert(!hdr->nscount); + assert(!DNS_QR(hdr->flags)); + assert(DNS_OPCODE(hdr->flags) == OPCODE_QUERY); + c = htons(TYPE_AAAA); + assert(!memcmp(buff + 12 + 12, &c, 2)); +} + +/* + * test mock data of recv in both IPv4 and IPv6 + * + * the mock data are produced by this script: + * https://gist.github.com/realyukii/d7b450b4ddc305c66a2d8cd5600f23c4 + */ +void run_all_tests(void) +{ + /* + * We can't use serialize_answ to parse multiple response at once. + * The caller MUST call serialize_answ more than one time if there's + * more than one response, because txid is passed to only verify one + * response. + */ + fprintf(stderr, "test constructing DNS standard query packet for TYPE_A!\n"); + test_build_ipv4(); + fprintf(stderr, "test constructing DNS standard query packet for TYPE_AAAA!\n"); + test_build_ipv6(); + fprintf(stderr, "test parsing DNS standard query packet for TYPE_A!\n"); + test_parse_ipv4(); + fprintf(stderr, "test parsing DNS standard query packet for TYPE_AAAA!\n"); + test_parse_ipv6(); + fprintf(stderr, "all tests passed!\n"); +} + +int main(void) +{ + run_all_tests(); + return 0; +} + +#endif diff --git a/src/gwproxy/dnsparser.h b/src/gwproxy/dnsparser.h new file mode 100644 index 000000000000..6d5769cc3259 --- /dev/null +++ b/src/gwproxy/dnsparser.h @@ -0,0 +1,192 @@ +#include <stdint.h> +#include <stddef.h> +#include <assert.h> +#include <string.h> +#include <stdio.h> +#include <stdlib.h> +#include <errno.h> +#include <arpa/inet.h> +#include <sys/socket.h> + +#include <gwproxy/net.h> + +#ifndef GWP_DNS_PARSER_H +#define GWP_DNS_PARSER_H + +#ifndef __packed +#define __packed __attribute__((__packed__)) +#endif + +/* + * 4. MESSAGES + * 4.1. Format + * + * All communications inside of the domain protocol are carried in a single + * format called a message. The top-level format of a message is divided + * into 5 sections (some of which may be empty in certain cases), shown below: + * + * +---------------------+ + * | Header | + * +---------------------+ + * | Question | the question for the name server + * +---------------------+ + * | Answer | RRs answering the question + * +---------------------+ + * | Authority | RRs pointing toward an authority + * +---------------------+ + * | Additional | RRs holding additional information + * +---------------------+ + * + * These sections are defined in RFC 1035 §4.1. The Header section is always + * present and includes fields that specify which of the other sections follow, + * as well as metadata such as whether the message is a query or response, + * the opcode, etc. + */ + +/* Flag bit position in little-endian machine */ +#define DNS_QR_BIT 0xF +#define DNS_OPCODE_BIT 0xB // 4-bit field +#define DNS_AA_BIT 0xA +#define DNS_TC_BIT 0x9 +#define DNS_RD_BIT 0x8 +#define DNS_RA_BIT 0x7 +#define DNS_Z_BIT 0x4 // 3-bit field +#define DNS_RCODE_BIT 0x0 // 4-bit field +#define DNS_COMPRESSION_BIT (0x3 << 0xE) + +/* Flag extraction macros for listtle-endian machine */ +#define DNS_QR(flags) (((flags) >> DNS_QR_BIT) & 0x1) +#define DNS_OPCODE(flags) (((flags) >> DNS_OPCODE_BIT) & 0xF) +#define DNS_RCODE(flags) ((flags) & 0xF) +#define DNS_IS_COMPRESSED(mask) ((mask) & DNS_COMPRESSION_BIT) + +/* Flag construction macros for little-endian machine */ +#define DNS_SET_RD(flags, val) (flags) = ((flags) & ~(1 << DNS_RD_BIT)) | ((!!(val)) << DNS_RD_BIT) + +/* as per RFC 1035 §2.3.4. Size limits */ +#define DOMAIN_LABEL_LIMIT 63 +#define DOMAIN_NAME_LIMIT 255 +#define UDP_MSG_LIMIT 512 + +typedef enum { + OPCODE_QUERY = 0, // Standard query (QUERY) +} gwdns_op; + +typedef enum { + TYPE_A = 1, // IPv4 host address + TYPE_CNAME = 5, // the canonical name for an alias + TYPE_AAAA = 28, // IPv6 host address +} gwdns_type; + +typedef enum { + CLASS_IN = 1, // Internet +} gwdns_class; + +typedef struct { + uint16_t id; + uint16_t flags; + uint16_t qdcount; + uint16_t ancount; + uint16_t nscount; + uint16_t arcount; +} __packed gwdns_header_pkt; + +typedef struct { + uint8_t question[UDP_MSG_LIMIT]; + char answr[UDP_MSG_LIMIT]; +} gwdns_question_buffer; + +typedef struct { + uint8_t *dst_buffer; + uint16_t txid; + uint16_t type; + size_t dst_len; + const char *domain; +} gwdns_question_part; + +/* + * 4.1.3. Resource record format + * + * The answer, authority, and additional sections all share the same + * format: a variable number of resource records, where the number of + * records is specified in the corresponding count field in the header. + */ +typedef struct { + uint8_t *name; // DOMAIN NAME: variable‑length sequence of labels (length-byte followed by label, ending in 0), possibly compressed + uint16_t rr_type; // TYPE: two-octet code identifying the RR type (see gwdns_type) + uint16_t rr_class; // CLASS: two-octet code identifying the RR class (see gwdns_class) + uint32_t ttl; // TTL: 32-bit unsigned, time to live in seconds + uint16_t rdlength; // RDLENGTH: length in octets of RDATA + uint8_t *rdata; // RDATA: variable-length data, format depends on TYPE and CLASS +} gwdns_serialized_rr; + +typedef struct { + char qname[DOMAIN_NAME_LIMIT]; + uint16_t qtype; + uint16_t qclass; +} gwdns_serialized_question; + +typedef gwdns_serialized_rr gwdns_serialized_answ; + +typedef struct { + gwdns_header_pkt hdr; + uint8_t body[UDP_MSG_LIMIT]; +} gwdns_query_pkt; + +typedef struct { + gwdns_header_pkt hdr; + gwdns_serialized_question question; + gwdns_serialized_answ **rr_answ; +} gwdns_answ_data; + +struct gwdns_addrinfo_node { + int ai_family; + int ai_ttl; + socklen_t ai_addrlen; + struct gwp_sockaddr ai_addr; + struct gwdns_addrinfo_node *ai_next; +}; + +/* + * Build standard query for domain name lookup. + * + * The caller may need to check for potential transaction ID collisions. + * + * Possible errors are: + * - ENAMETOOLONG name is too long. + * - ENOBUFS length specified by out_len is not sufficient. + * - EINVAL malformed name or unsupported value of family. + * + * @param txid transaction id + * @param name domain name + * @param family choose request for IPv4 or IPv6 + * @param out destination buffer for constructed packet + * @param out_len available capacity of destination buffer + * @return length of bytes written into dst_buffer on success, + * or a negative integer on failure. + */ +ssize_t gwdns_build_query(uint16_t txid, const char *name, int family, uint8_t *out, size_t out_len); + +/* + * Parse name server's answer + * + * Possible errors are: + * -EAGAIN in buffer is not sufficient, no bytes are processed, need more data. + * -EINVAL the content of in buffer is not valid. + * -ENOMEM failed to allocate dynamic memory. + * -ENODATA the packet didn't contain any answers. + * -EPROTO the DNS server can't understand your question + * + * @param txid transaction id of question + * @param service port number in ascii + * @param in a pointer to buffer that need to be parsed + * @param in_len a pointer to buffer that need to be parsed + * @param ai a pointer to address info + * @return zero on success or a negative number on failure + */ +int gwdns_parse_query(uint16_t txid, const char *service, + uint8_t *in, size_t in_len, + struct gwdns_addrinfo_node **ai); +void gwdns_free_parsed_query(struct gwdns_addrinfo_node *addrinfo); + +#endif /* #ifndef GWP_DNS_PARSER_H */ -- Ahmad Gani ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-08-29 7:55 [PATCH gwproxy v8 0/2] Initial work on integration of DNS parser lib in gwproxy Ahmad Gani 2025-08-29 7:55 ` [PATCH gwproxy v8 1/2] dnsparser: Add dns parser code Ahmad Gani @ 2025-08-29 7:55 ` Ahmad Gani 2025-09-05 16:26 ` Alviro Iskandar Setiawan 2025-09-06 7:47 ` Alviro Iskandar Setiawan 2025-09-05 9:18 ` [PATCH gwproxy v8 0/2] Initial work on integration of DNS parser lib in gwproxy Ammar Faizi 2 siblings, 2 replies; 34+ messages in thread From: Ahmad Gani @ 2025-08-29 7:55 UTC (permalink / raw) To: Ammar Faizi; +Cc: Ahmad Gani, Alviro Iskandar Setiawan, GNU/Weeb Mailing List The raw DNS backend is now marked as experimental and disabled by default. It can be enabled at build time with the configure option --use-raw-dns, and at runtime with the gwproxy option -r 1. Add fallback mechanism for raw DNS If *_PREFER_* is used in the restyp, the raw DNS backend will attempt to retry DNS query with different address family. Add DNS server option (remove hard-coded DNS server) Introduce --dns-server gwproxy' cmdline option instead of hard-coded value. Signed-off-by: Ahmad Gani <reyuki@gnuweeb.org> --- Makefile | 2 +- configure | 8 ++ src/gwproxy/dns.c | 241 ++++++++++++++++++++++++++++++++++++----- src/gwproxy/dns.h | 29 ++++- src/gwproxy/ev/epoll.c | 67 ++++++++++-- src/gwproxy/gwproxy.c | 54 +++++++-- src/gwproxy/gwproxy.h | 5 +- 7 files changed, 357 insertions(+), 49 deletions(-) diff --git a/Makefile b/Makefile index 3ac35f052793..eb750e0fc31c 100644 --- a/Makefile +++ b/Makefile @@ -43,7 +43,7 @@ LIBGWPSOCKS5_TEST_CC_SOURCES = $(GWPROXY_DIR)/tests/socks5.c LIBGWPSOCKS5_TEST_OBJECTS = $(LIBGWPSOCKS5_TEST_CC_SOURCES:%.c=%.c.o) LIBGWDNS_TARGET = libgwdns.so -LIBGWDNS_CC_SOURCES = $(GWPROXY_DIR)/dns.c $(GWPROXY_DIR)/dns_cache.c +LIBGWDNS_CC_SOURCES = $(GWPROXY_DIR)/dns.c $(GWPROXY_DIR)/dnsparser.c $(GWPROXY_DIR)/dns_cache.c $(GWPROXY_DIR)/net.c LIBGWDNS_OBJECTS = $(LIBGWDNS_CC_SOURCES:%.c=%.c.o) LIBGWDNS_TEST_TARGET = $(GWPROXY_DIR)/tests/dns.t LIBGWDNS_TEST_CC_SOURCES = $(GWPROXY_DIR)/tests/dns.c diff --git a/configure b/configure index cef07b1cb652..539610916149 100755 --- a/configure +++ b/configure @@ -34,6 +34,9 @@ for opt do --cxx=*) cxx="$optarg"; ;; + --use-raw-dns) + use_raw_dns="yes"; + ;; --use-io-uring) use_io_uring="yes"; ;; @@ -87,6 +90,7 @@ Options: [defaults in brackets after descriptions] --cxx=CMD Use CMD as the C++ compiler --debug Build with debug enabled --use-io-uring Enable io_uring support (default: no) + --use-raw-dns Enable experimental raw DNS backend (default: no) --sanitize Enable sanitizers (default: no) EOF exit 0; @@ -333,6 +337,10 @@ if test "${use_io_uring}" = "yes"; then CXXFLAGS="${CXXFLAGS} -I./src/liburing/src/include"; fi; +if test "${use_raw_dns}" = "yes"; then + add_config "CONFIG_RAW_DNS"; +fi; + if test "${use_sanitize}" = "yes"; then add_config "CONFIG_SANITIZE"; if ! add_c_and_cxx_flag "-fsanitize=address"; then diff --git a/src/gwproxy/dns.c b/src/gwproxy/dns.c index cdc62a5bcf78..f22738c912e4 100644 --- a/src/gwproxy/dns.c +++ b/src/gwproxy/dns.c @@ -23,6 +23,7 @@ #include <pthread.h> #include <sys/eventfd.h> +#include <gwproxy/dnsparser.h> struct gwp_dns_ctx; @@ -33,6 +34,14 @@ struct gwp_dns_wrk { }; struct gwp_dns_ctx { +#ifdef CONFIG_RAW_DNS + uint32_t entry_cap; + struct gwp_dns_entry **entries; + int sockfd; + int ns_family; + struct gwp_sockaddr ns_addr; + uint8_t ns_addrlen; +#endif volatile bool should_stop; pthread_mutex_t lock; pthread_cond_t cond; @@ -182,13 +191,122 @@ int gwp_dns_resolve(struct gwp_dns_ctx *ctx, const char *name, return found ? 0 : -EHOSTUNREACH; } +#ifdef CONFIG_RAW_DNS + +void cp_nsaddr(struct gwp_dns_ctx *ctx, struct gwp_sockaddr *addr, uint8_t *addrlen) +{ + *addr = ctx->ns_addr; + *addrlen = ctx->ns_addrlen; +} + +static void _gwp_dns_entry_free(struct gwp_dns_entry *e) +{ + assert(e); + assert(e->udp_fd >= 0); + __sys_close(e->udp_fd); + free(e->name); + free(e); +} + +void gwp_dns_raw_entry_free(struct gwp_dns_ctx *ctx, struct gwp_dns_entry *e) +{ + struct gwp_dns_entry *new_e; + + assert(e); + + new_e = ctx->entries[--ctx->nr_entries]; + assert(ctx->nr_entries == new_e->idx); + new_e->idx = e->idx; + ctx->entries[e->idx] = new_e; + ctx->entries[ctx->nr_entries] = NULL; + + _gwp_dns_entry_free(e); +} + +static void free_all_queued_entries(struct gwp_dns_ctx *ctx) +{ + uint32_t i; + for (i = 0; i < ctx->nr_entries; i++) { + struct gwp_dns_entry *e = ctx->entries[i]; + _gwp_dns_entry_free(e); + } + + free(ctx->entries); +} + +static bool realloc_entries(struct gwp_dns_ctx *ctx) +{ + struct gwp_dns_entry **tmp; + int new_cap; + + new_cap = ctx->entry_cap * 2; + tmp = realloc(ctx->entries, new_cap * sizeof(*tmp)); + if (!tmp) + return 1; + + ctx->entries = tmp; + ctx->entry_cap = new_cap; + + return 0; +} + +int gwp_dns_process(struct gwp_dns_ctx *ctx, struct gwp_dns_entry *e) +{ + struct gwdns_addrinfo_node *ai; + uint8_t buff[UDP_MSG_LIMIT]; + ssize_t r; + + r = __sys_recvfrom( + e->udp_fd, buff, sizeof(buff), 0, + &ctx->ns_addr.sa, (socklen_t *)&ctx->ns_addrlen + ); + if (r <= 0) + return (int)r; + + r = gwdns_parse_query(e->txid, e->service, buff, r, &ai); + if (r) { + if (r == -ENODATA) { + uint16_t txid; + int af; + + /* Fallback to other family if this one yield no results */ + switch (ctx->cfg.restyp) { + case GWP_DNS_RESTYP_PREFER_IPV4: + af = AF_INET6; + break; + case GWP_DNS_RESTYP_PREFER_IPV6: + af = AF_INET; + break; + default: + goto exit_free_ai; + } + + txid = (uint16_t)rand(); + r = gwdns_build_query(txid, e->name, af, e->payload, sizeof(e->payload)); + if (r > 0) { + e->txid = txid; + e->payloadlen = (int)r; + r = -EAGAIN; + } + } + goto exit_free_ai; + } + + e->addr = ai->ai_addr; + +exit_free_ai: + gwdns_free_parsed_query(ai); + return (int)r; +} +#endif /* #ifndef CONFIG_RAW_DNS */ + static void gwp_dns_entry_free(struct gwp_dns_entry *e) { if (!e) return; assert(e->ev_fd >= 0); - close(e->ev_fd); + __sys_close(e->ev_fd); free(e->name); free(e); } @@ -719,33 +837,48 @@ int gwp_dns_ctx_init(struct gwp_dns_ctx **ctx_p, const struct gwp_dns_cfg *cfg) goto out_free_ctx; } - r = pthread_cond_init(&ctx->cond, NULL); - if (r) { - r = -r; + if (cfg->use_raw_dns) { +#ifdef CONFIG_RAW_DNS + r = convert_str_to_ssaddr(cfg->ns_addr_str, &ctx->ns_addr, 53); + if (r) goto out_destroy_mutex; + ctx->ns_addrlen = ctx->ns_addr.sa.sa_family == AF_INET + ? sizeof(ctx->ns_addr.i4) + : sizeof(ctx->ns_addr.i6); + ctx->entry_cap = DEFAULT_ENTRIES_CAP; + ctx->entries = malloc(ctx->entry_cap * sizeof(*ctx->entries)); + if (!ctx->entries) + goto out_destroy_mutex; +#endif + } else { + r = pthread_cond_init(&ctx->cond, NULL); + if (r) { + r = -r; + goto out_destroy_mutex; + } + + ctx->nr_sleeping = 0; + ctx->workers = NULL; + ctx->head = NULL; + ctx->tail = NULL; + r = init_workers(ctx); + if (r) + goto out_destroy_cond; } + ctx->nr_entries = 0; r = init_cache(ctx); if (r) goto out_destroy_cond; - ctx->nr_sleeping = 0; - ctx->nr_entries = 0; - ctx->workers = NULL; - ctx->head = NULL; - ctx->tail = NULL; ctx->should_stop = false; ctx->last_scan = time(NULL); - r = init_workers(ctx); - if (r) - goto out_free_cache; *ctx_p = ctx; return 0; -out_free_cache: - free_cache(ctx->cache); out_destroy_cond: - pthread_cond_destroy(&ctx->cond); + if (!cfg->use_raw_dns) + pthread_cond_destroy(&ctx->cond); out_destroy_mutex: pthread_mutex_destroy(&ctx->lock); out_free_ctx: @@ -762,10 +895,16 @@ static void put_all_queued_entries(struct gwp_dns_ctx *ctx) void gwp_dns_ctx_free(struct gwp_dns_ctx *ctx) { - free_workers(ctx); + if (ctx->cfg.use_raw_dns) { +#ifdef CONFIG_RAW_DNS + free_all_queued_entries(ctx); +#endif + } else { + free_workers(ctx); + pthread_cond_destroy(&ctx->cond); + put_all_queued_entries(ctx); + } pthread_mutex_destroy(&ctx->lock); - pthread_cond_destroy(&ctx->cond); - put_all_queued_entries(ctx); free_cache(ctx->cache); free(ctx); } @@ -791,14 +930,51 @@ struct gwp_dns_entry *gwp_dns_queue(struct gwp_dns_ctx *ctx, { struct gwp_dns_entry *e; size_t nl, sl; +#ifdef CONFIG_RAW_DNS + uint16_t txid; + ssize_t r; + int af; +#endif e = malloc(sizeof(*e)); if (!e) return NULL; - e->ev_fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC); - if (e->ev_fd < 0) - goto out_free_e; + if (ctx->cfg.use_raw_dns) { +#ifdef CONFIG_RAW_DNS + if (ctx->nr_entries == ctx->entry_cap && realloc_entries(ctx)) + return NULL; + + r = __sys_socket(ctx->ns_addr.sa.sa_family, SOCK_DGRAM | SOCK_NONBLOCK, 0); + if (r < 0) + goto out_free_e; + e->udp_fd = (int)r; + + switch (ctx->cfg.restyp) { + case GWP_DNS_RESTYP_PREFER_IPV4: + case GWP_DNS_RESTYP_IPV4_ONLY: + af = AF_INET; + break; + case GWP_DNS_RESTYP_PREFER_IPV6: + case GWP_DNS_RESTYP_IPV6_ONLY: + af = AF_INET6; + break; + default: + assert(0); + goto out_close_fd; + } + + txid = (uint16_t)rand(); + r = gwdns_build_query(txid, name, af, e->payload, sizeof(e->payload)); + if (r < 0) + goto out_close_fd; + e->payloadlen = (int)r; +#endif + } else { + e->ev_fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC); + if (e->ev_fd < 0) + goto out_free_e; + } /* * Merge name and service into a single allocated string to @@ -811,7 +987,7 @@ struct gwp_dns_entry *gwp_dns_queue(struct gwp_dns_ctx *ctx, sl = service ? strlen(service) : 0; e->name = malloc(nl + 1 + sl + 1); if (!e->name) - goto out_close_ev_fd; + goto out_close_fd; e->service = e->name + nl + 1; memcpy(e->name, name, nl + 1); @@ -820,13 +996,26 @@ struct gwp_dns_entry *gwp_dns_queue(struct gwp_dns_ctx *ctx, else e->service[0] = '\0'; - atomic_init(&e->refcnt, 2); e->res = 0; - push_queue(ctx, e); + if (ctx->cfg.use_raw_dns) { +#ifdef CONFIG_RAW_DNS + e->idx = ctx->nr_entries++; + ctx->entries[e->idx] = e; +#endif + } else { + atomic_init(&e->refcnt, 2); + push_queue(ctx, e); + } return e; -out_close_ev_fd: - close(e->ev_fd); +out_close_fd: + if (ctx->cfg.use_raw_dns) { +#ifdef CONFIG_RAW_DNS + __sys_close(e->udp_fd); +#endif + } else { + __sys_close(e->ev_fd); + } out_free_e: free(e); return NULL; diff --git a/src/gwproxy/dns.h b/src/gwproxy/dns.h index 10c7cea2ebe4..410989f6414c 100644 --- a/src/gwproxy/dns.h +++ b/src/gwproxy/dns.h @@ -10,10 +10,19 @@ #include <stdbool.h> #include <netinet/in.h> #include <gwproxy/net.h> - -struct gwp_dns_wrk; +#include <gwproxy/dnsparser.h> +#include <gwproxy/syscall.h> struct gwp_dns_entry { +#ifdef CONFIG_RAW_DNS + uint32_t idx; + int udp_fd; + int payloadlen; + union { + uint16_t txid; + uint8_t payload[UDP_MSG_LIMIT]; + }; +#endif char *name; char *service; _Atomic(int) refcnt; @@ -31,10 +40,16 @@ enum { GWP_DNS_RESTYP_PREFER_IPV6 = 4, }; +#define DEFAULT_ENTRIES_CAP 255 + struct gwp_dns_cfg { int cache_expiry; /* In seconds. <= 0 to disable cache. */ uint32_t nr_workers; uint32_t restyp; + bool use_raw_dns; +#ifdef CONFIG_RAW_DNS + const char *ns_addr_str; +#endif }; struct gwp_dns_ctx; @@ -76,7 +91,6 @@ void gwp_dns_ctx_free(struct gwp_dns_ctx *ctx); */ struct gwp_dns_entry *gwp_dns_queue(struct gwp_dns_ctx *ctx, const char *name, const char *service); - /** * Release a DNS entry. This function decrements the reference count of the * entry. If the reference count reaches zero, the entry is freed. @@ -87,6 +101,15 @@ struct gwp_dns_entry *gwp_dns_queue(struct gwp_dns_ctx *ctx, */ bool gwp_dns_entry_put(struct gwp_dns_entry *entry); +#ifdef CONFIG_RAW_DNS + +void cp_nsaddr(struct gwp_dns_ctx *ctx, struct gwp_sockaddr *addr, uint8_t *addrlen); + +void gwp_dns_raw_entry_free(struct gwp_dns_ctx *ctx, struct gwp_dns_entry *e); + +int gwp_dns_process(struct gwp_dns_ctx *ctx, struct gwp_dns_entry *e); +#endif + /** * Lookup a DNS entry in the cache. If the entry is found, it fills the * `addr` structure with the resolved address and returns 0. If the entry is diff --git a/src/gwproxy/ev/epoll.c b/src/gwproxy/ev/epoll.c index d46568a6a2b1..357f9a3e0486 100644 --- a/src/gwproxy/ev/epoll.c +++ b/src/gwproxy/ev/epoll.c @@ -778,24 +778,55 @@ static int handle_connect(struct gwp_wrk *w, struct gwp_conn_pair *gcp) return 0; } +#ifdef CONFIG_RAW_DNS +static int arm_poll_for_raw_dns_query(struct gwp_wrk *w, + struct gwp_conn_pair *gcp) +{ + struct gwp_dns_entry *gde = gcp->gde; + struct gwp_dns_ctx *dctx; + struct gwp_sockaddr addr; + uint8_t addrlen; + ssize_t r; + + dctx = w->ctx->dns; + cp_nsaddr(dctx, &addr, &addrlen); + r = __sys_sendto( + gde->udp_fd, gde->payload, gde->payloadlen, MSG_NOSIGNAL, + &addr.sa, addrlen + ); + + return (int)r; +} +#endif + static int arm_poll_for_dns_query(struct gwp_wrk *w, struct gwp_conn_pair *gcp) { struct gwp_dns_entry *gde = gcp->gde; struct epoll_event ev; - int r; + ssize_t r; assert(gde); - assert(gde->ev_fd >= 0); ev.events = EPOLLIN; ev.data.u64 = 0; ev.data.ptr = gcp; ev.data.u64 |= EV_BIT_DNS_QUERY; - r = __sys_epoll_ctl(w->ep_fd, EPOLL_CTL_ADD, gde->ev_fd, &ev); + if (w->ctx->cfg.use_raw_dns) { +#ifdef CONFIG_RAW_DNS + arm_poll_for_raw_dns_query(w, gcp); + + r = __sys_epoll_ctl(w->ep_fd, EPOLL_CTL_ADD, gde->udp_fd, &ev); if (unlikely(r)) - return r; + return (int)r; +#endif + } else { + assert(gde->ev_fd >= 0); + r = __sys_epoll_ctl(w->ep_fd, EPOLL_CTL_ADD, gde->ev_fd, &ev); + if (unlikely(r)) + return (int)r; + } return 0; } @@ -826,13 +857,25 @@ static int handle_ev_dns_query(struct gwp_wrk *w, struct gwp_conn_pair *gcp) int r, ct = gcp->conn_state; assert(gde); - assert(gde->ev_fd >= 0); assert(ct == CONN_STATE_SOCKS5_DNS_QUERY || ct == CONN_STATE_HTTP_DNS_QUERY); - r = __sys_epoll_ctl(w->ep_fd, EPOLL_CTL_DEL, gde->ev_fd, NULL); - if (unlikely(r)) - return r; + if (w->ctx->cfg.use_raw_dns) { +#ifdef CONFIG_RAW_DNS + r = gwp_dns_process(w->ctx->dns, gde); + if (r == -EAGAIN) { + pr_dbg(&w->ctx->lh, "DNS Fallback"); + arm_poll_for_raw_dns_query(w, gcp); + return 0; + } else if (r) + gde->res = r; +#endif + } else { + assert(gde->ev_fd >= 0); + r = __sys_epoll_ctl(w->ep_fd, EPOLL_CTL_DEL, gde->ev_fd, NULL); + if (unlikely(r)) + return r; + } log_dns_query(w, gcp, gde); if (likely(!gde->res)) { @@ -845,7 +888,13 @@ static int handle_ev_dns_query(struct gwp_wrk *w, struct gwp_conn_pair *gcp) r = -EIO; } - gwp_dns_entry_put(gde); + if (w->ctx->cfg.use_raw_dns) { +#ifdef CONFIG_RAW_DNS + gwp_dns_raw_entry_free(w->ctx->dns, gde); +#endif + } else { + gwp_dns_entry_put(gde); + } gcp->gde = NULL; return r; } diff --git a/src/gwproxy/gwproxy.c b/src/gwproxy/gwproxy.c index 7a90609dd5f1..36305e65c4cb 100644 --- a/src/gwproxy/gwproxy.c +++ b/src/gwproxy/gwproxy.c @@ -44,6 +44,10 @@ static const struct option long_opts[] = { { "help", no_argument, NULL, 'h' }, + { "raw-dns", required_argument, NULL, 'r' }, +#ifdef CONFIG_RAW_DNS + { "dns-server", required_argument, NULL, 'j' }, +#endif { "event-loop", required_argument, NULL, 'e' }, { "bind", required_argument, NULL, 'b' }, { "target", required_argument, NULL, 't' }, @@ -54,7 +58,6 @@ static const struct option long_opts[] = { { "socks5-auth-file", required_argument, NULL, 'A' }, { "socks5-dns-cache-secs", required_argument, NULL, 'L' }, { "nr-workers", required_argument, NULL, 'w' }, - { "nr-dns-workers", required_argument, NULL, 'W' }, { "connect-timeout", required_argument, NULL, 'c' }, { "target-buf-size", required_argument, NULL, 'T' }, { "client-buf-size", required_argument, NULL, 'C' }, @@ -74,6 +77,7 @@ static const struct gwp_cfg default_opts = { .event_loop = "epoll", .bind = "[::]:1080", .target = NULL, + .use_raw_dns = false, .as_socks5 = false, .as_http = false, .socks5_prefer_ipv6 = false, @@ -81,7 +85,6 @@ static const struct gwp_cfg default_opts = { .socks5_auth_file = NULL, .socks5_dns_cache_secs = 0, .nr_workers = 4, - .nr_dns_workers = 4, .connect_timeout = 5, .target_buf_size = 2048, .client_buf_size = 2048, @@ -94,6 +97,9 @@ static const struct gwp_cfg default_opts = { .log_level = 3, .log_file = "/dev/stdout", .pid_file = NULL, +#ifdef CONFIG_RAW_DNS + .ns_addr_str = "1.1.1.1" +#endif }; __cold @@ -104,6 +110,10 @@ static void show_help(const char *app) printf(" -h, --help Show this help message and exit\n"); printf(" -e, --event-loop=name Specify the event loop to use (default: %s)\n", default_opts.event_loop); printf(" Available values: epoll, io_uring\n"); + printf(" -r, --raw-dns=0|1 Use experimental raw DNS as the backend (default: %d)\n", default_opts.use_raw_dns); +#ifdef CONFIG_RAW_DNS + printf(" -j, --dns-server=addr DNS server address (default: %s)\n", default_opts.ns_addr_str); +#endif printf(" -b, --bind=addr:port Bind to the specified address (default: %s)\n", default_opts.bind); printf(" -t, --target=addr_port Target address to connect to\n"); printf(" -S, --as-socks5=0|1 Run as a SOCKS5 proxy (default: %d)\n", default_opts.as_socks5); @@ -114,7 +124,6 @@ static void show_help(const char *app) printf(" -L, --socks5-dns-cache-secs=sec SOCKS5 DNS cache duration in seconds (default: %d)\n", default_opts.socks5_dns_cache_secs); printf(" Set to 0 or a negative number to disable DNS caching.\n"); printf(" -w, --nr-workers=nr Number of worker threads (default: %d)\n", default_opts.nr_workers); - printf(" -W, --nr-dns-workers=nr Number of DNS worker threads for SOCKS5 (default: %d)\n", default_opts.nr_dns_workers); printf(" -c, --connect-timeout=sec Connection to target timeout in seconds (default: %d)\n", default_opts.connect_timeout); printf(" -T, --target-buf-size=nr Target buffer size in bytes (default: %d)\n", default_opts.target_buf_size); printf(" -C, --client-buf-size=nr Client buffer size in bytes (default: %d)\n", default_opts.client_buf_size); @@ -159,6 +168,9 @@ static int parse_options(int argc, char *argv[], struct gwp_cfg *cfg) case 'h': show_help(argv[0]); exit(0); + case 'r': + cfg->use_raw_dns = !!atoi(optarg); + break; case 'e': cfg->event_loop = optarg; break; @@ -189,9 +201,6 @@ static int parse_options(int argc, char *argv[], struct gwp_cfg *cfg) case 'w': cfg->nr_workers = atoi(optarg); break; - case 'W': - cfg->nr_dns_workers = atoi(optarg); - break; case 'c': cfg->connect_timeout = atoi(optarg); break; @@ -228,6 +237,11 @@ static int parse_options(int argc, char *argv[], struct gwp_cfg *cfg) case 'p': cfg->pid_file = optarg; break; +#ifdef CONFIG_RAW_DNS + case 'j': + cfg->ns_addr_str = optarg; + break; +#endif default: fprintf(stderr, "Unknown option: %c\n", c); show_help(argv[0]); @@ -689,12 +703,23 @@ static int gwp_ctx_init_dns(struct gwp_ctx *ctx) { struct gwp_cfg *cfg = &ctx->cfg; const struct gwp_dns_cfg dns_cfg = { + .use_raw_dns = cfg->use_raw_dns, .cache_expiry = cfg->socks5_dns_cache_secs, .restyp = cfg->socks5_prefer_ipv6 ? GWP_DNS_RESTYP_PREFER_IPV6 : 0, - .nr_workers = cfg->nr_dns_workers + .nr_workers = 1, +#ifdef CONFIG_RAW_DNS + .ns_addr_str = cfg->ns_addr_str +#endif }; int r; + if (cfg->use_raw_dns) { +#ifndef CONFIG_RAW_DNS + pr_err(&ctx->lh, "raw DNS backend is not enabled in this build"); + return -ENOSYS; +#endif + } + if (!cfg->as_socks5 && !cfg->as_http) { ctx->dns = NULL; return 0; @@ -733,6 +758,10 @@ static int gwp_ctx_parse_ev(struct gwp_ctx *ctx) ctx->ev_used = GWP_EV_EPOLL; pr_dbg(&ctx->lh, "Using event loop: epoll"); } else if (!strcmp(ev, "io_uring") || !strcmp(ev, "iou")) { + if (ctx->cfg.use_raw_dns) { + pr_err(&ctx->lh, "raw DNS backend is not supported for io_uring yet"); + return -ENOSYS; + } ctx->ev_used = GWP_EV_IO_URING; pr_dbg(&ctx->lh, "Using event loop: io_uring"); } else { @@ -990,8 +1019,15 @@ int gwp_free_conn_pair(struct gwp_wrk *w, struct gwp_conn_pair *gcp) if (gcp->timer_fd >= 0) __sys_close(gcp->timer_fd); - if (gcp->gde) - gwp_dns_entry_put(gcp->gde); + if (gcp->gde) { + if (w->ctx->cfg.use_raw_dns) { +#ifdef CONFIG_RAW_DNS + gwp_dns_raw_entry_free(w->ctx->dns, gcp->gde); +#endif + } else { + gwp_dns_entry_put(gcp->gde); + } + } switch (gcp->prot_type) { case GWP_PROT_TYPE_SOCKS5: diff --git a/src/gwproxy/gwproxy.h b/src/gwproxy/gwproxy.h index 095c0ce700c3..cdc5adf88ad6 100644 --- a/src/gwproxy/gwproxy.h +++ b/src/gwproxy/gwproxy.h @@ -24,6 +24,7 @@ struct gwp_cfg { const char *event_loop; const char *bind; const char *target; + bool use_raw_dns; bool as_socks5; bool as_http; bool socks5_prefer_ipv6; @@ -31,7 +32,6 @@ struct gwp_cfg { const char *socks5_auth_file; int socks5_dns_cache_secs; int nr_workers; - int nr_dns_workers; int connect_timeout; int target_buf_size; int client_buf_size; @@ -44,6 +44,9 @@ struct gwp_cfg { int log_level; const char *log_file; const char *pid_file; +#ifdef CONFIG_RAW_DNS + const char *ns_addr_str; +#endif }; struct gwp_ctx; -- Ahmad Gani ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-08-29 7:55 ` [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend Ahmad Gani @ 2025-09-05 16:26 ` Alviro Iskandar Setiawan 2025-09-06 4:32 ` Ahmad Gani 2025-09-06 7:14 ` Alviro Iskandar Setiawan 2025-09-06 7:47 ` Alviro Iskandar Setiawan 1 sibling, 2 replies; 34+ messages in thread From: Alviro Iskandar Setiawan @ 2025-09-05 16:26 UTC (permalink / raw) To: Ahmad Gani; +Cc: Ammar Faizi, GNU/Weeb Mailing List Having #ifdef CONFIG_RAW_DNS all over the place is horrible. Please do something else to reduce them. CTRL + F, "def CONFIG_RAW_DNS", 24 matches. We're not taking this. On Fri, Aug 29, 2025 at 2:56 PM Ahmad Gani wrote: > +void cp_nsaddr(struct gwp_dns_ctx *ctx, struct gwp_sockaddr *addr, uint8_t *addrlen) > +{ > + *addr = ctx->ns_addr; > + *addrlen = ctx->ns_addrlen; > +} Why is this exposed globally? I don't think it's worth a function call. Better be inlined. > +int gwp_dns_process(struct gwp_dns_ctx *ctx, struct gwp_dns_entry *e) > +{ > + struct gwdns_addrinfo_node *ai; > + uint8_t buff[UDP_MSG_LIMIT]; > + ssize_t r; > + > + r = __sys_recvfrom( > + e->udp_fd, buff, sizeof(buff), 0, > + &ctx->ns_addr.sa, (socklen_t *)&ctx->ns_addrlen > + ); > + if (r <= 0) > + return (int)r; The subsequent code calls gwdns_parse_query(), and you noted that a return value of 0 indicates success, and by the end of gwp_dns_process(), it returns what was returned by gwdns_parse_query(). That's it. It means, if successful, it returns 0. What happens if __sys_recvfrom() returns 0? Doesn't that mean it's also considered successful? If so, why? > + if (cfg->use_raw_dns) { > +#ifdef CONFIG_RAW_DNS > + r = convert_str_to_ssaddr(cfg->ns_addr_str, &ctx->ns_addr, 53); > + if (r) > goto out_destroy_mutex; > + ctx->ns_addrlen = ctx->ns_addr.sa.sa_family == AF_INET > + ? sizeof(ctx->ns_addr.i4) > + : sizeof(ctx->ns_addr.i6); > + ctx->entry_cap = DEFAULT_ENTRIES_CAP; > + ctx->entries = malloc(ctx->entry_cap * sizeof(*ctx->entries)); > + if (!ctx->entries) > + goto out_destroy_mutex; > +#endif > + } else { Bad indentation. Apart from bad indentation, if the malloc() call fails, r still contains 0, and you goto out_destroy_mutex, returning 0 on failure. Shouldn't you set -ENOMEM instead? > void gwp_dns_ctx_free(struct gwp_dns_ctx *ctx) > { > - free_workers(ctx); > + if (ctx->cfg.use_raw_dns) { > +#ifdef CONFIG_RAW_DNS > + free_all_queued_entries(ctx); > +#endif > + } else { Bad indentation. > + txid = (uint16_t)rand(); Why do you use a random number for the txid? I don't think rand() guarantees uniqueness. Also, I don't find the srand() call anywhere. That's equivalent to having srand(1). What about using a simple incrementing counter instead (which guarantees the uniqueness)? It will indeed reset after 65535, but that should give enough room to complete prior DNS requests first. With a simple counter, the reuse of an ID will always be 65535 queries away. > struct gwp_dns_entry { > +#ifdef CONFIG_RAW_DNS > + uint32_t idx; > + int udp_fd; AFAIU, when using the raw DNS feature you bring. ev_fd is not used, instead you use udp_fd. So I think it is not worth adding extra unused space. Better be an anonymous union if the name needs to be different. No? > +#ifdef CONFIG_RAW_DNS > +static int arm_poll_for_raw_dns_query(struct gwp_wrk *w, > + struct gwp_conn_pair *gcp) > +{ > + struct gwp_dns_entry *gde = gcp->gde; > + struct gwp_dns_ctx *dctx; > + struct gwp_sockaddr addr; > + uint8_t addrlen; > + ssize_t r; > + > + dctx = w->ctx->dns; > + cp_nsaddr(dctx, &addr, &addrlen); > + r = __sys_sendto( > + gde->udp_fd, gde->payload, gde->payloadlen, MSG_NOSIGNAL, > + &addr.sa, addrlen > + ); > + > + return (int)r; > +} > +#endif Why is this function named arm_poll_for_raw_dns_query()? There is no poll-arming here. You're simply calling __sys_sendto(). Please write a function name based on what it does. It's confusing. > + if (w->ctx->cfg.use_raw_dns) { > +#ifdef CONFIG_RAW_DNS > + arm_poll_for_raw_dns_query(w, gcp); > + > + r = __sys_epoll_ctl(w->ep_fd, EPOLL_CTL_ADD, gde->udp_fd, &ev); > if (unlikely(r)) > - return r; > + return (int)r; > +#endif > + } else { Bad indentation. > - gwp_dns_entry_put(gde); > + if (w->ctx->cfg.use_raw_dns) { > +#ifdef CONFIG_RAW_DNS > + gwp_dns_raw_entry_free(w->ctx->dns, gde); > +#endif Bad indentation. Why are there many of those? > + if (gcp->gde) { > + if (w->ctx->cfg.use_raw_dns) { > +#ifdef CONFIG_RAW_DNS > + gwp_dns_raw_entry_free(w->ctx->dns, gcp->gde); > +#endif > + } else { Bad indentation. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-05 16:26 ` Alviro Iskandar Setiawan @ 2025-09-06 4:32 ` Ahmad Gani 2025-09-06 5:16 ` Ahmad Gani ` (3 more replies) 2025-09-06 7:14 ` Alviro Iskandar Setiawan 1 sibling, 4 replies; 34+ messages in thread From: Ahmad Gani @ 2025-09-06 4:32 UTC (permalink / raw) To: Alviro Iskandar Setiawan; +Cc: Ammar Faizi, GNU/Weeb Mailing List On Fri, Sep 5, 2025 at 11:26 PM Alviro Iskandar Setiawan wrote: > Having #ifdef CONFIG_RAW_DNS all over the place is horrible. Please do > something else to reduce them. CTRL + F, "def CONFIG_RAW_DNS", 24 > matches. Hi Sir Al, First I want to say thank you for reviewing my patchset (and sir Ammar for pinging it). I thought it was necessary to mark that a certain block belongs to the experimental raw DNS feature. If I reduce the use of these marks, it might introduce unnecessary code into the compiled version when this experimental feature is disabled. > On Fri, Aug 29, 2025 at 2:56 PM Ahmad Gani wrote: >> +void cp_nsaddr(struct gwp_dns_ctx *ctx, struct gwp_sockaddr *addr, uint8_t *addrlen) >> +{ >> + *addr = ctx->ns_addr; >> + *addrlen = ctx->ns_addrlen; >> +} > > Why is this exposed globally? I don't think it's worth a function > call. Better be inlined. I can't access members of the gwp_dns_ctx struct directly from another file, as it seems to be treated as an opaque pointer outside of dns.c. If I make it an inline function, it would only be accessible within dns.c. Should I move the full declaration of the struct to the header file? >> +int gwp_dns_process(struct gwp_dns_ctx *ctx, struct gwp_dns_entry *e) >> +{ >> + struct gwdns_addrinfo_node *ai; >> + uint8_t buff[UDP_MSG_LIMIT]; >> + ssize_t r; >> + >> + r = __sys_recvfrom( >> + e->udp_fd, buff, sizeof(buff), 0, >> + &ctx->ns_addr.sa, (socklen_t *)&ctx->ns_addrlen >> + ); >> + if (r <= 0) >> + return (int)r; > > The subsequent code calls gwdns_parse_query(), and you noted that a > return value of 0 indicates success, and by the end of > gwp_dns_process(), it returns what was returned by > gwdns_parse_query(). That's it. It means, if successful, it returns 0. > What happens if __sys_recvfrom() returns 0? Doesn't that mean it's > also considered successful? If so, why? Yeah, that's an error on my part. I will set r to -EINVAL for lengths in the range 0-38. Since there is no short send in UDP, there's no way it can be a valid standard query if the length is less than 38 bytes: - 12 bytes for the header section - 10 bytes for the query section - 16 bytes for the answer section The shortest valid domain length is 3 characters: 1 for the second-level label and 2 for the TLD label. You can try `dig +short +noedns g.cn` and observe it in Wireshark. So far, the shortest standard query response I've found is 38 bytes. >> + if (cfg->use_raw_dns) { >> +#ifdef CONFIG_RAW_DNS >> + r = convert_str_to_ssaddr(cfg->ns_addr_str, &ctx->ns_addr, 53); >> + if (r) >> goto out_destroy_mutex; >> + ctx->ns_addrlen = ctx->ns_addr.sa.sa_family == AF_INET >> + ? sizeof(ctx->ns_addr.i4) >> + : sizeof(ctx->ns_addr.i6); >> + ctx->entry_cap = DEFAULT_ENTRIES_CAP; >> + ctx->entries = malloc(ctx->entry_cap * sizeof(*ctx->entries)); >> + if (!ctx->entries) >> + goto out_destroy_mutex; >> +#endif >> + } else { > > Bad indentation. Apart from bad indentation, if the malloc() call > fails, r still contains 0, and you goto out_destroy_mutex, returning 0 > on failure. Shouldn't you set -ENOMEM instead? You're right to point that out, I'll fix it. >> + txid = (uint16_t)rand(); > > Why do you use a random number for the txid? I don't think rand() > guarantees uniqueness. Also, I don't find the srand() call anywhere. > That's equivalent to having srand(1). > > What about using a simple incrementing counter instead (which > guarantees the uniqueness)? > It will indeed reset after 65535, but that should give enough room to > complete prior DNS requests first. With a simple counter, the reuse of > an ID will always be 65535 queries away. Without calling srand, it still generates random numbers, so I thought it was fine. But I agree with your suggestion—an incrementing counter is better, because randomness doesn’t guarantee uniqueness and is more likely to cause collisions than a counter. >> struct gwp_dns_entry { >> +#ifdef CONFIG_RAW_DNS >> + uint32_t idx; >> + int udp_fd; > > AFAIU, when using the raw DNS feature you bring. ev_fd is not used, > instead you use udp_fd. So I think it is not worth adding extra unused > space. Better be an anonymous union if the name needs to be different. > No? That's indeed a more efficient way to handle it. Let's go with that idea. >> +#ifdef CONFIG_RAW_DNS >> +static int arm_poll_for_raw_dns_query(struct gwp_wrk *w, >> + struct gwp_conn_pair *gcp) >> +{ >> + struct gwp_dns_entry *gde = gcp->gde; >> + struct gwp_dns_ctx *dctx; >> + struct gwp_sockaddr addr; >> + uint8_t addrlen; >> + ssize_t r; >> + >> + dctx = w->ctx->dns; >> + cp_nsaddr(dctx, &addr, &addrlen); >> + r = __sys_sendto( >> + gde->udp_fd, gde->payload, gde->payloadlen, MSG_NOSIGNAL, >> + &addr.sa, addrlen >> + ); >> + >> + return (int)r; >> +} >> +#endif > > Why is this function named arm_poll_for_raw_dns_query()? There is no > poll-arming here. You're simply calling __sys_sendto(). Please write a > function name based on what it does. It's confusing. Lol, I'm surprised myself to see that function name. I'm not sure what I was thinking when I wrote it—sorry for the confusion. > Bad indentation. Why are there many of those? Sorry about the indentation issues. They often happen when I add a new if block and forget to adjust the indentation afterward, I'll fix it. -- Ahmad Gani ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-06 4:32 ` Ahmad Gani @ 2025-09-06 5:16 ` Ahmad Gani 2025-09-06 6:17 ` Ahmad Gani ` (2 subsequent siblings) 3 siblings, 0 replies; 34+ messages in thread From: Ahmad Gani @ 2025-09-06 5:16 UTC (permalink / raw) To: Alviro Iskandar Setiawan; +Cc: Ammar Faizi, GNU/Weeb Mailing List On Sat, Sep 6, 2025 at 11:32 AM Ahmad Gani wrote: > On Fri, Sep 5, 2025 at 11:26 PM Alviro Iskandar Setiawan wrote: > > Having #ifdef CONFIG_RAW_DNS all over the place is horrible. Please do > > something else to reduce them. CTRL + F, "def CONFIG_RAW_DNS", 24 > > matches. > > I thought it was necessary to mark that a certain block belongs to the > experimental raw DNS feature. If I reduce the use of these marks, it might > introduce unnecessary code into the compiled version when this experimental > feature is disabled. However, I will reduce the `#ifdef CONFIG_RAW_DNS` where possible. -- Ahmad Gani ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-06 4:32 ` Ahmad Gani 2025-09-06 5:16 ` Ahmad Gani @ 2025-09-06 6:17 ` Ahmad Gani 2025-09-06 6:48 ` Ahmad Gani 2025-09-06 6:47 ` Alviro Iskandar Setiawan 2025-09-06 11:27 ` Ahmad Gani 3 siblings, 1 reply; 34+ messages in thread From: Ahmad Gani @ 2025-09-06 6:17 UTC (permalink / raw) To: Alviro Iskandar Setiawan; +Cc: Ammar Faizi, GNU/Weeb Mailing List On Sat, Sep 6, 2025 at 11:32 AM Ahmad Gani wrote: > > On Fri, Aug 29, 2025 at 2:56 PM Ahmad Gani wrote: > >> +void cp_nsaddr(struct gwp_dns_ctx *ctx, struct gwp_sockaddr *addr, uint8_t *addrlen) > >> +{ > >> + *addr = ctx->ns_addr; > >> + *addrlen = ctx->ns_addrlen; > >> +} > > > > Why is this exposed globally? I don't think it's worth a function > > call. Better be inlined. > > I can't access members of the gwp_dns_ctx struct directly from another > file, as it seems to be treated as an opaque pointer outside of dns.c. If > I make it an inline function, it would only be accessible within dns.c. > Should I move the full declaration of the struct to the header file? Oh, wait—I just realized you were talking about the inline keyword, not static. My previous reply was unrelated; I misunderstood. In that case, I'll add the inline keyword if it's not worth a function call. -- Ahmad Gani ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-06 6:17 ` Ahmad Gani @ 2025-09-06 6:48 ` Ahmad Gani 2025-09-06 7:02 ` Alviro Iskandar Setiawan 0 siblings, 1 reply; 34+ messages in thread From: Ahmad Gani @ 2025-09-06 6:48 UTC (permalink / raw) To: Alviro Iskandar Setiawan; +Cc: Ammar Faizi, GNU/Weeb Mailing List On Sat, Sep 6, 2025 at 1:17 PM Ahmad Gani wrote: > On Sat, Sep 6, 2025 at 11:32 AM Ahmad Gani wrote: > > > On Fri, Aug 29, 2025 at 2:56 PM Ahmad Gani wrote: > > >> +void cp_nsaddr(struct gwp_dns_ctx *ctx, struct gwp_sockaddr *addr, uint8_t *addrlen) > > >> +{ > > >> + *addr = ctx->ns_addr; > > >> + *addrlen = ctx->ns_addrlen; > > >> +} > > > > > > Why is this exposed globally? I don't think it's worth a function > > > call. Better be inlined. > > > > I can't access members of the gwp_dns_ctx struct directly from another > > file, as it seems to be treated as an opaque pointer outside of dns.c. If > > I make it an inline function, it would only be accessible within dns.c. > > Should I move the full declaration of the struct to the header file? > > Oh, wait—I just realized you were talking about the inline keyword, not > static. My previous reply was unrelated; I misunderstood. In that case, > I'll add the inline keyword if it's not worth a function call. Actually, it's related, using the inline keyword in the header file required the gwp_dns_ctx struct to be defined in the header file as well. I just get this error: ``` ./src/gwproxy/dns.h:112:20: error: invalid use of undefined type ‘struct gwp_dns_ctx’ 112 | *addr = ctx->ns_addr; | ^~ ./src/gwproxy/dns.h:113:23: error: invalid use of undefined type ‘struct gwp_dns_ctx’ 113 | *addrlen = ctx->ns_addrlen; ``` -- Ahmad Gani ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-06 6:48 ` Ahmad Gani @ 2025-09-06 7:02 ` Alviro Iskandar Setiawan 0 siblings, 0 replies; 34+ messages in thread From: Alviro Iskandar Setiawan @ 2025-09-06 7:02 UTC (permalink / raw) To: Ahmad Gani; +Cc: Ammar Faizi, GNU/Weeb Mailing List On Sat, Sep 6, 2025 at 1:49 PM Ahmad Gani wrote: > ./src/gwproxy/dns.h:112:20: error: invalid use of undefined type > ‘struct gwp_dns_ctx’ > 112 | *addr = ctx->ns_addr; > | ^~ > ./src/gwproxy/dns.h:113:23: error: invalid use of undefined type > ‘struct gwp_dns_ctx’ > 113 | *addrlen = ctx->ns_addrlen; My advice would be to expose that struct and kill cp_nsaddr() addr entirely. That __sys_sendto() accepts "const struct sockaddr *" anyway, so it won't change anything. No point in copying to local var. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-06 4:32 ` Ahmad Gani 2025-09-06 5:16 ` Ahmad Gani 2025-09-06 6:17 ` Ahmad Gani @ 2025-09-06 6:47 ` Alviro Iskandar Setiawan 2025-09-09 2:38 ` reyuki 2025-09-06 11:27 ` Ahmad Gani 3 siblings, 1 reply; 34+ messages in thread From: Alviro Iskandar Setiawan @ 2025-09-06 6:47 UTC (permalink / raw) To: Ahmad Gani; +Cc: Ammar Faizi, GNU/Weeb Mailing List On Sat, Sep 6, 2025 at 11:33 AM Ahmad Gani <reyuki@gnuweeb.org> wrote: > I thought it was necessary to mark that a certain block belongs to the > experimental raw DNS feature. If I reduce the use of these marks, it might > introduce unnecessary code into the compiled version when this experimental > feature is disabled. It's necessary, but you're overusing it. Your version clutters the codebase so much. You had better have empty local functions which are compiled to nothing instead of #ifdef/#endif everywhere. Look at https://git.kernel.org/pub/scm/linux/kernel/git/axboe/liburing.git/tree/src/include/liburing/sanitize.h?id=5baeec1e1200b88da1d84ba978605728d791d693 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-06 6:47 ` Alviro Iskandar Setiawan @ 2025-09-09 2:38 ` reyuki 0 siblings, 0 replies; 34+ messages in thread From: reyuki @ 2025-09-09 2:38 UTC (permalink / raw) To: Alviro Iskandar Setiawan; +Cc: Ammar Faizi, GNU/Weeb Mailing List On Sat, Sep 6, 2025 at 1:47 PM Alviro Iskandar Setiawan wrote: > On Sat, Sep 6, 2025 at 11:33 AM Ahmad Gani wrote: > > I thought it was necessary to mark that a certain block belongs to the > > experimental raw DNS feature. If I reduce the use of these marks, it might > > introduce unnecessary code into the compiled version when this experimental > > feature is disabled. > > It's necessary, but you're overusing it. Your version clutters the > codebase so much. You had better have empty local functions which are > compiled to nothing instead of #ifdef/#endif everywhere. Look at > https://git.kernel.org/pub/scm/linux/kernel/git/axboe/liburing.git/tree/src/include/liburing/sanitize.h?id=5baeec1e1200b88da1d84ba978605728d791d693 I have reduced it to 17 using a similar way with the liburing way, is it still too much? you can take a look at this branch [1] FYI, I've pushed some commits implementing your proposal and Sir Ammar's, except for the third point in Sir Ammar's proposal. As Sir Ammar noted, DNS queries are short-lived objects. If the concurrent queries exceed 2^16, I think we can assume we're being DDoS'ed and can safely ignore further requests. If some of you think exceeding 2^16 is realistic in production and are willing to reproduce such a scenario and show me results where it actually happens, I'll reconsider implementing the third point. For now, I believe in the YAGNI principle. [1]: https://github.com/realyukii/gwproxy/tree/archived-experimental-raw-dns ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-06 4:32 ` Ahmad Gani ` (2 preceding siblings ...) 2025-09-06 6:47 ` Alviro Iskandar Setiawan @ 2025-09-06 11:27 ` Ahmad Gani 2025-09-06 12:01 ` Alviro Iskandar Setiawan 2025-09-06 12:58 ` Ahmad Gani 3 siblings, 2 replies; 34+ messages in thread From: Ahmad Gani @ 2025-09-06 11:27 UTC (permalink / raw) To: Alviro Iskandar Setiawan; +Cc: Ammar Faizi, GNU/Weeb Mailing List On Sat, Sep 6, 2025 at 11:32 AM Ahmad Gani wrote: > On Fri, Sep 5, 2025 at 11:26 PM Alviro Iskandar Setiawan wrote: > >> + txid = (uint16_t)rand(); > > > > Why do you use a random number for the txid? I don't think rand() > > guarantees uniqueness. Also, I don't find the srand() call anywhere. > > That's equivalent to having srand(1). > > > > What about using a simple incrementing counter instead (which > > guarantees the uniqueness)? > > It will indeed reset after 65535, but that should give enough room to > > complete prior DNS requests first. With a simple counter, the reuse of > > an ID will always be 65535 queries away. > > Without calling srand, it still generates random numbers, so I thought it > was fine. But I agree with your suggestion—an incrementing counter is > better, because randomness doesn’t guarantee uniqueness and is more > likely to cause collisions than a counter. (( I'll finish this first before working on the single UDP socket model. )) If it's incremented until it is recycled by itself, does it still need to be atomic? I'm considering adding _Atomic because the data probably will be accessed and incremented at the same time from multiple workers. -- Ahmad Gani ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-06 11:27 ` Ahmad Gani @ 2025-09-06 12:01 ` Alviro Iskandar Setiawan 2025-09-06 12:58 ` Ahmad Gani 1 sibling, 0 replies; 34+ messages in thread From: Alviro Iskandar Setiawan @ 2025-09-06 12:01 UTC (permalink / raw) To: Ahmad Gani; +Cc: Ammar Faizi, GNU/Weeb Mailing List On Sat, Sep 6, 2025 at 6:28 PM Ahmad Gani wrote: > If it's incremented until it is recycled by itself, does it still need to > be atomic? I'm considering adding _Atomic because the data probably will > be accessed and incremented at the same time from multiple workers. Yes. That's right. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-06 11:27 ` Ahmad Gani 2025-09-06 12:01 ` Alviro Iskandar Setiawan @ 2025-09-06 12:58 ` Ahmad Gani 2025-09-06 13:30 ` Alviro Iskandar Setiawan 1 sibling, 1 reply; 34+ messages in thread From: Ahmad Gani @ 2025-09-06 12:58 UTC (permalink / raw) To: Alviro Iskandar Setiawan; +Cc: Ammar Faizi, GNU/Weeb Mailing List On Sat, Sep 6, 2025 at 6:27 PM Ahmad Gani wrote: > On Sat, Sep 6, 2025 at 11:32 AM Ahmad Gani wrote: > (( I'll finish this first before working on the single UDP socket model. )) Now assuming we are using a single UDP socket, how do we match answers to the DNS queries with the corresponding proxy session? Currently this is achieved by one UDP socket for each query. In the epoll model, when the socket is ready for a particular event, it already stores the pointer to the corresponding proxy session (gwp_conn_pair). And if we are using only one UDP socket for all queries from all proxy sessions, we might not be able to rely on the pointer returned by epoll anymore. So, what are our options in this case? If you want to take a look, I've pushed new commits in the experimental-raw-dns branch [1]. [1]: Now the UDP socket only created once at: https://github.com/realyukii/gwproxy/blob/2b2a85b08b8b178775a064b7e3583540f7397956/src/gwproxy/dns.c#L826-L829 and only registered to epoll once at: https://github.com/realyukii/gwproxy/blob/2b2a85b08b8b178775a064b7e3583540f7397956/src/gwproxy/ev/epoll.c#L1163-L1171 it can send data over the socket just fine at: https://github.com/realyukii/gwproxy/blob/2b2a85b08b8b178775a064b7e3583540f7397956/src/gwproxy/ev/epoll.c#L816-L820 and the question is how we fill out udata upon receiving a response at: https://github.com/realyukii/gwproxy/blob/2b2a85b08b8b178775a064b7e3583540f7397956/src/gwproxy/ev/epoll.c#L1102 one option I can think of is to perform some sort of lookup at: https://github.com/realyukii/gwproxy/blob/2b2a85b08b8b178775a064b7e3583540f7397956/src/gwproxy/dns.c#L255-L263 after recv, we can use the txid received from the buffer as a key to look up the corresponding key in the array, what do you think? -- Ahmad Gani ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-06 12:58 ` Ahmad Gani @ 2025-09-06 13:30 ` Alviro Iskandar Setiawan 2025-09-06 13:44 ` Alviro Iskandar Setiawan ` (2 more replies) 0 siblings, 3 replies; 34+ messages in thread From: Alviro Iskandar Setiawan @ 2025-09-06 13:30 UTC (permalink / raw) To: Ahmad Gani; +Cc: Ammar Faizi, GNU/Weeb Mailing List On Sat, Sep 6, 2025 at 7:59 PM Ahmad Gani wrote: > So, what are our options in this case? Ok, good question. That's probably the most challenging part of this implementation. It will require careful planning to ensure that all edge cases are handled properly. My initial thought is, the most efficient data structure for this task would be an array of "struct gwp_conn_pair". Let's drop the atomic counter proposal. I suggest having thread-local incrementing counters. So the plan is, each thread has its own DNS resolver. That will simplify things a lot, because we don't need to worry about synchronization across threads. Each thread will have its own "struct gwp_conn_pair" array that maps the txid to your session. That also means, each thread can only have a max of 65536 concurrent connections. We can expand it by having more than one array. But I think that can be done later. Here's the initial plan (may still change as we go and discover more requirements): 1) Each thread has its own DNS resolver. 2) Each DNS resolver has: struct gwp_conn_pair *pairs[65536]; On a 64-bit machine, that's around 0.5MB per thread. On a 32-bit machine, that's around 0.25MB per thread. You may as well allocate it lazily by calling realloc() starting from a small number and keep reallocating as the number grows to keep the memory footprint low. 3) The array index will be the txid. 4) Each entry in the array will be a pointer to a struct gwp_conn_pair that contains the session information. But note that, since the lifetime of the "struct gwp_conn_pair" is now tied to both: 1) The SOCKS5 session, and 2) The DNS resolution query. You will need to make "struct gwp_conn_pair" refcounted to avoid use-after-free in case the DNS resolution query completes after the SOCKS5 session has been killed. I will think more about this and post something about it later when I have something better in my mind. Let me know your design if you have one too. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-06 13:30 ` Alviro Iskandar Setiawan @ 2025-09-06 13:44 ` Alviro Iskandar Setiawan 2025-09-06 14:26 ` Ahmad Gani 2025-09-07 4:22 ` Ammar Faizi 2 siblings, 0 replies; 34+ messages in thread From: Alviro Iskandar Setiawan @ 2025-09-06 13:44 UTC (permalink / raw) To: Ahmad Gani; +Cc: Ammar Faizi, GNU/Weeb Mailing List On Sat, Sep 6, 2025 at 8:30 PM Alviro Iskandar Setiawan wrote: > But note that, since the lifetime of the "struct > gwp_conn_pair" is now tied to both: > 1) The SOCKS5 session, and > 2) The DNS resolution query. > You will need to make "struct gwp_conn_pair" refcounted to avoid > use-after-free in case the DNS resolution query completes after the > SOCKS5 session has been killed. Thinking about this more, I found that the "struct gwp_conn_pair" does not need to be refcounted. So, we have: struct gwp_conn_pair *sess_map[65536]; Initially, set: sess_map[0..65535] = NULL; then, when a SOCKS5 needs a DNS query, generate a txid, then set: sess_map[txid] = socks5 session; if the socks5 session dies first before the DNS query completes, set: sess_map[txid] = NULL; then, when the DNS query completes and it sees sess_map[txid] == NULL, just drop the response, do nothing. No refcount needed. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-06 13:30 ` Alviro Iskandar Setiawan 2025-09-06 13:44 ` Alviro Iskandar Setiawan @ 2025-09-06 14:26 ` Ahmad Gani 2025-09-06 14:30 ` Alviro Iskandar Setiawan 2025-09-07 4:22 ` Ammar Faizi 2 siblings, 1 reply; 34+ messages in thread From: Ahmad Gani @ 2025-09-06 14:26 UTC (permalink / raw) To: Alviro Iskandar Setiawan; +Cc: Ammar Faizi, GNU/Weeb Mailing List On Sat, Sep 6, 2025 at 8:31 PM Alviro Iskandar Setiawan wrote: > So the plan is, each thread has its own DNS resolver. That > will simplify things a lot, because we don't need to worry about > synchronization across threads. Each thread will have its own "struct > gwp_conn_pair" array that maps the txid to your session. By its own DNS resolver, do you mean each epoll worker thread has its own UDP socket instead of a single UDP socket for all workers? It might already be clear, but I just want to confirm. > I will think more about this and post something about it later > when I have something better in my mind. Let me know your design if > you have one too. I do have a proposal for this, but I think your version is better than mine. -- Ahmad Gani ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-06 14:26 ` Ahmad Gani @ 2025-09-06 14:30 ` Alviro Iskandar Setiawan 0 siblings, 0 replies; 34+ messages in thread From: Alviro Iskandar Setiawan @ 2025-09-06 14:30 UTC (permalink / raw) To: Ahmad Gani; +Cc: Ammar Faizi, GNU/Weeb Mailing List On Sat, Sep 6, 2025 at 9:27 PM Ahmad Gani wrote: > By its own DNS resolver, do you mean each epoll worker thread has its own > UDP socket instead of a single UDP socket for all workers? Yes. It's better that way to avoid expensive synchronization across multiple threads. And it's still a lot better than open-close socket per query of course. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-06 13:30 ` Alviro Iskandar Setiawan 2025-09-06 13:44 ` Alviro Iskandar Setiawan 2025-09-06 14:26 ` Ahmad Gani @ 2025-09-07 4:22 ` Ammar Faizi 2025-09-07 5:57 ` Ammar Faizi ` (2 more replies) 2 siblings, 3 replies; 34+ messages in thread From: Ammar Faizi @ 2025-09-07 4:22 UTC (permalink / raw) To: Alviro Iskandar Setiawan; +Cc: Ahmad Gani, GNU/Weeb Mailing List On Sat, Sep 06, 2025 at 08:30:54PM +0700, Alviro Iskandar Setiawan wrote: > Let me know your design if you have one too. I have a small proposal. Writing this email from my Android phone right now. I am still traveling. Sitting and waiting are boring, so I decided to exercise my brain designing this solution. It keeps my brain active. Have these two structs: struct stack_u16 { uint16_t sp; uint16_t bp; uint16_t *arr; }; struct dns_resolver { struct stack_u16 stack; struct gwp_conn_pair **sess_map; uint16_t sess_map_cap; }; [ The struct dns_resolver MAY also cover socket, addr, etc. Now my primary point here is about the session mapping data structure. ] 1) @stack is used to keep track of unused indexes in @sess_map. - Push all unused indexes into the @stack, the index will be used as the DNS query txid. - When creating a DNS query, pop the stack, use the popped number as the txid. The txid is also used to store the corresponding gwp_conn_pair session (sess_map[txid] = ptr to conn session). - When the sess_map[txid] is no longer used, set (sess_map[txid] = NULL) and push the txid back into the @stack. 2) @sess_map_cap is used to keep track of the allocated size of @sess_map. Don't allocate 65536 at once to keep the memory footprint low when the proxy server is not fully utilized. - Start allocating @sess_map from size 16. Push 15, 14, 13, ..., 0 into the @stack. - Double the size when the slot is exhausted. When double-ing the size, the new allocated indexes are all pushed into the @stack. The slot exhaustion happens when the @stack is empty, it can be identified at the pop operation. - @sess_map array can be expanded up to 65536. - @sess_map MAY only be shrunk back to 16 when all members in @sess_map are NULL. When to shrink the array can be identified at the push operation. If the number of elements in the @stack is equal to @sess_map_cap, that means @sess_map is 100% free and can be reset to the initial state (back to size 16). The content of the @stack MUST also be reset at the shrink operation to keep the next pop and push operations in sync. 3) Handle more than 65535 clients in a single thread? Doable! Just have more than one struct dns_resolver per thread, create a new DNS resolver in the same thread once the 65535 limit is hit. Client 0 to 65535, use DNS resolver A. Client 65536 to 131071, use DNS resolver B (new UDP socket!). and so on... I doubt we will reach that limit. I think it is better to add more threads to utilize more CPU cores than keep adding more clients in a single thread. But we should still support it. -- Ammar Faizi ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-07 4:22 ` Ammar Faizi @ 2025-09-07 5:57 ` Ammar Faizi 2025-09-07 6:39 ` Ahmad Gani 2025-09-07 9:52 ` Ahmad Gani 2 siblings, 0 replies; 34+ messages in thread From: Ammar Faizi @ 2025-09-07 5:57 UTC (permalink / raw) To: Alviro Iskandar Setiawan; +Cc: Ahmad Gani, GNU/Weeb Mailing List On Sun, Sep 07, 2025 at 01:23:05PM +0900, Ammar Faizi wrote: > I have a small proposal. Writing this email from my Android phone right > now. I am still traveling. Sitting and waiting are boring, so I decided > to exercise my brain designing this solution. It keeps my brain active. I am back to my laptop now. Just pushed a new commit to speed up the configure script invokation. As gwproxy currently does not have any C++ code, there is no need to append CXXFLAGS, so, skip the CXXFLAGS tests. Consider rebasing if needed. https://github.com/GNUWeeb/gwproxy/commit/b1e70e468bab135a14a8faee3ea535ace9eac211 -- Ammar Faizi ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-07 4:22 ` Ammar Faizi 2025-09-07 5:57 ` Ammar Faizi @ 2025-09-07 6:39 ` Ahmad Gani 2025-09-07 6:40 ` Ahmad Gani 2025-09-07 7:06 ` Ammar Faizi 2025-09-07 9:52 ` Ahmad Gani 2 siblings, 2 replies; 34+ messages in thread From: Ahmad Gani @ 2025-09-07 6:39 UTC (permalink / raw) To: Ammar Faizi; +Cc: Alviro Iskandar Setiawan, GNU/Weeb Mailing List On Sun, Sep 7, 2025 at 11:23 AM Ammar Faizi wrote: > Have these two structs: > > struct stack_u16 { > uint16_t sp; > uint16_t bp; > uint16_t *arr; > }; > > struct dns_resolver { > struct stack_u16 stack; > struct gwp_conn_pair **sess_map; > uint16_t sess_map_cap; > }; > > [ The struct dns_resolver MAY also cover socket, addr, etc. Now my > primary point here is about the session mapping data structure. ] > > 1) @stack is used to keep track of unused indexes in @sess_map. > > - Push all unused indexes into the @stack, the index will be used as > the DNS query txid. > > - When creating a DNS query, pop the stack, use the popped number as > the txid. The txid is also used to store the corresponding > gwp_conn_pair session (sess_map[txid] = ptr to conn session). > > - When the sess_map[txid] is no longer used, set > (sess_map[txid] = NULL) and push the txid back into the @stack. Is using a stack more advantageous than incrementing txid until it recycles itself? we can still keep track of unused indexes by checking if the session_map[txid] is NULL or not. > 2) @sess_map_cap is used to keep track of the allocated size of > @sess_map. Don't allocate 65536 at once to keep the memory footprint > low when the proxy server is not fully utilized. I think 65536 is a pretty low memory footprint in modern systems; as Sir Alviro noted, it's around 0.25 or 0.5 MB. Maybe it's more beneficial to allocate at once in the stack rather than add instructions for growing/shrinking the allocated memory? > 3) Handle more than 65535 clients in a single thread? Doable! > > Just have more than one struct dns_resolver per thread, create a new > DNS resolver in the same thread once the 65535 limit is hit. > > Client 0 to 65535, use DNS resolver A. > Client 65536 to 131071, use DNS resolver B (new UDP socket!). > and so on... > > I doubt we will reach that limit. I think it is better to add more > threads to utilize more CPU cores than keep adding more clients in > a single thread. But we should still support it. Yeah, I prefer to close the connection and add more workers if the concurrent clients in that worker are beyond 65535. But closing the connection because of this makes gwproxy not a robust application; I will consider allowing it to scale beyond 65535 in a single worker. -- Ahmad Gani ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-07 6:39 ` Ahmad Gani @ 2025-09-07 6:40 ` Ahmad Gani 2025-09-07 6:43 ` Ammar Faizi 2025-09-07 7:06 ` Ammar Faizi 1 sibling, 1 reply; 34+ messages in thread From: Ahmad Gani @ 2025-09-07 6:40 UTC (permalink / raw) To: Ammar Faizi; +Cc: Alviro Iskandar Setiawan, GNU/Weeb Mailing List On Sun, Sep 7, 2025 at 1:39 PM Ahmad Gani wrote: > Is using a stack more advantageous than incrementing txid until it > recycles itself? we can still keep track of unused indexes by checking > if the session_map[txid] is NULL or not. Oh wait, I think it's indeed more efficient to use stack rather than checking for NULL, if the session_map is occupied in linear, the checking may take longer to iterate until it finds NULL. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-07 6:40 ` Ahmad Gani @ 2025-09-07 6:43 ` Ammar Faizi 0 siblings, 0 replies; 34+ messages in thread From: Ammar Faizi @ 2025-09-07 6:43 UTC (permalink / raw) To: Ahmad Gani; +Cc: Alviro Iskandar Setiawan, GNU/Weeb Mailing List On Sun, Sep 07, 2025 at 01:40:35PM +0700, Ahmad Gani wrote: > Oh wait, I think it's indeed more efficient to use stack rather than > checking for NULL, if the session_map is occupied in linear, the checking > may take longer to iterate until it finds NULL. That's right, using stack will give you O(1) lookup. Constant time. -- Ammar Faizi ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-07 6:39 ` Ahmad Gani 2025-09-07 6:40 ` Ahmad Gani @ 2025-09-07 7:06 ` Ammar Faizi 2025-09-07 7:17 ` Ahmad Gani 1 sibling, 1 reply; 34+ messages in thread From: Ammar Faizi @ 2025-09-07 7:06 UTC (permalink / raw) To: Ahmad Gani; +Cc: Alviro Iskandar Setiawan, GNU/Weeb Mailing List On Sun, Sep 07, 2025 at 01:39:05PM +0700, Ahmad Gani wrote: > I think 65536 is a pretty low memory footprint in modern systems; as Sir > Alviro noted, it's around 0.25 or 0.5 MB. > > Maybe it's more beneficial to allocate at once in the stack rather than > add instructions for growing/shrinking the allocated memory? Yes, it's small. But I prefer to save the space for it. The memory saving is very beneficial for a small Linux device like Raspberry Pi, 4G modems, or even your lovely home routers. Also, when you have 65536 clients in a single thread, that doesn't necessarily mean your DNS mapping are fully used from index 0 to 65535. A DNS query is a short-lived object. You only need to keep it until the DNS query receives a response. Having 65536 outstanding DNS queries is very unlikely even with 100K concurrent connections. Especially given the cache feature, you will skip the DNS resolution if identical names are accessed very often. I believe allocating it lazily is a good way for this design. If the initial size, 16, feels too low for you. You may consider adding an option to customize the initial size via CLI args. -- Ammar Faizi ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-07 7:06 ` Ammar Faizi @ 2025-09-07 7:17 ` Ahmad Gani 0 siblings, 0 replies; 34+ messages in thread From: Ahmad Gani @ 2025-09-07 7:17 UTC (permalink / raw) To: Ammar Faizi; +Cc: Alviro Iskandar Setiawan, GNU/Weeb Mailing List On Sun, Sep 7, 2025 at 2:06 PM Ammar Faizi wrote: > On Sun, Sep 07, 2025 at 01:39:05PM +0700, Ahmad Gani wrote: > > I think 65536 is a pretty low memory footprint in modern systems; as Sir > > Alviro noted, it's around 0.25 or 0.5 MB. > > > > Maybe it's more beneficial to allocate at once in the stack rather than > > add instructions for growing/shrinking the allocated memory? > > Yes, it's small. But I prefer to save the space for it. The memory > saving is very beneficial for a small Linux device like Raspberry Pi, > 4G modems, or even your lovely home routers. Unfortunately I can't install gwproxy on my lovely fiberhome HG6243C routers :( > Also, when you have 65536 clients in a single thread, that doesn't > necessarily mean your DNS mapping are fully used from index 0 to 65535. > A DNS query is a short-lived object. You only need to keep it until the > DNS query receives a response. > > Having 65536 outstanding DNS queries is very unlikely even with 100K > concurrent connections. Especially given the cache feature, you will > skip the DNS resolution if identical names are accessed very often. I > believe allocating it lazily is a good way for this design. > > If the initial size, 16, feels too low for you. You may consider adding > an option to customize the initial size via CLI args. Fair enough, I will try to adopt this proposal as well. Thanks for the idea! -- Ahmad Gani ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-07 4:22 ` Ammar Faizi 2025-09-07 5:57 ` Ammar Faizi 2025-09-07 6:39 ` Ahmad Gani @ 2025-09-07 9:52 ` Ahmad Gani 2025-09-07 10:19 ` Ammar Faizi 2 siblings, 1 reply; 34+ messages in thread From: Ahmad Gani @ 2025-09-07 9:52 UTC (permalink / raw) To: Ammar Faizi; +Cc: Alviro Iskandar Setiawan, GNU/Weeb Mailing List On Sun, Sep 7, 2025 at 11:23 AM Ammar Faizi wrote: > Have these two structs: > > struct stack_u16 { > uint16_t sp; > uint16_t bp; > uint16_t *arr; > }; > > struct dns_resolver { > struct stack_u16 stack; > struct gwp_conn_pair **sess_map; > uint16_t sess_map_cap; > }; > > [ The struct dns_resolver MAY also cover socket, addr, etc. Now my > primary point here is about the session mapping data structure. ] > > 1) @stack is used to keep track of unused indexes in @sess_map. > > - Push all unused indexes into the @stack, the index will be used as > the DNS query txid. > > - When creating a DNS query, pop the stack, use the popped number as > the txid. The txid is also used to store the corresponding > gwp_conn_pair session (sess_map[txid] = ptr to conn session). > > - When the sess_map[txid] is no longer used, set > (sess_map[txid] = NULL) and push the txid back into the @stack. Could you elaborate on the usage of bp and sp? I'm a bit confused about how they are intended to be used in practice. My assumption is that bp points to index 0 as the base, and sp points to index sess_map_cap - 1. Is that correct? If so, why do we need bp in the first place, under what circumstances would bp change to something else? > - Double the size when the slot is exhausted. When double-ing the > size, the new allocated indexes are all pushed into the @stack. > The slot exhaustion happens when the @stack is empty, it can be > identified at the pop operation. So the @stack.arr is considered to be empty when sp == bp or zero and at that moment, no more txid left to be used for incoming queries. > If the number of elements in the @stack is > equal to @sess_map_cap, that means @sess_map is 100% free and can > be reset to the initial state (back to size 16). Because I treat @stack.sp as an index that gets incremented gradually each time a push operation occurs, I assume you're talking about @stack.sp + 1 == @sess_map_cap. -- Ahmad Gani ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-07 9:52 ` Ahmad Gani @ 2025-09-07 10:19 ` Ammar Faizi 2025-09-07 10:36 ` Ahmad Gani 0 siblings, 1 reply; 34+ messages in thread From: Ammar Faizi @ 2025-09-07 10:19 UTC (permalink / raw) To: Ahmad Gani; +Cc: Alviro Iskandar Setiawan, GNU/Weeb Mailing List That works. I suugest not to blindly copy anything as is. That was just a general idea. Since the array size in the stack is always the same as @sess_map_cap, it is fine to use @sess_map_cap as the stack boundary obviouslÿ,/h\x02\x02 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-07 10:19 ` Ammar Faizi @ 2025-09-07 10:36 ` Ahmad Gani 0 siblings, 0 replies; 34+ messages in thread From: Ahmad Gani @ 2025-09-07 10:36 UTC (permalink / raw) To: Ammar Faizi; +Cc: Alviro Iskandar Setiawan, GNU/Weeb Mailing List On Sun, Sep 7, 2025 at 5:19 PM Ammar Faizi wrote: > That works. I suugest not to blindly copy anything as is. That was just a general idea. Since the array size in the stack is always the same as @sess_map_cap, it is fine to use @sess_map_cap as the stack boundary obviouslÿ. I just thought I might be missing something, so I wanted to confirm the usage of bp. I realize now it was just meant as a general idea. Sorry for being pedantic. -- Ahmad Gani ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-05 16:26 ` Alviro Iskandar Setiawan 2025-09-06 4:32 ` Ahmad Gani @ 2025-09-06 7:14 ` Alviro Iskandar Setiawan 2025-09-06 7:21 ` Ahmad Gani 1 sibling, 1 reply; 34+ messages in thread From: Alviro Iskandar Setiawan @ 2025-09-06 7:14 UTC (permalink / raw) To: Ahmad Gani; +Cc: Ammar Faizi, GNU/Weeb Mailing List On Fri, Sep 5, 2025 at 11:26 PM Alviro Iskandar Setiawan wrote: > > + if (w->ctx->cfg.use_raw_dns) { > > +#ifdef CONFIG_RAW_DNS > > + arm_poll_for_raw_dns_query(w, gcp); > > + > > + r = __sys_epoll_ctl(w->ep_fd, EPOLL_CTL_ADD, gde->udp_fd, &ev); > > if (unlikely(r)) > > - return r; > > + return (int)r; > > +#endif > > + } else { > > Bad indentation. Back to this section. Apart from that, the arm_poll_for_raw_dns_query() call here does not seem to care about what happens in __sys_sendto(). What if it fails? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-06 7:14 ` Alviro Iskandar Setiawan @ 2025-09-06 7:21 ` Ahmad Gani 0 siblings, 0 replies; 34+ messages in thread From: Ahmad Gani @ 2025-09-06 7:21 UTC (permalink / raw) To: Alviro Iskandar Setiawan; +Cc: Ammar Faizi, GNU/Weeb Mailing List On Sat, Sep 6, 2025 at 2:14 PM Alviro Iskandar Setiawan wrote: > On Fri, Sep 5, 2025 at 11:26 PM Alviro Iskandar Setiawan wrote: > > > + if (w->ctx->cfg.use_raw_dns) { > > > +#ifdef CONFIG_RAW_DNS > > > + arm_poll_for_raw_dns_query(w, gcp); > > > + > > > + r = __sys_epoll_ctl(w->ep_fd, EPOLL_CTL_ADD, gde->udp_fd, &ev); > > > if (unlikely(r)) > > > - return r; > > > + return (int)r; > > > +#endif > > > + } else { > > > > Bad indentation. > > Back to this section. Apart from that, the > arm_poll_for_raw_dns_query() call here does not seem to care about > what happens in __sys_sendto(). What if it fails? Oops, apparently I forgot to add the error checking on that. I guess if it returns an error, just drop the query and close the proxy session (?) ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-08-29 7:55 ` [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend Ahmad Gani 2025-09-05 16:26 ` Alviro Iskandar Setiawan @ 2025-09-06 7:47 ` Alviro Iskandar Setiawan 2025-09-06 11:01 ` Ahmad Gani 1 sibling, 1 reply; 34+ messages in thread From: Alviro Iskandar Setiawan @ 2025-09-06 7:47 UTC (permalink / raw) To: Ahmad Gani; +Cc: Ammar Faizi, GNU/Weeb Mailing List On Fri, Aug 29, 2025 at 2:56 PM Ahmad Gani wrote: > +#ifdef CONFIG_RAW_DNS > + if (ctx->nr_entries == ctx->entry_cap && realloc_entries(ctx)) > + return NULL; > + > + r = __sys_socket(ctx->ns_addr.sa.sa_family, SOCK_DGRAM | SOCK_NONBLOCK, 0); > + if (r < 0) > + goto out_free_e; > + e->udp_fd = (int)r; > + > + switch (ctx->cfg.restyp) { > + case GWP_DNS_RESTYP_PREFER_IPV4: > + case GWP_DNS_RESTYP_IPV4_ONLY: > + af = AF_INET; > + break; > + case GWP_DNS_RESTYP_PREFER_IPV6: > + case GWP_DNS_RESTYP_IPV6_ONLY: > + af = AF_INET6; > + break; > + default: > + assert(0); > + goto out_close_fd; > + } > + > + txid = (uint16_t)rand(); > + r = gwdns_build_query(txid, name, af, e->payload, sizeof(e->payload)); > + if (r < 0) > + goto out_close_fd; > + e->payloadlen = (int)r; > +#endif Back to this section again. Isn't one of the advantages of inventing your own DNS resolver is also to avoid opening and closing a UDP socket for each DNS query? This design feels wrong, the txid here means nothing if each query has its own UDP connection. Can't you use the same UDP socket for all queries? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend 2025-09-06 7:47 ` Alviro Iskandar Setiawan @ 2025-09-06 11:01 ` Ahmad Gani 0 siblings, 0 replies; 34+ messages in thread From: Ahmad Gani @ 2025-09-06 11:01 UTC (permalink / raw) To: Alviro Iskandar Setiawan; +Cc: Ammar Faizi, GNU/Weeb Mailing List On Sat, Sep 6, 2025 at 2:47 PM Alviro Iskandar Setiawan wrote: > On Fri, Aug 29, 2025 at 2:56 PM Ahmad Gani wrote: > > +#ifdef CONFIG_RAW_DNS > > + if (ctx->nr_entries == ctx->entry_cap && realloc_entries(ctx)) > > + return NULL; > > + > > + r = __sys_socket(ctx->ns_addr.sa.sa_family, SOCK_DGRAM | SOCK_NONBLOCK, 0); > > + if (r < 0) > > + goto out_free_e; > > + e->udp_fd = (int)r; > > + > > + switch (ctx->cfg.restyp) { > > + case GWP_DNS_RESTYP_PREFER_IPV4: > > + case GWP_DNS_RESTYP_IPV4_ONLY: > > + af = AF_INET; > > + break; > > + case GWP_DNS_RESTYP_PREFER_IPV6: > > + case GWP_DNS_RESTYP_IPV6_ONLY: > > + af = AF_INET6; > > + break; > > + default: > > + assert(0); > > + goto out_close_fd; > > + } > > + > > + txid = (uint16_t)rand(); > > + r = gwdns_build_query(txid, name, af, e->payload, sizeof(e->payload)); > > + if (r < 0) > > + goto out_close_fd; > > + e->payloadlen = (int)r; > > +#endif > > Back to this section again. Isn't one of the advantages of > inventing your own DNS resolver is also to avoid opening and closing a > UDP socket for each DNS query? This design feels wrong, the txid here > means nothing if each query has its own UDP connection. Can't you use > the same UDP socket for all queries? Mhmmm, you have a point; I'll give some thought to the single UDP socket approach. -- Ahmad Gani ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 0/2] Initial work on integration of DNS parser lib in gwproxy 2025-08-29 7:55 [PATCH gwproxy v8 0/2] Initial work on integration of DNS parser lib in gwproxy Ahmad Gani 2025-08-29 7:55 ` [PATCH gwproxy v8 1/2] dnsparser: Add dns parser code Ahmad Gani 2025-08-29 7:55 ` [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend Ahmad Gani @ 2025-09-05 9:18 ` Ammar Faizi 2025-09-05 9:34 ` Alviro Iskandar Setiawan 2 siblings, 1 reply; 34+ messages in thread From: Ammar Faizi @ 2025-09-05 9:18 UTC (permalink / raw) To: Alviro Iskandar Setiawan; +Cc: Ahmad Gani, GNU/Weeb Mailing List On Fri, Aug 29, 2025 at 2:56 PM Ahmad Gani wrote: > This is revision v8 of the initial work on the integration of the DNS > parser lib in gwproxy. This is an RFC draft; the patches themselves > aren't final. Al, can you review this series? -- Ammar Faizi ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH gwproxy v8 0/2] Initial work on integration of DNS parser lib in gwproxy 2025-09-05 9:18 ` [PATCH gwproxy v8 0/2] Initial work on integration of DNS parser lib in gwproxy Ammar Faizi @ 2025-09-05 9:34 ` Alviro Iskandar Setiawan 0 siblings, 0 replies; 34+ messages in thread From: Alviro Iskandar Setiawan @ 2025-09-05 9:34 UTC (permalink / raw) To: Ammar Faizi; +Cc: Alviro Iskandar Setiawan, Ahmad Gani, GNU/Weeb Mailing List On Fri, Sep 5, 2025 at 4:18 PM Ammar Faizi wrote: > Al, can you review this series? It slipped my mind and was not on my radar. I'll be looking into it shortly. tq -- Software Engineer & ITPM Officer Alviro Iskandar Setiawan ^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2025-09-09 2:39 UTC | newest] Thread overview: 34+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-08-29 7:55 [PATCH gwproxy v8 0/2] Initial work on integration of DNS parser lib in gwproxy Ahmad Gani 2025-08-29 7:55 ` [PATCH gwproxy v8 1/2] dnsparser: Add dns parser code Ahmad Gani 2025-08-29 7:55 ` [PATCH gwproxy v8 2/2] gwproxy: refactor code base to add experimental raw DNS backend Ahmad Gani 2025-09-05 16:26 ` Alviro Iskandar Setiawan 2025-09-06 4:32 ` Ahmad Gani 2025-09-06 5:16 ` Ahmad Gani 2025-09-06 6:17 ` Ahmad Gani 2025-09-06 6:48 ` Ahmad Gani 2025-09-06 7:02 ` Alviro Iskandar Setiawan 2025-09-06 6:47 ` Alviro Iskandar Setiawan 2025-09-09 2:38 ` reyuki 2025-09-06 11:27 ` Ahmad Gani 2025-09-06 12:01 ` Alviro Iskandar Setiawan 2025-09-06 12:58 ` Ahmad Gani 2025-09-06 13:30 ` Alviro Iskandar Setiawan 2025-09-06 13:44 ` Alviro Iskandar Setiawan 2025-09-06 14:26 ` Ahmad Gani 2025-09-06 14:30 ` Alviro Iskandar Setiawan 2025-09-07 4:22 ` Ammar Faizi 2025-09-07 5:57 ` Ammar Faizi 2025-09-07 6:39 ` Ahmad Gani 2025-09-07 6:40 ` Ahmad Gani 2025-09-07 6:43 ` Ammar Faizi 2025-09-07 7:06 ` Ammar Faizi 2025-09-07 7:17 ` Ahmad Gani 2025-09-07 9:52 ` Ahmad Gani 2025-09-07 10:19 ` Ammar Faizi 2025-09-07 10:36 ` Ahmad Gani 2025-09-06 7:14 ` Alviro Iskandar Setiawan 2025-09-06 7:21 ` Ahmad Gani 2025-09-06 7:47 ` Alviro Iskandar Setiawan 2025-09-06 11:01 ` Ahmad Gani 2025-09-05 9:18 ` [PATCH gwproxy v8 0/2] Initial work on integration of DNS parser lib in gwproxy Ammar Faizi 2025-09-05 9:34 ` Alviro Iskandar Setiawan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox