DGA and DNS-Tunnel Hunting at Scale — ML domain anomaly and tunnel volumetrics on VPC Flow Logs

DGA and DNS-Tunnel Hunting at Scale on VPC Flow Logs

DGA and DNS-Tunnel Hunting at Scale — ML domain anomaly and tunnel volumetrics on VPC Flow Logs — HACKFORLAB cover

From the hunt desk. Ask any SOC analyst on call this week what their DGA detection rule looks like, and you will get one of three answers: a your SIEM macro that counts NXDOMAIN responses per host, a Sigma rule borrowed from a 2019 talk, or — most commonly — a slow head-shake and “we rely on the resolver vendor.” All three are wrong answers for 2026. Modern operators rotate domains within minutes, blend DGA queries into legitimate DNS traffic, and tunnel command-and-control payloads through TXT records on hijacked third-party domains that no vendor feed will ever flag. The detection has to live in your own pipeline, not somebody else’s allow-list.

This post is the playbook for hunting Domain Generation Algorithm (DGA) callbacks and DNS-tunnel exfiltration using AWS VPC Flow Logs as the primary signal, augmented with Route 53 Resolver query logs for lexical analysis. We cover entropy scoring on subdomain labels, n-gram dictionaries trained on legitimate Alexa-style traffic, NXDOMAIN burst detection, and inter-arrival timing analysis on UDP/53 flows — all wired into a pipeline you can run on commodity Athena + Lambda inside a $50/month AWS budget.

It is post #6 in our running VPC Flow Log detection-engineering series. The companions are adaptive C2 beaconing (FFT + DBSCAN), lateral movement via graph analysis, low-and-slow exfiltration, botnet coordination clustering, and living-off-the-land Markov chains. Together, these are the hunts every team using AWS for production workloads should have running before the end of this quarter.

What VPC Flow Logs Actually Tell You About DNS (and What They Don’t)

The first sin most teams commit when hunting DNS-based attacks is assuming VPC Flow Logs carry DNS payload data. They don’t. A VPC Flow Log record for a DNS query gives you the five-tuple (source IP, source port, destination IP, destination port, protocol), the timestamps, byte counts, packet counts, and a TCP-flag summary that is useless for UDP/53. What you can derive from that is volumetric and temporal — query rate per host, byte size per query, inter-arrival timing — but not the queried domain itself.

For lexical analysis (entropy, n-gram, dictionary distance) you need Route 53 Resolver query logs. They are free to generate, cheap to store in S3, and Athena-queryable out of the box. If your account is not emitting them yet, that is a 10-minute fix and the prerequisite for everything below. The split looks like this:

  • VPC Flow Logs — query rate, byte volume, IAT, NXDOMAIN-versus-NOERROR ratio (when you correlate with response sizes), spikes in UDP/53 to non-resolver IPs.
  • Route 53 Resolver query logs — full query strings, record types (A, AAAA, TXT, NULL, CNAME), response codes, time to first byte.

The pipeline below uses both. If you are stuck on VPC Flow Logs alone — air-gapped, regulatory constraint, whatever the reason — you can still catch the tunneling and burst patterns, you just lose the lexical-classifier branch.

Why Naive DGA Detection Fails in 2026

Every DGA detection rule that ships in default SIEM content was tuned against Conficker, CryptoLocker, or Necurs. Those families used predictable domain seeds, generated 10–10,000 candidate domains per day, and produced massive NXDOMAIN spikes that any threshold rule would catch. That world ended around 2018. Modern operators do four things that break the old rules:

  • Mixed-language dictionary DGAs. Instead of xq7p2k8a4mz.com, the algorithm produces quietlampriverstone.net — three real English words concatenated. Entropy scores look like a small business website. The old “high entropy = bad” heuristic produces nothing but false positives on Cloudflare R2 buckets and Azure storage URLs.
  • Sparse query patterns. Polling 3 candidate domains per hour instead of 1,000 per day. NXDOMAIN bursts never form because the operator only requires one successful resolution.
  • Living off legitimate domains. Encoding C2 payloads into subdomain queries on a compromised real domain (aGVsbG8tdGhlcmU.legit-saas.example). The parent domain is on every allow-list; the subdomain is the C2 channel.
  • DoH and DoT. DNS over HTTPS and DNS over TLS hide the queries entirely. If your workload uses a public DoH resolver, all DNS traffic looks like 443/TCP to Cloudflare or Google. You need to either force-route DNS through Route 53 Resolver or detect the DoH pattern itself.

None of these are theoretical. We have observed all four in real engagements during the past 18 months — including one case where a compromised CI/CD runner exfiltrated 22 MB of source code through TXT-record queries on a third-party SaaS domain that nobody on the team had any reason to question. The volumetric signal was visible the entire time in VPC Flow Logs. Nobody was looking.

The Detection Pipeline

DGA and DNS-Tunnel detection pipeline — five-step architecture from VPC Flow Logs through ML classifier to alert

Five stages, run daily. Each stage is independently inspectable and tunable.

  1. Ingest. Route 53 Resolver query logs land in S3 as JSON, partitioned by date and account. VPC Flow Logs land in their usual location as Parquet. Both queryable from a single Athena workgroup.
  2. Per-domain lexical features. For each unique queried domain (excluding the in-org allow-list), compute: subdomain length, character entropy, dictionary-word coverage, vowel ratio, character n-gram log-likelihood against a pretrained model. The model is a 3-gram character-level Markov chain trained on Tranco top-100k (free, weekly-updated alternative to the discontinued Alexa list).
  3. Random Forest classifier. Inputs the seven lexical features. Trained on a labelled corpus of legitimate top-100k domains and a publicly available DGA corpus (DGArchive’s open dataset is the standard reference — caveats below). Output: dga_score between 0 and 1. Score above 0.7 = candidate DGA.
  4. Volumetric overlay (VPC Flow Logs). For each candidate DGA source host, compute NXDOMAIN-ratio over the prior hour (using Resolver log response codes), DNS query rate, and DNS byte volume relative to baseline. Anomalies on any axis raise the host’s overall threat score.
  5. Alert + IOC export. Final score combines DGA classifier confidence and host-level volumetric anomaly. Above threshold, output an alert to SNS / SIEM, and push the candidate domain into MISP / OpenCTI as a STIX 2.1 indicator with TTL of 7 days. Cyber-threat-intel teams elsewhere benefit; you benefit when the same domain appears again from another host.

The DGArchive caveat: the corpus is research-grade and aging. Operators we see in 2026 frequently produce DGAs that postdate the training data and bypass classifiers trained on it. Re-train monthly against your own NXDOMAIN bursts (sampled, validated by an analyst), and you will catch the families that no public corpus has yet seen.

Feature Engineering From the Telemetry You Have

Feature Source Formula / method What it captures
Subdomain entropy Resolver logs Shannon entropy on character distribution Random-looking labels
N-gram log-likelihood Resolver logs log P(s) under trained char-level Markov model How English-like the label is
Dictionary coverage Resolver logs fraction of label covered by dictionary words Catches mixed-language DGAs
Subdomain length Resolver logs length(label) excluding TLD Tunnels use very long labels
NXDOMAIN ratio Resolver logs nxdomain_count / total_queries per host/hour Sparse-DGA hosts deviate
DNS query rate VPC Flow Logs (UDP/53) flows per minute per source Burst detection
DNS payload volume VPC Flow Logs (UDP/53) bytes per query (avg + p95) Tunnel signature — legit DNS is < 200 bytes/query
Resolver target diversity VPC Flow Logs distinct resolver IPs per host Catches hosts bypassing default resolver
TXT-record query share Resolver logs txt_queries / total_queries Tunnels lean heavily on TXT

Athena SQL — Putting the Pieces Together

The first query is the volumetric host profile from VPC Flow Logs. Numeric protocol 17 is UDP; port 53 is DNS. The 100-byte threshold on average bytes per query is the single most useful filter for tunnel detection — legitimate DNS rarely exceeds it.

WITH dns_flows AS (
    SELECT srcaddr, dstaddr, bytes, packets, start,
           CASE WHEN protocol = 17 AND dstport = 53 THEN 1 ELSE 0 END AS is_dns
    FROM central_vpc_flow_logs
    WHERE action = 'ACCEPT'
      AND srcaddr LIKE '10.%'
      AND protocol = 17 AND dstport = 53
      AND day BETWEEN '2026/05/09' AND '2026/05/15'
)
SELECT srcaddr,
       COUNT(*)                              AS dns_query_count,
       COUNT(DISTINCT dstaddr)               AS resolver_diversity,
       AVG(bytes)                            AS avg_bytes_per_query,
       APPROX_PERCENTILE(bytes, 0.95)        AS p95_bytes,
       SUM(bytes)                            AS total_dns_bytes
FROM dns_flows
GROUP BY srcaddr
HAVING AVG(bytes) > 100
    OR COUNT(DISTINCT dstaddr) > 2
    OR SUM(bytes) > 5000000
ORDER BY total_dns_bytes DESC;

The second query joins to Route 53 Resolver logs (parquet table r53_resolver_logs) and surfaces NXDOMAIN bursts and TXT-record concentration per host:

SELECT srcaddr,
       COUNT(*)                                                                AS total_queries,
       COUNT_IF(rcode = 'NXDOMAIN')                                            AS nxdomain_count,
       CAST(COUNT_IF(rcode='NXDOMAIN') AS DOUBLE)/NULLIF(COUNT(*),0)           AS nxdomain_ratio,
       COUNT_IF(query_type = 'TXT')                                            AS txt_queries,
       AVG(LENGTH(query_name))                                                 AS avg_query_length,
       MAX(LENGTH(query_name))                                                 AS max_query_length,
       APPROX_PERCENTILE(LENGTH(query_name), 0.95)                             AS p95_query_length
FROM r53_resolver_logs
WHERE day BETWEEN '2026/05/09' AND '2026/05/15'
GROUP BY srcaddr
HAVING nxdomain_ratio > 0.30
    OR txt_queries > 50
    OR MAX(LENGTH(query_name)) > 100
ORDER BY total_queries DESC;

Pair both outputs in your Lambda / SageMaker scoring step. The volumetric layer (VPC Flow Logs) is your first-line filter — it drops 99% of hosts. The lexical layer (Resolver logs + classifier) is the second-line. Together, the false-positive rate sits around 1–3 alerts per million queries in a typical enterprise environment, and analyst triage takes under five minutes per alert.

What Real Tunnels Look Like in Your Data

If you have never seen DNS tunnel traffic in your own logs, the signature is unmistakable once you know what to look for:

  • Average bytes per query above 200. Legitimate DNS A-record queries are typically 60–80 bytes; AAAA queries are 70–95 bytes; even SRV and MX queries rarely exceed 200. iodine and dnscat2 tunnels routinely push 400+ bytes per query.
  • TXT-record share above 5%. Most production workloads make under 1% TXT queries (mostly SPF/DKIM lookups by mail servers). A web server suddenly making 30% TXT queries is exfiltrating.
  • NXDOMAIN ratio above 30%. Sparse DGA hosts produce a steady stream of failed lookups as they search for the live operator domain.
  • Resolver IP diversity. A host that normally queries only the VPC resolver suddenly bypassing it to hit 1.1.1.1, 8.8.8.8, or 9.9.9.9 is either misconfigured or doing something the security team should know about.

Limits and False-Positive Sources

  • CDN-fronted SaaS. Cloudflare R2, Akamai EdgeWorkers, AWS CloudFront URLs all use long-random-looking subdomain segments that look DGA-ish to entropy classifiers. Maintain an ASN allow-list for legitimate CDN traffic.
  • Browser canary domains. Chrome generates random three-letter NXDOMAIN queries on startup to detect DNS hijacking (the famous jvxz, kfqx patterns). These will trip naïve NXDOMAIN detectors. Filter on user-agent-correlated source ports if you can.
  • collaboration platforms all use DNS-heavy connection-rotation patterns that look suspicious in volumetric features. Allow-list by destination ASN.
  • Anti-malware / EDR resolvers. enterprise endpoint security platforms all generate elevated DNS traffic to vendor cloud resolvers. Allow-list by destination IP block.

MITRE ATT&CK Techniques Covered by This Detection

ATT&CK ID Technique / sub-technique Coverage Hunter notes
T1071.004 Application Layer Protocol: DNS Full The core surface — entropy + volume features
T1568 Dynamic Resolution (parent) Full
T1568.002 Dynamic Resolution: Domain Generation Algorithms Full Classifier is purpose-built for this technique
T1568.003 Dynamic Resolution: DNS Calculation Partial Catches some via NXDOMAIN burst
T1568.001 Dynamic Resolution: Fast Flux DNS Partial Resolver-target diversity feature
T1048.003 Exfiltration Over Unencrypted Non-C2 Protocol (DNS exfil) Full iodine / dnscat2 / custom tunnels
T1572 Protocol Tunneling Full DNS tunnel is the canonical case
T1090 Proxy Partial DoH/DoT proxy bypassing the corp resolver visible in resolver-diversity feature
T1102 Web Service (when DoH is used) Partial
T1071.001 Web Protocols (DoH variants) Partial
T1041 Exfiltration Over C2 Channel Out of scope See series post #3 (Isolation Forest + LSTM)

Adversary emulation. The cleanest validation is to run iodine or dnscat2 from a lab host against a controlled subdomain you own. The volumetric features should fire within 60 seconds of the tunnel starting. public adversary-emulation atomics T1048.003 provides scripted variants. For DGA emulation specifically, domain_generation_algorithms is the academic reference implementation — generate samples from a half-dozen families and confirm your classifier scores them correctly.

Sigma / detection-as-code. The volumetric Sigma rule fires on the SIEM directly from VPC Flow Logs. The lexical detection runs in your ML pipeline and emits a dga_score field; Sigma then thresholds on that. Two rules, both maintainable.

Adversary groups. Persistent DGA users in current threat-intel include G0035 — Dragonfly, G0080 — Cobalt Group, and most of the financially-motivated ransomware affiliates that run on commercial C2 frameworks with custom DNS profiles. DNS tunneling specifically shows up in G0050 — APT32 (OceanLotus) tradecraft repeatedly.

Where This Sits in a Mature Threat Hunting Programme

Closing Thoughts

DNS is the protocol every adversary still gets to use. Block it and your business stops; allow it without inspection and the back door is permanently open. The pipeline above is not exotic — it is the bare minimum any cloud-native SOC should run on its own DNS traffic in 2026. Build it, tune it, send us your tuning lessons. The threat hunters who share their false-positive curves help the entire community converge faster on the right defaults.

Happy threat hunting.

#threathunting #dgadetection #dnstunneling #vpcflowlogs #route53 #awssecurity #c2 #soc #blueteam #detectionengineering #ml #mitreattack

Core Working Areas :- Threat Intelligence, Digital Forensics, Incident Response, Fraud Investigation, Web Application Security Technical Certifications :- Computer Hacking Forensics Investigator | Certified Ethical Hacker | Certified Cyber crime investigator | Certified Professional Hacker | Certified Professional Forensics Analyst | Redhat certified Engineer | Cisco Certified Network Associates | Certified Firewall Solutions | Certified Network Monitoring Solution | Certified Proxy Solutions

Leave a Reply

Your email address will not be published. Required fields are marked *

Enter Captcha Here : *

Reload Image