
From the hunt desk. Flow-by-flow lateral-movement rules — “alert when a host opens SMB to N new destinations” — catch worms and miss operators. Every red team you have ever paid for already knows to spread the activity across hosts, protocols, and time. The detection you actually want sees the shape of the communication graph, not individual flows. This post is the graph-analytic playbook for MITRE ATT&CK TA0008 (Lateral Movement) and TA0007 (Discovery), including the full sub-technique coverage table further down the page and public adversary-emulation atomics / open-source adversary-emulation frameworks validation paths.
Mature attackers do not move laterally in a straight line. They pivot through multiple hosts, change protocols at each hop — SMB to WinRM to RDP to SSH — and time their movements to blend with normal background traffic. A single-hop detection rule, no matter how well tuned, will miss the kill chain. The pattern is only visible when you stop looking at flows individually and start looking at the shape of the communication graph as a whole.
This playbook walks through a detection pipeline that builds a directed, weighted graph from internal-to-internal VPC Flow Log records, baselines it over a 14-day window, and then uses PageRank anomaly scoring, Louvain community detection, and Graph Neural Network (GNN) anomaly scoring to surface compromised pivot hosts, broken segment isolation, and complete multi-hop kill chains — all within attacker-feasible time windows.
It is part of a five-post series on production-grade VPC Flow Log detection engineering. The companion posts cover adaptive C2 beacon detection with FFT, low-and-slow data exfiltration with Isolation Forest and LSTM, botnet coordination with clustering, and living-off-the-land kill chains with Markov models. Together, the five pipelines cover the network-side of the entire MITRE ATT&CK kill chain — from initial access through exfiltration — using telemetry you already collect.
Why Per-Hop Lateral Movement Rules Miss the Kill Chain
A typical SIEM rule for lateral movement looks something like: “alert when a single host opens SMB to more than five new internal destinations in one hour.” That rule works for noisy worms — WannaCry, NotPetya — and almost nothing else. Modern operators avoid it the same way they avoid every other threshold-based rule: by spreading the activity across hosts, protocols, and time.
The structural problem is that lateral movement is not a property of any single flow. It is a property of paths. A web server reaching an LDAP server is normal. An LDAP server reaching a database server is normal. A database server reaching a domain controller is normal. The sequence — web → LDAP → DB → DC inside a 30-minute window, with the same attacker session driving each hop — is not. No flow-level alert can see that sequence because each hop in isolation looks routine.
Graph analysis fixes this by treating the network as what it actually is: a directed weighted graph where nodes are hosts and edges are communications. Once the graph is in memory, three classes of detection become possible that flow-by-flow rules cannot achieve:
- Centrality-anomaly detection — a host whose betweenness centrality or PageRank score spikes above its baseline is acting as a pivot. Even if every individual flow looks legitimate, the host has stepped into a structural role it never previously played.
- Edge-novelty detection — an edge that never existed in the 14-day baseline and uses a known lateral-movement port carries a 10× weight multiplier. Combined with cross-community detection (the edge crosses a segmentation boundary), it is a near-perfect signal for early-stage lateral movement.
- Path-traversal detection — temporal depth-first search from a centrality-anomalous node, restricted to 30-minute windows, reconstructs the actual multi-hop sequence. The output is not just “host X is suspicious” but “host X → host Y → host Z → host W within 22 minutes” — a kill chain ready for analyst triage.
Building the Communication Graph from VPC Flow Logs
The graph construction is straightforward. Nodes are unique internal IPs (both srcAddr and dstAddr in RFC 1918 space). Edges are directed and weighted by total flow count and total bytes for that (src, dst) pair over the analysis window. Edge attributes capture protocol distribution (set of destination ports), temporal distribution (how the edge’s flows are spread across the window), and TCP flag patterns.
The lateral-movement-relevant destination ports we focus on are:
- SMB / NetBIOS: 445, 135, 139
- WinRM: 5985, 5986
- RDP: 3389
- SSH / Telnet: 22, 23
- Databases: 1433 (SQL Server), 3306 (MySQL), 5432 (Postgres), 27017 (Mongo), 6379 (Redis)
Filtering at the VPC Flow Log level to only these ports — combined with srcaddr LIKE '10.%' AND dstaddr LIKE '10.%' — typically reduces a daily flow log volume by 95–99%, which means even very large enterprises can keep the graph in memory on a single Lambda or Glue worker.
Baselining the Graph Over a 14-Day Window
You cannot detect anomalies without a baseline. The pipeline computes a per-node and per-edge baseline over a rolling 14-day window. Per-node we capture:
- In-degree, out-degree — the simplest measures of how connected the host is.
- Betweenness centrality — the proportion of shortest paths between all node pairs that pass through this node. High betweenness identifies the host as a “bridge.”
- PageRank — recursive importance score; a host pointed to by many high-PageRank hosts is itself high-PageRank. Originally devised for web search (Page & Brin, 1998), the algorithm transfers cleanly to internal traffic.
- Clustering coefficient — how interconnected a node’s neighbours are. Captures the local topology around each host.
Per-edge we capture the weight (flow count and byte count) and the temporal pattern (uniform across the window, bursty, or one-off). We also run the Louvain community-detection algorithm over the baseline to discover the network’s natural segmentation — even if the team has not formally documented its subnets. Each host gets a community label, and edges that cross community boundaries become high-interest under the anomaly model.
Anomaly Detection: Centrality Spikes and Novel Edges
Once the baseline is established, anomaly detection runs on each new 1-hour graph snapshot. The core comparison is a per-metric z-score:
z = (current_metric − μ_baseline) / σ_baseline
Three signals fire alerts on their own; a fourth signal is structural and earns a 10× multiplier:
- PageRank anomaly:
|PR_current(node) − μ_PR_baseline(node)| / σ_PR_baseline(node) > 3.0. A node whose PageRank suddenly triples — because newly compromised hosts are calling out to it — fires this alert. - Betweenness anomaly: same z-score threshold against the betweenness centrality baseline. Flags hosts that have suddenly become network bridges.
- Out-degree explosion: a host opening connections to many new destinations it has never contacted before. Classic enumeration / scanning behaviour.
- Cross-community novel edges: any edge that (a) is new, (b) uses a lateral-movement port, and (c) crosses a Louvain community boundary. These are flagged automatically and get the 10× weight multiplier when summed into the host’s lateral-movement risk score.
The combination matters. A host with anomalous PageRank and a new SMB edge into a different community is, in operational terms, almost certainly compromised. The same host with only one of those signals is plausibly explainable as a benign change (new service rolled out, planned admin work). Tuning the alert threshold on the combination rather than the individual signals is the secret to a tractable false-positive rate.
Attack-Path Reconstruction with Temporal DFS
Identifying an anomalous node is half the work. The other half is reconstructing the kill chain so an analyst can read it as a story. The pipeline runs a depth-first search from each anomalous node, but with a strict temporal constraint: the next hop must occur within 30 minutes of the previous hop, and only outbound edges are followed. The DFS terminates at one of three boundaries:
- A node whose outbound activity stops (the dead end — typically the final target).
- A node that reaches the public internet via an egress flow (the exfiltration hop).
- The 30-minute timeout (the chain stalled or the analyst is too late).
The reconstructed path is scored:
path_score = hop_count × protocol_diversity × cross_community_count
A four-hop chain that traverses SMB, WinRM, RDP, and HTTPS — and crosses three Louvain communities along the way — scores enormously higher than a four-hop chain that stays within one community on a single protocol. The scoring function deliberately rewards exactly the behaviour real operators exhibit when they pivot toward sensitive assets.
Feature Engineering from VPC Flow Logs
| Feature | Source attributes | Formula | What it captures |
|---|---|---|---|
| Out-degree delta | srcAddr, dstAddr | current_unique_dst − baseline_avg_dst | New connections from a host — first signal of enumeration |
| Betweenness centrality | srcAddr, dstAddr | σ(s,t|v) / σ(s,t) across all pairs | Pivot-point detection |
| PageRank anomaly | srcAddr, dstAddr, bytes | PR(current) − PR(baseline) | Structural importance shift |
| Protocol diversity | dstPort per edge | COUNT(DISTINCT dstPort) per src → dst | Multi-protocol lateral movement |
| Cross-segment flag | srcAddr, dstAddr, subnet_id | IF src_subnet ≠ dst_subnet THEN 1 ELSE 0 | Segmentation boundary crossing |
| Temporal chain score | start, srcAddr, dstAddr | Σ (1 / time_gap) for sequential hops | Fast hop sequences |
| TCP flag entropy | tcp_flags | entropy(tcp_flags distribution) | Unusual handshake patterns |
All seven features come straight from standard VPC Flow Logs (v3+ for subnet_id and tcp_flags; if you are still on v2, enable v3 today). The pipeline does not need any external enrichment to function — though pairing the output with identity context (which IAM principal owns which instance) makes triage dramatically faster.
Athena SQL — Edge Extraction for the Graph Pipeline
Athena handles the heavy filtering and aggregation. The query below extracts candidate lateral-movement edges — flows on lateral-movement ports between internal hosts — and aggregates them into edges with attribute summaries ready for the graph layer.
WITH internal_flows AS (
SELECT srcaddr, dstaddr, dstport, bytes, packets, start, tcp_flags,
subnet_id, interface_id, instance_id,
COUNT(*) OVER (PARTITION BY srcaddr) AS src_out_degree,
COUNT(*) OVER (PARTITION BY dstaddr) AS dst_in_degree
FROM central_vpc_flow_logs
WHERE action = 'ACCEPT'
AND srcaddr LIKE '10.%' AND dstaddr LIKE '10.%'
AND dstport IN (445, 135, 139, 5985, 5986, 3389, 22, 23, 1433, 3306, 5432, 27017, 6379)
AND day BETWEEN '2026/03/19' AND '2026/03/23'
),
edge_summary AS (
SELECT srcaddr, dstaddr,
COUNT(*) AS flow_count,
COUNT(DISTINCT dstport) AS protocol_diversity,
SUM(bytes) AS total_bytes,
array_agg(DISTINCT dstport) AS ports_used,
MIN(start) AS first_contact,
MAX(start) AS last_contact,
COUNT(DISTINCT instance_id) AS instances_involved
FROM internal_flows
GROUP BY srcaddr, dstaddr
)
SELECT *,
flow_count * protocol_diversity AS lateral_risk_score
FROM edge_summary
WHERE protocol_diversity >= 2 OR flow_count > 50
ORDER BY lateral_risk_score DESC;
A few notes on tuning:
- The
WHEREclause filters to RFC 1918 internal-to-internal flows on lateral-movement ports. If your private space includes the 100.64/10 carrier-grade NAT range or 172.16/12, extend the filter. - The final
HAVINGclause is the single tunable parameter that controls noise —protocol_diversity >= 2is the strongest filter, because legitimate workload-to-workload communication very rarely spans more than one protocol per src → dst pair.flow_count > 50catches the high-volume legitimate edges that we want in the graph anyway (and which form the baseline for centrality metrics). - Output volume: a mid-size enterprise typically produces 5,000–50,000 candidate edges per day after this filter — small enough for any graph library (NetworkX, PyTorch Geometric, DGL) to handle in seconds.
The Lateral Movement Risk Score
Once edges are loaded into the graph and centralities are computed, the final risk score for each candidate sequence is:
Lateral Movement Score = Σ (edge_weight × protocol_diversity × cross_segment_penalty)
cross_segment_penalty = 3.0 if src_subnet ≠ dst_subnet
= 1.0 otherwise
The score is computed per reconstructed path, not per node. A high-scoring path is a kill chain — it carries the weight of multiple edges, the diversity of multiple protocols, and the multiplied penalty of every segment boundary crossed. In our experience the alerting threshold lands somewhere between 50 and 200 depending on enterprise size; tune against a four-week historical backtest against known clean traffic.
Putting It Into Production
The end-to-end architecture is intentionally lightweight:
- VPC Flow Logs → S3 (Parquet partitioned by date). If you have not enabled this, our VPC Flow Logs hunting primer covers the setup.
- EventBridge → daily Lambda kicks off the Athena query above at 03:00 local time. Output lands in a separate S3 prefix.
- Glue / Spark / SageMaker job loads the edges into a NetworkX or PyTorch Geometric graph, computes centralities, runs Louvain, and produces per-host and per-edge anomaly scores.
- Anomaly scores → SNS/Kinesis → SIEM, with the reconstructed path attached as a structured field.
- 14-day baseline refresh runs nightly with a sliding window. Older data is aged out so the baseline stays current with planned environment changes.
For the GNN variant of the pipeline — useful when you have labelled incidents to train on — PyTorch Geometric or DGL provides everything needed. We start with a 2-layer GraphSAGE classifier over node features (centralities + role tags) and edge features (protocol diversity + temporal pattern), trained on a few hundred labelled incidents from past breach disclosures and your own red-team exercises. The unsupervised pipeline (z-scores + Louvain) works without any labels, which is where most teams start.
Limits and False-Positive Sources
Real networks have legitimate centrality concentrations. Common sources of false positives:
- Domain controllers, DNS resolvers, and centralised log collectors are designed to be high-betweenness, high-PageRank nodes. They produce constant baseline anomalies until allow-listed.
- Backup servers hit every host on a schedule and look like distributed lateral movement to the algorithm.
- Vulnerability scanners (a vulnerability scanner, a vulnerability scanner, OpenVAS, a vulnerability scanner) deliberately do exactly what the pipeline alerts on. Maintain an allow-list of scanner subnets.
- Configuration management agents (Ansible push, Puppet master, Chef server) reach into every host on a schedule.
- Newly deployed services create new edges that look novel to the 14-day baseline. The pipeline will alert until the baseline catches up; use a service-deployment notification channel to pre-suppress.
The cleanest operational pattern is a maintained allow-list of “known structural hubs” — DCs, DNS, scanner subnets, backup endpoints — combined with role-based suppression for newly deployed services during the first 14 days after deployment.
MITRE ATT&CK Techniques Covered by This Detection
This pipeline targets the Lateral Movement (TA0008) and Discovery (TA0007) tactics, with adjacent coverage of Credential Access (TA0006). The graph-anomaly signal fires on the structural traces an operator leaves behind during hands-on-keyboard movement, regardless of which specific tool they used. The table is your purple-team coverage worksheet.
| ATT&CK ID | Technique / sub-technique | Coverage | Hunter notes |
|---|---|---|---|
| T1021 | Remote Services (parent) | Full | Core surface — every sub-technique below produces graph-level evidence |
| T1021.001 | Remote Desktop Protocol (RDP) | Full | Multi-hop RDP is a classic chain — DFS reconstruction trivial |
| T1021.002 | SMB / Windows Admin Shares | Full | Port 445/135/139 carry the highest edge weight in the model |
| T1021.004 | SSH | Full | Internal-to-internal SSH on port 22 is rare and high-signal |
| T1021.006 | Windows Remote Management (WinRM) | Full | 5985/5986 — pairs naturally with PowerShell remoting hunts |
| T1018 | Remote System Discovery | Full | Out-degree explosion is the canonical scan signature |
| T1046 | Network Service Discovery | Full | Multi-port probing surfaces in protocol_diversity feature |
| T1570 | Lateral Tool Transfer | Full | Byte-weight on lateral edges spikes when payloads move |
| T1210 | Exploitation of Remote Services | Partial | Surface — exploit success is what creates the new edge; pre-exploit recon also surfaces |
| T1550.002 | Pass the Hash | Partial | Network footprint identical to legitimate SMB — pair with auth-event hunts |
| T1550.003 | Pass the Ticket | Partial | — |
| T1558 | Steal or Forge Kerberos Tickets | Partial | Kerberoasting traffic (port 88) flagged when paired with anomalous LDAP enumeration |
| T1558.003 | Kerberoasting | Partial | — |
| T1078 | Valid Accounts | Partial | The hardest case — legitimate creds, legitimate ports, abnormal graph position |
| T1059.001 | Command and Scripting Interpreter: PowerShell | Partial | WinRM channel covered at the network level; payload analysis needs EDR |
| T1047 | Windows Management Instrumentation | Partial | WMI over DCOM (port 135) surfaces in port_diversity |
| T1219 | Remote Access Software (legitimate RMM abuse) | Partial | AnyDesk / commercial remote-access tools / RustDesk surface via destination-port anomaly |
| T1572 | Protocol Tunneling | Out of scope | SSH/ICMP tunnels need post #3’s covert-channel hunt |
Adversary emulation / purple-team validation. The high-value public adversary-emulation atomics tests for this detection are T1021.002 (SMB), T1021.006 (WinRM), and T1046 (service discovery). For a realistic multi-stage chain, run the open-source adversary-emulation frameworks “discovery-and-lateral-movement” operation profile against a 3-tier lab segment. The graph anomaly score should peak as soon as the operator crosses the second segment boundary.
Sigma / detection-as-code. Output your graph anomaly events into the SIEM as structured fields — graph_pagerank_z, graph_betweenness_z, lateral_risk_score, attack_path — and write the Sigma rule as a simple threshold check. This separation keeps the heavy graph maths in Lambda/SageMaker and the alert logic in code review.
D3FEND mappings. The pipeline implements D3-NTCD (Network Traffic Community Deviation) directly via Louvain community analysis, and D3-NTA (Network Traffic Analysis) as the umbrella defensive technique. Useful framing when you justify the investment to your CISO.
Where This Sits in a Mature Threat Hunting Programme
Graph-based lateral-movement detection pairs naturally with the other four detections in this VPC-Flow-Log series and with the broader hunt patterns already on HACKFORLAB:
- Adaptive C2 beacon detection (FFT + DBSCAN) — the initial-access side.
- VPC Flow Log attack hunting — volumetric and access-pattern hunts.
- Outbound network threat hunting — destination enrichment.
- Cloud attack threat hunting — identity and resource-level evidence.
- Hunting AWS identity attacks — for the IAM side of the pivot.
- AWS Bedrock CloudTrail playbook — for the GenAI service surface.
- Authentication-event threat hunting — for the auth-side correlation.
- Linux threat hunting with CUT, SORT, UNIQ, DIFF — for the host-side investigation once a lateral-movement alert lands.
Closing Thoughts
If your SOC is still hunting lateral movement one flow at a time, you are missing the structural signal that real attackers cannot hide. Graph methods are mature, the libraries are free, and the SQL above is the only piece of plumbing you need to drop the analytical layer in front of your existing VPC Flow Logs. The investment is two engineer-days; the payoff is detection coverage for an entire kill-chain phase that most enterprises currently miss.
Tune the parameters against your environment. Backtest against your own incidents. Send us your war stories. Happy threat hunting.
#threathunting #lateralmovement #vpcflowlogs #awssecurity #cloudsecurity #graphneuralnetwork #pagerank #louvain #communitydetection #mitreattack #soc #blueteam #networkdetection #anomalydetection #cyberdefense #infosec #ml #detectionengineering










