Living-off-the-Land Kill Chain Detection with Markov Chains

Living-off-the-Land Kill Chain Detection — Markov chain + ensemble scoring across MITRE phases on VPC Flow Logs — HACKFORLAB cover image

From the hunt desk. The hardest attacks blend into normal traffic at every single flow. Every packet looks legitimate; every port is one you allow; every protocol is one your business uses. The signal is in the sequence, not in any one event. This post is the Markov-chain playbook that closes our five-part VPC Flow Log detection series — full kill-chain coverage across MITRE ATT&CK from Initial Access (TA0001) through Exfiltration (TA0010), an ensemble of three ML models, and four documented kill-chain patterns ready to drop into your SIEM as named-chain detection rules.

The most sophisticated cyber attacks blend into normal traffic at every single stage. They use legitimate ports (443, 53, 80), standard protocols, and normal byte volumes. No single flow is anomalous. Looking at any one network event from a living-off-the-land (LOTL) attack tells you nothing — every event individually is something the host could plausibly do on a routine day. The attack only becomes visible when you model the sequence of behaviours as a Markov chain of network states and ask: is this transition pattern probable for this host?

This playbook walks through that pipeline. Each VPC Flow Log record is mapped to a network state — EXTERNAL_HTTPS, METADATA_ACCESS, INTERNAL_LDAP, INTERNAL_SMB, INTERNAL_DATABASE, and roughly fifteen others — chosen to align with MITRE ATT&CK technique categories. We compute a Markov transition matrix from 14 days of baseline behaviour, score each new state sequence by its joint probability under the baseline, and ensemble the result with Isolation Forest (per-host features) and an LSTM (temporal patterns). The output is a kill-chain score: which hosts followed an unlikely sequence of behaviours within an attack-feasible time window.

This is the final post in our five-part VPC Flow Log detection-engineering series. The companions are adaptive C2 beacon detection, lateral movement graph detection, low-and-slow data exfiltration, and botnet coordination clustering.

Why LOTL Attacks Defeat Flow-Level Rules

Modern intrusion sets — China-nexus APTs, Iranian state-sponsored actors, top-tier ransomware affiliates — long ago internalised the lesson that the cheapest way to evade detection is to look like the legitimate users of the environment. They:

Stage payloads through HTTPS to cloud-hosted infrastructure (S3, Azure Blob, Google Cloud Storage).
Steal credentials through ordinary AD queries on standard LDAP ports.
Move laterally with built-in tools — PowerShell remoting on 5985, WMI on 135, RDP on 3389.
Query the EC2 instance metadata service for credentials on a perfectly normal-looking link-local address (169.254.169.254).
Exfiltrate through HTTPS to common cloud providers that every business depends on.

Every one of those actions is something normal users and normal services do every day. A web server reaches HTTPS endpoints. A workstation queries AD. An admin uses WinRM. A developer uses RDP. The signal is in the sequence, not in any single event. A web server that has never previously touched LDAP suddenly running a Kerberos request and then opening SMB to a database server within a 30-minute window — that sequence is improbable enough to be diagnostic of compromise. The Markov model is the formal way to ask “how improbable is this sequence under your historical baseline?”

Mapping Flows to Network States

The first step of the pipeline is a deterministic state map. Every flow becomes one of approximately twenty states based on a triple of (direction, port class, internal/external destination):

METADATA_ACCESS — destination is 169.254.169.254 (the EC2 IMDS endpoint).
DNS_QUERY — destination port 53.
INTERNAL_LDAP — destination ports 389 or 636 to an RFC 1918 address.
INTERNAL_KERBEROS — port 88 internal.
INTERNAL_SMB — ports 445, 135, 139 internal.
INTERNAL_WINRM — ports 5985, 5986 internal.
INTERNAL_RDP — port 3389 internal.
INTERNAL_SSH — port 22 internal.
INTERNAL_DATABASE — ports 3306, 5432, 1433, 27017, 6379 internal.
EXTERNAL_HTTPS — port 443 to non-RFC 1918.
EXTERNAL_HTTP — port 80 to non-RFC 1918.
EXTERNAL_HIGH_PORT — non-RFC 1918 on ports > 1024 not otherwise classified.
EXTERNAL_OTHER — everything else outbound.
INTERNAL_OTHER — everything else internal.

Adding or removing states is a one-time editing exercise; we have settled on this set because it cleanly aligns with MITRE ATT&CK tactic categories without exploding the state space. Each state corresponds to a recognisable analyst concept.

Building the Markov Transition Matrix

Over the 14-day baseline window, we observe each host’s state sequence. From that we compute the conditional probability P(state_j | state_i) for every pair of consecutive states, per host:

P(s_j | s_i) = count(s_i → s_j) / count(s_i)

The matrix is sparse for typical hosts. A web server’s matrix concentrates almost entirely in EXTERNAL_HTTPS → EXTERNAL_HTTPS and EXTERNAL_HTTPS → DNS_QUERY. A workstation has more diversity but still focuses on a small set of routine transitions. The transition matrix essentially encodes the host’s role on the network without any external labelling.

This per-host modelling is what makes the detection sensitive. A transition that is perfectly normal for a domain controller — say, INTERNAL_KERBEROS → INTERNAL_LDAP — is wildly improbable for a web server. The same transition produces a low or high anomaly score depending on the host, exactly as it should.

Sequence Probability Scoring

For each host’s current 1-hour state sequence, the joint probability under the baseline matrix is:

P(sequence) = P(s₁) · Π_{i=1..n-1} P(s_{i+1} | s_i)

For numerical stability we work in log space:

log P(sequence) = Σ_{i=1..n-1} log P(s_{i+1} | s_i) + log P(s₁)

An anomaly fires when −log P(sequence) exceeds the host’s baseline mean by more than 3 standard deviations. That threshold corresponds to a sequence the host would, statistically, produce less than once per several thousand hours — a clear signal that something is happening that has not happened before.

Kill-Chain Mapping and Phase Coverage

Anomalous sequences are not all equal. A sequence that touches multiple MITRE ATT&CK kill-chain phases in rapid succession is much more concerning than a sequence that wobbles within a single phase. The pipeline computes:

kill_chain_score = Σ_i ( phase_weight[i] · (1 / transition_probability[i]) · time_decay[i] )

time_decay = e^(−Δt / τ),  τ = 1800s (30-minute decay constant)

The decay constant is critical. Compromises happen quickly — hands-on-keyboard operator sessions typically last 10–60 minutes per host. The 30-minute decay ensures the score concentrates on tightly clustered improbable transitions rather than rewarding stale long-tail events.

A handful of high-risk kill-chain patterns we explicitly fingerprint:

EXTERNAL_HTTPS → METADATA_ACCESS → INTERNAL_LDAP → INTERNAL_SMB → INTERNAL_DATABASE

Translation: web-app exploit, SSRF to steal IAM credentials, AD reconnaissance, lateral movement, data access. Classic cloud kill chain.
EXTERNAL_HTTPS → INTERNAL_KERBEROS → INTERNAL_SMB → INTERNAL_WINRM → EXTERNAL_HTTPS

Translation: initial access, Kerberoasting, lateral move, remote execute, exfil. Classic on-prem kill chain.
DNS_QUERY (burst) → EXTERNAL_OTHER (high-port) → INTERNAL_SMB → INTERNAL_DATABASE → DNS_QUERY (high-bytes)

Translation: DNS-based C2 setup, callback, spread, data access, DNS exfiltration. Multi-protocol stealth chain.
EXTERNAL_HTTPS → INTERNAL_RDP → INTERNAL_RDP → INTERNAL_RDP → EXTERNAL_HTTPS (high-bytes)

Translation: RDP compromise, multi-hop pivot, high-volume exfil. The “smash and grab” pattern.

The pipeline maintains a small library of these named kill chains. When a current sequence overlaps a known kill chain at 80%+ similarity, the alert is enriched with the kill chain’s name and an analyst-readable narrative.

Ensemble Scoring: Markov + Isolation Forest + LSTM

No single anomaly detector handles all attack patterns. The pipeline therefore ensembles three models:

final_score = w₁ · markov_anomaly
            + w₂ · IF_score
            + w₃ · LSTM_error

initial weights:  w₁ = 0.5,  w₂ = 0.25,  w₃ = 0.25

The Markov model catches improbable transitions; Isolation Forest catches improbable feature vectors; the LSTM catches improbable temporal trajectories. Each model misses some attacks the other models catch. Weights are learned by logistic regression once you have a few dozen labelled incidents.

Feature Engineering from VPC Flow Logs

Feature	Source attributes	Formula	What it captures
Network state	srcAddr, dstAddr, dstPort, protocol	state_map(direction, port_class)	Behavioural state classification
Transition probability	sequential states per host	P(s_j \| s_i) from baseline matrix	Normal vs abnormal transitions
Sequence log-probability	state sequence per window	Σ log P(s_{i+1} \| s_i)	Overall chain anomaly
Kill-chain phase coverage	state sequence	count(unique ATT&CK phases in window)	Multi-phase attack indicator
Role deviation	all per-host features	cosine_distance(current_vector, role_centroid)	Host deviating from its normal role
Temporal acceleration	start between states	1 / AVG(time_between_state_transitions)	Fast attack chain execution
Cross-host correlation	srcAddr, state, start	count(hosts with same rare transitions)	Coordinated multi-host attack

Athena SQL — State Sequence Extraction

WITH state_mapped AS (
    SELECT srcaddr, dstaddr, dstport, start, bytes, protocol,
           CASE
             WHEN dstaddr = '169.254.169.254'                              THEN 'METADATA_ACCESS'
             WHEN dstport = 53                                             THEN 'DNS_QUERY'
             WHEN dstport IN (389, 636)   AND dstaddr LIKE '10.%'          THEN 'INTERNAL_LDAP'
             WHEN dstport = 88           AND dstaddr LIKE '10.%'           THEN 'INTERNAL_KERBEROS'
             WHEN dstport IN (445, 135, 139) AND dstaddr LIKE '10.%'       THEN 'INTERNAL_SMB'
             WHEN dstport IN (5985, 5986) AND dstaddr LIKE '10.%'          THEN 'INTERNAL_WINRM'
             WHEN dstport = 3389         AND dstaddr LIKE '10.%'           THEN 'INTERNAL_RDP'
             WHEN dstport = 22           AND dstaddr LIKE '10.%'           THEN 'INTERNAL_SSH'
             WHEN dstport IN (3306, 5432, 1433, 27017, 6379) AND dstaddr LIKE '10.%' THEN 'INTERNAL_DATABASE'
             WHEN dstport = 443          AND dstaddr NOT LIKE '10.%'       THEN 'EXTERNAL_HTTPS'
             WHEN dstport = 80           AND dstaddr NOT LIKE '10.%'       THEN 'EXTERNAL_HTTP'
             WHEN dstaddr NOT LIKE '10.%' AND dstaddr NOT LIKE '172.%'     THEN 'EXTERNAL_OTHER'
             ELSE 'INTERNAL_OTHER'
           END AS network_state
    FROM central_vpc_flow_logs
    WHERE action = 'ACCEPT'
      AND day BETWEEN '2026/03/19' AND '2026/03/23'
),
state_transitions AS (
    SELECT srcaddr, network_state AS current_state,
           LEAD(network_state) OVER (PARTITION BY srcaddr ORDER BY start) AS next_state,
           LEAD(start) OVER (PARTITION BY srcaddr ORDER BY start) - start AS transition_time,
           start, bytes
    FROM state_mapped
)
SELECT srcaddr, current_state, next_state,
       COUNT(*)                AS transition_count,
       AVG(transition_time)    AS avg_transition_time,
       MIN(transition_time)    AS min_transition_time
FROM state_transitions
WHERE next_state IS NOT NULL
GROUP BY srcaddr, current_state, next_state
ORDER BY transition_count DESC;

Tuning notes:

The state-map CASE is the only piece you may want to customise. Add states for application-specific ports (your SIEM forwarder 9997, Elastic 9200, Redis Sentinel 26379, etc.) so they do not collapse into INTERNAL_OTHER.
LEAD() over the partition by srcaddr chronologically pairs each flow with its next flow — the basis for the transition counts.
transition_time in seconds is reused in the temporal-acceleration feature and in the kill-chain time-decay calculation.

Putting It Into Production

VPC Flow Logs → S3 (Parquet partitioned by day).
EventBridge → daily Athena query at 04:30 local time produces the per-host transition tables.
Lambda or SageMaker job computes per-host transition probabilities, then scores each new sequence against the baseline matrix. Markov computation is trivial — almost all the cost is the SQL.
Ensemble scoring combines Markov, Isolation Forest, and LSTM outputs. The pre-computed Isolation Forest and LSTM models are loaded once and applied per host.
Alerts → SIEM, enriched with the matched kill chain (if any), the most improbable transitions, and the time decay weight.
Weekly baseline refresh — re-compute transition matrices and re-train downstream models.

The full pipeline runs comfortably inside a $100/month AWS budget for an environment of 5,000 hosts.

Detection Coverage Matrix

Attack class	Signature	Detection
Web exploit → SSRF → cloud cred theft	EXTERNAL_HTTPS → METADATA_ACCESS transition (very rare for most hosts)	Trivial — METADATA_ACCESS is an immediate diagnostic state
Kerberoasting + lateral move	INTERNAL_KERBEROS → INTERNAL_SMB → INTERNAL_WINRM sequence	Strong — three-state chain is extremely rare for workstations
DNS-tunnel C2 + exfil	DNS_QUERY burst at start and end of chain, internal pivot between	Strong — the bookend pattern is diagnostic
Multi-hop RDP pivot	INTERNAL_RDP → INTERNAL_RDP → INTERNAL_RDP sequence	Strong — RDP-to-RDP-to-RDP rarely legitimate
Living-off-the-land workstation compromise	Workstation state-transition matrix departs from its baseline	Strong — per-host baseline catches role deviation
Single-state attack (e.g., pure HTTPS C2)	No state changes	Uncovered — see post #1 (FFT beacon detection)
Slow drift exfiltration without protocol change	No new transitions, just slowly increasing volume	Uncovered by Markov — see post #3 (LSTM autoencoder)
Distributed coordinated attack across many hosts	Markov alerts for many hosts simultaneously	Covered through cross-host correlation feature; cross-reference post #4

Limits and False-Positive Sources

Newly deployed services change their state-transition matrix during the first 14 days after deployment. Pre-suppress through a deployment-notification channel.
Admin maintenance windows produce concentrated bursts of unusual transitions. Tag and suppress.
CI runners running build pipelines produce transition matrices that resemble enumeration. Tag by source IAM role or instance tag.
Vulnerability scanners deliberately produce kill-chain-looking traffic. Allow-list the scanner subnets.
Multi-tenant containers running varied workloads on the same host produce wide state distributions. Aim baselines at the container or pod level if possible (use the instance_id + container metadata enrichment).

MITRE ATT&CK Techniques Covered by This Detection

The Markov state-sequence pipeline is the only detection in this five-part series that sees the entire kill chain as one signal. Where the other four pipelines specialise (C2, lateral movement, exfiltration, coordination), this one stitches them together by modelling the transitions between behavioural states. Coverage spans Initial Access (TA0001) through Exfiltration (TA0010) — every adversary technique that produces a network signal contributes to the state sequence.

ATT&CK ID	Technique / sub-technique	Coverage	Hunter notes
T1190	Exploit Public-Facing Application (initial access)	Partial	Surfaces as EXTERNAL_HTTPS → unusual internal-state transition
T1078	Valid Accounts (the heart of LOTL)	Full	Per-host transition matrix catches role deviation even with valid creds
T1078.004	Cloud Accounts	Full	METADATA_ACCESS → EXTERNAL_HTTPS chain is the classic SSRF cred-theft signature
T1552.005	Unsecured Credentials: Cloud Instance Metadata API	Full	METADATA_ACCESS state is purpose-built for this
T1558	Steal or Forge Kerberos Tickets (parent)	Full	INTERNAL_KERBEROS → unusual next-state transitions
T1558.003	Kerberoasting	Full	Workstation → KRB5 → SMB chain is wildly improbable for non-admin hosts
T1558.004	AS-REP Roasting	Partial	—
T1003	OS Credential Dumping (network signature only)	Partial	Network egress of dump file surfaces; on-host dump needs EDR
T1021	Remote Services (full series)	Full	All sub-techniques produce state transitions
T1059	Command and Scripting Interpreter (network footprint)	Partial	—
T1059.001	PowerShell (via WinRM transitions)	Partial	WinRM state used; payload analysis needs EDR
T1568	Dynamic Resolution	Partial	DNS_QUERY burst patterns surface in transition matrix
T1071	Application Layer Protocol (parent)	Full	Every protocol becomes a state
T1574	Hijack Execution Flow	Out of scope	Host-side; EDR territory
T1136	Create Account (network footprint)	Partial	Unusual LDAP-write transitions on hosts that normally read-only
T1098	Account Manipulation	Partial	Same as T1136 — LDAP write traffic from non-IT hosts
T1218	System Binary Proxy Execution (LOLBins)	Out of scope	Network signature is the resulting EXTERNAL_HTTPS / INTERNAL_SMB — pipeline catches downstream effect
T1140	Deobfuscate/Decode Files or Information	Out of scope	Endpoint-only
T1090	Proxy	Full	Proxy hops become recurring state transitions
T1102	Web Service (legitimate-looking C2)	Full	Cloud-storage dead-drops surface as repeated EXTERNAL_HTTPS to high-value destinations
T1567.002	Exfiltration to Cloud Storage	Full	Closes the kill chain — final EXTERNAL_HTTPS state with high byte volume
T1041	Exfiltration Over C2 Channel	Full	Same chain end-state
T1486	Data Encrypted for Impact (ransomware staging)	Partial	Pre-ransomware lateral movement is the dominant signal

Adversary emulation / purple-team validation. The Markov pipeline is the most valuable to validate against full kill-chain emulation rather than atomic tests. The four named kill chains in the body of this post each map to a complete open-source adversary-emulation frameworks adversary profile — the “cloud-pivot-exfil” profile in particular reproduces the EXTERNAL_HTTPS → METADATA_ACCESS → INTERNAL_LDAP → INTERNAL_SMB → INTERNAL_DATABASE sequence end-to-end. Run that operation in a lab segment and confirm the Markov anomaly score peaks within the 30-minute time-decay window. public adversary-emulation atomics also provides individual tests for T1552.005 (IMDS) and T1558.003 (Kerberoasting) that validate two of the most-distinctive transitions.

Sigma / detection-as-code. The Markov pipeline produces a kill_chain_score field per host per hour. Sigma rule is a threshold check on that score — but the high-value rules are the named-chain matchers that fire when the current state sequence overlaps any of the four documented kill chains at > 80% similarity. Those rules give the analyst an immediate story rather than a generic anomaly.

D3FEND mappings. This pipeline maps to D3-NTA (Network Traffic Analysis) as the umbrella and to D3-UBA (User Behavior Analysis) when host-role baselines are framed as user/host behaviour profiles. The transition matrix is, structurally, a behaviour profile.

ATLAS for AI-targeted attacks. If your environment runs AWS Bedrock, Azure OpenAI, or Vertex AI workloads, extend the state map with BEDROCK_INVOKE, BEDROCK_AGENT, and BEDROCK_KB states. Map these to the MITRE ATLAS framework — the LLM-specific equivalent of ATT&CK. We covered the Bedrock-side in detail in AWS Bedrock Threat Hunting; this pipeline naturally fuses with that one when the state map is extended.

Where This Sits in a Mature Threat Hunting Programme

The Markov kill-chain detection is the apex of the five-part series — it sits on top of the other four pipelines and gives the analyst a single sequence-level alert that aggregates everything. Pair with:

Closing Thoughts on the Series

This five-post series argues a single thesis: VPC Flow Logs are an under-exploited gold mine of detection signal, and the ML methods to extract it are mature, free, and small enough to deploy on commodity AWS infrastructure. The five pipelines together cover periodicity (FFT), structure (graph), volume (Isolation Forest + LSTM), coordination (clustering), and sequence (Markov + ensemble). Every modern attack leaves a fingerprint in at least one of those dimensions.

If you have rolled out any of the five pipelines and want to compare notes — successful detections, false-positive war stories, parameter tuning lessons — please get in touch via the contact page. We are planning a follow-up series on EDR + VPC fusion that will reuse much of this infrastructure on top of process telemetry.

Happy threat hunting.

#threathunting #livingoffaland #lolbins #killchain #markovchain #mitreattack #vpcflowlogs #awssecurity #cloudsecurity #ensemble #anomalydetection #soc #blueteam #infosec #ml #detectionengineering #apt

Forensics and Cyber Threat Research Area

Leave a Reply Cancel reply

Indicator of Attacks | Indicator of Compromise

Recent Posts

Hackforlab Category

FaceBook Page

SIEM | UEBA

GridView List Posts Widget

Weekly Threat Advisory: Beyond Ransomware — 11 RATs, 7 APTs, 1 WIPER, HASH Still Leads (Jul 6 – 12, 2026)

Weekly Threat Advisory: 5 APTs, 200 RATs, 74% High-Severity — The Week the C2 Flood Went Quiet (Jun 29 – Jul 5, 2026)

Weekly Threat Advisory: APT Surge, Ransomware Full-Pivot, Messaging Weaponised — June 22-28, 2026

Indicators of Compromise and Threat Intelligence: A Practitioner Reference

Weekly Threat Advisory: Cluster Analysis & Top IOCs, June 15 – 21, 2026

Cyber Threat Attacks / Hunting

Cyber Deception

FOLLOW US

CYBER THREAT CATEGORIES

Top Cyber Security Articles

Threat Hunting Scenarios