core.entropy
Reusable Shannon entropy computation for TLS memory dump analysis.
Provides sliding-window entropy profiling with O(1) incremental updates per step. Used by change_point detection and entropy visualization.
All functions are stdlib-only (math) with no external dependencies.
- entropy_from_freq(freq, total)[source]
Compute Shannon entropy from byte frequency counts.
- Iterates over 256 frequency bins and applies the standard formula:
H = -sum(p * log2(p)) for each p = count / total where count > 0
- shannon_entropy(data)[source]
Compute Shannon entropy of raw byte data.
Builds a full frequency table over the input and computes entropy in a single pass.
- compute_entropy_profile(data, window=32, step=1)[source]
Sliding-window entropy profile over byte data.
Uses an incremental frequency table that adds the incoming byte and removes the outgoing byte at each step, keeping each step O(1) regardless of window size.
For large dumps (e.g. 10 MB), use step=16 to produce ~625K sample points instead of ~10M.
- Parameters:
- Returns:
List of (offset, entropy) tuples, one per window position. Returns an empty list when data is shorter than the window.
- Return type:
- find_high_entropy_regions(profile, threshold=7.5, min_width=32)[source]
Find contiguous high-entropy regions in an entropy profile.
Scans the profile for runs of consecutive points at or above the threshold, then filters by minimum width.
- Parameters:
profile (List[Tuple[int, float]]) – Output of compute_entropy_profile – list of (offset, entropy) tuples, assumed sorted by offset.
threshold (float) – Minimum entropy (bits/byte) to qualify as high-entropy. Default 7.5 targets near-random data.
min_width (int) – Minimum span (end - start) in bytes for a region to be reported. Default 32 (one AES-256 key length).
- Returns:
List of (start_offset, end_offset, mean_entropy) tuples for each qualifying region. Offsets refer to the window start positions from the profile.
- Return type: