core.entropy

Reusable Shannon entropy computation for TLS memory dump analysis.

Provides sliding-window entropy profiling with O(1) incremental updates per step. Used by change_point detection and entropy visualization.

All functions are stdlib-only (math) with no external dependencies.

entropy_from_freq(freq, total)[source]

Compute Shannon entropy from byte frequency counts.

Iterates over 256 frequency bins and applies the standard formula:

H = -sum(p * log2(p)) for each p = count / total where count > 0

Parameters:
  • freq (list) – List of 256 integer counts (one per byte value).

  • total (int) – Sum of all counts (window size).

Returns:

Entropy in bits per byte, in range [0.0, 8.0]. Returns 0.0 when total is zero.

Return type:

float

shannon_entropy(data)[source]

Compute Shannon entropy of raw byte data.

Builds a full frequency table over the input and computes entropy in a single pass.

Parameters:

data (bytes) – Arbitrary byte sequence.

Returns:

Entropy in bits per byte, in range [0.0, 8.0]. Returns 0.0 for empty input.

Return type:

float

compute_entropy_profile(data, window=32, step=1)[source]

Sliding-window entropy profile over byte data.

Uses an incremental frequency table that adds the incoming byte and removes the outgoing byte at each step, keeping each step O(1) regardless of window size.

For large dumps (e.g. 10 MB), use step=16 to produce ~625K sample points instead of ~10M.

Parameters:
  • data (bytes) – Raw memory dump bytes.

  • window (int) – Sliding window size in bytes (default 32).

  • step (int) – Advance step in bytes (default 1).

Returns:

List of (offset, entropy) tuples, one per window position. Returns an empty list when data is shorter than the window.

Return type:

List[Tuple[int, float]]

find_high_entropy_regions(profile, threshold=7.5, min_width=32)[source]

Find contiguous high-entropy regions in an entropy profile.

Scans the profile for runs of consecutive points at or above the threshold, then filters by minimum width.

Parameters:
  • profile (List[Tuple[int, float]]) – Output of compute_entropy_profile – list of (offset, entropy) tuples, assumed sorted by offset.

  • threshold (float) – Minimum entropy (bits/byte) to qualify as high-entropy. Default 7.5 targets near-random data.

  • min_width (int) – Minimum span (end - start) in bytes for a region to be reported. Default 32 (one AES-256 key length).

Returns:

List of (start_offset, end_offset, mean_entropy) tuples for each qualifying region. Offsets refer to the window start positions from the profile.

Return type:

List[Tuple[int, int, float]]