Low-level Classes¶
C++17 classes exposed via pybind11. Use them for streaming (one value at a time) or to read multiple statistics from a single pass.
- class robustrolling.SlidingMean(window_size)¶
Rolling mean — prefix sum with optional ARM NEON / AVX2 SIMD, O(n) batch.
- Parameters:
window_size – int
- process_batch(x: numpy.ndarray, min_periods: int = 0) numpy.ndarray¶
import robust_rolling_core as rrc import numpy as np sm = rrc.SlidingMean(3) x = np.array([1.0, 2.0, 3.0, 4.0, 5.0]) sm.process_batch(x) # [1.0, 1.5, 2.0, 3.0, 4.0]
- class robustrolling.MonotonicMax(window_size)¶
Rolling maximum — monotonic deque, O(1) amortised.
- Parameters:
window_size – int
- process_batch(x: numpy.ndarray, min_periods: int = 0) numpy.ndarray¶
mm = rrc.MonotonicMax(3) mm.update(1.0); mm.update(3.0); mm.update(2.0) mm.get_max() # 3.0
- class robustrolling.MonotonicMin(window_size)¶
Rolling minimum — monotonic deque, O(1) amortised.
- Parameters:
window_size – int
- process_batch(x: numpy.ndarray, min_periods: int = 0) numpy.ndarray¶
Median classes¶
Four classes implement rolling median. SlidingMedian is the
recommended default — it selects the fastest sub-implementation in the
constructor, with zero runtime dispatch overhead (std::visit on
std::variant).
Class |
Algorithm |
Best for |
|---|---|---|
Auto-dispatcher |
All cases — picks one of the three below |
|
Sorted |
Small windows (w ≤ 600) or NaN-heavy data |
|
|
Medium windows (601–2 000), clean data |
|
Two heaps + lazy deletion |
Large windows (w > 2 000) or NaN-heavy data |
- class robustrolling.SlidingMedian(window_size, expect_nan=False)¶
Rolling median auto-dispatcher. Selects the fastest of
FlatMedian,MultisetMedian, orTwoHeapMedianbased on window_size and expect_nan at construction time.Dispatch thresholds (windows > 2 000 always use
TwoHeapMedian):expect_nanw ≤ 600
601–1 500
1 501–2 000
FalseFlatMedianMultisetMedianMultisetMedianTrueFlatMedianFlatMedianTwoHeapMedian- Parameters:
window_size – int
expect_nan – bool — hint that input contains many NaN values (default
False)
- process_batch(x: numpy.ndarray, min_periods: int = 0) numpy.ndarray¶
import numpy as np import robust_rolling_core as rrc x = np.array([1.0, 3.0, 2.0, 5.0, 4.0]) rrc.SlidingMedian(3).process_batch(x) # array([1., 2., 2., 3., 4.]) # NaN-heavy data — use expect_nan=True for large windows rrc.SlidingMedian(700, expect_nan=True).process_batch(x)
- class robustrolling.FlatMedian(window_size)¶
Rolling median — sorted
std::vectorwith binary-search insertion and eviction. O(w) insert/evict but cache-friendly; fastest for small windows (w ≤ 600) and for NaN-heavy streams where iterator tracking would degrade.- Parameters:
window_size – int
- process_batch(x: numpy.ndarray, min_periods: int = 0) numpy.ndarray¶
- class robustrolling.MultisetMedian(window_size)¶
Rolling median —
std::multiset(red-black tree) with a trackedmid_iterator. O(log w) insert/evict; fastest for medium windows (601–2 000) on clean data. Degrades significantly on NaN-heavy streams because iterator repositioning must scan the tree.- Parameters:
window_size – int
- process_batch(x: numpy.ndarray, min_periods: int = 0) numpy.ndarray¶
- class robustrolling.TwoHeapMedian(window_size)¶
Rolling median — two heaps (max-heap for lower half, min-heap for upper half) with lazy deletion via a
pendingmap. O(log w) amortised; memory layout is less cache-friendly thanFlatMedianbut unaffected by NaN density, making it the best choice for large windows or NaN-heavy data.- Parameters:
window_size – int
- process_batch(x: numpy.ndarray, min_periods: int = 0) numpy.ndarray¶
- class robustrolling.SlidingWelford(window_size)¶
Rolling sample variance (ddof=1) — Welford algorithm with ring buffer, O(1).
- Parameters:
window_size – int
- process_batch(x: numpy.ndarray, min_periods: int = 0) numpy.ndarray¶
sw = rrc.SlidingWelford(3) for v in [1., 2., 3., 4.]: sw.update(v) sw.get_variance() # 1.0
- class robustrolling.SlidingMoments(window_size)¶
Rolling mean, skewness, and excess kurtosis — Terriberry’s 4th-moment algorithm, O(1). Requires ≥ 3 observations for skewness, ≥ 4 for kurtosis.
- Parameters:
window_size – int
- reset()¶
- process_mean_batch(x: numpy.ndarray, min_periods: int) numpy.ndarray¶
- process_skewness_batch(x: numpy.ndarray, min_periods: int) numpy.ndarray¶
- process_kurtosis_batch(x: numpy.ndarray, min_periods: int) numpy.ndarray¶
Note:
min_periodsis a required positional argument in theprocess_*_batchmethods (no default).sm = rrc.SlidingMoments(4) for v in [1., 2., 3., 4.]: sm.update(v) sm.get_mean(), sm.get_skewness(), sm.get_kurtosis() # (2.5, 0.0, -1.2) # Batch usage x = np.array([1.0, 2.0, 3.0, 4.0, 5.0]) rrc.SlidingMoments(3).process_skewness_batch(x, 0) # [nan, nan, 0., 0., 0.]
- class robustrolling.SlidingCovariance(window_size)¶
Rolling sample covariance and Pearson correlation — 2-D Welford algorithm, O(1).
- Parameters:
window_size – int
- process_covariance_batch(x: numpy.ndarray, y: numpy.ndarray) numpy.ndarray¶
- process_correlation_batch(x: numpy.ndarray, y: numpy.ndarray) numpy.ndarray¶
sc = rrc.SlidingCovariance(3) for x, y in [(1, 2), (2, 4), (3, 6)]: sc.update(x, y) sc.get_covariance(), sc.get_correlation() # (2.0, 1.0)
- class robustrolling.SlidingMomentsPrefix(window_size)¶
Stateless batch engine for variance, skewness, and kurtosis using prefix sums of raw moments. Faster than
SlidingMomentsbut susceptible to catastrophic cancellation for data with large values and small variance. Use when numerical precision is not critical.- Parameters:
window_size – int
- variance_batch(x: numpy.ndarray, min_periods: int = 0) numpy.ndarray¶
- skewness_batch(x: numpy.ndarray, min_periods: int = 0) numpy.ndarray¶
- kurtosis_batch(x: numpy.ndarray, min_periods: int = 0) numpy.ndarray¶
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0]) rrc.SlidingMomentsPrefix(3).variance_batch(x) # [nan, 0.5, 1., 1., 1.] rrc.SlidingMomentsPrefix(3).skewness_batch(x) # [nan, nan, 0., 0., 0.] rrc.SlidingMomentsPrefix(4).kurtosis_batch(x) # [nan, nan, nan, -1.2, -1.2]