Anomaly
polars_ts.bayesian.anomaly
Bayesian anomaly scoring via posterior predictive p-values.
Scores each observation using an online conjugate normal model that maintains a running posterior over mean and variance, then computes tail-area probabilities and Bayes factors.
References
- Gelman et al. (2013), Bayesian Data Analysis, Chapter 6
- Adams & MacKay (2007), Bayesian Online Changepoint Detection
BayesianAnomalyResult
dataclass
Container for Bayesian anomaly scoring output.
Attributes
scores DataFrame with anomaly scores per observation. n_anomalies Total number of flagged anomalies.
_NIGState
dataclass
Normal-Inverse-Gamma sufficient statistics for online updating.
mu | sigma^2 ~ N(mu0, sigma^2 / kappa0)
sigma^2 ~ IG(alpha0, beta0)
update(y)
Update posterior with a single observation.
predictive_params()
Return (mean, scale) of the posterior predictive t-distribution.
predictive_df()
Degrees of freedom for the posterior predictive t-distribution.
BayesianAnomalyDetector
Bayesian anomaly detector using posterior predictive p-values.
Maintains an online Normal-Inverse-Gamma conjugate model per series, scoring each observation via its posterior predictive tail probability.
Parameters
threshold
P-value threshold below which an observation is flagged (default 0.01).
prior_mu
Prior mean for the normal model. If None, uses first observation.
prior_kappa
Prior strength on mean (higher = more confident prior).
prior_alpha
Inverse-Gamma shape for variance prior.
prior_beta
Inverse-Gamma scale for variance prior.
warmup
Number of initial observations to use for prior calibration
before scoring begins.
anomaly_scale
Scale multiplier for the anomaly hypothesis in Bayes factor.
id_col
Column identifying each time series.
target_col
Column with target values.
time_col
Column with timestamps.
score(df)
Score each observation for anomalousness.
Parameters
df Input DataFrame with time series data.
Returns
BayesianAnomalyResult Result containing scores DataFrame and anomaly count.
_score_single(gid, values)
Score a single time series.
_t_cdf(x, df)
CDF of Student's t-distribution using scipy (lazy import).
_compute_pvalue(y, state)
Compute two-sided posterior predictive p-value.
_compute_bayes_factor(y, state, anomaly_scale=10.0)
Bayes factor: evidence for normal model vs anomaly model.
H0: y ~ predictive(state) H1: y ~ predictive(state, scale * anomaly_scale)
Returns BF01 (>1 favors normal, <1 favors anomaly).
bayesian_anomaly_score(df, threshold=0.01, prior_mu=None, prior_kappa=1.0, prior_alpha=2.0, prior_beta=1.0, warmup=10, anomaly_scale=10.0, id_col='unique_id', target_col='y', time_col='ds')
Bayesian anomaly scoring convenience function.
Scores each observation using an online conjugate normal model with posterior predictive p-values and Bayes factors.
Parameters
df Input DataFrame. threshold P-value threshold for flagging anomalies (default 0.01). prior_mu Prior mean (default: first observation). prior_kappa Prior strength on mean. prior_alpha Inverse-Gamma shape for variance prior. prior_beta Inverse-Gamma scale for variance prior. warmup Warmup observations before scoring. anomaly_scale Scale for anomaly hypothesis in Bayes factor. id_col, target_col, time_col Column names.
Returns
pl.DataFrame
Scores with p_value, bayes_factor, is_anomaly columns.