Skip to content

Anomaly

polars_ts.bayesian.anomaly

Bayesian anomaly scoring via posterior predictive p-values.

Scores each observation using an online conjugate normal model that maintains a running posterior over mean and variance, then computes tail-area probabilities and Bayes factors.

References

  • Gelman et al. (2013), Bayesian Data Analysis, Chapter 6
  • Adams & MacKay (2007), Bayesian Online Changepoint Detection

BayesianAnomalyResult dataclass

Container for Bayesian anomaly scoring output.

Attributes

scores DataFrame with anomaly scores per observation. n_anomalies Total number of flagged anomalies.

_NIGState dataclass

Normal-Inverse-Gamma sufficient statistics for online updating.

mu | sigma^2 ~ N(mu0, sigma^2 / kappa0)

sigma^2 ~ IG(alpha0, beta0)

update(y)

Update posterior with a single observation.

predictive_params()

Return (mean, scale) of the posterior predictive t-distribution.

predictive_df()

Degrees of freedom for the posterior predictive t-distribution.

BayesianAnomalyDetector

Bayesian anomaly detector using posterior predictive p-values.

Maintains an online Normal-Inverse-Gamma conjugate model per series, scoring each observation via its posterior predictive tail probability.

Parameters

threshold P-value threshold below which an observation is flagged (default 0.01). prior_mu Prior mean for the normal model. If None, uses first observation. prior_kappa Prior strength on mean (higher = more confident prior). prior_alpha Inverse-Gamma shape for variance prior. prior_beta Inverse-Gamma scale for variance prior. warmup Number of initial observations to use for prior calibration before scoring begins. anomaly_scale Scale multiplier for the anomaly hypothesis in Bayes factor. id_col Column identifying each time series. target_col Column with target values. time_col Column with timestamps.

score(df)

Score each observation for anomalousness.

Parameters

df Input DataFrame with time series data.

Returns

BayesianAnomalyResult Result containing scores DataFrame and anomaly count.

_score_single(gid, values)

Score a single time series.

_t_cdf(x, df)

CDF of Student's t-distribution using scipy (lazy import).

_compute_pvalue(y, state)

Compute two-sided posterior predictive p-value.

_compute_bayes_factor(y, state, anomaly_scale=10.0)

Bayes factor: evidence for normal model vs anomaly model.

H0: y ~ predictive(state) H1: y ~ predictive(state, scale * anomaly_scale)

Returns BF01 (>1 favors normal, <1 favors anomaly).

bayesian_anomaly_score(df, threshold=0.01, prior_mu=None, prior_kappa=1.0, prior_alpha=2.0, prior_beta=1.0, warmup=10, anomaly_scale=10.0, id_col='unique_id', target_col='y', time_col='ds')

Bayesian anomaly scoring convenience function.

Scores each observation using an online conjugate normal model with posterior predictive p-values and Bayes factors.

Parameters

df Input DataFrame. threshold P-value threshold for flagging anomalies (default 0.01). prior_mu Prior mean (default: first observation). prior_kappa Prior strength on mean. prior_alpha Inverse-Gamma shape for variance prior. prior_beta Inverse-Gamma scale for variance prior. warmup Warmup observations before scoring. anomaly_scale Scale for anomaly hypothesis in Bayes factor. id_col, target_col, time_col Column names.

Returns

pl.DataFrame Scores with p_value, bayes_factor, is_anomaly columns.