Skip to content

docs: add Understanding Your Baseline Period section to ML configuration#22064

Draft
nedi-app[bot] wants to merge 1 commit intomasterfrom
docs/ml-baseline-period-explanation
Draft

docs: add Understanding Your Baseline Period section to ML configuration#22064
nedi-app[bot] wants to merge 1 commit intomasterfrom
docs/ml-baseline-period-explanation

Conversation

@nedi-app
Copy link
Copy Markdown
Contributor

@nedi-app nedi-app bot commented Mar 27, 2026

Summary

  • Adds a new "Understanding Your Baseline Period" section to ml-configuration.md
  • Explains how to calculate the comparison baseline period using the formula: baseline_period = number of models per dimension × train every
  • Includes example calculations for both default configuration (54 hours) and custom configurations (e.g., 6 hours)
  • Clarifies the relationship between individual model training windows and the overall baseline period
  • Helps users understand what time period anomaly alerts are compared against

Changes

Added a new section after "Multiple Models and False Positive Reduction" that:

  1. Provides the baseline period formula
  2. Shows example calculations with default and custom configurations
  3. Explains how each model's training window relates to the total baseline
  4. Uses Docusaurus admonition syntax for the formula tip

Audit Context

Session Query (Original):
наш текущий конф:
```
[ml]
enabled = yes
dimension anomaly score threshold = 0.5
number of models per dimension = 6
num samples to smooth = 1
minimum num samples to train = 600
train every = 1h
```

когда приходит алерт об аномалии, как понять, по сравнению с чем пришёл этот алерт? за какой период?

Session Query (English translation):
Our current config:
```
[ml]
enabled = yes
dimension anomaly score threshold = 0.5
number of models per dimension = 6
num samples to smooth = 1
minimum num samples to train = 600
train every = 1h
```

When an anomaly alert comes, how to understand what this alert is compared against? For what period?

Decision Verdict: Surgical documentation edit to add a formula and explanation for calculating the ML anomaly detection baseline period based on configuration parameters.


Summary by cubic

Adds an "Understanding Your Baseline Period" section to ml-configuration.md to explain the time window anomaly alerts are compared against. Includes a simple formula and examples to map ML config to a concrete baseline period.

  • New Features
    • Defines baseline formula: baseline_period = number of models per dimension × train every.
    • Adds examples: default 18×3h → 54h; custom 6×1h → 6h.
    • Clarifies how per-model training windows relate to the total baseline and that alerts use consensus across all models.

Written for commit 72df1a3. Summary will update on new commits.

Add a new section explaining how to calculate the comparison baseline period
for anomaly detection based on ML configuration parameters. Includes the
formula, example calculations for default and custom configs, and explanation
of how the baseline period affects anomaly detection.
@github-actions github-actions bot added area/docs area/ml Machine Learning Related Issues labels Mar 27, 2026
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

Confidence score: 5/5

  • Automated review surfaced no issues in the provided summaries.
  • No files require special attention.
Architecture diagram
sequenceDiagram
    participant TSDB as Time-Series Database
    participant MLEngine as ML Engine
    participant Models as Model Store (N models)
    participant Alerter as Alerting System

    Note over MLEngine,Models: NEW logic: Total Baseline = (Models per Dim) × (Train Every)

    loop Every "train every" interval (e.g., 1h or 3h)
        MLEngine->>TSDB: Fetch historical samples (maximum num samples)
        TSDB-->>MLEngine: Training data window
        MLEngine->>MLEngine: Train new model instance
        MLEngine->>Models: CHANGED: Rotate/Update model in pool
        Note right of Models: Total pool size = "number of models per dimension"
    end

    Note over TSDB,Alerter: Anomaly Detection Flow

    TSDB->>MLEngine: New data point collected
    
    loop For each model in pool (up to N)
        MLEngine->>Models: Query data point against model
        Models-->>MLEngine: Individual Anomaly Score
    end

    alt Consensus reached (All models agree)
        MLEngine->>MLEngine: Calculate aggregate Anomaly Score
        opt Score > "dimension anomaly score threshold"
            MLEngine->>Alerter: Trigger Anomaly Alert
            Note over Alerter: Alert context: Validated against<br/>total baseline period
        end
    else No Consensus
        MLEngine->>MLEngine: Suppress false positive
    end
Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/docs area/ml Machine Learning Related Issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants