Skip to content

docs: add Alert Notification Field Reference section to health documentation#22074

Draft
nedi-app[bot] wants to merge 1 commit intomasterfrom
docs/alert-notification-field-reference
Draft

docs: add Alert Notification Field Reference section to health documentation#22074
nedi-app[bot] wants to merge 1 commit intomasterfrom
docs/alert-notification-field-reference

Conversation

@nedi-app
Copy link
Copy Markdown
Contributor

@nedi-app nedi-app bot commented Mar 27, 2026

Summary

  • Adds a comprehensive "Alert Notification Field Reference" section to src/health/REFERENCE.md
  • Documents all notification payload fields including duration, non_clear_duration, status, old_status, and other common fields
  • Clarifies the critical distinction between duration (time in previous state) and non_clear_duration (total time in WARNING/CRITICAL state)
  • Includes practical examples showing how to interpret duration values in real alert scenarios
  • Cross-references existing custom notification integration documentation

Changes

  • Added new section "Alert Notification Field Reference" with:
    • Core alert fields table (host, chart, name, status, old_status, value, etc.)
    • Duration fields table with detailed explanation of duration vs non_clear_duration
    • Duration field examples with JSON payloads and interpretations
    • Identification, context, navigation, summary, and aggregate field tables
    • Usage guidance for custom notification integrations

Motivation

This documentation addresses a common source of confusion where users misinterpret the duration field as "time in current state" when it actually represents "time in the previous state before this transition". The correct interpretation is critical for proper troubleshooting decisions.

Audit Context

Session Query (Original):

приходит сообщение вот такого типа: { "host": "netdata", "chart": "prometheus_user_registrations.registrations_countryBrowser-country_browser=ru_chrome", ... "duration": "2700", ... } что тут значит duration?

Session Query (English translation):

I receive a message like this: { "host": "netdata", "chart": "prometheus_user_registrations...", "duration": "2700", ... }. What does duration mean here?

Decision Verdict: surgical_edit - Add documentation explaining the notification payload fields, specifically clarifying that duration represents the time in the previous alarm state (not current state), and distinguishing it from non_clear_duration.


Summary by cubic

Adds an Alert Notification Field Reference to health docs, explaining notification payload fields. Clarifies duration (time in previous state) vs non_clear_duration (total time in WARNING/CRITICAL), with examples.

  • New Features
    • New section in src/health/REFERENCE.md documenting core, duration, and summary fields.
    • JSON examples for new, escalating, and recovering alerts with interpretations.
    • Tables for identification, context, navigation, and aggregate fields.
    • Guidance for custom notification integrations and field/variable naming nuances.

Written for commit 9847be0. Summary will update on new commits.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

Confidence score: 5/5

  • Automated review surfaced no issues in the provided summaries.
  • No files require special attention.
Architecture diagram
sequenceDiagram
    participant Health as Health Engine
    participant Notify as Notification Dispatcher
    participant Webhook as External Webhook / API
    participant Script as Custom Notification Script

    Note over Health: Alert State Transition (e.g., CLEAR -> WARNING)
    
    Health->>Health: NEW: Calculate duration (Time spent in PREVIOUS state)
    
    alt If entering WARNING/CRITICAL
        Health->>Health: NEW: Calculate non_clear_duration (Total active time)
    else If returning to CLEAR
        Health->>Health: NEW: Reset non_clear_duration to 0
    end

    Health->>Notify: Dispatch notification payload
    
    Note over Notify: Formatting based on integration type

    alt Webhook Integration
        Notify->>Notify: Format JSON (snake_case)
        Notify->>Webhook: POST payload (includes duration & non_clear_duration)
        Webhook-->>Notify: 200 OK
    else Custom Script
        Notify->>Notify: Export environment variables
        Notify->>Script: Execute with ${duration} & ${non_clear_duration}
        Script-->>Notify: Exit Code
    end

    Note over Webhook,Script: Reviewer: Note distinction between duration (prev) vs non_clear (total)
Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant