Skip to content

docs: add disk usage estimation section to disk-requirements-and-retention#22071

Draft
nedi-app[bot] wants to merge 1 commit intomasterfrom
docs-add-disk-usage-estimation
Draft

docs: add disk usage estimation section to disk-requirements-and-retention#22071
nedi-app[bot] wants to merge 1 commit intomasterfrom
docs-add-disk-usage-estimation

Conversation

@nedi-app
Copy link
Copy Markdown
Contributor

@nedi-app nedi-app bot commented Mar 27, 2026

Summary

This PR adds a new "Estimating Disk Usage" section to the disk-requirements-and-retention documentation, providing users with a clear formula and worked example to calculate their own database size requirements.

Changes

  • Added new section "Estimating Disk Usage" after "Default Disk Footprint"
  • Included the formula: Disk per tier (bytes) = metrics_per_second × retention_seconds × bytes_per_sample
  • Added reference table for bytes_per_sample per tier (tier0: 0.6, tier1: 6, tier2: 18)
  • Provided a worked example calculating 10,000 metrics/sec with 30-day tier0 retention
  • Added note about ~1 GiB overhead for SQLite/metadata

Audit Context

Session Query: how does one estimate the database usage by netdata

Decision Verdict: surgical_edit

The validation identified that while most information about database sizing was accurate, the documentation lacked a clear formula and calculation guidance for users to estimate their own disk requirements. This PR addresses that gap by providing a structured approach with practical examples.

Checklist

  • New 'Estimating Disk Usage' section exists after 'Default Disk Footprint'
  • Formula is clearly presented with all variables explained
  • Worked example demonstrates the calculation end-to-end
  • SQLite/metadata overhead (~1 GiB) is mentioned
  • Existing content remains unchanged
  • Uses Docusaurus admonitions (:::note) instead of blockquote-style notices

Summary by cubic

Adds an “Estimating Disk Usage” section to the disk requirements doc with a simple formula and a worked example to size Netdata storage per tier. Includes a bytes-per-sample table (tier0 0.6, tier1 6, tier2 18) and notes ~1 GiB overhead for SQLite/metadata.

Written for commit e10af99. Summary will update on new commits.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file

Confidence score: 5/5

  • Minor docs-only issue: the worked example in docs/netdata-agent/sizing-netdata-agents/disk-requirements-and-retention.md mixes GB and GiB, which could slightly confuse readers about sizing.
  • Low-severity, high-confidence documentation discrepancy with no code impact keeps merge risk minimal.
  • Pay close attention to docs/netdata-agent/sizing-netdata-agents/disk-requirements-and-retention.md - correct the GB vs GiB conversion in the example.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="docs/netdata-agent/sizing-netdata-agents/disk-requirements-and-retention.md">

<violation number="1" location="docs/netdata-agent/sizing-netdata-agents/disk-requirements-and-retention.md:76">
P2: The worked example mixes GB and GiB units: 15,552,000,000 bytes is about 14.5 GiB, not 15.6 GiB.</violation>
</file>
Architecture diagram
sequenceDiagram
    participant App as External System / Collector
    participant Agent as Netdata Agent
    participant DB as DBEngine (Multi-tier)
    participant SQLite as Metadata Store (SQLite)
    participant Disk as Persistent Storage

    Note over App,Disk: Runtime Data Flow & Retention Logic

    App->>Agent: Ingest metrics (metrics_per_second)
    
    Agent->>DB: NEW: Store Tier 0 samples
    Note right of DB: Cost: 0.6 bytes / sample
    
    Agent->>DB: NEW: Store Tier 1 samples (Downsampled)
    Note right of DB: Cost: 6 bytes / sample
    
    Agent->>DB: NEW: Store Tier 2 samples (Downsampled)
    Note right of DB: Cost: 18 bytes / sample
    
    Agent->>SQLite: NEW: Update metadata & alert transitions
    Note right of SQLite: Constant Overhead: ~1 GiB

    loop For each Tier
        DB->>Disk: Commit compressed data blocks
        alt Retention logic (retention_seconds reached)
            DB->>Disk: Reclaim space (FIFO / Page deletion)
        else Storage within limits
            DB->>Disk: Append new data
        end
    end

    Note over Agent,Disk: Total Disk = (Tier Sums) + 1 GiB Metadata
Loading

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

For a system collecting **10,000 metrics per second** with a **30-day retention** for `tier0`:

1. Calculate seconds in 30 days: 30 days × 24 hours × 60 minutes × 60 seconds = **2,592,000 seconds**
2. Apply the formula: 10,000 × 2,592,000 × 0.6 = **15,552,000,000 bytes** ≈ **15.6 GiB**
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The worked example mixes GB and GiB units: 15,552,000,000 bytes is about 14.5 GiB, not 15.6 GiB.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At docs/netdata-agent/sizing-netdata-agents/disk-requirements-and-retention.md, line 76:

<comment>The worked example mixes GB and GiB units: 15,552,000,000 bytes is about 14.5 GiB, not 15.6 GiB.</comment>

<file context>
@@ -49,6 +49,40 @@ In practice, with default settings and an ingestion rate of about 4,000 metrics
+For a system collecting **10,000 metrics per second** with a **30-day retention** for `tier0`:
+
+1. Calculate seconds in 30 days: 30 days × 24 hours × 60 minutes × 60 seconds = **2,592,000 seconds**
+2. Apply the formula: 10,000 × 2,592,000 × 0.6 = **15,552,000,000 bytes** ≈ **15.6 GiB**
+
+:::note
</file context>
Suggested change
2. Apply the formula: 10,000 × 2,592,000 × 0.6 = **15,552,000,000 bytes****15.6 GiB**
2. Apply the formula: 10,000 × 2,592,000 × 0.6 = **15,552,000,000 bytes****14.5 GiB**
Fix with Cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant