Skip to content

Enhancement: Inflection point detection for performance-sensitive API calls #178

@gregeva

Description

@gregeva

Problem

ltl surfaces latency statistics and time-bucket visualizations, but does not currently identify when a specific API call (log message) has crossed a performance degradation threshold — the point at which incremental load begins to cost disproportionately more in latency.

This inflection point is the classic signature of a resource constraint being hit: connection pool exhaustion, thread contention, queue saturation, or CPU bottleneck. Without automated detection, spotting it requires manual inspection of per-message latency trends across the dataset.

Proposed Enhancement

Introduce inflection point detection that, for each unique log message across the full dataset (%log_messages), identifies the load level at which the relationship between execution count and latency changes character — where the slope of latency-vs-count steepens materially.

The ratio of the post-inflection slope to the pre-inflection slope becomes a severity score. A gentle rolloff is low severity; a sharp cliff is high severity. Severity should be expressed as named tiers (e.g. low / medium / high / critical) rather than a raw ratio — the specific tier boundaries and naming are to be determined during research.

When active, inflection point detection becomes a filter: the bar graph and summary table show only messages where an inflection was detected at or above the user-specified severity tier. No other UI changes are intended at this stage, though a per-message severity indicator (e.g. a severity column in the summary table, sortable descending) should be evaluated during requirements and prototyping before implementation begins.

Implementation Phases

This feature requires a research-first approach before any implementation:

Phase 1 — Research

  • Determine which percentile(s) are most meaningful for inflection detection (P50, P90, P99, or a combination)
  • Determine what count distribution and observation minimum are required per message to produce a statistically reliable result — some messages may appear too infrequently for a meaningful curve fit
  • Assess whether histogram data is required for accuracy, or whether the existing percentile statistics are sufficient
  • Determine severity tier boundaries and naming from empirical data

Phase 2 — Requirements

  • Define any required enhancements to the existing data model and statistics (%log_stats, %log_analysis, histogram structures) to support the analysis
  • Specify the severity tier structure and filtering interface
  • Specify how severity is surfaced per API call in the output (severity column, sort order, or other — TBD)

Phase 3 — Prototype

  • Validate the detection algorithm against real log data before full implementation

Phase 4 — Implementation

Key Design Constraints

  • Detection operates on %log_messages across the entire dataset — not per time bucket. Each unique log message is treated as a distinct API call.
  • The x-axis for slope analysis is execution count vs. latency per message, not latency over time.
  • Named severity tiers are preferred over raw ratio values for usability — boundaries to be determined empirically during research.
  • Minimum observation requirements per message must be established during research; messages below the threshold are silently excluded from detection.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions