feat: Add VariantStatistics materialized view#7093
Open
AntoineToussaint wants to merge 6 commits intomainfrom
Open
feat: Add VariantStatistics materialized view#7093AntoineToussaint wants to merge 6 commits intomainfrom
AntoineToussaint wants to merge 6 commits intomainfrom
Conversation
Create variant-level aggregation tables that group metrics by (function_name, variant_name, minute), analogous to the existing ModelProviderStatistics which groups by (model_name, model_provider_name). ClickHouse: AggregatingMergeTree table fed by three MVs: - VariantStatisticsModelView (ModelInference → tokens/cost via InferenceById JOIN) - VariantStatisticsChatView (ChatInference → latency quantiles + count) - VariantStatisticsJsonView (JsonInference → latency quantiles + count) Postgres: variant_statistics table with incremental refresh function that joins chat_inferences + json_inferences with model_inferences. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests cover: basic aggregation with retries, multiple variants isolation, cross-minute bucketing, mixed chat+json inferences, high-volume (20 inferences with retry patterns), and orphan inferences without model data. All tests run against both ClickHouse and Postgres via make_db_test! macro. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix cross_minute test: count_with_cost assertion was 3 but should be 4
(a1, a2, b1, b3 all have cost; b2 and c1 don't)
- Fix mixed_chat_json test: Postgres json_inferences requires non-null
output_schema — provide `{"type": "object"}` in make_json_inference
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ClickHouse cannot parse `null` for JSON string columns like `input`. Use `Some(StoredInput::default())` and `Some(vec![])` for output, matching the pattern used in evaluation_queries tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ClickHouse rejects null for JSON string columns (input, inference_params, tool_params, extra_body, auxiliary_content). Provide default values matching the pattern used in evaluation_queries tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
VariantStatisticsaggregation tables that group metrics by(function_name, variant_name, minute)— the variant-level counterpart ofModelProviderStatisticsAggregatingMergeTreetable fed by three materialized views —VariantStatisticsModelView(tokens/cost fromModelInferenceviaInferenceByIdJOIN),VariantStatisticsChatView(latency quantiles + count fromChatInference), andVariantStatisticsJsonView(same fromJsonInference)variant_statisticstable withrefresh_variant_statistics_incremental()function joiningchat_inferences UNION ALL json_inferenceswithmodel_inferencesprepare_variant_statistics()test helper for both backendscount_with_costacross multiple model inferences per inferenceCloses #6980
Test plan
cargo check --all-targets --all-featurescargo clippy --all-targets --all-features -- -D warningscargo fmtcargo test-unit-fast(12 pre-existing failures, no new ones)test_variant_statistics_aggregation_clickhouse— inserts chat + model inferences, verifies aggregated rowtest_variant_statistics_aggregation_postgres— same test on Postgres backendtest_rollback_up_to_migration_index_46)🤖 Generated with Claude Code