Skip to content

Conversation

@Aaron1011
Copy link
Member

@Aaron1011 Aaron1011 commented Nov 4, 2025

Summary

  • add migration 0041 to install the tensorzero_uint_to_uuid ClickHouse UDF and register it with the migration runner
  • update runtime ClickHouse queries and tests in Rust/TypeScript to call tensorzero_uint_to_uuid instead of uint_to_uuid
  • bump the migration count/test harness to include migration 0041

Testing

  • cargo fmt
  • cargo clippy --all-targets --all-features -- -D warnings
  • cargo test-unit (fails: no such command nextest)
  • pnpm --filter tensorzero-node run build
  • pnpm run format
  • pnpm run lint
  • pnpm run typecheck

https://chatgpt.com/codex/tasks/task_b_690a36bd8b1483298c44e06b45012aa2


Important

Add tensorzero_uint_to_uuid ClickHouse function and update queries and tests to use it instead of uint_to_uuid.

  • Behavior:
    • Add tensorzero_uint_to_uuid ClickHouse UDF in migration_0041.rs and register it with the migration runner.
    • Update queries in clickhouse.rs, dataset_queries.rs, select_queries.rs, test_helpers.rs, mod.rs, evaluations.server.ts, function.ts, inference.server.ts, workflow_evaluations.server.ts, and inference.test.ts to use tensorzero_uint_to_uuid instead of uint_to_uuid.
  • Testing:
    • Update tests in clickhouse.rs, inference.test.ts, and observability.episodes.spec.ts to reflect the new function usage.
    • Run various test commands including cargo test-unit, pnpm run build, and pnpm run lint.

This description was created by Ellipsis for ab1ea5e. You can customize this summary. It will automatically update as commits are pushed.

Copilot AI review requested due to automatic review settings November 4, 2025 18:13
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR renames the ClickHouse user-defined function from uint_to_uuid to tensorzero_uint_to_uuid to avoid naming conflicts with the ClickHouse built-in function introduced in ClickHouse version 24.8. The changes include:

  • Creating a new migration (migration_0041) that defines the tensorzero_uint_to_uuid function
  • Updating all SQL queries across the codebase to use tensorzero_uint_to_uuid instead of uint_to_uuid
  • Incrementing the migration count from 34 to 35 and adding test coverage for the new migration

Reviewed Changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs New migration file that creates the tensorzero_uint_to_uuid function
tensorzero-core/src/db/clickhouse/migration_manager/mod.rs Registers the new migration and updates NUM_MIGRATIONS constant
tensorzero-core/src/db/clickhouse/migration_manager/migrations/mod.rs Adds the new migration module
tensorzero-core/tests/e2e/clickhouse.rs Adds migration index 34 to the rollback test array
tensorzero-core/src/db/clickhouse/select_queries.rs Updates queries to use tensorzero_uint_to_uuid
tensorzero-core/src/db/clickhouse/dataset_queries.rs Updates queries to use tensorzero_uint_to_uuid
tensorzero-core/src/db/clickhouse/test_helpers.rs Updates queries to use tensorzero_uint_to_uuid
tensorzero-core/src/endpoints/feedback/mod.rs Updates queries to use tensorzero_uint_to_uuid
tensorzero-core/tests/load/feedback/src/main.rs Updates queries to use tensorzero_uint_to_uuid
ui/app/utils/clickhouse/inference.server.ts Updates queries to use tensorzero_uint_to_uuid
ui/app/utils/clickhouse/inference.test.ts Updates queries to use tensorzero_uint_to_uuid
ui/app/utils/clickhouse/workflow_evaluations.server.ts Updates queries to use tensorzero_uint_to_uuid
ui/app/utils/clickhouse/function.ts Updates queries to use tensorzero_uint_to_uuid
ui/app/utils/clickhouse/evaluations.server.ts Updates queries to use tensorzero_uint_to_uuid
ui/e2e_tests/observability.episodes.spec.ts Updates queries to use tensorzero_uint_to_uuid
examples/integrations/cursor/feedback/src/clickhouse.rs Updates queries to use tensorzero_uint_to_uuid

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 4, 2025

TensorZero CI Bot Automated Comment

Thanks for the PR! The ClickHouse E2E test failures are caused by the new migration (Migration0041) not being recognized as “applied” by the test harness.

Root cause:

  • Migration0041’s should_apply/has_succeeded checks use `system.functions` to determine whether the SQL UDF `tensorzero_uint_to_uuid` exists. On ClickHouse, SQL UDFs are listed in `system.sql_functions`, not `system.functions`. As a result, after creating the function, the follow-up check still doesn’t find it, so the migration is deemed not applied, which triggers:
    • test_rollback_up_to_migration_index_34: fails with “Migration Migration0041 should have succeeded”
    • test_clickhouse_migration_manager: fails because it expects all migrations to be applied/logged
    • test_concurrent_clickhouse_migrations: fails with “TooFew” since the last migration is skipped

Fix:

  • Update Migration0041 to detect the function in `system.sql_functions` first, falling back to `system.functions` (for future compatibility), and reuse the same check in has_succeeded. This makes the migration reliable across ClickHouse versions.

No changes to GitHub Actions are required.

Warning

I encountered an error while trying to create a follow-up PR: Failed to create follow-up PR using remote https://x-access-token:***@github.com/tensorzero/tensorzero.git: git apply --whitespace=nowarn /tmp/tensorzero-pr-KiAbIs/repo/tensorzero.patch failed: error: corrupt patch at line 102
.

The patch I tried to generate is as follows:

diff --git a/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs b/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs
index d41c79b30..f0d2d0b7e 100644
--- a/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs
+++ b/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs
@@ -1,6 +1,7 @@
 use crate::db::clickhouse::migration_manager::migration_trait::Migration;
 use crate::db::clickhouse::ClickHouseConnectionInfo;
 use crate::error::{Error, ErrorDetails};
+use crate::result::ResultExt;
 use async_trait::async_trait;
 
 pub struct Migration0041<'a> {
@@ -10,22 +11,54 @@ pub struct Migration0041<'a> {
 #[async_trait]
 impl Migration for Migration0041<'_> {
     async fn can_apply(&self) -> Result<(), Error> {
         Ok(())
     }
 
     async fn should_apply(&self) -> Result<bool, Error> {
-        let query =
-            "SELECT 1 FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'".to_string();
-        let result = self
-            .clickhouse
-            .run_query_synchronous_no_params(query)
-            .await?;
-        if result.response.contains('1') {
-            return Ok(false);
-        }
-        if result.response.trim().is_empty() || result.response.contains("0") {
-            return Ok(true);
-        }
-        Err(Error::new(ErrorDetails::ClickHouseMigration {
-            id: "0041".to_string(),
-            message: format!(
-                "Unexpected response when checking for tensorzero_uint_to_uuid function: {}",
-                result.response.trim()
-            ),
-        }))
+        // Check if the SQL UDF already exists. On ClickHouse, SQL UDFs are exposed via
+        // `system.sql_functions` (not `system.functions`). Fall back to `system.functions`
+        // defensively in case of version differences.
+        async fn function_exists(clickhouse: &ClickHouseConnectionInfo) -> Result<bool, Error> {
+            // Prefer system.sql_functions if available
+            let sql_fn_check = "SELECT 1 FROM system.sql_functions WHERE name = 'tensorzero_uint_to_uuid'";
+            match clickhouse
+                .run_query_synchronous_no_params(sql_fn_check.to_string())
+                .await
+            {
+                Ok(res) => {
+                    if !res.response.trim().is_empty() && res.response.contains('1') {
+                        return Ok(true);
+                    }
+                }
+                // If the system table doesn't exist on this server, ignore and try the fallback.
+                Err(_e) => {}
+            }
+
+            // Fallback: some builds expose user-defined SQL functions under system.functions as well.
+            let fallback_check =
+                "SELECT 1 FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'";
+            let res = clickhouse
+                .run_query_synchronous_no_params(fallback_check.to_string())
+                .await?;
+            Ok(!res.response.trim().is_empty() && res.response.contains('1'))
+        }
+
+        let exists = function_exists(self.clickhouse).await?;
+        Ok(!exists)
     }
 
     async fn apply(&self, _clean_start: bool) -> Result<(), Error> {
         let on_cluster_name = self.clickhouse.get_on_cluster_name();
         let query = format!(
             r"CREATE FUNCTION IF NOT EXISTS tensorzero_uint_to_uuid{on_cluster_name} AS (x) -> reinterpretAsUUID(
             concat(
                 substring(reinterpretAsString(x), 9, 8),
                 substring(reinterpretAsString(x), 1, 8)
             )
         );",
         );
         self.clickhouse
             .run_query_synchronous_no_params(query)
             .await?;
 
         Ok(())
     }
 
     fn rollback_instructions(&self) -> String {
         "DROP FUNCTION IF EXISTS tensorzero_uint_to_uuid;".to_string()
     }
 
     async fn has_succeeded(&self) -> Result<bool, Error> {
-        let should_apply = self.should_apply().await?;
-        Ok(!should_apply)
+        // Reuse the existence logic from should_apply: if we should not apply anymore,
+        // then the migration has succeeded.
+        let should_apply = self.should_apply().await?;
+        Ok(!should_apply)
     }
 }

@github-actions
Copy link
Contributor

github-actions bot commented Nov 4, 2025

TensorZero CI Bot Automated Comment

Thanks for the PR! The ClickHouse CI is failing because the new user-defined function tensorzero_uint_to_uuid introduced in migration 0041 isn't being detected reliably in tests, and the migration sometimes doesn't appear to run or doesn't report success.

Root causes observed:

  • The migration’s should_apply() checks system.functions without a deterministic output format and simply string-matches for '1'/'0'. This is brittle and can cause false positives/negatives, leading to “TooFew” migrations applied in concurrency/migration-manager tests.
  • The migration’s apply() likely succeeds, but has_succeeded() relies on the same fragile check, so tests conclude it didn’t succeed (“Migration0041 should have succeeded”).
  • As a result, tests that expect to see “Applying migration: Migration0041” in the logs don’t, and e2e migration tests fail.

Fix summary:

  • Make should_apply() and has_succeeded() robust by using a deterministic query with FORMAT JSONEachRow and EXISTS to check for the function’s presence, then interpreting an empty vs non-empty response reliably.
  • Slightly harden the CREATE FUNCTION body to ensure the reinterpretation gets a fixed-length buffer (toFixedString(..., 16)), which is safer across ClickHouse versions.

These changes should:

  • Ensure the migration applies exactly once when needed.
  • Ensure has_succeeded() deterministically reports success after creation.
  • Unblock the failing Rust ClickHouse tests (test_concurrent_clickhouse_migrations, test_clickhouse_migration_manager, test_rollback_apply_rollback).

No GitHub Actions changes are required.

Warning

I encountered an error while trying to create a follow-up PR: Failed to create follow-up PR using remote https://x-access-token:***@github.com/tensorzero/tensorzero.git: git apply --whitespace=nowarn /tmp/tensorzero-pr-1NbBe9/repo/tensorzero.patch failed: error: corrupt patch at line 100
.

The patch I tried to generate is as follows:

diff --git a/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs b/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs
index d41c79b30..10f36e579 100644
--- a/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs
+++ b/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs
@@ -1,63 +1,83 @@
 use crate::db::clickhouse::migration_manager::migration_trait::Migration;
 use crate::db::clickhouse::ClickHouseConnectionInfo;
 use crate::error::{Error, ErrorDetails};
 use async_trait::async_trait;
 
 pub struct Migration0041<'a> {
     pub clickhouse: &'a ClickHouseConnectionInfo,
 }
 
 #[async_trait]
 impl Migration for Migration0041<'_> {
     async fn can_apply(&self) -> Result<(), Error> {
         Ok(())
     }
 
     async fn should_apply(&self) -> Result<bool, Error> {
-        let query =
-            "SELECT 1 FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'".to_string();
-        let result = self
-            .clickhouse
-            .run_query_synchronous_no_params(query)
-            .await?;
-        if result.response.contains('1') {
-            return Ok(false);
-        }
-        if result.response.trim().is_empty() || result.response.contains("0") {
-            return Ok(true);
-        }
-        Err(Error::new(ErrorDetails::ClickHouseMigration {
-            id: "0041".to_string(),
-            message: format!(
-                "Unexpected response when checking for tensorzero_uint_to_uuid function: {}",
-                result.response.trim()
-            ),
-        }))
+        // Deterministically check if the function exists by returning a row only when it does.
+        // If no row is returned, we should apply the migration.
+        let query = r"
+            SELECT 1 AS exists
+            WHERE EXISTS(
+                SELECT 1 FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'
+            )
+            FORMAT JSONEachRow
+        "
+        .to_string();
+        let result = self.clickhouse.run_query_synchronous_no_params(query).await?;
+        Ok(result.response.trim().is_empty())
     }
 
     async fn apply(&self, _clean_start: bool) -> Result<(), Error> {
         let on_cluster_name = self.clickhouse.get_on_cluster_name();
         let query = format!(
             r"CREATE FUNCTION IF NOT EXISTS tensorzero_uint_to_uuid{on_cluster_name} AS (x) -> reinterpretAsUUID(
-            concat(
-                substring(reinterpretAsString(x), 9, 8),
-                substring(reinterpretAsString(x), 1, 8)
-            )
-        );",
+                toFixedString(
+                    concat(
+                        substring(reinterpretAsString(x), 9, 8),
+                        substring(reinterpretAsString(x), 1, 8)
+                    ),
+                    16
+                )
+            );",
         );
         self.clickhouse
             .run_query_synchronous_no_params(query)
             .await?;
 
         Ok(())
     }
 
     fn rollback_instructions(&self) -> String {
         "DROP FUNCTION IF EXISTS tensorzero_uint_to_uuid;".to_string()
     }
 
     async fn has_succeeded(&self) -> Result<bool, Error> {
-        let should_apply = self.should_apply().await?;
-        Ok(!should_apply)
+        // Succeeded if the function now exists (i.e., the query returns a row).
+        let query = r"
+            SELECT 1 AS exists
+            WHERE EXISTS(
+                SELECT 1 FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'
+            )
+            FORMAT JSONEachRow
+        "
+        .to_string();
+        let result = self.clickhouse.run_query_synchronous_no_params(query).await?;
+        Ok(!result.response.trim().is_empty())
     }
 }

@virajmehta
Copy link
Member

Please snooze this PR 1 day. I would like to get Migration0041 in from my tool work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants