Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile OverviewGreptile SummaryThis PR extends the external database sync system to support ClickHouse user synchronization alongside the existing PostgreSQL support. The implementation adds comprehensive infrastructure for syncing user data to ClickHouse for analytics purposes. Key Changes
ArchitectureThe sync operates in batches of 1000 rows, pulling data from the internal PostgreSQL using The status route provides comprehensive monitoring including metadata tracking, backlog calculation (internal max sequence ID - last synced sequence ID), and user table statistics. Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Cron as Cron Job
participant Sequencer as Sequencer API
participant Poller as Poller API
participant SyncEngine as External DB Sync Engine
participant PG as Internal PostgreSQL
participant CH as ClickHouse
participant Status as Status API
Note over Cron,CH: User Sync Flow
Cron->>Sequencer: POST /external-db-sync/sequencer
Sequencer->>PG: Update ProjectUser.sequenceId
Sequencer->>PG: Update DeletedRow.sequenceId
Cron->>Poller: POST /external-db-sync/poller
Poller->>SyncEngine: syncExternalDatabases(tenancy)
alt ClickHouse Database
SyncEngine->>CH: getClickhouseLastSyncedSequenceId()
CH-->>SyncEngine: lastSequenceId
loop Batch Processing
SyncEngine->>PG: SELECT users WHERE sequence_id > lastSequenceId
PG-->>SyncEngine: rows (max 1000)
SyncEngine->>SyncEngine: normalizeClickhouseBoolean()
SyncEngine->>CH: INSERT INTO analytics_internal.users
SyncEngine->>CH: INSERT INTO _stack_sync_metadata
end
end
alt Postgres Database
SyncEngine->>PG: SELECT last_synced_sequence_id FROM _stack_sync_metadata
PG-->>SyncEngine: lastSequenceId
loop Batch Processing
SyncEngine->>PG: SELECT users WHERE sequence_id > lastSequenceId
PG-->>SyncEngine: rows (max 1000)
SyncEngine->>PG: UPSERT into external DB
end
end
Note over Status,CH: Status Monitoring
Status->>CH: Query _stack_sync_metadata
CH-->>Status: metadata rows
Status->>CH: Query users table stats
CH-->>Status: user counts & timestamps
Status-->>Cron: Sync status with backlog info
|
https://www.loom.com/share/9e6b13061a314bcb94bc5cb7232c80fb <!-- Make sure you've read the CONTRIBUTING.md guidelines: https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md -->
No description provided.