Conversation
### What - **Add StarRocks engine support to SQLMesh** via StarRocks’ MySQL-compatible protocol. - Ship **engine adapter + docs + real integration tests** to ensure generated SQL works on StarRocks. ### Why - **User demand / adoption**: StarRocks is a common OLAP choice; SQLMesh users want to run the same model lifecycle (build, incremental maintenance, views/MVs) on StarRocks without bespoke SQL. - **Engine-specific semantics**: StarRocks differs from vanilla MySQL in DDL/DML constraints (e.g., key types, delete behavior, rename caveats). An adapter is needed to produce correct and predictable SQL. - **Confidence & maintainability**: Documenting config patterns + codifying behavior with integration tests prevents regressions and makes support “real” (not just “it parses”). ### Scope (what’s supported) - **Connectivity**: Connect through MySQL protocol (e.g., `pymysql`). - **Table creation / DDL**: - Key table types via `physical_properties`: **DUPLICATE KEY (default)**, **PRIMARY KEY (recommended for incremental)**, **UNIQUE KEY** - **Partitioning**: simple `partitioned_by` and advanced `partition_by` (complex expression partitioning) + optional initial `partitions` - **Distribution**: `distributed_by` structured form or string fallback (HASH / RANDOM; buckets required) - **Ordering**: `order_by` / `clustered_by` - **Generic PROPERTIES passthrough** (string key/value) - **Views**: - Regular views - **Materialized views** via `kind VIEW(materialized true)` with StarRocks-specific notes/constraints - **DML / maintenance**: - Insert/select/update basics - Delete behavior handled with StarRocks compatibility constraints (PRIMARY KEY tables recommended for robust deletes) ### Changes - **Engine adapter**: `sqlmesh/core/engine_adapter/starrocks.py` - **Docs**: `docs/integrations/engines/starrocks.md` - **Integration tests**: `tests/core/engine_adapter/integration/test_integration_starrocks.py`, and `tests/core/engine_adapter/test_starrocks.py` ### Verification - **Integration tests require a running StarRocks** instance. - Ran: - set `STARROCKS_HOST/PORT/USER/PASSWORD` - `pytest -m "starrocks and docker" tests/core/engine_adapter/integration/test_integration_starrocks.py` ### Known limitations / caveats - **No sync MV support (currently)** - **No tuple IN**: `(c1, c2) IN ((v1, v2), ...)` - **No `SELECT ... FOR UPDATE`** - **RENAME caveat**: rename target can’t be qualified with a database name ### Notes on compatibility - **Changes are StarRocks-scoped** (adapter/docs/tests) and should not impact other engines. Signed-off-by: jaogoy <jaogoy@gmail.com>
|
@erindru Hi Erin, would you like to take a review of this PR. This PR is similar with #5033, but to support StarRocks in SQLMesh. I'll be very glad to see your comments. I'm trying to fix the CI problem and some test cases. |
And optimize some test cases. Signed-off-by: jaogoy <jaogoy@gmail.com>
Signed-off-by: jaogoy <jaogoy@gmail.com>
Signed-off-by: jaogoy <jaogoy@gmail.com>
Signed-off-by: jaogoy <jaogoy@gmail.com>
Signed-off-by: jaogoy <jaogoy@gmail.com>
Signed-off-by: jaogoy <jaogoy@gmail.com>
Signed-off-by: jaogoy <jaogoy@gmail.com>
Signed-off-by: jaogoy <jaogoy@gmail.com>
Signed-off-by: jaogoy <jaogoy@gmail.com>
@erindru Hi, Erin, tobymao/sqlglot#6827 in SQLGlot is merged. |
Signed-off-by: jaogoy <jaogoy@gmail.com>
What
Why
Scope (what’s supported)
pymysql).physical_properties: DUPLICATE KEY (default), PRIMARY KEY (recommended for incremental), UNIQUE KEYpartitioned_byand advancedpartition_by(complex expression partitioning) + optional initialpartitionsdistributed_bystructured form or string fallback (HASH / RANDOM; buckets required)order_by/clustered_bykind VIEW(materialized true)with StarRocks-specific notes/constraintsChanges
sqlmesh/core/engine_adapter/starrocks.pydocs/integrations/engines/starrocks.mdtests/core/engine_adapter/integration/test_integration_starrocks.py, andtests/core/engine_adapter/test_starrocks.pyVerification
STARROCKS_HOST/PORT/USER/PASSWORDpytest -m "starrocks and docker" tests/core/engine_adapter/integration/test_integration_starrocks.pyKnown limitations / caveats
(c1, c2) IN ((v1, v2), ...)SELECT ... FOR UPDATEAcknowledgement
This implementation was largely inspired by #5033 — thanks to @xinge-ji for the solid groundwork.