Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs.feldera.com/docs/pipelines/checkpoint-sync.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Here is a sample configuration:
| `provider` \* | `string` | | The S3 provider identifier. Must match [rclone’s list](https://rclone.org/s3/#providers). Case-sensitive. Use `"Other"` if unsure. |
| `access_key` | `string` | | S3 access key. Not required if using environment-based auth (e.g., IRSA). |
| `secret_key` | `string` | | S3 secret key. Not required if using environment-based auth. |
| `start_from_checkpoint` | `string` | | Checkpoint UUID to resume from, or `latest` to restore from the latest checkpoint. |
| `start_from_checkpoint` | `string` | | Checkpoint UUID to resume from, or `latest` to restore from the latest checkpoint. <b>If a checkpoint already exists locally, we prefer the checkpoint that has processed more records.</b> |
| `fail_if_no_checkpoint` | `boolean` | `false` | When `true` the pipeline will fail to initialize if fetching the specified checkpoint fails. <p> When `false`, the pipeline will start from scratch instead. Ignored if `start_from_checkpoint` is not set. </p> |
| `standby` | `boolean` | `false` | When `true`, the pipeline starts in **standby** mode. <p> To start processing the data the pipeline must be activated (`POST /activate`). </p> <p> If a previously activated pipeline is restarted without clearing storage, it auto-activates. </p> `start_from_checkpoint` must be set to use standby mode. |
| `pull_interval` | `integer(u64)` | `10` | Interval (in seconds) between fetch attempts for the latest checkpoint while standby. |
Expand Down
13 changes: 12 additions & 1 deletion python/tests/platform/test_checkpoint_sync.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import sys
import time
from typing import Optional
from uuid import uuid4, UUID
from uuid import UUID, uuid4

from feldera.enums import FaultToleranceModel, PipelineStatus
from feldera.runtime_config import RuntimeConfig, Storage
Expand Down Expand Up @@ -173,6 +173,16 @@ def test_checkpoint_sync(
if chk_uuid is not None:
assert UUID(uuid) >= UUID(chk_uuid)

if not clear_storage:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have tests covering a few cases here? E.g., local but no remote, local only, remote only, local ahead of remote, local == remote, and local behind remote?

Copy link
Copy Markdown
Contributor Author

@abhizer abhizer Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We test:

  • remote only
  • local only isn't covered as it is for the purposes of sync, the same as local + remote

We do not test for local checkpoint that has made less progress than remote. We will add new tests with the new sync implementation discussed as part of the rfc.

# If we aren't clearing storage, verify that the local checkpoint is preferred over the remote one.
self.pipeline.input_json("t0", [{"c0": 999, "c1": "local_only"}])
got_before = list(self.pipeline.query("SELECT * FROM v0"))
print(
f"{self.pipeline.name}: records after local only insert: {total}, {got_before}",
file=sys.stderr,
)
self.pipeline.checkpoint(wait=True)

self.pipeline.stop(force=True)

if clear_storage:
Expand Down Expand Up @@ -233,6 +243,7 @@ def test_checkpoint_sync(
if expect_empty:
got_before = []

# Validate that the outputs before starting from checkpoint is the same as after.
self.assertCountEqual(got_before, got_after)

self.pipeline.stop(force=True)
Expand Down