Conversation
There was a problem hiding this comment.
Orca Security Scan Summary
| Status | Check | Issues by priority | |
|---|---|---|---|
| Infrastructure as Code | View in Orca | ||
| SAST | View in Orca | ||
| Secrets | View in Orca | ||
| Vulnerabilities | View in Orca |
There was a problem hiding this comment.
Pull request overview
This PR adds support for incremental backups to the Weaviate Python client by introducing a new base_backup_id parameter to the backup creation methods. This allows users to create incremental backups based on a previous backup.
Changes:
- Added
base_backup_idparameter to thecreatemethod signature across sync and async backup implementations - The parameter is mapped to
incremental_backup_base_idin the API payload
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| weaviate/backup/executor.py | Added base_backup_id parameter to the create method implementation and included it in the backup creation payload |
| weaviate/backup/sync.pyi | Added base_backup_id parameter type hint to the sync backup create method stub |
| weaviate/backup/async_.pyi | Added base_backup_id parameter type hint to the async backup create method stub |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| backend: BackupStorage, | ||
| include_collections: Union[List[str], str, None] = None, | ||
| exclude_collections: Union[List[str], str, None] = None, | ||
| base_backup_id: Optional[str] = None, |
There was a problem hiding this comment.
The docstring for the create method is missing documentation for the new base_backup_id parameter. The docstring should include a description of this parameter in the Args section to explain its purpose for incremental backups.
| backend: BackupStorage, | ||
| include_collections: Union[List[str], str, None] = None, | ||
| exclude_collections: Union[List[str], str, None] = None, | ||
| base_backup_id: Optional[str] = None, |
There was a problem hiding this comment.
The new base_backup_id parameter lacks test coverage. Since the test file integration/test_backup_v4.py contains comprehensive backup tests, there should be at least one test case that verifies incremental backup functionality using this new parameter.
| backend: BackupStorage, | ||
| include_collections: Union[List[str], str, None] = None, | ||
| exclude_collections: Union[List[str], str, None] = None, | ||
| base_backup_id: Optional[str] = None, |
There was a problem hiding this comment.
There is a naming inconsistency between the parameter name base_backup_id and the payload key incremental_backup_base_id. Consider renaming the parameter to incremental_backup_base_id for better clarity and consistency with the API payload, or if brevity is preferred, ensure this mapping is well-documented.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1928 +/- ##
==========================================
- Coverage 86.45% 86.38% -0.07%
==========================================
Files 274 274
Lines 19966 19977 +11
==========================================
- Hits 17261 17258 -3
- Misses 2705 2719 +14 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| backend: BackupStorage, | ||
| include_collections: Union[List[str], str, None] = None, | ||
| exclude_collections: Union[List[str], str, None] = None, | ||
| base_backup_id: Optional[str] = None, |
There was a problem hiding this comment.
Is there a more unambiguous name for this parameter?
IMO the fact that setting a "base backup id" (which looks and feels like a "default prefix for your backup IDs") changes the backup strategy is surprising.
Having a separate "backup_strategy" field would make the effect of using it clearer and allow adding some other strategy in the future (e.g. "scheduled" or sth)
There was a problem hiding this comment.
Is there a more unambiguous name for this parameter?
Happy to use another name, do you have any proposals?
Having a separate "backup_strategy" field would make the effect of using it clearer and allow adding some other strategy in the future (e.g. "scheduled" or sth)
Hmmm, not a fan of this:
- I am not sure if I would call incremental backups a "strategy"
- We don't have anything else planned that would fall under strategy
- If you would add any other strategy, it would mean that it is mutually exclusive with incremental backups. Why would "scheduled" mean that the backup is not incremental for example?
There was a problem hiding this comment.
I mean, "scheduled" is just as example (I guess a bad one heh) and "base backup id" could stay if the effect of setting it was clearer. My main concern is this:
the fact that setting a "base backup id" (which looks and feels like a "default prefix for your backup IDs") changes the backup strategy
If "incremental" is not exactly a strategy (doesn't fit in an enum) then perhaps it could be a boolean parameter?
Could the server (in principle) use a default "base backup id" if one wasn't provided?
There was a problem hiding this comment.
As I said, I am happy to use a better name if you can think of one.
But I dont think adding a second parameter (it being an enum or a bool) makes things easier, but rather adds more failure cases:
- bool set, but not the base backup id
- bot not set, but base backup id
Could the server (in principle) use a default "base backup id" if one wasn't provided?
No
There was a problem hiding this comment.
As I said, I am happy to use a better name if you can think of one.
I'd second this comment then: #1928 (comment)
No description provided.