-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
improve S3 v3 storage layer #8927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
||
| def accept_state_visitor(self, visitor: StateVisitor): | ||
| visitor.visit(s3_stores) | ||
| visitor.visit(AssetDirectory(self._storage_backend.root_directory)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: this won't work for community cloud pods, as there's a check that the path should start with config.dirs.data, and files might not be saved here in the community version as they're spooled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
acceptable limitation for now 👍
bcce35f to
626bc96
Compare
edfad3a to
12b47d5
Compare
555135b to
2c1a505
Compare
12b47d5 to
952e428
Compare
4dedad3 to
bc4a45b
Compare
bc4a45b to
68e49ac
Compare
thrau
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! glad we did the iterations and now have a clean distinction between reset/flush/close in the contract 👍
|
|
||
| def accept_state_visitor(self, visitor: StateVisitor): | ||
| visitor.visit(s3_stores) | ||
| visitor.visit(AssetDirectory(self._storage_backend.root_directory)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
acceptable limitation for now 👍
| def create_bucket(self, bucket: BucketName): | ||
| pass | ||
|
|
||
| def delete_bucket(self, bucket: BucketName): | ||
| pass | ||
|
|
||
| def flush(self): | ||
| pass | ||
|
|
||
| def close(self): | ||
| pass | ||
|
|
||
| def reset(self): | ||
| pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe some doc strings would be helpful to explain what the different behavior of flush,close,reset are
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do 👍 funnily, I don't think we use flush anywhere at the moment, as persistence is always "on" for the filesystem storage backend. But we could keep it for some fun behaviour with community pods or something?
| def __init__(self) -> None: | ||
| super().__init__() | ||
| self._storage_backend = EphemeralS3ObjectStore() | ||
| self._storage_backend = EphemeralS3ObjectStore(DEFAULT_S3_TMP_DIR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i mentioned this before: maybe we could allow injecting a storage backend into the constructor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. But isn't having the logic of the storage backend inside providers.py, depending on config be a bit weird? Are there other services doing this?
Thanks for the quick chat on Slack, I will make it optional so that we can inject it from pro and have the same default value. Awesome 👍
Motivation
Improving the current S3 storage layer, introduce 2 more operations to allow easier cleanup of resources with bucket creation and deletion. Also separate between
closeandresetthe store, to have better semantics between these 2 operations (closing does not always mean deleting all underlying objects, especially with persistence enabled).Separate between the
closecall for the Ephemeral object, so that it doesn't close the underlying file object, which would delete it in Ephemeral mode, following the fix to the response with #8926.Changes
Introduce said changes above,
create_bucketanddelete_bucketfor the storage layer, and preparing support for persistence.Also added a quick fix regarding creating bucket in
us-east-1, I wasn't returning early from the method and ended up overwriting the bucket in the store.Make
closea noop forEphemeralS3StoredObject, and implement adeletemethod for cleaning up.