chore(bigframes): optimize system test teardown#17443
Conversation
There was a problem hiding this comment.
Code Review
This pull request parallelizes table deletion using a ThreadPoolExecutor and moves the Python UDF cleanup process to a background daemon thread in anonymous_dataset.py. Additionally, it configures pytest to use the worksteal distribution algorithm in noxfile.py. Feedback on these changes highlights potential issues with silent failures during table deletion due to unhandled exceptions in the thread pool, and warns that running the UDF cleanup in a daemon thread may lead to incomplete cleanup and secondary exceptions during interpreter shutdown.
| ) | ||
| warnings.warn(msg, category=bfe.CleanupFailedWarning) | ||
|
|
||
| threading.Thread(target=run_cleanup, daemon=True, name="bigframes-udf-cleanup").start() |
There was a problem hiding this comment.
Starting the UDF cleanup in a daemon thread (daemon=True) means that if the Python process exits shortly after close() is called (which is very common at the end of a script or test run), the thread will be abruptly terminated. This can prevent the cleanup from completing, leading to resource accumulation (the very issue b/450913424 is trying to solve).
Additionally, calling warnings.warn from a daemon thread during interpreter shutdown can raise secondary exceptions (like TypeError or AttributeError due to modules being cleaned up and set to None).
Consider whether this cleanup should be synchronous, or if there is a way to register an atexit handler to ensure the thread completes before the process exits.
chalmerlowe
left a comment
There was a problem hiding this comment.
By and large, this LGTM.
I think it is reasonable to check in on the note that gemini-code-assist provided.
There may be some merit to adding that defensive pattern in place.
I leave the decision up to you and do not need to re-review, unless there is a major change.
brings down bigframes system tests from 23 mins to 12 mins i.e. 2x speed up.