-
-
Notifications
You must be signed in to change notification settings - Fork 33.7k
gh-142518: Document thread-safety guarantees of list operations #142519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
1858ef5 to
fd400f0
Compare
|
Merged main to fix CI |
|
I think any note about the atomicity of list operations should probably go after the documentation about the type. I can imagine a beginner to Python going to the documentation to read about lists and being confused about what free-threading semantics are talking about. I also wonder if this should not be a note. Perhaps it would be better to describe as part of the type description what operations are atomic and which are not? |
encukou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good start.
|
My general opinion: reference documentation needs to be precise first; complete second. Succinctness comes after that. Edit: Also, IMO, it's fine to add information first, then reorganize when a better structure becomes more obvious. |
1a99a5e to
38483c9
Compare
|
I've addressed most of the feedback and added a lot more details. I also inadvertently force-pushed. Sorry about that. |
| The following operations/methods are not fully atomic: | ||
|
|
||
| .. code-block:: | ||
| :class: maybe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this could say something like “Operations/methods that involve iteration are generally not atomic, except when used with specific built-in types”, and iteration itself can be moved here?
Mentioning iteration might help people make sense of this, i.e. it's no longer two arbitrary lists of operations/methods.
Then the bad section below would be left only with examples of “manually” combining multiple operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how I feel about this. "Operations/methods that involve iteration are generally not atomic" is probably not a mnemonic we want people to use, because there are methods that are atomic but traverse the list. Granted most of those are ones that also mutate it, but e.g. list.copy doesn't.
But the idea of separating iteration from manually combining multiple operations is good. Maybe we should do just that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
(I guess it's “operations that involve arbitrary iterators and/or comparison functions”, but that's too long; readers whom that would help can figure it out from the list.)
As for iteration, it sounds like the guarantees are the same for single-threaded code: iteration of a list that is being modified may skip elements or yield repeated elements, but will not crash or produce elements that were never part of the list. Is that right?
Is this the place to document list iterators -- i.e. what happens if you use a shared iterator in several threads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've completely reworded the docs to account for lock-free operations. Is this better?
I think documenting iterators should be done in the Iterator docs, not really individually for each iterator type.
| process(item) # another thread may modify lst | ||
| Consider external synchronization when sharing :class:`list` instances | ||
| across threads. See :ref:`freethreading-python-howto` for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is getting long enough that maybe it deserves to be in its own page in the reference docs, along with thread-safety notes for all the builtin types. Then we can link to those reference docs from here.
I think our hope when we originally wanted to include these notes directly in the docs for the builtins was that these notes would be pretty short. But it turns out there are a decent number of caveats and that hope might not be realistic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this. Splitting everything out in a new page in the reference docs and then cross-linking sounds like the better approach, especially since dict docs are going to be even longer. @encukou What do you think?
| All of the above methods/operations are also lock-free. They do not block | ||
| concurrent modifications. Other operations that hold a lock will not block | ||
| these from observing intermediate states. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is "intermediate states" correct, or is it just that readers might observe the state either before or after concurrent modification? The way it's written now a very literal-minded reader might conclude that a reader might observe a torn read or other C data race.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not about data races, it's about intermediate states when the lock-free operations race with operations that touch multiple elements. For example, if list.index and list.reverse race with one another (I know, bad example, because list.reverse does have a data race, but let's assume that gets fixed), list.index might return the reversed index, even though some elements might not have been reversed yet.
| list appears empty for the duration of the sort. | ||
|
|
||
| The following operations may allow lock-free operations to observe | ||
| intermediate states since they modify multiple elements in place: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me what "intermediate states" means here. See my comment above. Would help maybe to clarify globally (i.e. somewhere not in the list docs) what observing an object in an intermediate state means. Is any possible layout of the list while it's being processed possible?
I'm opening this to start a discussion on what we want the structure of such documentation to be, as well as how to balance detail with succinctness. Feedback very welcome!
📚 Documentation preview 📚: https://cpython-previews--142519.org.readthedocs.build/en/142519/library/stdtypes.html#lists