Skip to content

Conversation

@lysnikolaou
Copy link
Member

@lysnikolaou lysnikolaou commented Dec 10, 2025

I'm opening this to start a discussion on what we want the structure of such documentation to be, as well as how to balance detail with succinctness. Feedback very welcome!


📚 Documentation preview 📚: https://cpython-previews--142519.org.readthedocs.build/en/142519/library/stdtypes.html#lists

@lysnikolaou lysnikolaou force-pushed the list-thread-safety-docs branch from 1858ef5 to fd400f0 Compare December 10, 2025 15:07
@emmatyping
Copy link
Member

Merged main to fix CI

@emmatyping
Copy link
Member

I think any note about the atomicity of list operations should probably go after the documentation about the type. I can imagine a beginner to Python going to the documentation to read about lists and being confused about what free-threading semantics are talking about.

I also wonder if this should not be a note. Perhaps it would be better to describe as part of the type description what operations are atomic and which are not?

Copy link
Member

@encukou encukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good start.

@encukou
Copy link
Member

encukou commented Dec 11, 2025

My general opinion: reference documentation needs to be precise first; complete second. Succinctness comes after that.

Edit: Also, IMO, it's fine to add information first, then reorganize when a better structure becomes more obvious.

@lysnikolaou lysnikolaou force-pushed the list-thread-safety-docs branch from 1a99a5e to 38483c9 Compare December 11, 2025 13:38
@lysnikolaou
Copy link
Member Author

I've addressed most of the feedback and added a lot more details.

I also inadvertently force-pushed. Sorry about that.

The following operations/methods are not fully atomic:

.. code-block::
:class: maybe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could say something like “Operations/methods that involve iteration are generally not atomic, except when used with specific built-in types”, and iteration itself can be moved here?
Mentioning iteration might help people make sense of this, i.e. it's no longer two arbitrary lists of operations/methods.

Then the bad section below would be left only with examples of “manually” combining multiple operations.

Copy link
Member Author

@lysnikolaou lysnikolaou Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how I feel about this. "Operations/methods that involve iteration are generally not atomic" is probably not a mnemonic we want people to use, because there are methods that are atomic but traverse the list. Granted most of those are ones that also mutate it, but e.g. list.copy doesn't.

But the idea of separating iteration from manually combining multiple operations is good. Maybe we should do just that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.
(I guess it's “operations that involve arbitrary iterators and/or comparison functions”, but that's too long; readers whom that would help can figure it out from the list.)

As for iteration, it sounds like the guarantees are the same for single-threaded code: iteration of a list that is being modified may skip elements or yield repeated elements, but will not crash or produce elements that were never part of the list. Is that right?


Is this the place to document list iterators -- i.e. what happens if you use a shared iterator in several threads?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've completely reworded the docs to account for lock-free operations. Is this better?


I think documenting iterators should be done in the Iterator docs, not really individually for each iterator type.

process(item) # another thread may modify lst
Consider external synchronization when sharing :class:`list` instances
across threads. See :ref:`freethreading-python-howto` for more information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting long enough that maybe it deserves to be in its own page in the reference docs, along with thread-safety notes for all the builtin types. Then we can link to those reference docs from here.

I think our hope when we originally wanted to include these notes directly in the docs for the builtins was that these notes would be pretty short. But it turns out there are a decent number of caveats and that hope might not be realistic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this. Splitting everything out in a new page in the reference docs and then cross-linking sounds like the better approach, especially since dict docs are going to be even longer. @encukou What do you think?

All of the above methods/operations are also lock-free. They do not block
concurrent modifications. Other operations that hold a lock will not block
these from observing intermediate states.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "intermediate states" correct, or is it just that readers might observe the state either before or after concurrent modification? The way it's written now a very literal-minded reader might conclude that a reader might observe a torn read or other C data race.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not about data races, it's about intermediate states when the lock-free operations race with operations that touch multiple elements. For example, if list.index and list.reverse race with one another (I know, bad example, because list.reverse does have a data race, but let's assume that gets fixed), list.index might return the reversed index, even though some elements might not have been reversed yet.

list appears empty for the duration of the sort.

The following operations may allow lock-free operations to observe
intermediate states since they modify multiple elements in place:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me what "intermediate states" means here. See my comment above. Would help maybe to clarify globally (i.e. somewhere not in the list docs) what observing an object in an intermediate state means. Is any possible layout of the list while it's being processed possible?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

6 participants