Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
172 changes: 164 additions & 8 deletions Doc/glossary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,14 @@ Glossary
iterator's :meth:`~object.__anext__` method until it raises a
:exc:`StopAsyncIteration` exception. Introduced by :pep:`492`.

atomic operation
An operation that appears to execute as a single, indivisible step: no
other thread can observe it half-done, and its effects become visible all
at once. Python does not guarantee that high-level statements are atomic
(for example, ``x += 1`` performs multiple bytecode operations and is not
atomic). Atomicity is only guaranteed where explicitly documented. See
also :term:`race condition` and :term:`data race`.

attached thread state

A :term:`thread state` that is active for the current OS thread.
Expand Down Expand Up @@ -289,6 +297,22 @@ Glossary
advanced mathematical feature. If you're not aware of a need for them,
it's almost certain you can safely ignore them.

concurrency
The ability of a computer program to perform multiple tasks at the same
time. Python provides libraries for writing programs that make use of
different forms of concurrency. :mod:`asyncio` is a library for dealing
with asynchronous tasks and coroutines. :mod:`threading` provides
access to operating system threads and :mod:`multiprocessing` to
operating system processes. Multi-core processors can execute threads and
processes on different CPU cores at the same time (see
:term:`parallelism`).

concurrent modification
When multiple threads modify shared data at the same time. Concurrent
modification without proper synchronization can cause
:term:`race conditions <race condition>`, and might also trigger a
:term:`data race <data race>`, data corruption, or both.

context
This term has different meanings depending on where and how it is used.
Some common meanings:
Expand Down Expand Up @@ -363,6 +387,28 @@ Glossary
the :term:`cyclic garbage collector <garbage collection>` is to identify these groups and break the reference
cycles so that the memory can be reclaimed.

data race
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are only talking about native code (C or C++ or Rust or whatever), maybe we should leave this out of the glossary for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not sure about this. The fact that a program might be crashing because of a data race in native code affects Python programmers, so maybe it warrants at least an entry in the glossary. It'll also be nice to be able to link to this glossary entry from other places where we talk about data races.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One idea we had was for ecosystem documentation to link back here. In that context it'll really help.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having both data race and race condition feels a bit redundant. Wouldn't ecosystem libraries be able to only refer to race conditions?

Am I guessing correctly that here you want to distinguish between races that produce errors in application logic and races that produce a segfault or something low-level like that? If that's the case it might be useful when talking about bug reports, yes. But it might be a bit niche, hopefully?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extension modules can cause data races, so I feel it's appropriate to talk about both, especially since data races are undefined behavior.

A situation where multiple threads access the same memory location
concurrently, at least one of the accesses is a write, and the threads
do not use any synchronization to control their access. Data races
lead to :term:`non-deterministic` behavior and can cause data corruption.
Proper use of :term:`locks <lock>` and other :term:`synchronization primitives
<synchronization primitive>` prevents data races. Note that data races
can only happen in native code, but that :term:`native code` might be
exposed in a Python API. See also :term:`race condition` and
:term:`thread-safe`.

deadlock
A situation in which two or more tasks (threads, processes, or coroutines)
wait indefinitely for each other to release resources or complete actions,
preventing any from making progress. For example, if thread A holds lock
1 and waits for lock 2, while thread B holds lock 2 and waits for lock 1,
both threads will wait indefinitely. In Python this often arises from
acquiring multiple locks in conflicting orders or from circular
join/await dependencies. Deadlocks can be avoided by always acquiring
multiple :term:`locks <lock>` in a consistent order. See also
:term:`lock` and :term:`reentrant`.

decorator
A function returning another function, usually applied as a function
transformation using the ``@wrapper`` syntax. Common examples for
Expand Down Expand Up @@ -662,6 +708,14 @@ Glossary
requires the GIL to be held in order to use it. This refers to having an
:term:`attached thread state`.

global state
Data that is accessible throughout a program, such as module-level
variables, class variables, or C static variables in :term:`extension modules
<extension module>`. In multi-threaded programs, global state shared
between threads typically requires synchronization to avoid
:term:`race conditions <race condition>` and
:term:`data races <data race>`.

hash-based pyc
A bytecode cache file that uses the hash rather than the last-modified
time of the corresponding source file to determine its validity. See
Expand Down Expand Up @@ -706,7 +760,9 @@ Glossary
tuples. Such an object cannot be altered. A new object has to
be created if a different value has to be stored. They play an important
role in places where a constant hash value is needed, for example as a key
in a dictionary.
in a dictionary. Immutable objects are inherently :term:`thread-safe`
because their state cannot be modified after creation, eliminating concerns
about improperly synchronized :term:`concurrent modification`.

import path
A list of locations (or :term:`path entries <path entry>`) that are
Expand Down Expand Up @@ -796,8 +852,9 @@ Glossary

CPython does not consistently apply the requirement that an iterator
define :meth:`~iterator.__iter__`.
And also please note that the free-threading CPython does not guarantee
the thread-safety of iterator operations.
And also please note that :term:`free-threaded <free threading>`
CPython does not guarantee :term:`thread-safe` behavior of iterator
operations.


key function
Expand Down Expand Up @@ -835,10 +892,11 @@ Glossary
:keyword:`if` statements.

In a multi-threaded environment, the LBYL approach can risk introducing a
race condition between "the looking" and "the leaping". For example, the
code, ``if key in mapping: return mapping[key]`` can fail if another
:term:`race condition` between "the looking" and "the leaping". For example,
the code, ``if key in mapping: return mapping[key]`` can fail if another
thread removes *key* from *mapping* after the test, but before the lookup.
This issue can be solved with locks or by using the EAFP approach.
This issue can be solved with :term:`locks <lock>` or by using the
:term:`EAFP` approach. See also :term:`thread-safe`.

lexical analyzer

Expand All @@ -857,6 +915,19 @@ Glossary
clause is optional. If omitted, all elements in ``range(256)`` are
processed.

lock
A :term:`synchronization primitive` that allows only one thread at a
time to access a shared resource. A thread must acquire a lock before
accessing the protected resource and release it afterward. If a thread
attempts to acquire a lock that is already held by another thread, it
will block until the lock becomes available. Python's :mod:`threading`
module provides :class:`~threading.Lock` (a basic lock) and
:class:`~threading.RLock` (a :term:`reentrant` lock). Locks are used
to prevent :term:`race conditions <race condition>` and ensure
:term:`thread-safe` access to shared data. Alternative design patterns
to locks exist such as queues, producer/consumer patterns, and
thread-local state. See also :term:`deadlock`, and :term:`reentrant`.

loader
An object that loads a module.
It must define the :meth:`!exec_module` and :meth:`!create_module` methods
Expand Down Expand Up @@ -942,8 +1013,11 @@ Glossary
See :term:`method resolution order`.

mutable
Mutable objects can change their value but keep their :func:`id`. See
also :term:`immutable`.
An :term:`object` with state that is allowed to change during the course
of the program. In multi-threaded programs, mutable objects that are
shared between threads require careful synchronization to avoid
:term:`race conditions <race condition>`. See also :term:`immutable`,
:term:`thread-safe`, and :term:`concurrent modification`.

named tuple
The term "named tuple" applies to any type or class that inherits from
Expand Down Expand Up @@ -995,6 +1069,13 @@ Glossary

See also :term:`module`.

native code
Code that is compiled to machine instructions and runs directly on the
processor, as opposed to code that is interpreted or runs in a virtual
machine. In the context of Python, native code typically refers to
C, C++, Rust or Fortran code in :term:`extension modules <extension module>`
that can be called from Python. See also :term:`extension module`.

nested scope
The ability to refer to a variable in an enclosing definition. For
instance, a function defined inside another function can refer to
Expand All @@ -1011,6 +1092,15 @@ Glossary
properties, :meth:`~object.__getattribute__`, class methods, and static
methods.

non-deterministic
Behavior where the outcome of a program can vary between executions with
the same inputs. In multi-threaded programs, non-deterministic behavior
often results from :term:`race conditions <race condition>` where the
relative timing or interleaving of threads affects the result.
Proper synchronization using :term:`locks <lock>` and other
:term:`synchronization primitives <synchronization primitive>` helps
ensure deterministic behavior.

object
Any data with state (attributes or value) and defined behavior
(methods). Also the ultimate base class of any :term:`new-style
Expand Down Expand Up @@ -1041,6 +1131,16 @@ Glossary

See also :term:`regular package` and :term:`namespace package`.

parallelism
Executing multiple operations at the same time (e.g. on multiple CPU
cores). In Python builds with the
:term:`global interpreter lock (GIL) <global interpreter lock>`, only one
thread runs Python bytecode at a time, so taking advantage of multiple
CPU cores typically involves multiple processes
(e.g. :mod:`multiprocessing`) or native extensions that release the GIL.
In :term:`free-threaded <free threading>` Python, multiple Python threads
can run Python code simultaneously on different cores.

parameter
A named entity in a :term:`function` (or method) definition that
specifies an :term:`argument` (or in some cases, arguments) that the
Expand Down Expand Up @@ -1215,6 +1315,18 @@ Glossary
>>> email.mime.text.__name__
'email.mime.text'

race condition
A condition of a program where the its behavior
depends on the relative timing or ordering of events, particularly in
multi-threaded programs. Race conditions can lead to
:term:`non-deterministic` behavior and bugs that are difficult to
reproduce. A :term:`data race` is a specific type of race condition
involving unsynchronized access to shared memory. The :term:`LBYL`
coding style is particularly susceptible to race conditions in
multi-threaded code. Using :term:`locks <lock>` and other
:term:`synchronization primitives <synchronization primitive>`
helps prevent race conditions.

reference count
The number of references to an object. When the reference count of an
object drops to zero, it is deallocated. Some objects are
Expand All @@ -1236,6 +1348,25 @@ Glossary

See also :term:`namespace package`.

reentrant
A property of a function or :term:`lock` that allows it to be called or
acquired multiple times by the same thread without causing errors or a
:term:`deadlock`.

For functions, reentrancy means the function can be safely called again
before a previous invocation has completed, which is important when
functions may be called recursively or from signal handlers. Thread-unsafe
functions may be :term:`non-deterministic` if they're called reentrantly in a
multithreaded program.

For locks, Python's :class:`threading.RLock` (reentrant lock) is
reentrant, meaning a thread that already holds the lock can acquire it
again without blocking. In contrast, :class:`threading.Lock` is not
reentrant - attempting to acquire it twice from the same thread will cause
a deadlock.

See also :term:`lock` and :term:`deadlock`.

REPL
An acronym for the "read–eval–print loop", another name for the
:term:`interactive` interpreter shell.
Expand Down Expand Up @@ -1340,6 +1471,18 @@ Glossary

See also :term:`borrowed reference`.

synchronization primitive
A basic building block for coordinating (synchronizing) the execution of
multiple threads to ensure :term:`thread-safe` access to shared resources.
Python's :mod:`threading` module provides several synchronization primitives
including :class:`~threading.Lock`, :class:`~threading.RLock`,
:class:`~threading.Semaphore`, :class:`~threading.Condition`,
:class:`~threading.Event`, and :class:`~threading.Barrier`. Additionally,
the :mod:`queue` module provides multi-producer, multi-consumer queues
that are especially useful in multithreaded programs. These
primitives help prevent :term:`race conditions <race condition>` and
coordinate thread execution. See also :term:`lock`.

t-string
t-strings
String literals prefixed with ``t`` or ``T`` are commonly called
Expand Down Expand Up @@ -1392,6 +1535,19 @@ Glossary
See :ref:`Thread State and the Global Interpreter Lock <threads>` for more
information.

thread-safe
A module, function, or class that behaves correctly when used by multiple
threads concurrently. Thread-safe code uses appropriate
:term:`synchronization primitives <synchronization primitive>` like
:term:`locks <lock>` to protect shared mutable state, or is designed
to avoid shared mutable state entirely. In the
:term:`free-threaded <free threading>` build, built-in types like
:class:`dict`, :class:`list`, and :class:`set` use internal locking
to make many operations thread-safe, although thread safety is not
necessarily guaranteed. Code that is not thread-safe may experience
:term:`race conditions <race condition>` and :term:`data races <data race>`
when used in multi-threaded programs.
Comment on lines +1539 to +1549
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one's very tricky, definitely because thread-safe in itself is very vague.

I took a look at my uni textbook on concurrency and couldn't find a definition of thread-safe. (I checked, it actually never mentions thread-safe or thread safe.) And I think that's because what thread-safe means fundamentally depends on the semantics of an application.

Is it thread-safe to call my_list[0] += 1, assuming my_list is shared? It depends. And that's why there is this unsatisfactory line here:

[...] built-in types like
:class:`dict`, :class:`list`, and :class:`set` use internal locking
to make many operations thread-safe, although thread safety is not
necessarily guaranteed.

Maybe we can turn the problem around and only define thread-safety as the absence of thread-unsafety?

Suggested change
A module, function, or class that behaves correctly when used by multiple
threads concurrently. Thread-safe code uses appropriate
:term:`synchronization primitives <synchronization primitive>` like
:term:`locks <lock>` to protect shared mutable state, or is designed
to avoid shared mutable state entirely. In the
:term:`free-threaded <free threading>` build, built-in types like
:class:`dict`, :class:`list`, and :class:`set` use internal locking
to make many operations thread-safe, although thread safety is not
necessarily guaranteed. Code that is not thread-safe may experience
:term:`race conditions <race condition>` and :term:`data races <data race>`
when used in multi-threaded programs.
A module, function, or class that behaves correctly when used by multiple
threads concurrently. Thread-safe code uses appropriate
:term:`synchronization primitives <synchronization primitive>` like
:term:`locks <lock>` to protect shared mutable state, or is designed
to avoid shared mutable state entirely. Code that is not thread-safe may experience
:term:`race conditions <race condition>` and :term:`data races <data race>`
when used in multi-threaded programs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, talking about thread-safety is always bound to be vague. Having said that, the sentence about dict, list and set feels like it's okay since it doesn't necessarily make any statement, but makes the reader understand that there's subtlety there which they can found out more about in other places in the docs.


token

A small unit of source code, generated by the
Expand Down
Loading