gh-111965: Using critical sections to make io.StringIO thread safe.#112116
gh-111965: Using critical sections to make io.StringIO thread safe.#112116corona10 merged 12 commits intopython:mainfrom
io.StringIO thread safe.#112116Conversation
colesbury
left a comment
There was a problem hiding this comment.
Thanks @aisk.
I think we want critical sections for stringio_closed and stringio_closed for the accesses to closed because the object might be concurrent closed. You'll need to be careful because the CHECK_ macros may return and it's not safe to return from the middle of a critical section. It may be easiest to split those functions in two.
Similarly for stringio_newlines we want the critical section to cover at least the CHECK_INITIALIZED and CHECK_CLOSED.
Some basic performance measurements would not be a bad idea, but don't go overboard. They'll be more important after other changes to I/O objects.
|
We don't need critical sections in |
|
Updated, and made a small performance test on my local machine, and not seen any noticeable performance difference before and after the commits. I just realized that in the default build (no |
colesbury
left a comment
There was a problem hiding this comment.
Thanks @aisk. What's the performance impact on --disable-gil builds using your benchmark?
I suggested a different structure for the three getter functions that I think will make them more consistent with the rest of the file (in particular, not having to inline CHECK_INITIALIZED/CHECK_CLOSED).
|
I found that it's really common for the getters and setters to add the critical section guard, and makes the work repeatability and error-prone. Can we add some code generation process, like the argument clinic stuff, or just introduce a C macro to reduce the repeatability, like: |
|
@aisk, Yeah I think adding support for getters/setters to Argument Clinic is the way to do it. |
Modules/_io/stringio.c
Outdated
| #include "Python.h" | ||
| #include <stddef.h> // offsetof() | ||
| #include "pycore_object.h" | ||
| #include "pycore_critical_section.h" |
Modules/_io/stringio.c
Outdated
|
|
||
| state: object | ||
|
|
There was a problem hiding this comment.
| state: object | |
| state: object | |
| / |
There was a problem hiding this comment.
@colesbury
Out of curiosity:
Should user explicitly care about thread-safe when implementing dunder methods of object for each case?
Or Will it be handled from interpreter side?
Modules/_io/clinic/stringio.c.h
Outdated
| "\n"); | ||
|
|
||
| #define _IO_STRINGIO___SETSTATE___METHODDEF \ | ||
| {"__setstate__", _PyCFunction_CAST(_io_StringIO___setstate__), METH_FASTCALL|METH_KEYWORDS, _io_StringIO___setstate____doc__}, |
There was a problem hiding this comment.
__setstate__ should be METH_O not METH_FASTCALL|METH_KEYWORDS
See: https://github.com/python/cpython/pull/112116/files#r1398121900
There was a problem hiding this comment.
Just realized the difference, thanks a lot for point out
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Modules/_io/stringio.c
Outdated
| #include <stddef.h> // offsetof() | ||
| #include <stddef.h> // offsetof() | ||
| #include "pycore_object.h" | ||
| #include "pycore_critical_section.h" // Py_BEGIN_CRITICAL_SECTION() |
There was a problem hiding this comment.
Since we update AC tool, we don't have to add a header manually from now on.
#112251
Please rebase the PR and run make clinic one more time.
Co-authored-by: Donghee Na <donghee.na@python.org>
da13ca2 to
0fbf608
Compare
Added the "critical sections" tags for most methods of the
StringIOclass, except the__new__and__init__. The reference implementation from colesbury/nogil-3.12@6323ca60f9 dosen't do it also. I think it's because with the__new__and__init__process, other threads will not touch the newly created instance, so it does not need lock.And for the getter closed and linebuffering, there are only a few read-only field access for conditioning (and will raise exception if failed), so it's not need to protect.
I don't know if I'm right for these two questions. And do we need some performance benchmark for the change? If so, I can provide some help later.
--disable-gilbuilds) #111965