Skip to content

Conversation

@Christopher-Chianelli
Copy link
Contributor

@Christopher-Chianelli Christopher-Chianelli commented Apr 16, 2023

bpo-46430 caused an interesting side effect; the code x = 'a'; x[0] is x no longer returned True. This in turn is because there are two different cached versions of 'a':

  • One that was cached when code in frozen modules was compiled (and is stored in the interned_dict)
  • One that is stored as a runtime-global object that is used during function calls (and is stored in _Py_SINGLETON(strings))

However, some characters do not have this behaviour (for example, 'g', 'u', and 'z'). I suspect it because these characters are not used in co_consts of frozen modules.

The interned_dict is per interpreter, and is initialized by init_interned_dict(PyInterpreterState *). The prior implementation initialize it to an empty dict, which allows code in frozen modules to use their (different and per interpreter) singleton strings instead of the runtime-global one.

The new implementation add all runtime-global singleton strings to the interned_dict when it initialized, causing the frozen modules to use the same immortal singleton string and for x = 'a'; x[0] is x to return True.

…reter interned_dict

bpo-46430 caused an interesting side effect; the code
`x = 'a'; x[0] is x` no longer returned True. This in turn
is because there are two different cached versions of 'a':

- One that was cached when code in frozen modules was compiled
  (and is stored in the interned_dict)
- One that is stored as a runtime-global object that is used
  during function calls (and is stored in _Py_SINGLETON(strings))

However, some characters do not have this behaviour (for example,
'g', 'u', and 'z'). I suspect it because these characters are not
used in co_consts of frozen modules.

The interned_dict is per interpreter, and is initialized by
`init_interned_dict(PyInterpreterState *)`. The prior implementation
initialize it to an empty dict, which allows code in frozen modules
to use their (different and per interpreter) singleton strings
instead of the runtime-global one.

The new implementation add all runtime-global singleton strings
to the interned_dict when it initialized, causing the frozen
modules to use the same immortal singleton string and for
`x = 'a'; x[0] is x` to return True.
Using it in the intern dict seems to causes decrementation
of its reference count elsewhere that cannot be solved
with Py_INCREF. However, updating the tuple inside
intern_string_constants does not cause the extra
decrementation.
@kumaraditya303
Copy link
Contributor

See my comment on issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants