Skip to content

implement multi-phase init#1024

Open
provinzkraut wants to merge 5 commits into
msgspec:mainfrom
provinzkraut:multi-phase-init
Open

implement multi-phase init#1024
provinzkraut wants to merge 5 commits into
msgspec:mainfrom
provinzkraut:multi-phase-init

Conversation

@provinzkraut

Copy link
Copy Markdown
Member

Closes #563.

Implement multi-phase init.

Relatively straightforward, the only slight hiccough is msgspec_get_global_state, which relied on PyState_FindModule / PyState_AddModule, which in turn don't work (well) with multi-phase init. I've replaced them with a global static MsgspecState, set during module init; This is not safe for sub-interpreters, but handling it properly would be a large refactor that I think is best kept for separate PRs.


Road to subinterpreter support

This is the first step for msgspec to support subinterpreters, but it's still a long way to go. I'll open a separate tracking issue for that once this gets merged, but it'll likely involve some larger refactors, especially around moving things to heap types where absolutely necessary, and ensuring this does not impact performance negatively.

Comment thread tests/unit/test_module.py

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we're changing how the subinterpreter support is flagged, i wanted some tests to show it's still working

@provinzkraut provinzkraut force-pushed the multi-phase-init branch 2 times, most recently from d65661e to 16fe68d Compare April 18, 2026 15:35
@provinzkraut provinzkraut marked this pull request as draft April 18, 2026 15:36
@provinzkraut provinzkraut marked this pull request as ready for review April 18, 2026 21:13
Comment thread src/msgspec/_core.c
return 0;
}

static struct PyModuleDef msgspecmodule = {

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was simply moved down to make the flow easier, as it references _core_exec

@Siyet Siyet left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving a note for the history / backlink: this obsoletes the PyState_AddModule workaround jcrist added in #561, where the body said "in the long run we should move to using multiphase module init which should avoid this problem entirely." Good to have the thread connected.

Comment thread src/msgspec/_core.c Outdated
@goodboy

goodboy commented Apr 21, 2026

Copy link
Copy Markdown

I'll open a separate tracking issue for that once this gets merged,

please do ping if/when that happens 🙏🏼

Comment thread src/msgspec/_core.c Outdated
Comment thread tests/unit/test_module.py Outdated
Comment thread src/msgspec/_core.c Outdated
Comment thread tests/unit/test_module.py Outdated
@Siyet

Siyet commented May 26, 2026

Copy link
Copy Markdown
Contributor

@provinzkraut Happy to push these small fixups to your branch myself if you're short on time, just give me a thumbs up. Keen to get this in so the subinterp work in #1026 can start moving.

@provinzkraut

Copy link
Copy Markdown
Member Author

@Siyet feel free to push. Otherwise I could get it done some time later this week

Comment thread src/msgspec/_core.c
state is populated in _core_exec and cleared and freed by m_clear and m_free
respectively.
*/
static MsgspecState *_core_state = NULL;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would advokate against this pattern :)
It would be required to refactor it anyway for future FT / SI builds.

Maybe we can instead use something like

static MsgspecState *
_get_current_module_state(void)
{
    PyObject *mod = _get_current_module();
    if (mod == NULL) {
        mod = PyImport_ImportModule("msgspec._core");
        if (mod == NULL) {
            return NULL;
        }
    }
    MsgspecState *state = get_module_state(mod);
    Py_DECREF(mod);
    return state;
}

?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or PyType_GetModuleByDef / PyType_GetModule :)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried using PyType_GetModule at first, but there were some issues. Can't remember what it was though, it's been a while. Let me re-evaluate that again

Comment thread src/msgspec/_core.c
.m_methods = msgspec_methods,
.m_traverse = msgspec_traverse,
.m_clear = msgspec_clear,
.m_free =(freefunc)msgspec_free,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.m_free =(freefunc)msgspec_free,
.m_free = (freefunc)msgspec_free,

style nit

Comment thread src/msgspec/_core.c
#ifdef Py_GIL_DISABLED
PyUnstable_Module_SetGIL(m, Py_MOD_GIL_NOT_USED);
#if !PY313_PLUS
PyUnstable_Module_SetGIL(m, Py_MOD_GIL_NOT_USED);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed, we set {Py_mod_gil, Py_MOD_GIL_NOT_USED}, module slot

Comment thread src/msgspec/_core.c
if (st->astimezone == NULL) return -1;

/* uuid module imports */
temp_module = PyImport_ImportModule("uuid");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are missing Py_DECREF(temp_module) calls in many places.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this was discussed in a previous PR. I think we should clean those up. Should be a separate PR though IMO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use multi-phase init for c-extension initialization

4 participants