-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Move cdef globals to module state #6967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
commit 59d932719d5c5fbdfc3da73b75bce1b1e82e1225 Merge: a139956 164df91 Author: da-woods <dw-git@d-woods.co.uk> Date: Sat Jun 7 15:34:43 2025 +0100 Merge remote-tracking branch 'real_origin/master' into globals-module-state commit a139956 Author: da-woods <dw-git@d-woods.co.uk> Date: Sat May 31 18:08:21 2025 +0100 More working altough not there yet commit f1abd7d Merge: 8e51f9f 5fe4ea8 Author: da-woods <dw-git@d-woods.co.uk> Date: Sat May 31 13:38:00 2025 +0100 Merge remote-tracking branch 'real_origin/master' into globals-module-state commit 8e51f9f Author: da-woods <dw-git@d-woods.co.uk> Date: Mon May 26 21:00:04 2025 +0100 Begin adding globals to the module state commit 9cec077 Author: da-woods <dw-git@d-woods.co.uk> Date: Mon May 26 18:19:11 2025 +0100 Only pass "deprecated" when it'll be understood commit 491434f Author: da-woods <dw-git@d-woods.co.uk> Date: Mon May 26 16:40:13 2025 +0100 Add back in documentation of forrmat commit 66e708a Merge: 90c203a a2e5e39 Author: da-woods <dw-git@d-woods.co.uk> Date: Mon May 26 16:38:31 2025 +0100 Merge remote-tracking branch 'real_origin/master' into pick-limited-api-version-to-test commit 90c203a Author: da-woods <dw-git@d-woods.co.uk> Date: Mon May 26 14:57:02 2025 +0100 Add an option to runtests to run with a specific Limited API version I don't propose using this on the CI (there's just too many combinations to test) but I think it's a useful thing to be able to do locally.
| def put_var_gotref(self, entry): | ||
| self.put_gotref(entry.cname, entry.type) | ||
| cname = entry.cname | ||
| if entry.is_declared_in_module_state(): | ||
| cname = self.name_in_module_state(cname) | ||
| self.put_gotref(cname, entry.type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably a bit of cleanup left
This code suggests to me that someone else needs to know how to build a "name in module state" than "self". (Not sure who, but at least "self" prevents the Entry from doing it itself.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think there is a way to get the module state qualified cname into the Entry? ISTM that that might resolve a lot of quirks in the implementation.
| not self.entry.in_closure and not self.entry.from_closure): | ||
| return False | ||
| # Most likely a global, or a class attribute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I consider it risky to temporarily set object attributes in certain situations and rely on them to be there in other places. It can easily lead to incorrect C code being generated, even silently. Object state should be well defined when the object is created, not extended in an ad-hoc fashion. Static code analysers usually warn about this kind of state change, and for good reason.
Cython/Compiler/ModuleNode.py
Outdated
| if entry.is_declared_in_module_state(): | ||
| storage_class = "" | ||
| dll_linkage = None | ||
| destination_code = globalstate['module_state'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I admit that code might not be the best name for the code writer object, but it's a convention by now and so ubiquitous that it's good that it's short. But destination_code sounds like it's code for/about a destination, not something that allows writing code to a specific destination. I think I used decl_code in other places, which, in retrospect, doesn't sound much better but at least suggests what it should be used for. So, maybe that, or you can come up with something clearer?
Cython/Compiler/ModuleNode.py
Outdated
|
|
||
| if storage_class: | ||
| code.put("%s " % storage_class) | ||
| destination_code.put("%s " % storage_class) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| destination_code.put("%s " % storage_class) | |
| destination_code.put(f"{storage_class} ") |
Cython/Compiler/ModuleNode.py
Outdated
| if init is not None: | ||
| code.put_safe(" = %s" % init) | ||
| code.putln(";") | ||
| destination_code.put_safe(" = %s" % init) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| destination_code.put_safe(" = %s" % init) | |
| destination_code.put_safe(f" = {init}") |
| if isinstance(self.loopvar_node, ExprNodes.TempNode): | ||
| self.loopvar_node.allocate(code) | ||
| elif isinstance(self.loopvar_node, ExprNodes.NameNode): | ||
| self.loopvar_node.generate_evaluation_code(code) | ||
| if isinstance(self.py_loopvar_node, ExprNodes.TempNode): | ||
| self.py_loopvar_node.allocate(code) | ||
| elif isinstance(self.py_loopvar_node, ExprNodes.NameNode): | ||
| self.py_loopvar_node.generate_evaluation_code(code) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does the generate_evaluation_code() only apply to NameNodes? Why do we need to special case them here?
|
|
||
| self.modified_entries.append((entry, entry.cname)) | ||
| code.putln("%s = %s;" % (cname, entry.cname)) | ||
| self.modified_entries.append((entry, entry.cname, entry.force_not_declared_in_module_state)) | ||
| entry_cname = code.entry_cname_in_module_state(entry) | ||
| entry.force_not_declared_in_module_state = True | ||
| code.putln(f"{cname} = {entry_cname};") | ||
| entry.cname = cname |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, interesting. I wasn't aware of this 14 years old cname replacement hack. It's unfortunate, because as we see here, it now requires another hack to work around it.
What was the reason again why entries couldn't know their complete module state based cname themselves? That could avoid the need for a force_not_declared_in_module_state flag.
The problem with that is that the module state lookup should be different depending on where you're requesting it from (and not the scope that the entry lives in). For example, in the global scope it probably just has the module as a function argument while in a cdef class it might use Currently most of that isn't implemented and almost everything uses a generic fallback, but I think that should change in future. One other option would be to have a single macro to access the module state and then redefine the macro at the start of every C function. I.e. move the work into the C compiler. I'm not convinced that's better, but it might simplify some of the code within Cython (I'll look at the other comments later) |
So … can we give the |
|
Merging current master in order to update the CI config. |
Co-authored-by: scoder <stefan_ml@behnel.de>
|
I think my current plans for this are:
and it'll either be done or not before the rest of 3.2 and we just make a judgement based on that. |
|
I think there's a GIL related issue I've missed: the module state lookup requires the GIL to be held (and we can't just use I think that's probably a reasonable restriction (although it implies a lot of existing code won't run with module state). But it probably suggests that this isn't really ready for 3.2. |
Draft for the moment because I think the probably a bit of cleanup left to be done until this is nice.
This moves almost all
cdefglobal variables into the module state, which I think is one of the last big barriers to being able to use module state effectively.The exception is
cdef publicvariables which I don't think can be in the module state because they have to appear in a header file.