bpo-45340: Don't create object dictionaries unless actually needed#28802
bpo-45340: Don't create object dictionaries unless actually needed#28802markshannon merged 25 commits intopython:mainfrom
Conversation
… keys more freely.
|
Needs NEWS and some additional documentation. |
Fidget-Spinner
left a comment
There was a problem hiding this comment.
Some comments about LOAD_METHOD.
| if (*owner_dictptr) { | ||
| SPECIALIZATION_FAIL(LOAD_METHOD, SPEC_FAIL_IS_ATTR); | ||
| goto fail; |
There was a problem hiding this comment.
Does this reduce the specialization rates by much?
There was a problem hiding this comment.
I didn't check. Overall performance is up a bit, so that's good enough for this PR 🙂
We will need to recheck the specialization rates and kinds of failures for LOAD_ATTR, STORE_ATTR and LOAD_METHOD. From my superficial scan, it appears that the kinds of failures are more clustered, which bodes well.
But changing all that in this PR would just confuse matters.
| assert(self_cls->tp_dictoffset > 0); | ||
| assert(self_cls->tp_inline_values_offset > 0); | ||
| PyDictObject *dict = *(PyDictObject **)(((char *)self) + self_cls->tp_dictoffset); | ||
| DEOPT_IF(dict != NULL, LOAD_METHOD); |
There was a problem hiding this comment.
Wait is this due to the new scheme that objects shouldn't have dict unless really required?
There was a problem hiding this comment.
Yes, it is. We assume that having no __dict__ is the most common case, so it isn't worth bothering to check for the shared key dict case.
| PyObject *descr = NULL; | ||
| DesciptorClassification kind = 0; | ||
| kind = analyze_descriptor(owner_cls, name, &descr, 0); | ||
| // Store the version right away, in case it's modified halfway through. |
There was a problem hiding this comment.
Out of curiosity: is there a reason why we don't need to do this immediately anymore? Are the new dictionary operations atomic/don't call any _PyType_Lookup?
There was a problem hiding this comment.
We used to call _Py_dict_lookup, which could do anything, now we call _PyDictKeys_StringLookup which cannot.
|
🤖 New build scheduled with the buildbot fleet by @markshannon for commit 7978220 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
|
🤖 New build scheduled with the buildbot fleet by @markshannon for commit 8c894f3 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
|
The buildbot failures are:
|
|
Benchmark of final version: 1% faster |
| if (owner_cls->tp_inline_values_offset) { | ||
| PyObject **owner_dictptr = _PyObject_DictPointer(owner); | ||
| assert(owner_dictptr); | ||
| if (*owner_dictptr) { |
There was a problem hiding this comment.
@markshannon I'm somewhat confused why anything with a dict pointer is now an attribute, even without looking into said dict. Is there a reason elsewhere (or is this so that we don't need another opcode for the other dict form)?
There was a problem hiding this comment.
Anything will a dict pointer will be a miss for LOAD_INSTANCE_VALUE, so I'm assuming it isn't worth optimizing.
Maybe we should attempt to optimize this using LOAD_ATTR_WITH_HINT at this point?
There was a problem hiding this comment.
This code seems to be for LOAD_METHOD_CACHED. I'll try to play around with it and see if the dictionary matters (versus LOAD_ATTR_INSTANCE_VALUE/LOAD_ATTR_WITH_HINT and such.
A "normal" Python objects is conceptually just a pair of pointers, one to the class, and one to the dictionary.
With shared keys, the dictionary is redundant as it is no more than a pair of pointers, one to the keys and one to the values.
By adding a pointer to the values to the object and fetching the keys via the class, we can avoid creating a dictionary for many objects.
See faster-cpython/ideas#72 for more details.
About 1% faster, which is nice, but the main benefit is the reduced memory consumption.
https://bugs.python.org/issue45340