Retain order of keys in dict when deleting a key#2550
Conversation
|
I don't think it's so simple, since there's still indices in |
|
But then maybe that would mess up something in the hash/bucket algorithm? I'm not sure, hashmaps are complicated 🙃 |
I think I'll do that. I'm new to this code base so I'm not sure how the dict implementation works yet :P I'm assuming the indices vector stores a mapping from the hash table's keys(?) to its corresponding index in the entries vector. |
| inner.entries.pop().unwrap() | ||
| } else { | ||
| let last_index = inner.entries.last().unwrap().index; | ||
| let removed = inner.entries.swap_remove(entry_index); |
There was a problem hiding this comment.
oh... I believed it is not swap_remove and we were doing it as how CPython does.
| for move_entry_idx in entries_rem..entries_len { | ||
| let index_index = inner.entries[move_entry_idx].index; | ||
| inner.indices[index_index] = move_entry_idx as i64 - 1; | ||
| } |
There was a problem hiding this comment.
I don't think CPython uses loop for this. What happens if the dict has very many items? Would you check _PyDict_Pop_KnownHash in dictobject.c? I think this is the corresponding funciton in CPython.
There was a problem hiding this comment.
Yeah, this definitely won't be a loop, pretty sure that leads to O(N) deletes.
|
@youknowone here is where I meant to put this comment, not the other PR... Interesting - it looks like CPython does this by just setting the entry's key+value to NULL and just letting it sit there, and only on "resize" does it get rid of the unused entry and recompact. That might be something to try, if you want, or else this looks fine to me. |
|
@coolreader18 I think you can't represent unused entries with the current DictEntry struct. Maybe we could turn DictInner's entry type into |
|
@fermian yeah, I think that would be good to try, if you feel confident in it. |
|
Hey @fermian do you have time to see this through or should I pick it up? Edit: Someone else is working on this, ignore this comment. |
This patch makes dictionaries retain the order of elements after deleting a key (the correct behavior in CPython).