Skip to content

Retain order of keys in dict when deleting a key#2550

Closed
fermian wants to merge 4 commits into
RustPython:masterfrom
fermian:test_dict
Closed

Retain order of keys in dict when deleting a key#2550
fermian wants to merge 4 commits into
RustPython:masterfrom
fermian:test_dict

Conversation

@fermian

@fermian fermian commented Apr 12, 2021

Copy link
Copy Markdown
Contributor

This patch makes dictionaries retain the order of elements after deleting a key (the correct behavior in CPython).

@fermian fermian marked this pull request as draft April 12, 2021 13:19
@coolreader18

Copy link
Copy Markdown
Member

I don't think it's so simple, since there's still indices in indices pointing to the "old" positions of the entries. I think you'd have to recompute/shift over the indices vector as well.

@coolreader18

coolreader18 commented Apr 12, 2021

Copy link
Copy Markdown
Member

But then maybe that would mess up something in the hash/bucket algorithm? I'm not sure, hashmaps are complicated 🙃

@fermian

fermian commented Apr 12, 2021

Copy link
Copy Markdown
Contributor Author

I don't think it's so simple, since there's still indices in indices pointing to the "old" positions of the entries. I think you'd have to recompute/shift over the indices vector as well.

I think I'll do that. I'm new to this code base so I'm not sure how the dict implementation works yet :P

I'm assuming the indices vector stores a mapping from the hash table's keys(?) to its corresponding index in the entries vector.

@fermian fermian marked this pull request as ready for review April 12, 2021 15:53
Comment thread vm/src/dictdatatype.rs
inner.entries.pop().unwrap()
} else {
let last_index = inner.entries.last().unwrap().index;
let removed = inner.entries.swap_remove(entry_index);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh... I believed it is not swap_remove and we were doing it as how CPython does.

Comment thread vm/src/dictdatatype.rs
Comment on lines +562 to +565
for move_entry_idx in entries_rem..entries_len {
let index_index = inner.entries[move_entry_idx].index;
inner.indices[index_index] = move_entry_idx as i64 - 1;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think CPython uses loop for this. What happens if the dict has very many items? Would you check _PyDict_Pop_KnownHash in dictobject.c? I think this is the corresponding funciton in CPython.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this definitely won't be a loop, pretty sure that leads to O(N) deletes.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eldpswp99 is working on this again

@coolreader18

Copy link
Copy Markdown
Member

@youknowone here is where I meant to put this comment, not the other PR...

Interesting - it looks like CPython does this by just setting the entry's key+value to NULL and just letting it sit there, and only on "resize" does it get rid of the unused entry and recompact. That might be something to try, if you want, or else this looks fine to me.

@fermian

fermian commented Apr 15, 2021

Copy link
Copy Markdown
Contributor Author

@coolreader18 I think you can't represent unused entries with the current DictEntry struct. Maybe we could turn DictInner's entry type into Vec<Option<DictEntry<T>>> and rewrite all the dictionary functions?

@coolreader18

Copy link
Copy Markdown
Member

@fermian yeah, I think that would be good to try, if you feel confident in it.

@DimitrisJim

DimitrisJim commented Aug 14, 2021

Copy link
Copy Markdown
Member

Hey @fermian do you have time to see this through or should I pick it up?

Edit: Someone else is working on this, ignore this comment.

@youknowone

Copy link
Copy Markdown
Member

#2902

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants