I wanted to write a function which lists the Counter of dictionary items that appear for at least the number of times df in all other dictionaries.
example:
prune(([{'a': 1, 'b': 10}, {'a': 1}, {'c': 1}], min_df=2)
[Counter({'a': 1}), Counter({'a': 1})]
prune(([{'a': 1, 'b': 10}, {'a': 2}, {'c': 1}], min_df=2)
[Counter({'a': 1}), Counter({'a': 2})]
As we can see that 'a' occurs twice in two dictionaries it gets listed in the output.
My approach:
from collections import Counter
def prune(dicto,df=2):
new = Counter()
for d in dicto:
new += Counter(d.keys())
x = {}
for key,value in new.items():
if value >= df:
x[key] = value
print Counter(x)
Output:
Counter({'a': 2})
This gives the output as a combined Counter. As we can see, term 'a' appears 2 times on the whole and hence it satisfies the df condition and gets listed in the output. Now, Can anyone correct me to get the desired output.
dfmany dictionaries