2 回答
TA贡献1780条经验 获得超5个赞
该Counter数项的频率,所以会告诉你什么比这一次更出现。data从你的字典中取出:
from Collections import Counter
data = [
['00000000B42852FA', 'ADM_EIG', 'Administratiefeigenaar', 'ADM_EIG', 'ADM_EIG'],
['000000005880959E', 'OPZ', 'Opzeggingen', 'STANDAARD', nan]
]
您需要展平列表列表:
[item for sublist in data for item in sublist]
计数器将为您提供每个项目的频率:
>>> Counter([item for sublist in data for item in sublist])
Counter({'ADM_EIG': 3, '00000000B42852FA': 1, 'Administratief eigenaar': 1, '000000005880959E': 1, 'OPZ': 1, 'Opzeggingen': 1, 'STANDAARD': 1, nan: 1})
然后您可以过滤您需要的内容:
counter = Counter([item for sublist in data for item in sublist])
[value for value, count in counter.items() if count > 1]
这使 ['ADM_EIG']
编辑以匹配问题编辑
要查看所有行,请获取所有数据并查找重复项:
data = []
for key, value in files_dict.items():
data.extend(value['data'])
counter = Counter([item for sublist in data for item in sublist])
print([value for value, count in counter.items() if count > 1])
添加回答
举报