3 回答
TA贡献1818条经验 获得超7个赞
df['check']=[list(set(x).intersection(set(y)))\
for x, y in zip(df.UF_med, df.UF_cadastral)]
df['count']=df.check.str.len()
id UF_med UF_cadastral check count
0 1 [SP, SC, PA] [SP, PA] [SP, PA] 2
1 2 [SP] [SP] [SP] 1
2 3 [AM, RJ, PA, RS] [AM, RS] [AM, RS] 2
或者只是替换list如下len:
df['amount']=[len(set(x).intersection(set(y))) for x, y in zip(df.UF_med, df.UF_cadastral)]
结果将是:
id UF_med UF_cadastral amount
0 1 [SP, SC, PA] [SP, PA] 2
1 2 [SP] [SP] 1
2 3 [AM, RJ, PA, RS] [AM, RS] 2
TA贡献1856条经验 获得超5个赞
尝试改变
df['Detect_Municipio'] = df.apply(lambda x: x['UF_med'] in x['UF_cadastral'], axis=1)
到
df['Detect_Municipio'] = df.apply( lambda x: len(set(x['UF_med']) & set(x['UF_cadastral'])), axis=1)
表格的元素是列表,因此您可以使用列表交集来获取这些列表中的等效元素。Len 让你得到匹配的数量。
TA贡献1798条经验 获得超7个赞
我不知道你到底想完成什么,但这是我最好的猜测:
在两个列表中查找重叠的列表:
df = {'id': [1,2,3],
'UF_med':[['SP', 'SC', 'PA'], ['SP'], ['AM', 'RJ', 'PA', 'RS']],
'UF_cadastral': [['SP', 'PA'], ['SP'], ['AM', 'RS']]}
output = [item for item in df["UF_med"] if item in df["UF_cadastral"]]
#output is [['SP']]
在所有列表中查找重叠的字符串:
df = {'id': [1,2,3],
'UF_med':[['SP', 'SC', 'PA'], ['SP'], ['AM', 'RJ', 'PA', 'RS']],
'UF_cadastral': [['SP', 'PA'], ['SP'], ['AM', 'RS']]}
uf_med = {item for sublist in df["UF_med"] for item in sublist}
uf_cadastral = {item for sublist in df["UF_cadastral"] for item in sublist}
output = [item for item in uf_med if item in uf_cadastral]
#output is ['AM', 'PA', 'RS', 'SP']
在相同索引列表中查找重叠的字符串:
df = {'id': [1,2,3],
'UF_med':[['SP', 'SC', 'PA'], ['SP'], ['AM', 'RJ', 'PA', 'RS']],
'UF_cadastral': [['SP', 'PA'], ['SP'], ['AM', 'RS']]}
output = [{item for item in list1 if item in list2} for list1, list2 in zip(df["UF_med"], df["UF_cadastral"])]
#output is [{'PA', 'SP'}, {'SP'}, {'AM', 'RS'}]
添加回答
举报