是否可以基于 2 个列比较 4 个数据帧,如果出现在 2 个或更多数据帧中,是否可以得到包含重复的结果。结果应包含发生次数。我的数据框看起来像>>>df1 Circle Division Power 0 AAAA AA 25 1 BBBB BB 5 >>>df2 Circle Division Power 0 CCCC CC 25 1 BBBB BB 66>>>df3 Circle Division Power 0 DDDD DD 55 1 FFFF FF 682 AAAA AA 87 >>>df4 Circle Division Power 0 AAAA AA 45 1 CCCC CC 56 预期结果>>>result_df Circle Division Power1 power2 power3 power4 Repeated0 AAAA AA 25 - 87 45 31 BBBB BB 5 66 - - 22 CCCC CC - 25 - 56 2 我试图一一合并,但在那之后卡住了。 m12=pd.merge(df1, df2, on=['Circle','Division'], how='inner',suffixes=('1',' 2')) m13=pd.merge(df1, df3, on=['Circle','Division'], how='inner',suffixes=('1',' 3')) m14=pd.merge(df1, df4, on=['Circle','Division'], how='inner',suffixes=('1',' 4')) m23=pd.merge(df2, df3, on=['Circle','Division'], how='inner',suffixes=('2',' 3')) m24=pd.merge(df2, df4, on=['Circle','Division'], how='inner',suffixes=('2',' 4')) m34=pd.merge(df3, df4, on=['Circle','Division'], how='inner',suffixes=('3',' 4'))
1 回答
MMTTMM
TA贡献1869条经验 获得超4个赞
使用concatwithDataFrame.set_index和参数keys将所有 DataFrame 连接在一起, flatten MultiIndex。
创建新列以DataFrame.count获取NaN每行的非 s 值并按以下方式过滤boolean indexing:
dfs = [df1, df2, df3, df4]
comp = [x.set_index(['Circle','Division']) for x in dfs]
df = pd.concat(comp, axis=1, keys=(range(1, len(dfs)+ 1)))
df.columns = [f'{b}{a}' for a, b in df.columns]
df['Repeat'] = df.count(axis=1)
df = df[df['Repeat'] > 1]
df = df.reset_index()
print (df)
Circle Division Power1 Power2 Power3 Power4 Repeat
0 AAAA AA 25.0 NaN 87.0 45.0 3
1 BBBB BB 5.0 66.0 NaN NaN 2
2 CCCC CC NaN 25.0 NaN 56.0 2
添加回答
举报
0/150
提交
取消