2 回答
TA贡献1796条经验 获得超4个赞
我建议为计id数值创建新的帮助器列cumcount,然后按此值合并:
df1['g'] = df1.groupby('id').cumcount()
df2['g'] = df2.groupby('id').cumcount()
merged_table = pd.merge(df1,df2,on=["id", 'g'],how='outer')
print (merged_table)
Name amount_x id g Category amount_y
0 John 500.25 GH10 0 Food 500.25
1 Helen 1250.00 GH11 0 Travel 1250.00
2 Adam 432.54 GH11 1 Food 432.54
3 Sarah 567.12 GH12 0 NaN NaN
最后删除id:
merged_table = pd.merge(df1,df2,on=["id", 'g'],how='outer').drop('g', axis=1)
print (merged_table)
Name amount_x id Category amount_y
0 John 500.25 GH10 Food 500.25
1 Helen 1250.00 GH11 Travel 1250.00
2 Adam 432.54 GH11 Food 432.54
3 Sarah 567.12 GH12 NaN NaN
详细说明:
print (df1)
Name amount id g
0 John 500.25 GH10 0
1 Helen 1250.00 GH11 0
2 Adam 432.54 GH11 1
3 Sarah 567.12 GH12 0
print (df2)
Category amount id g
0 Food 500.25 GH10 0
1 Travel 1250.00 GH11 0
2 Food 432.54 GH11 1
TA贡献1842条经验 获得超12个赞
在输出中(之后merge),您可以在下面应用。我们也可以单枪匹马地做到这一点,但我建议您先弄清楚。给你提示...
>>> df.drop_duplicates('Name',keep='first')
Name amount_x id category amount_y
0 John 500.25 GH10 Food 500.25
1 Helen 1250.00 GH11 Travel 1250
3 Adam 432.54 GH11 Travel 1250
5 Sarah 567.12 GH12
添加回答
举报