2 回答
TA贡献1803条经验 获得超3个赞
让我们按组提取列然后映射:
max_cols = (df.filter(like='value') # choose the value columns, also df.iloc[:, :4]
.groupby(df['group']).sum() # calculate sum per group
.idxmax(axis=1) # find col with max value
)
df['Column'] = df['group'].map(max_cols)
还groupby().transform():
df['Column'] = (df.filter(like='value')
.groupby(df['group']).transform('sum')
.idxmax(axis=1)
)
输出:
value1 value2 value3 value4 random string column group Column
index1 10 2 3 4 stuff group 2 value1
index2 5 4 3 2 other stuff group 1 value4
index3 6 7 8 9 other stuff group 1 value4
index4 1 2 2 4 yet other stuff group 2 value1
index5 6 1 8 11 other stuff group 1 value4
TA贡献1851条经验 获得超5个赞
这是map和的一种潜在解决方案dictionary:
#creates a dictionary with the maximum sum per group
d = df.groupby('group').sum().idxmax(axis=1).to_dict()
#mapping the dictionary to 'group' column to generated a new column
df['Group Identifying Column'] = df['group'].map(d)
或者,您可以切断该dictionary部分并简单地执行以下操作:
df['Group Identifying Column'] = df.group.map(df.groupby('group').sum().idxmax(axis=1))
输出:
value1 value2 ... group Group Identifying Column
index1 10 2 ... group 2 value1
index2 5 4 ... group 1 value4
index3 6 7 ... group 1 value4
index4 1 2 ... group 2 value1
index5 6 1 ... group 1 value4
添加回答
举报