2 回答
TA贡献1873条经验 获得超9个赞
用于Series.where
替换结果中与 byrank
匹配NaN
的行2
,然后用于GroupBy.transform
每组重复值 by GroupBy.first
,最后比较更大的 by并在 中Series.gt
设置值:6
DataFrame.loc
#convert to integers for correct compare values greater like '10'
df[['rank','result']] = df[['rank','result']].astype(int)
s = df['rank'].where(df['result'].eq(2)).groupby(df['group']).transform('first')
df.loc[df['rank'].gt(s), 'result'] = 6
print (df)
group rank result
0 g1 1 1
1 g1 2 4
2 g1 3 2
3 g1 4 6
4 g1 5 6
5 g2 1 1
6 g2 2 4
7 g2 3 4
8 g2 4 2
9 g2 5 6
10 g2 6 6
TA贡献1865条经验 获得超7个赞
这就能解决问题
import pandas as pd
import numpy as np
group = ['g1','g1','g1','g1','g1','g2','g2','g2','g2','g2','g2']
rank = ['1','2','3','4','5','1','2','3','4','5','6']
result = ['1','4','2','4','4','1','4','4','2','4','4']
df = pd.DataFrame({"group": group, "rank": rank, "result": result})
def changeDf(x):
df_gp = df[df['group'] == x['group']]
rank_of_2 = df_gp.loc[df_gp['result'] =='2', 'rank'].values[0]
if int(x['rank']) > int(rank_of_2):
return '6'
else:
return x['result']
df['result'] = df.apply(changeDf, axis=1)
print(df)
添加回答
举报