1 回答
TA贡献1883条经验 获得超3个赞
使用scipy.stats.zscore:
from scipy.stats import zscore
df['zscore'] = df.groupby('ids')['value'].transform(zscore)
print(df)
ids value zscore
0 1 0.10 -1.135550
1 1 0.20 1.297771
2 1 0.14 -0.162221
3 2 0.22 NaN
或者,坚持熊猫,
df['zscore'] = df.groupby('ids').value.apply(
lambda x: (x - x.mean()) / x.std(ddof=0))
print(df)
ids value zscore
0 1 0.10 -1.135550
1 1 0.20 1.297771
2 1 0.14 -0.162221
3 2 0.22 NaN
如果您想要扩展 zscore,请尝试groupby+ expanding:
g = df.groupby('ids').value.expanding(min_periods=1)
df['zscore'] = (df['value'] - g.mean().values) / g.std(ddof=0).values
print(df)
ids value zscore
0 1 0.10 NaN
1 1 0.20 1.000000
2 1 0.14 -0.162221
3 2 0.22 NaN
添加回答
举报