1 回答

TA贡献1798条经验 获得超7个赞
您可以使用sort_values+ groupby 根据 ID 按组计算差异。为了计算差值,使用 . 将日期转换为日期时间pd.to_datetime:
df['value']=pd.to_datetime(df['value'])
df=df.sort_values(['Opportunity_ID','stage'])
df['difference']=df.groupby('Opportunity_ID')['value'].diff(-1)
print(df)
Opportunity_ID stage value difference
0 0061R00000l43xP 1.0 2018-11-07 NaT
1 0061R00000lUT5r 1.0 2019-05-02 -20 days
2 0061R00000lUT5r 2.0 2019-05-22 -12 days
3 0061R00000lUT5r 3.0 2019-06-03 -99 days
5 0061R00000lUT5r 5.5 2019-09-10 NaT
6 0061R00000lXwZL 1.0 2018-12-05 -125 days
7 0061R00000lXwZL 4.0 2019-04-09 -10 days
8 0061R00000lXwZL 5.0 2019-04-19 0 days
9 0061R00000lXwZL 5.5 2019-04-19 -14 days
10 0061R00000lXwZL 8.0 2019-05-03 -67 days
11 0061R00000lXwZL 9.0 2019-07-09 -24 days
12 0061R00000lXwZL 11.0 2019-08-02 NaT
13 0061R00000lY4Vm 1.0 2018-12-06 -294 days
14 0061R00000lY4Vm 2.0 2019-09-26 NaT
15 0061R00000lrOGm 3.0 2019-02-15 -215 days
16 0061R00000lrOGm 4.0 2019-09-18 NaT
4 80061R0000lUT5r 5.0 2019-06-20 NaT
订购可能对您不方便。您可以在不预先排序值的情况下计算它。这将是您示例的结果:
df['value']=pd.to_datetime(df['value'])
df['difference']=df.groupby('Opportunity_ID')['value'].diff(-1)
print(df)
Opportunity_ID stage value difference
0 0061R00000l43xP 1.0 2018-11-07 NaT
1 0061R00000lUT5r 1.0 2019-05-02 -20 days
2 0061R00000lUT5r 2.0 2019-05-22 -12 days
3 0061R00000lUT5r 3.0 2019-06-03 -99 days
4 80061R0000lUT5r 5.0 2019-06-20 NaT
5 0061R00000lUT5r 5.5 2019-09-10 NaT
6 0061R00000lXwZL 1.0 2018-12-05 -125 days
7 0061R00000lXwZL 4.0 2019-04-09 -10 days
8 0061R00000lXwZL 5.0 2019-04-19 0 days
9 0061R00000lXwZL 5.5 2019-04-19 -14 days
10 0061R00000lXwZL 8.0 2019-05-03 -67 days
11 0061R00000lXwZL 9.0 2019-07-09 -24 days
12 0061R00000lXwZL 11.0 2019-08-02 NaT
13 0061R00000lY4Vm 1.0 2018-12-06 -294 days
14 0061R00000lY4Vm 2.0 2019-09-26 NaT
15 0061R00000lrOGm 3.0 2019-02-15 -215 days
16 0061R00000lrOGm 4.0 2019-09-18 NaT
添加回答
举报