2 回答
TA贡献1789条经验 获得超10个赞
IIUC,这是我的回答:
df['Date'] = pd.to_datetime(df.Date)
df['delta'] = df.groupby('ID')['Date'].diff().dt.days
df['flag'] = (df.groupby('ID').delta.cumsum()<365).astype(int)
group_ids = df.flag.diff().ne(0).cumsum()
df['count'] = df['flag'].groupby([df['ID'], group_ids]).cumsum()
结果:(删除不相关的列)
ID Date count
0 abc 2016-07-12 0
1 abc 2017-02-04 1
2 abc 2017-02-13 2
3 abc 2019-02-16 0
4 xyz 2014-11-03 0
5 xyz 2014-11-06 1
6 xyz 2016-02-17 0
TA贡献1877条经验 获得超6个赞
我只是用稍微修改过的代码复制了你的逻辑:
....
df['Date'] = pd.to_datetime(df.Date)
def lastyear(row):
curr = row.Date
lastyr = curr - pd.Timedelta(days=365)
return (df[(df.ID == row.ID) & (df.Date > lastyr) & (df.Date < curr)]).ID.size
df['Count'] = df.apply(lastyear, axis=1)
df
#Out[79]:
# ID Date Count
#0 abc 2016-07-12 0
#1 abc 2017-02-04 1
#2 abc 2017-02-13 2
#3 abc 2019-02-16 0
#4 xyz 2014-11-03 0
#5 xyz 2014-11-06 1
#6 xyz 2016-02-17 0
添加回答
举报