3 回答
TA贡献1777条经验 获得超3个赞
使用groupby
def f(s):
s = s.reset_index(drop=True)
one = s[s.eq(1)]
if one.empty: return -1
return -s.index + one.index[0]
df.groupby('categories').event.transform(f)
categories dates event time_until
0 a 0 0 3
1 b 0 0 1
2 c 0 0 -1
3 a 1 0 2
4 b 1 1 0
5 c 1 0 -1
6 a 2 0 1
7 b 2 0 -1
8 c 2 0 -1
9 a 3 1 0
10 b 3 0 -2
11 c 3 0 -1
请注意,即使在事件发生之后,它也会找到距离。因此,对于以下事件,您将获得以下输出
event = [0, 0, 0, 1, 0, 0]
until = [3, 2, 1, 0, -1, -2]
如果您需要使所有负值保持不变-1,那么只需在最后进行调整
df.time_until.where(df.time_until >= -1, -1)
TA贡献1873条经验 获得超9个赞
替代解决方案:
df.sort_values(by=['categories', 'dates'], ascending=[True, False], inplace=True)
df['tmp'] = df.groupby('categories')['event'].transform('cumsum')
df['time_until'] = df.groupby('categories')['tmp'].transform('cumsum') - 1
df.drop(columns='tmp', inplace=True)
df.sort_values(by=['dates', 'categories'], ascending=[True, True], inplace=True)
输出:
categories dates event time_until
0 a 0 0 3
1 b 0 0 1
2 c 0 0 -1
3 a 1 0 2
4 b 1 1 0
5 c 1 0 -1
6 a 2 0 1
7 b 2 0 -1
8 c 2 0 -1
9 a 3 1 0
10 b 3 0 -1
11 c 3 0 -1
TA贡献1804条经验 获得超7个赞
尝试这样的事情:
import pandas as pd
import numpy as np
data = {'categories':['a','b','c']*4,
'dates':[i for i in range(4) for j in range(3)],
'event':[0, 1, 0]*4}
df = pd.DataFrame(data)
print(df)
# One way
df.loc[df.event == 0, 'Newevents'] = 'Cancelled'
df.loc[df.event != 0, 'Newevents'] = 'Scheduled'
# Another way
conditions = [
(df['categories'] == "a"),
(df['categories'] == "b"),
(df['categories'] == "c")]
choices = ['None', 'Completed', 'Scheduled']
df['NewCategories'] = np.select(conditions, choices, default='black')
print(df)
输出:
categories dates event
0 a 0 0
1 b 0 1
2 c 0 0
3 a 1 0
4 b 1 1
5 c 1 0
6 a 2 0
7 b 2 1
8 c 2 0
9 a 3 0
10 b 3 1
11 c 3 0
categories dates event Newevents NewCategories
0 a 0 0 Cancelled None
1 b 0 1 Scheduled Completed
2 c 0 0 Cancelled Scheduled
3 a 1 0 Cancelled None
4 b 1 1 Scheduled Completed
5 c 1 0 Cancelled Scheduled
6 a 2 0 Cancelled None
7 b 2 1 Scheduled Completed
8 c 2 0 Cancelled Scheduled
9 a 3 0 Cancelled None
10 b 3 1 Scheduled Completed
11 c 3 0 Cancelled
添加回答
举报