3 回答
TA贡献1846条经验 获得超7个赞
下一个起始值取决于上一组的最后一个值,因此我认为它无法矢量化。它需要某种迭代过程。我想出了在groupby的组上进行迭代的解决方案。反转并分配给 。处理每组组并将最终组列表分配给原始组dfdf1df1df
df1 = df[::-1]
s = df1.B.isin(['BCLOSE','SCLOSE']).shift(fill_value=False).cumsum()
grps = df1.groupby(s)
init_val= 100
l = []
for _, grp in grps:
s = grp.C * 0.01 * init_val
s.iloc[0] = init_val
s = s.cumsum()
init_val = s.iloc[-1]
l.append(s)
df['D'] = pd.concat(l)
Out[50]:
A B C D
0 1/05/2019 SIT 0.0 158.6
1 2/05/2019 SCLOSE 1.0 158.6
2 3/05/2019 SHODL 10.0 157.3
3 4/05/2019 SHODL 5.0 144.3
4 5/05/2019 SHODL 6.0 137.8
5 6/05/2019 SHODL -6.0 130.0
6 7/05/2019 SHODL 6.0 137.8
7 8/05/2019 SELL 0.0 130.0
8 9/05/2019 SIT 0.0 130.0
9 10/05/2019 SIT 0.0 130.0
10 11/05/2019 BCLOSE -8.0 130.0
11 12/05/2019 BHODL 33.0 138.0
12 13/05/2019 BHODL -15.0 105.0
13 14/05/2019 BHODL 6.0 120.0
14 15/05/2019 BHODL -1.0 114.0
15 16/05/2019 BHODL 5.0 115.0
16 17/05/2019 BHODL 10.0 110.0
17 18/05/2019 BUY 0.0 100.0
18 19/05/2019 SIT 0.0 100.0
19 20/05/2019 SIT 0.0 100.0
TA贡献1864条经验 获得超2个赞
下面的这篇文章应该可以帮助你。它产生预期的输出,并且速度相对较快,因为它避免了对数据帧行的直接迭代。
endpoints = [df.first_valid_index(), df.last_valid_index()]
# occurrences of 'BCLOSE' or 'SCLOSE'
breakpoints = df.index[(df.B =='BCLOSE') | (df.B == 'SCLOSE')][::-1]
# remove the endpoints of the dataframe that do not break the structure
breakpoints = breakpoints.drop(endpoints, errors='ignore')
PERCENTAGE_CONST = 100
top = 100 # you can specify any initial value here
for i in range(len(breakpoints) + 1):
prv = breakpoints[i - 1] - 1 if i else -1 # previous or first breakpoint
try:
nex = breakpoints[i] - 1 # next breakpoint
except IndexError:
nex = None # last breakpoint
# cumulative sum of appended to 'D' column
res = top + (df['C'][prv: nex: -1] * top / PERCENTAGE_CONST).cumsum()[::-1]
df.loc[res.index, 'D'] = res
# saving the value that will be the basis for percentage calculations
# for the next breakpoint
top = res.iloc[0]
TA贡献1815条经验 获得超6个赞
我认为有一种更优化和pythonic的方法可以解决这个问题。但是一个带有迭代的解决方案:
df['D'] = pd.to_numeric(df['D'])
df['C'] = pd.to_numeric(df['C'])
D_val = None
for i in range(len(df)-1, 0, -1):
if df.loc[i, 'B'] == 'BUY':
D_val = df.loc[i, 'D']
continue
if D_val is None:
continue
df.loc[i, 'D'] = df.loc[i+1, 'D'] + (D_val * df.loc[i, 'C']/100)
每次遇到 in 时,您都会更新 .我们还可以有一个停止的条件,如OP所提到的,如 或 。BUYcolumn DD_valSCLOSEBCLOSE
添加回答
举报