首页猿问根据先前的行值计算行值，将结果更新...

根据先前的行值计算行值，将结果更新为新的行值

Python

aluckdog 2022-12-20 09:55:08

下面是一些反映我正在使用的数据的虚拟数据。import pandas as pdimport numpy as npfrom numpy import randomrandom.seed(30)# Dummy data that represents a percent changedatelist = pd.date_range(start='1983-01-01', end='1994-01-01', freq='Y')df1 = pd.DataFrame({"P Change_1": np.random.uniform(low=-0.55528, high=0.0396181, size=(11,)), "P Change_2": np.random.uniform(low=-0.55528, high=0.0396181, size=(11,))})#This dataframe contains the rows we want to operate ondf2 = pd.DataFrame({ 'Loc1': [None, None, None, None, None, None, None, None, None, None, 2.5415], 'Loc2': [None, None, None, None, None, None, None, None, None, None, 3.2126],})#Set the datetime indexdf1 = df1.set_index(datelist)df2 = df2.set_index(datelist)df1: P Change_1 P Change_21984-12-31 -0.172080 -0.2315741985-12-31 -0.328773 -0.2470181986-12-31 -0.160834 -0.0990791987-12-31 -0.457924 0.0002661988-12-31 0.017374 -0.5019161989-12-31 -0.349052 -0.4388161990-12-31 0.034711 0.0361641991-12-31 -0.415445 -0.4153721992-12-31 -0.206852 -0.4131071993-12-31 -0.313341 -0.1810301994-12-31 -0.474234 -0.118058df2: Loc1 Loc21984-12-31 NaN NaN1985-12-31 NaN NaN1986-12-31 NaN NaN1987-12-31 NaN NaN1988-12-31 NaN NaN1989-12-31 NaN NaN1990-12-31 NaN NaN1991-12-31 NaN NaN1992-12-31 NaN NaN1993-12-31 NaN NaN1994-12-31 2.5415 3.2126数据框详细信息：首先，Loc1 将对应于 P Change_1，Loc2 对应于 P Change_2，等等。首先查看 Loc1，我想用相关值填充包含 Loc1 和 Loc2 的 DataFrame，或者计算一个包含 Calc1 和 Calc2 列的新数据帧.计算：我想从 Loc1 的 1994 年值开始，并通过采用 Loc1 1993 = Loc1 1994 + (Loc1 1994 * P Change_1 1993) 计算 1993 年的新值。填充的值将是 2.5415 +(-0.313341 * 2.5415)，大约等于 1.74514。这个 1.74514 值将替换 1993 年的 NaN 值，然后我想使用该计算值来获得 1992 年的值。这意味着我们现在计算 Loc1 1992 = Loc1 1993 + (Loc1 1993 * P Change_1 1992)。我想按行执行此操作，直到它获得时间序列中最早的值。实现这个逐行方程的最佳方法是什么？我希望这是有道理的，非常感谢任何帮助！

查看完整描述

3 回答

呼啦一阵风

TA贡献1802条经验获得超6个赞

df = pd.merge(df1, df2, how='inner', right_index=True, left_index=True) # merging dataframes on date index

df['count'] = range(len(df)) # creating a column, count for easy operation

# divides dataframe in two part, one part above the not NaN row and one below

da1 = df[df['count']<=df.dropna().iloc[0]['count']]

da2 = df[df['count']>=df.dropna().iloc[0]['count']]

da1.sort_values(by=['count'],ascending=False, inplace=True)

g=[da1,da2]

num_col=len(df1.columns)

for w in range(len(g)):

list_of_col=[]

count = 0

list_of_col=[list() for i in range(len(g[w]))]

for item, rows in g[w].iterrows():

n=[]

if count==0:

for p in range(1,num_col+1):

n.append(rows[f'Loc{p}'])

else:

for p in range(1,num_col+1):

n.append(list_of_col[count-1][p-1]+ list_of_col[count-1][p-1]* rows[f'P Change_{p}'])

list_of_col[count].extend(n)

count+=1

tmp=[list() for i in range(num_col)]

for d_ in range(num_col):

for x_ in range(len(list_of_col)):

tmp[d_].append(list_of_col[x_][d_])

z1=[]

z1.extend(tmp)

for i in range(num_col):

g[w][f'Loc{i+1}']=z1[i]

da1.sort_values(by=['count'] ,inplace=True)

final_df = pd.concat([da1, da2[1:]])

calc_df = pd.DataFrame()

for i in range(num_col):

calc_df[f'Calc{i+1}']=final_df[f'Loc{i+1}']

print(calc_df)

我试图在评论中包括我所做的所有晦涩的事情。我已经编辑了我的代码，让初始数据帧不受影响。

[已编辑]：我已编辑代码以在给定数据框中包含任意数量的列。

[已编辑：]如果 df1 和 df2 中的列名是任意的，请在运行上层代码之前运行此代码块。我已经使用列表理解重命名了列名！

df1.columns = [f'P Change_{i+1}' for i in range(len(df1.columns))]

df2.columns = [f'Loc{i+1}' for i in range(len(df2.columns))]

反对回复 2022-12-20

慕仙森

TA贡献1827条经验获得超8个赞

也许有更好/更优雅的方法来做到这一点，但这对我来说效果很好：

def fill_values(df1, df2, cols1=None, cols2=None):

if cols1 is None: cols1 = df1.columns

if cols2 is None: cols2 = df2.columns

for i in reversed(range(df2.shape[0]-1)):

for col1, col2 in zip(cols1, cols2):

if np.isnan(df2[col2].iloc[i]):

val = df2[col2].iloc[i+1] + df2[col2].iloc[i+1] * df1[col1].iloc[i]

df2[col2].iloc[i] = val

return df1, df2

df1, df2 = fill_values(df1, df2)

print(df2)

Loc1 Loc2

1983-12-31 0.140160 0.136329

1984-12-31 0.169291 0.177413

1985-12-31 0.252212 0.235614

1986-12-31 0.300550 0.261526

1987-12-31 0.554444 0.261457

1988-12-31 0.544976 0.524925

1989-12-31 0.837202 0.935388

1990-12-31 0.809117 0.902741

1991-12-31 1.384158 1.544128

1992-12-31 1.745144 2.631024

1993-12-31 2.541500 3.212600

这假设 df1 和 df2 中的行完全对应（我不是查询索引，而是查询位置）。希望能帮助到你！

反对回复 2022-12-20

饮歌长啸

TA贡献1951条经验获得超3个赞

只是要清楚，你需要的是Loc1[year]=Loc1[next_year] + PChange[year]*Loc1[next_year]，对吧？下面的循环将执行您正在寻找的操作，但它只是假设两个 df 中的行数始终相等，等等（而不是匹配索引中的值）。根据您的描述，我认为这适用于您的数据。

for i in range(df2.shape[0]-2,-1,-1):
    df2.Loc1[i]=df2.Loc1[i+1] + (df1.PChange_1[i]*df2.Loc1[i+1])

希望这可以帮助：）

反对回复 2022-12-20

3 回答
0 关注
96 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

根据先前的行值计算行值，将结果更新为新的行值

根据先前的行值计算行值，将结果更新为新的行值

3 回答

添加回答