2 回答
TA贡献1895条经验 获得超3个赞
使用numpy和数组切片
import numpy as np
n = 4
df['fnew'] = np.concatenate([np.repeat(df.f.values[n-1::n], n),
np.repeat(np.NaN, len(df)%n)])
输出:
n=3
index dtm f fnew
0 0 00:00:00 50.065 50.058
1 1 00:00:01 50.061 50.058
2 2 00:00:02 50.058 50.058
3 3 00:00:03 50.049 50.044
4 4 00:00:04 50.044 50.044
5 5 00:00:05 50.044 50.044
6 6 00:00:06 50.042 NaN
7 7 00:00:07 50.042 NaN
n = 4
index dtm f fnew
0 0 00:00:00 50.065 50.049
1 1 00:00:01 50.061 50.049
2 2 00:00:02 50.058 50.049
3 3 00:00:03 50.049 50.049
4 4 00:00:04 50.044 50.042
5 5 00:00:05 50.044 50.042
6 6 00:00:06 50.042 50.042
7 7 00:00:07 50.042 50.042
n = 5
index dtm f fnew
0 0 00:00:00 50.065 50.044
1 1 00:00:01 50.061 50.044
2 2 00:00:02 50.058 50.044
3 3 00:00:03 50.049 50.044
4 4 00:00:04 50.044 50.044
5 5 00:00:05 50.044 NaN
6 6 00:00:06 50.042 NaN
7 7 00:00:07 50.042 NaN
TA贡献1841条经验 获得超3个赞
这是一种避免循环的方法df。
首先设置 a n,并生成一个包含现有索引的列表,不包括将用于重复 中值的行f:
n=4
ix = [x for i, x in enumerate(df.index.values) if (i + 1) % n != 0]
print(ix)
[0, 1, 2, 4, 5, 6]
现在将这些值设置为np.nan并使用bfill:
df.loc[ix, 'f'] = np.nan
df['f'] = df.f.bfill()
print(df)
index dtm f
0 0 00:00:00 50.049
1 1 00:00:01 50.049
2 2 00:00:02 50.049
3 3 00:00:03 50.049
4 4 00:00:04 50.042
5 5 00:00:05 50.042
6 6 00:00:06 50.042
7 7 00:00:07 50.042
添加回答
举报