2 回答
TA贡献1827条经验 获得超9个赞
您可以pd.merge_asof, 并屏蔽第二个条件:
dfm = pd.merge_asof(df1, df2, on='onset', direction='backward', suffixes=('','_y'))
dfm[['rhythm_name', 'rhythm_code']] = (dfm[['rhythm_name', 'rhythm_code']]
.where(dfm['offset'] <= dfm['offset_y']))
dfm.drop('offset_y', axis=1)
输出:
onset offset rhythm_name rhythm_code
0 1 200 NSR 100.0
1 201 400 NSR 100.0
2 401 600 NSR 100.0
3 601 800 NSR 100.0
4 801 1000 NSR 100.0
5 1001 1200 NSR 100.0
6 1201 1400 NSR 100.0
7 1401 1600 NSR 100.0
8 1601 1800 NSR 100.0
9 1801 2000 NSR 100.0
10 2001 2200 NSR 100.0
11 2201 2400 NSR 100.0
12 2401 2600 NSR 100.0
13 2601 2800 NaN NaN
14 2801 3000 JUNCTIONAL 4000.0
15 3001 3200 JUNCTIONAL 4000.0
16 3201 3400 JUNCTIONAL 4000.0
17 3401 3600 JUNCTIONAL 4000.0
18 3601 3800 JUNCTIONAL 4000.0
19 3801 4000 NaN NaN
20 4001 4200 NSR 100.0
21 4201 4400 NSR 100.0
22 4401 4600 NSR 100.0
23 4601 4800 NSR 100.0
24 4801 5000 NSR 100.0
25 5001 5200 NSR 100.0
26 5201 5400 NSR 100.0
27 5401 5600 NSR 100.0
28 5601 5800 NSR 100.0
29 5801 6000 NSR 100.0
TA贡献1770条经验 获得超3个赞
如果你的数据不是太大,你可以使用广播方式:
cond1 = df1.onset.values[:,None] >= df2.onset.values
cond2 = df1.offset.values[:,None] <= df2.offset.values
mask = (cond1&cond2)
idx = np.where(mask.any(1), mask.argmax(1), np.nan)
for col in ['rhythm_name', 'rhythm_code']:
df1[col] = df2[col].reindex(idx).values
输出:
0 1 200 NSR 100.0
1 201 400 NSR 100.0
2 401 600 NSR 100.0
3 601 800 NSR 100.0
4 801 1000 NSR 100.0
5 1001 1200 NSR 100.0
6 1201 1400 NSR 100.0
7 1401 1600 NSR 100.0
8 1601 1800 NSR 100.0
9 1801 2000 NSR 100.0
10 2001 2200 NSR 100.0
11 2201 2400 NSR 100.0
12 2401 2600 NSR 100.0
13 2601 2800 NaN NaN
14 2801 3000 JUNCTIONAL 4000.0
15 3001 3200 JUNCTIONAL 4000.0
16 3201 3400 JUNCTIONAL 4000.0
17 3401 3600 JUNCTIONAL 4000.0
18 3601 3800 JUNCTIONAL 4000.0
19 3801 4000 NaN NaN
20 4001 4200 NSR 100.0
21 4201 4400 NSR 100.0
22 4401 4600 NSR 100.0
23 4601 4800 NSR 100.0
24 4801 5000 NSR 100.0
25 5001 5200 NSR 100.0
26 5201 5400 NSR 100.0
27 5401 5600 NSR 100.0
28 5601 5800 NSR 100.0
29 5801 6000 NSR 100.0
选项 2:另一种(更好的)方法merge_asof:
(pd.merge_asof(df1,df2,on='onset',direction='backward',suffixes=['','_y'])
.query('offset<=offset_y')
.reindex(df1.index)
.drop('offset_y', axis=1)
.fillna(df1)
)
你得到相同的输出。
添加回答
举报