1 回答
TA贡献1843条经验 获得超7个赞
使用drop_duplicates然后似乎解决方案应该被简化:
#by one column
price = price.drop_duplicates('rsp')
#if necessary by multiple columns
#cols = ['code','prod','date_from', 'date_to', 'rsp']
#price = price.drop_duplicates(subset=cols)
g = price.groupby(['code','prod','date_from', 'date_to']).cumcount()
df1 = (price.set_index(['code','prod','date_from','date_to', g])
.unstack()
.sort_index(level=1, axis=1))
df1.columns = [f'{i}_{j+1}' for i, j in df1.columns]
df1 = df1.reset_index()
print (df1)
code prod date_from date_to rsp_1 time_from_1 time_to_1 rsp_2 \
0 123 HS 2018-01-01 2018-01-02 65.0 06:00 05:59 NaN
1 123 HS 2018-01-02 2018-01-03 64.0 06:00 05:59 NaN
2 123 MS 2018-01-01 2018-01-02 75.0 06:00 05:59 76.0
3 123 MS 2018-01-02 2018-01-03 73.0 06:00 05:59 NaN
time_from_2 time_to_2
0 NaN NaN
1 NaN NaN
2 10:00 05:59
3 NaN NaN
添加回答
举报