2 回答
TA贡献1911条经验 获得超7个赞
我希望您的解决方案应该简化 - 首先DataFrame.sort_values是 2 列:
df = df.sort_values(['Article','Value'])
print (df)
Article Name Value
0 A_01 P_01 360
2 A_01 P_07 360
3 A_01 P_09 370
14 A_01 P_33 480
11 A_01 P_26 500
9 A_02 P_25 750
12 A_02 P_26 750
4 A_02 P_09 847
7 A_02 P_22 935
10 A_03 P_25 600
13 A_03 P_26 600
1 A_03 P_01 625
8 A_03 P_22 625
6 A_03 P_18 650
5 A_03 P_09 685
15 A_03 P_33 750
然后创建计数器 Series byGroupBy.cumcount并通过 过滤 top3 值boolean indexing,添加MultiIndex并重塑 by Series.unstack,最后MultiIndex按 s 在列中展平f-string:
g = df.groupby('Article').cumcount().add(1)
mask = g < 4
df = df[mask].set_index(['Article',g[mask]]).unstack().sort_index(axis=1, level=1)
df.columns = df.columns.map(lambda x: f'Min_{x[1]}_{x[0]}')
df = df.reset_index()
print (df)
Article Min_1_Name Min_1_Value Min_2_Name Min_2_Value Min_3_Name \
0 A_01 P_01 360 P_07 360 P_09
1 A_02 P_25 750 P_26 750 P_09
2 A_03 P_25 600 P_26 600 P_01
Min_3_Value
0 370
1 847
2 625
TA贡献1810条经验 获得超4个赞
您可以通过用 NaN 替换以前的最小来实现它。
import pandas as pd
import numpy as np
df = pd.DataFrame(
[['A_01', 'P_01', 360],
['A_03', 'P_01', 625],
['A_01', 'P_07', 360],
['A_01', 'P_09', 370],
['A_02', 'P_09', 847],
['A_03', 'P_09', 685],
['A_03', 'P_18', 650],
['A_02', 'P_22', 935],
['A_03', 'P_22', 625],
['A_02', 'P_25', 750],
['A_03', 'P_25', 600],
['A_01', 'P_26', 500],
['A_02', 'P_26', 750],
['A_03', 'P_26', 600],
['A_01', 'P_33', 480],
['A_03', 'P_33', 750]])
df.columns=['Article','Name','Value']
list_articles = df['Article'].drop_duplicates()
list_names = list(df['Name'].drop_duplicates())
pivot_df = df.pivot(index='Article', columns='Name', values='Value').reset_index()
for i in range(1, 4):
pivot_df[f'Min_{i}_Value'] = pivot_df[list_names].T.apply(lambda x: x.nsmallest(1).max())
indices=pivot_df[list_names].T.apply(lambda y: y.nsmallest(1).idxmax())
pivot_df[f'Min_{i}_Name'] = indices
for i,x in enumerate(indices):
pivot_df[x][i]=np.nan
ColsToKeep = [x for x in pivot_df.columns.tolist() if x not in list_names]
ColsToKeep = [x for x in ColsToKeep if x[:3] == 'Min']
ColsToKeep.sort()
ColsToKeep = ['Article'] + ColsToKeep
final_df = pivot_df[ColsToKeep]
final_df
添加回答
举报