首页猿问如何以第一行获得最大数字，第二行获...

如何以第一行获得最大数字，第二行获得最小数字，第三行获得第二大数字的方式对组进行排序，依此类推

Python

婷婷同学_ 2023-04-18 16:06:17

所以我有一个这样的 dfIn [1]:data= {'Group': ['A','A','A','A','A','A','B','B','B','B'], 'Name': [ ' Sheldon Webb',' Traci Dean',' Chad Webster',' Ora Harmon',' Elijah Mendoza',' June Strickland',' Beth Vasquez',' Betty Sutton',' Joel Gill',' Vernon Stone'], 'Performance':[33,64,142,116,122,68,95,127,132,80]}In [2]:df = pd.DataFrame(data, columns = ['Group', 'Name','Performance'])Out[1]: Group Name Performance0 A Sheldon Webb 331 A Traci Dean 642 A Chad Webster 1423 A Ora Harmon 1164 A Elijah Mendoza 1225 A June Strickland 686 B Beth Vasquez 957 B Betty Sutton 1278 B Joel Gill 1329 B Vernon Stone 80我想以这样一种交替的方式对它进行排序，即在一个组中，比如组“A”，第一行应该有表现最好的人（在本例中为“Chad Webster”），然后在第二行中表现最差的人（哪个是“谢尔顿韦伯”）。我正在寻找的输出看起来像这样：Out[2]: Group Name Performance0 A Chad Webster 1421 A Sheldon Webb 332 A Elijah Mendoza 1223 A Traci Dean 644 A Ora Harmon 1165 A June Strickland 686 B Joel Gill 1327 B Vernon Stone 808 B Betty Sutton 1279 B Beth Vasquez 95您可以看到序列在组内的最高和最低之间交替。

查看完整描述

5 回答

慕慕森

TA贡献1856条经验获得超17个赞

采用排序后的顺序，然后对其应用二次函数，其中根是数组长度的 1/2（加上一些小的偏移量）。通过这种方式，最高排名被赋予极值（eps偏移量的符号决定了您是否想要排名在最低值之上的最高值）。我在末尾添加了一个小组来展示它如何正确处理重复值或奇数组大小。

def extremal_rank(s):

eps = 10**-4

y = (pd.Series(np.arange(1, len(s)+1), index=s.sort_values().index)

- (len(s)+1)/2 + eps)**2

return y.reindex_like(s)

df['rnk'] = df.groupby('Group')['Performance'].apply(extremal_rank)

df = df.sort_values(['Group', 'rnk'], ascending=[True, False])

Group Name Performance rnk

2 A Chad Webster 142 6.2505

0 A Sheldon Webb 33 6.2495

4 A Elijah Mendoza 122 2.2503

1 A Traci Dean 64 2.2497

3 A Ora Harmon 116 0.2501

5 A June Strickland 68 0.2499

8 B Joel Gill 132 2.2503

9 B Vernon Stone 80 2.2497

7 B Betty Sutton 127 0.2501

6 B Beth Vasquez 95 0.2499

11 C b 110 9.0006

12 C c 68 8.9994

10 C a 110 4.0004

13 C d 68 3.9996

15 C f 70 1.0002

16 C g 70 0.9998

14 C e 70 0.0000

反对回复 2023-04-18

倚天杖

TA贡献1828条经验获得超3个赞

您可以避免在 Performace 上groupby使用sort_values一次升序一次降序，concat两个排序的数据帧，然后使用sort_index并drop_duplicates获得预期的输出：

df_ = (pd.concat([df.sort_values(['Group', 'Performance'], ascending=[True, False])

.reset_index(), #need the original index for later drop_duplicates

df.sort_values(['Group', 'Performance'], ascending=[True, True])

.reset_index()

.set_index(np.arange(len(df))+0.5)], # for later sort_index

axis=0)

.sort_index()

.drop_duplicates('index', keep='first')

.reset_index(drop=True)

[['Group', 'Name', 'Performance']]

)

print(df_)

Group Name Performance

0 A Chad Webster 142

1 A Sheldon Webb 33

2 A Elijah Mendoza 122

3 A Traci Dean 64

4 A Ora Harmon 116

5 A June Strickland 68

6 B Joel Gill 132

7 B Vernon Stone 80

8 B Betty Sutton 127

9 B Beth Vasquez 95

反对回复 2023-04-18

德玛西亚99

TA贡献1770条经验获得超3个赞

对每个组应用nlargest和的排序串联：nsmallest

>>> (df.groupby('Group')[df.columns[1:]]

.apply(lambda x:

pd.concat([x.nlargest(x.shape[0]//2,'Performance').reset_index(),

x.nsmallest(x.shape[0]-x.shape[0]//2,'Performance').reset_index()]

)

.sort_index()

.drop('index',1))

.reset_index().drop('level_1',1))

Group Name Performance

0 A Chad Webster 142

1 A Sheldon Webb 33

2 A Elijah Mendoza 122

3 A Traci Dean 64

4 A Ora Harmon 116

5 A June Strickland 68

6 B Joel Gill 132

7 B Vernon Stone 80

8 B Betty Sutton 127

9 B Beth Vasquez 95

反对回复 2023-04-18

qq_笑_17

TA贡献1818条经验获得超7个赞

只是另一种使用自定义函数的方法np.empty：

def mysort(s):

arr = s.to_numpy()

c = np.empty(arr.shape, dtype=arr.dtype)

idx = arr.shape[0]//2 if not arr.shape[0]%2 else arr.shape[0]//2+1

c[0::2], c[1::2] = arr[:idx], arr[idx:][::-1]

return pd.DataFrame(c, columns=s.columns)

print (df.sort_values("Performance", ascending=False).groupby("Group").apply(mysort))

Group Name Performance

Group

A 0 A Chad Webster 142

1 A Sheldon Webb 33

2 A Elijah Mendoza 122

3 A Traci Dean 64

4 A Ora Harmon 116

5 A June Strickland 68

B 0 B Joel Gill 132

1 B Vernon Stone 80

2 B Betty Sutton 127

3 B Beth Vasquez 95

基准：

//img1.sycdn.imooc.com//643e4fb20001065706120463.jpg

反对回复 2023-04-18

冉冉说

TA贡献1877条经验获得超1个赞

让我们尝试用检测min, max行groupby().transform()，然后排序：

groups = df.groupby('Group')['Performance']

mins, maxs = groups.transform('min'), groups.transform('max')

(df.assign(temp=df['Performance'].eq(mins) | df['Performance'].eq(maxs))

.sort_values(['Group','temp','Performance'],

ascending=[True, False, False])

.drop('temp', axis=1)

)

输出：

Group Name Performance

2 A Chad Webster 142

0 A Sheldon Webb 33

4 A Elijah Mendoza 122

3 A Ora Harmon 116

5 A June Strickland 68

1 A Traci Dean 64

8 B Joel Gill 132

9 B Vernon Stone 80

7 B Betty Sutton 127

6 B Beth Vasquez 95

反对回复 2023-04-18

5 回答
0 关注
140 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

如何以第一行获得最大数字，第二行获得最小数字，第三行获得第二大数字的方式对组进行排序，依此类推

如何以第一行获得最大数字，第二行获得最小数字，第三行获得第二大数字的方式对组进行排序，依此类推

5 回答

添加回答