1 回答
TA贡献1802条经验 获得超6个赞
在我看来,你的代码是好的,只是受了一点变化group_keys=False在Series.groupby为避免重复多指标的水平:
s = big_plays.groupby(level=0, group_keys=False).nlargest(5)
print (s)
game_date posteam
2009 NO 72
PIT 72
SD 71
NYG 69
DAL 68
2018 KC 88
LA 80
LAC 77
TB 73
CLE 70
Name: a, dtype: int64
df = big_plays.groupby(level=0, group_keys=False).nlargest(5).reset_index(name='count')
print (df)
game_date posteam count
0 2009 NO 72
1 2009 PIT 72
2 2009 SD 71
3 2009 NYG 69
4 2009 DAL 68
5 2018 KC 88
6 2018 LA 80
7 2018 LAC 77
8 2018 TB 73
9 2018 CLE 70
替代方案更复杂:
df = (big_plays.reset_index(name='count')
.sort_values(['game_date','count'], ascending=[True, False])
.groupby('game_date')
.head(5))
print (df)
game_date posteam count
19 2009 NO 72
24 2009 PIT 72
25 2009 SD 71
20 2009 NYG 69
8 2009 DAL 68
43 2018 KC 88
44 2018 LA 80
45 2018 LAC 77
57 2018 TB 73
35 2018 CLE 70
添加回答
举报