df = pd.DataFrame({'key1' : ['a','a','a','b','b'], 'key2' : ['c','d','c','c','d'], 'data' : [1,10,2,3,30]})
>>> df
key1 key2 data0 a c 11 a d 102 a c 23 b c 34 b d 30目标结果
key1 key2 data row_number0 a c 1 11 a d 10 12 a c 2 23 b c 3 14 b d 30 1以key1、key2分组,按照data排序,取出序号应该怎么处理呢?搜索找到的以下方法没有成功df['row_number'] = df['data'].groupby(df['key1','key2']).rank(ascending=True,method='first')
1 回答
![?](http://img1.sycdn.imooc.com/545850ee0001798a02200220-100-100.jpg)
紫衣仙女
TA贡献1839条经验 获得超15个赞
def cumsum_seq(v): sub = v.sort_values('data') sub['seq'] = sub['seq'].cumsum() return sub.loc[:, ['data', 'seq']] df['seq'] = 1df.groupby(['key1', 'key2']).apply(cumsum_seq).reset_index().drop(columns='level_2')
结果
key1 | key2 | data | seq | |
---|---|---|---|---|
0 | a | c | 1 | 1 |
1 | a | c | 2 | 2 |
2 | a | d | 10 | 1 |
3 | b | c | 3 | 1 |
4 | b | d | 30 | 1 |
添加回答
举报
0/150
提交
取消