我有一个如下的数据框,我需要编写一个函数,它应该能够给我以下结果:输入参数:国家,例如 'INDIA'以年龄为例 'Student'我的输入数据框如下所示: Card Name Country Age Code Amount0 AAA INDIA Young House 1001 AAA Australia Old Hardware 2002 AAA INDIA Student House 3003 AAA US Young Hardware 6004 AAA INDIA Student Electricity 2005 BBB Australia Young Electricity 1006 BBB INDIA Student Electricity 2007 BBB Australia Young House 4508 BBB INDIA Old House 1509 CCC Australia Old Hardware 20010 CCC Australia Young House 35011 CCC INDIA Old Electricity 40012 CCC US Young House 200预期的输出将是 Code Total Amount Frequency Average0 Electricity 400 2 2001 House 300 1 300根据金额的总和,前 10 名(在我们的例子中,我们只能获得给定国家(= 印度)和年龄(= 学生)的前 2 名代码。此外,它还应该提供一个新的“频率”列,该列将计算编号。该组和“平均”列中的记录数将是总和/频率我试过了df.groupby(['Country','Age','Code']).agg({'Amount': sum})['Amount'].groupby(level=0, group_keys=False).nlargest(10)产生Country Age Code Australia Young House 800 Old Hardware 400 Young Electricity 100INDIA Old Electricity 400 Student Electricity 400 House 300 Old House 150 Young House 100US Young Hardware 600 House 200Name: Amount, dtype: int64不幸的是,这与预期的输出不同。
添加回答
举报
0/150
提交
取消