3 回答
![?](http://img1.sycdn.imooc.com/54584f6d0001759002200220-100-100.jpg)
TA贡献1815条经验 获得超13个赞
如果数据框中有很多行,我会尽量避免使用传递axis=1
.
s1 = np.where(df['Team 2'] > df['Team 1'], df['Team 1'], df['Team 2'])
s2 = np.where(df['Team 2'] < df['Team 1'], df['Team 1'], df['Team 2'])
df['Team 1'] = s1
df['Team 2'] = s2
df = df.groupby(['Team 1', 'Team 2', 'Winner'])['winner_count'].sum().reset_index()
df
Out[1]:
Team 1 Team 2 Winner winner_count
0 Afghanistan Australia Australia 2
![?](http://img1.sycdn.imooc.com/54584dc4000118d302200220-100-100.jpg)
TA贡献1868条经验 获得超4个赞
试穿这款尺码:
import pandas as pd
# Initialise test dataframe
df = pd.DataFrame({'Team 1': ['A', 'B'], 'Team 2': ['B', 'A'], 'winner_count': [1, 1]})
# Get the smallest team name (alphabetically)
df['First'] = df[['Team 1', 'Team 2']].min(axis=1)
# Get the largest team name (alphabetically)
df['Second'] = df[['Team 1', 'Team 2']].max(axis=1)
# Groupby the teams to sum winner_counts
df = df[['First', 'Second', 'winner_count']].groupby(['First', 'Second']).sum().reset_index()
这似乎是一个效率低下的解决方案,所以看看其他人是否想出更好的方法。
![?](http://img1.sycdn.imooc.com/54584ee0000179f302200220-100-100.jpg)
TA贡献1911条经验 获得超7个赞
IMO 该解决方案应该处理数据框中有两个以上团队的可能性,例如:
df = pd.DataFrame({"Team1": ["Afghanistan", "Australia", "Australia", "Belgium", "Afghanistan"],
"Team2": ["Australia", "Afghanistan", "Afghanistan", "Afghanistan","Belgium"],
"Winner": ["Afghanistan", "Australia", "Australia", "Afghanistan", "Afghanistan"],
"winner_count": [1, 1, 1, 1, 1]
})
我首先会通过新的“团队”列来识别每个团队组合:
df.loc[:,"Teams"] = df.apply(lambda x: ", ".join(sorted([x["Team1"],x["Team2"]])), axis=1)
然后可以按团队组合进行分组,并将获胜团队放在每个组的顶部:
group = df.groupby(by=["Teams","Winner"]).agg({
"Teams":"first",
"Team1":"first",
"Team2":"first",
"winner_count":"sum"}).sort_values(by=["winner_count"], ascending=False)
IMO 已经有了“团队”列,两个团队中哪一个是团队 1 或团队 2 并不重要。重复项将被删除:
group.drop_duplicates(subset="Teams", inplace=True)
还可以通过以下方式删除 MultiIndex ["Teams", "Winner"]:
group.reset_index(level=1, drop=True, inplace=True)
结果数据框组(没有索引“Teams”):
+-------------+------------------------+-----------+-------------+--------------+
| Winner | Teams | Team1 | Team2 | winner_count |
+-------------+------------------------+-----------+-------------+--------------+
| | | | | |
| Australia | Afghanistan, Australia | Australia | Afghanistan | 2 |
| Afghanistan | Afghanistan, Belgium | Belgium | Afghanistan | 2 |
+-------------+------------------------+-----------+-------------+--------------+
添加回答
举报