我有一个 DataFrame,我想合并包含相同值的行toy = [ [10, 11], [21, 22], [11, 15], [22, 23], [15, 33]]toy = pd.DataFrame(toy, columns = ['ID1', 'ID2']) ID1 ID20 10 111 21 222 11 153 22 234 15 33我希望之后得到的是 0 1 2 30 10 11 15 33.01 21 22 23 NaN因此合并包含任何相同值的行。我的解决方案非常不优雅,我正在寻找正确的方法来做到这一点......递归?通过...分组?唔..#### Feel Free to NOT read this... ###for k in range(100): print(k) merge_df = [] merged_indices = [] for i, row in toy.iterrows(): if i in merged_indices: continue cp = toy.copy() merge_rows = cp[cp.isin(row.values)].dropna(how="all") merged_indices = merged_indices + list(merge_rows.index) merge_rows = np.array(toy.iloc[merge_rows.index]).flatten() merge_rows = np.unique(merge_rows) merge_df.append(merge_rows) if toy.shape[0] == len(merge_df): break toy = pd.DataFrame(merge_df).copy()
1 回答
素胚勾勒不出你
TA贡献1827条经验 获得超9个赞
听起来像是网络问题,所以我使用 networkx
import networkx as nx
G=nx.from_pandas_edgelist(toy, 'ID1', 'ID2')
l=list(nx.connected_components(G))
newdf=pd.DataFrame(l)
newdf
Out[896]:
0 1 2 3
0 33 10 11 15.0
1 21 22 23 NaN
添加回答
举报
0/150
提交
取消