使用 Python 创建数字组合并计算不同组合的数量

我有 df1，它包含一组特定的 ID 作为列，而 df2 在每一行中包含 ID 的混合（如下图所示）。我想创建一个数据框，其中包含 df1 中存在于 df2 的每一行中的所有不同 ID 组合，并获取所有不同组合的计数。df1=pd.DataFrame({'Id':["181","456","235","653","987","5","300"]})df2=pd.DataFrame({'Tag Id':["213,435,181,954,987","456","215,435,181,754,987","213,12,432,300,653,987"})

查看完整描述

2 回答

慕森卡

TA贡献1806条经验获得超8个赞

这是使用列表理解和 itertools 的更快方法 -

import itertools

#Get vocab of items

vocab = list(df1['Id'].astype(int))

#get filtered list of combinations in each row of df2

filtered = [[int(j) for j in i.split(',') if int(j) in vocab] for i in list(df2['Tag Id'])]

#Get counts of the combinations and display as a dataframe

counts = list(zip(*np.unique(filtered, return_counts=True)))

pd.DataFrame(counts, columns=['Combinations', 'Counts'])

Combinations Counts

0 [181, 987] 2

1 [300, 653, 987] 1

2 [456] 1

反对回复 2023-03-30

江户川乱折腾

TA贡献1851条经验获得超5个赞

让我们尝试将inexplode分开，然后用和计数：Tag Idsdf1mergedf1

s = (df2['Tag Id'].str.split(',')

.explode()

.reset_index()

)

(df1.merge(s, left_on='Id', right_on='Tag Id')

.sort_values('Tag Id')

.groupby('index')

.agg(Combination=('Id',','.join))

['Combination']

.value_counts().reset_index()

)

输出：

index Combination

0 181,987 2

1 653,987,300 1

2 456 1

反对回复 2023-03-30

热搜

最近搜索清空

使用 Python 创建数字组合并计算不同组合的数量

使用 Python 创建数字组合并计算不同组合的数量

2 回答

添加回答