Pandas：使用另一个表的“虚拟变量”创建一个表

假设我有这个数据帧数据帧 A（产品）Cod | Product | Cost | Date-------------------------------18 | Product01 | 3.4 | 21/0422 | Product02 | 7.2 | 12/0833 | Product03 | 8.4 | 17/0155 | Product04 | 0.6 | 13/0767 | Product05 | 1.1 | 09/09数据帧 B（操作）id | codoper | CodProd | valor-------------------------------1 | 00001 | 55 | 450002 | 00001 | 18 | 450003 | 00002 | 33 | 530001 | 00001 | 55 | 45000这个想法是获得一个“数据帧C”，其中包含“Dataframe B”中的列乘积：数据帧 C 结果id | codoper | Product_18| Product_22| Product_33| Product_55| Product_67 |valor----------------------------------------------------------------------------------1 | 00001 | 1 | 0 | 0 | 1 | 0 |450002 | 00002 | 0 | 0 | 1 | 0 | 0 |53000到目前为止，我只设法从“DataFrame B”中做到了这一点：pd.get_dummies(df, columns=['CodProd']).groupby(['codoper'], as_index=False).min()注意：我在操作的数据帧中没有来自 Dataframe A 的所有产品

查看完整描述

2 回答

月关宝盒

TA贡献1772条经验获得超5个赞

您需要将中的假人与中的假人组合在一起。首先使用前缀定义输出列：ProductsOperations

columns = ['id', 'codoper'] + [f"Product_{cod}" for cod in A['Cod'].unique()] + ['valor']

然后，像上面所做的那样使用get假人，但使用相同的前缀来定义列。按所有完全共线的列分组，即、和。如果这些不是完全共线的，那么您需要决定如何将它们聚合到 .最后，使用之前定义的输出列重新编制索引，用零填充缺失值。idcodopervalorcodoper

pd.get_dummies(B, columns=['CodProd'], prefix='Product').groupby(['id', 'codoper', 'valor'], as_index=False).sum().reindex(columns=columns, fill_value=0)

id codoper Product_18 Product_22 Product_33 Product_55 Product_67 valor

0 1 00001 0 0 0 2 0 45000

1 2 00001 1 0 0 0 0 45000

2 3 00002 0 0 1 0 0 53000

反对回复 2022-08-02

喵喵时光机

TA贡献1846条经验获得超7个赞

这是一些调整的组合：mergepivot_table

(Products.merge(Operations,

left_on='Cod',

right_on='CodProd',

how='left')

.pivot_table(index=['codoper','valor'],

values='Product',

columns='Cod',

fill_value=0,

aggfunc='any')

.reindex(Products.Cod.unique(),

fill_value=False,

axis=1)

.astype(int)

.add_prefix('Product_')

.reset_index()

)

输出：

Cod codoper valor Product_18 Product_22 Product_33 Product_55 \

0 00001 45000.0 1 0 0 1

1 00002 53000.0 0 0 1 0

Cod Product_67

0 0

1 0

反对回复 2022-08-02

热搜

最近搜索清空

Pandas：使用另一个表的“虚拟变量”创建一个表

Pandas：使用另一个表的“虚拟变量”创建一个表

2 回答

添加回答