首页猿问 Pandas...

Pandas 版本的“如果为真，则在此处 VLOOKUP，如果为假，则在其他地方进行 VLOOKUP

Python

芜湖不芜 2021-11-23 19:10:12

我正在尝试将大型数据集及其处理从 Excel 转换为 Python/Pandas，并且在尝试实现“IF(col A = x, VLOOKUP(col B in table Y)，否则为 VLOOKUP(表 Z 中的列 A))”。我创建了两个单独的字典，它们将用作表 Y 和 Z 的 Pandas 版本，但我一直无法找到可以告诉 Pandas 使用 B 列中的值在字典中查找的构造。用熊猫尝试这个：# Created a function to map the values from# PROD_TYPE to the prod_dict.def map_values(row, prod_dict): return prod_dict[row]# Created the dictionaries / old VLOOKUP tables.prod_dict = {'PK': 'Packaging', 'ML': 'Mix', 'CM': 'Textile', 'NK': 'Metallic'}pack_dict = {'PK3' : 'Misc Packaging', 'PK4' : 'Mix Packaging', 'PK9' : 'Textile Packaging'}df = pd.DataFrame({'PROD_TYPE' : ['PK', 'ML', 'ML', 'CM'], 'PKG_TYPE': ['PK3', 'PK4', 'PK4', 'PK9'], 'VALUE': [1000, 900, 800, 700]})# Apply the map_values function.df['ITEM'] = df['PROD_TYPE'].apply(map_values, args = (prod_dict,))我得到： PROD_TYPE PKG_TYPE VALUE ITEM0 PK PK3 1000 Packaging1 ML PK4 900 Mix2 ML PK4 800 Mix3 CM PK9 700 Textile当我正在寻找的是： PROD_TYPE PKG_TYPE VALUE ITEM0 PK PK3 1000 Misc Packaging1 ML PK4 900 Mix2 ML PK4 800 Mix3 CM PK9 700 Textile或者，更简单地说：如果PROD_TYPE是'PK'，则从; 中的列PKG_TYPE中查找值pack_dict。否则，PROD_TYPE在prod_dict.任何帮助，将不胜感激！

查看完整描述

3 回答

慕无忌1623718

TA贡献1744条经验获得超4个赞

这是我将如何解决这个问题：

# First we make two dataframes out of the dictionaries with pd.melt

df2 = pd.DataFrame(prod_dict, index=[0])

df3 = pd.DataFrame(pack_dict, index=[0])

df2 = df2.melt(var_name=['PROD_TYPE'], value_name = 'ITEM')

df3 = df3.melt(var_name=['PKG_TYPE'], value_name = 'ITEM')

# df2

PROD_TYPE ITEM

0 PK Packaging

1 ML Mix

2 CM Textile

3 NK Metallic

# df3

PKG_TYPE ITEM

0 PK3 Misc Packaging

1 PK4 Mix Packaging

2 PK9 Textile Packaging

# Now we can merge our information together on keycolumns PROD_TYPE and PKG_TYPE

df_final = pd.merge(df, df2, on='PROD_TYPE')

df_final = pd.merge(df_final, df3, on='PKG_TYPE')

PROD_TYPE PKG_TYPE VALUE ITEM_x ITEM_y

0 PK PK3 1000 Packaging Misc Packaging

1 ML PK4 900 Mix Mix Packaging

2 ML PK4 800 Mix Mix Packaging

3 CM PK9 700 Textile Textile Packaging

# Finally we use np.where to conditionally select the values we need

df_final['ITEM'] = np.where(df_final.PROD_TYPE == 'PK', df_final.ITEM_y, df_final.ITEM_x)

# Drop columns which are not needed in output

df_final.drop(['ITEM_x', 'ITEM_y'], axis=1, inplace=True)

输出

PROD_TYPE PKG_TYPE VALUE ITEM

0 PK PK3 1000 Misc Packaging

1 ML PK4 900 Mix

2 ML PK4 800 Mix

3 CM PK9 700 Textile

np.where来自numpy模块，工作原理如下：

np.where(condition, true value, false value)

反对回复 2021-11-23

烙印99

TA贡献1829条经验获得超13个赞

类似于@Erfan 的回答，使用numpy.where但跳过meltto use pd.Series.map()。使用问题中的变量：

In []: df['ITEM'] = pd.np.where(df.PROD_TYPE == "PK",

df.PKG_TYPE.map(pack_dict),

df.PROD_TYPE.map(prod_dict))

In []: df

Out[]:

PROD_TYPE PKG_TYPE VALUE ITEM

0 PK PK3 1000 Misc Packaging

1 ML PK4 900 Mix

2 ML PK4 800 Mix

3 CM PK9 700 Textile

请注意，numpy已经由加载pandas，只需使用pd.np.

反对回复 2021-11-23

吃鸡游戏

TA贡献1829条经验获得超7个赞

一种方法是：

df["ITEM"]= [pack_dict[row[1]["PKG_TYPE"]]

if row[1]["PROD_TYPE"] == "PK"

else prod_dict[row[1]["PROD_TYPE"]]

for row in df.iterrows()]

我发现这比 Erfan 的解决方案快 10 倍。

反对回复 2021-11-23

3 回答
0 关注
221 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

Pandas 版本的“如果为真，则在此处 VLOOKUP，如果为假，则在其他地方进行 VLOOKUP

Pandas 版本的“如果为真，则在此处 VLOOKUP，如果为假，则在其他地方进行 VLOOKUP

3 回答

添加回答