首页猿问根据来自另一个数据帧的值替换数据帧的值

根据来自另一个数据帧的值替换数据帧的值

Python

四季花海 2022-10-18 16:06:05

如何基于另一个查找数据框在一个数据框之间合并。这是我要替换值的数据框 A： InfoType IncidentType DangerType0 NaN A NaN1 NaN C NaN2 NaN B C3 NaN B NaN这是查找表： ID ParamCode ParamValue ParmDesc1 ParamDesc2 SortOrder ParamStatus0 1 IncidentType A ABC DEF 1 11 2 IncidentType B GHI JKL 2 12 3 IncidentType C MNO PQR 7 12 3 DangerType C STU VWX 6 1预期输入： InfoType IncidentType DangerType0 NaN ABC NaN1 NaN MNO NaN2 NaN GHI STU3 NaN GHI NaN请注意，这ParamCode是列名，我需要替换ParamDesc1为数据框 A 中的相应列。数据框 A 中的每一列都可能有 NaN，我不打算删除它们。只是忽略它们。这就是我所做的：ntf_cols = ['InfoType','IncidentType','DangerType']for c in ntf_cols: if (c in ntf.columns) & (c in param['ParamCode'].values): paramValue = param['ParamValue'].unique() for idx, pv in enumerate(paramValue): ntf['NewIncidentType'] = pd.np.where(ntf.IncidentType.str.contains(pv), param['ParmDesc1'].values, "whatever")错误：ValueError: 操作数不能与形状 (25,) (13,) () 一起广播

查看完整描述

2 回答

青春有我

TA贡献1784条经验获得超8个赞

编辑： Lambda 的回答给了我一个想法，让我知道如何对要将此逻辑模式应用于的许多列执行此操作：

import pandas as pd

df1 = pd.DataFrame(dict(

InfoType = [None, None, None, None],

IncidentType = 'A C B B'.split(),

DangerType = [None, None, 'C', None],

))

df2 = pd.DataFrame(dict(

ParamCode = 'IncidentType IncidentType IncidentType DangerType'.split(),

ParamValue = 'A B C C'.split(),

ParmDesc1 = 'ABC GHI MNO STU'.split(),

))

for col in df1.columns[1:]:

dict_map = dict(

df2[df2.ParamCode == col][['ParamValue','ParmDesc1']].to_records(index=False)

)

df1[col] = df1[col].replace(dict_map)

print(df1)

这假设 in 第一列之后的每一列df1都是需要更新的列，并且要更新的列名作为的'ParamCode'列中的值存在df2。

Python 导师链接到代码

这个问题可以使用一些自定义函数和pandas.Series.apply()来解决：

import pandas as pd

def find_incident_type(x):

if pd.isna(x):

return x

return df2[

(df2['ParamCode'] == 'IncidentType') & (df2['ParamValue']==x)

]["ParmDesc1"].values[0]

def find_danger_type(x):

if pd.isna(x):

return x

return df2[

(df2['ParamCode'] == 'DangerType') & (df2['ParamValue']==x)

]["ParmDesc1"].values[0]

df1 = pd.DataFrame(dict(

InfoType = [None, None, None, None],

IncidentType = 'A C B B'.split(),

DangerType = [None, None, 'C', None],

))

df2 = pd.DataFrame(dict(

ParamCode = 'IncidentType IncidentType IncidentType DangerType'.split(),

ParamValue = 'A B C C'.split(),

ParmDesc1 = 'ABC GHI MNO STU'.split(),

))

df1['IncidentType'] = df1['IncidentType'].apply(find_incident_type)

df1['DangerType'] = df1['DangerType'].apply(find_danger_type)

print(df1)

单步执行python教程中的代码

很有可能有更有效的方法来做到这一点。希望有知道的人分享一下。

df2此外，来自外部作用域的 ref被硬编码到自定义函数中，因此仅适用于外部作用域中的变量名。如果您不希望这些函数依赖于该引用，则需要为pandas.Series.apply'参数使用参数。args

反对回复 2022-10-18

繁花不似锦

TA贡献1851条经验获得超4个赞

使用查找表制作一个dict，然后替换原始数据框的列值。假设原始数据框是df1并且查找表是df2

...

dict_map = dict(zip(df2.ParamCode + "-" + df2.ParamValue, df2.ParmDesc1))

df1['IncidentType'] = ("IncidentType" +'-'+ df1.IncidentType).replace(dict_map)

df1['DangerType'] = ("DangerType" +'-'+ df1.DangerType).replace(dict_map)

...

反对回复 2022-10-18

2 回答
0 关注
89 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

根据来自另一个数据帧的值替换数据帧的值

根据来自另一个数据帧的值替换数据帧的值

2 回答

添加回答