1 回答
TA贡献1875条经验 获得超5个赞
添加新列 C 并根据与正则表达式匹配的数据帧为该列分配 ID“1”或“2”。
In [17]: df
Out[17]:
A B
0 NaN this has Color:Red
1 NaN Color: Blue,red, green
2 NaN Color: Yellow
3 NaN This has many colors. Color: green, red, Yellow
4 NaN Filter oil Type: Synthetic Motor oil
5 NaN Oil Type : High Mileage Motor oil
您构造了两个条件:
In [18]: one = (df['B'].str.match('.*Color:.*') | df['B'].str.match('.*colorFUL:.*')) & df.A.isnull()
In [19]: one
Out[19]:
0 True
1 True
2 True
3 True
4 False
5 False
dtype: bool
In [20]: two = (df['B'].str.match('.*Type:.*')) & df.A.isnull()
In [21]: two
Out[21]:
0 False
1 False
2 False
3 False
4 True
5 False
dtype: bool
这是制作新专栏的一种方法。
In [22]: df['C'] = 0
使用条件的布尔系列根据这些条件分配值。
In [23]: df.loc[one,'C'] = 1
In [24]: df.loc[two,'C'] = 2
In [25]: df
Out[25]:
A B C
0 NaN this has Color:Red 1
1 NaN Color: Blue,red, green 1
2 NaN Color: Yellow 1
3 NaN This has many colors. Color: green, red, Yellow 1
4 NaN Filter oil Type: Synthetic Motor oil 2
5 NaN Oil Type : High Mileage Motor oil 0
如果 df 是输入数据帧,fd 是与模式匹配的输出数据帧,如何直接将 id 分配给 fd 而不进行布尔检查
fd = df.loc[one]
fd['C'] = 1
添加回答
举报