首页猿问加速...

加速 Pandas：查找满足一组条件的所有列

Python

侃侃尔雅 2022-01-05 20:37:56

我有使用 Pandas DataFrame 表示的数据，例如如下所示：| id | entity | name | value | location其中id是一个integer值、entity是一个integer、name是一个string、value是一个integer、和location是一个string（例如美国、加拿大、英国等）。现在，我想向此数据框中添加一个新列，即列“ flag”，其中的值分配如下：for d in df.iterrows(): if d.entity == 10 and d.value != 1000 and d.location == CA: d.flag = "A" elif d.entity != 10 and d.entity != 0 and d.value == 1000 and d.location == US: d.flag = "C" elif d.entity == 0 and d.value == 1000 and d.location == US" d.flag = "B" else: print("Different case")有没有办法加快速度并使用一些内置函数而不是 for 循环？

查看完整描述

3 回答

郎朗坤

TA贡献1921条经验获得超9个赞

np.select根据您给它选择的那些条件，使用您传递条件列表的哪个，并且您可以在不满足任何条件时指定默认值。

conditions = [

(d.entity == 10) & (d.value != 1000) & (d.location == 'CA'),

(d.entity != 10) & (d.entity != 0) & (d.value == 1000) & (d.location == 'US'),

(d.entity == 0) & (d.value == 1000) & (d.location == 'US')

]

choices = ["A", "C", "B"]

df['flag'] = np.select(conditions, choices, default="Different case")

反对回复 2022-01-05

LEATH

TA贡献1936条经验获得超6个赞

添加()按位and->&用于处理numpy.select：

m = [

(d.entity == 10) & (d.value != 1000) & (d.location == 'CA'),

(d.entity != 10) & (d.entity != 0) & (d.value == 1000) & (d.location == 'US'),

(d.entity == 0) & (d.value == 1000) & (d.location == 'US')

]

df['flag'] = np.select(m, ["A", "C", "B"], default="Different case")

反对回复 2022-01-05

绝地无双

TA贡献1946条经验获得超4个赞

您写了“查找满足一组条件的所有列”，但您的代码显示您实际上是在尝试添加一个新列，其每行的值是根据同一行的其他列的值计算的。

如果确实如此，您可以使用df.apply，给它一个计算特定行值的函数：

def flag_value(row):

if row.entity == 10 and row.value != 1000 and row.location == CA:

return "A"

elif row.entity != 10 and row.entity != 0 and row.value == 1000 and row.location == US:

return "C"

elif row.entity == 0 and row.value == 1000 and row.location == US:

return "B"

else:

return "Different case"

df['flag'] = df.apply(flag_value, axis=1)

查看此相关问题以获取更多信息。

如果您真的想查找指定某些条件的所有列，使用Pandas 数据框执行此操作的常用方法是使用df.loc和索引：

only_a_cases = df.loc[df.entity == 10 & df.value != 1000 & df.location == "CA"]

# or:

only_a_cases = df.loc[lambda df: df.entity == 10 & df.value != 1000 & df.location == "CA"]

反对回复 2022-01-05

3 回答
0 关注
293 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

加速 Pandas：查找满足一组条件的所有列

加速 Pandas：查找满足一组条件的所有列

3 回答

添加回答