如何匹配字典值中的字符串和子字符串

我有以下函数来检测数据中的字符串，我加入了字典的键和值，因为我想找到这两个值。我添加了 ^ 和 $ 因为我只想要精确匹配。功能import pandas as pddef check_direction(df): # dict for all direction and their abbreviation direction = { '^Northwest$': '^NW$', '^Northeast$': '^NE$', '^Southeast$': '^SE$', '^Southwest$': '^SW$', '^North$': '^N$', '^East$': '^E$', "^South$": '^S$', "^West$": "^W$"} # combining all the dict pairs into one for str match all_direction = direction.keys() | direction.values() all_direction = '|'.join(all_direction) df = df.astype(str) df = pd.DataFrame(df.str.contains(all_direction, case = False)) return df我对以下系列进行了测试，结果按预期工作：tmp = pd.Series(['Monday', 'Tuesday', 'Wednesday', 'Thursday'])check_direction(tmp)0 False1 False2 False3 Falsetmp = pd.Series(['SOUTH', 'NORTHEAST', 'WEST'])check_direction(tmp)0 True1 True2 True但是我在这里遇到了问题：tmp = pd.Series(['32 Street NE', 'Ogden Road SE'])check_direction(tmp)0 False1 False由于 NE 和 SE，当它应该为 True 时，两者都返回为 false，我该如何修改我的代码来实现这一点？

查看完整描述

1 回答

慕码人2483693

TA贡献1860条经验获得超9个赞

我认为您误解了^和的$意思。

^匹配整个字符串的开头，
$匹配整个字符串的结尾。

例如，'Ogden Road SE'不匹配模式^SE$，因为字符串不以开头SE。

您可能打算使用单词边界，即\b.

所以你应该改为^SE$,\bSE\b等等。

您可以通过编写来使其不那么乏味且更具可读性

direction = {

'Northwest': 'NW',

'Northeast': 'NE',

'Southeast': 'SE',

'Southwest': 'SW',

'North': 'N',

'East': 'E',

'South': 'S',

'West': 'W'}

all_direction = direction.keys() | direction.values()

all_direction = '|'.join(r'\b{}\b'.format(d) for d in all_direction)

反对回复 2023-06-27

热搜

最近搜索清空

如何匹配字典值中的字符串和子字符串

如何匹配字典值中的字符串和子字符串

1 回答

添加回答