3 回答
![?](http://img1.sycdn.imooc.com/5333a2320001acdd02000200-100-100.jpg)
TA贡献1848条经验 获得超2个赞
让我们将答案分解为两个简单的步骤。
将整个字符串转换为一组情侣姓名。
获取所有与所请求的模式匹配的对。
我们对遵循以下模式的情侣名字感兴趣:
<Name1> and <Name2> <Last-name> <May-or-may-not-be-words-separated-by-spaces>.
<Name1> and <Name2> <Last-name>
但我们只对每个匹配字符串的部分感兴趣。现在我们已经定义了我们想要做什么,下面是相同的代码。
import re
testStr = """Rob and Amber Mariano, Heather Robinson,
Jane and John Smith, Kiwan and Nichols Brady John,
Jimmy Nichols, Melanie Carbone, Jim Green and Nancy Brown,
Todd and Sana Clegg with Tatiana Perkin
"""
# Pattern definition for the match
regExpr = re.compile("^(\w+\sand\s\w+\s\w+)(\s\w)*")
# Remove whitespaces introduced at the beginning due to splitting
coupleList = [s.strip() for s in testStr.split(',')]
# Find all strings that have a matching string, for rest match() returns None
matchedList = [regExpr.match(s) for s in coupleList]
# Select first group which extracts the necessary pattern from every matched string
result = [s.group(1) for s in matchedList if s is not None ]
![?](http://img1.sycdn.imooc.com/5458478b0001f01502200220-100-100.jpg)
TA贡献1804条经验 获得超2个赞
有点晚了,但可能是最简单的正则表达式
import re
regex = r"(?:, |^)(\w+\sand\s\w+\s\w+)"
test_str = "Rob and Amber Mariano, Heather Robinson, Jane and John Smith, Kiwan and Nichols Brady, John, Jimmy Nichols, Melanie Carbone, Jim Green and Nancy Brown, Todd and Sana Clegg with Tatiana Perkin"
matches = re.finditer(regex, test_str, re.MULTILINE)
for matchNum, match in enumerate(matches, start=1):
for groupNum in range(0, len(match.groups())):
groupNum = groupNum + 1
print (match.group(groupNum))
输出:
Rob and Amber Mariano
Jane and John Smith
Kiwan and Nichols Brady
Todd and Sana Clegg
![?](http://img1.sycdn.imooc.com/545861b80001d27c02200220-100-100.jpg)
TA贡献1851条经验 获得超3个赞
试试这个...按预期完美工作
(,\s|^)([A-Z][a-z]+\sand\s[A-Z][a-z]+(\s[A-Z][a-z]+)+)
测试脚本:
import re
a=re.findall("(,\s|^)([A-Z][a-z]+\sand\s[A-Z][a-z]+(\s[A-Z][a-z]+)+)","Rob and Amber Mariano, Heather Robinson, Jane and John Smith, Kiwan and Nichols Brady John, Jimmy Nichols, Melanie Carbone, Jim Green and Nancy Brown, Todd and Sana Clegg with Tatiana Perkin")
print(a)
回复:
[('', 'Rob and Amber Mariano', ' Mariano'), (', ', 'Jane and John Smith', ' Smith'), (', ', 'Kiwan and Nichols Brady John', ' John'), (', ', 'Todd and Sana Clegg', ' Clegg')]
添加回答
举报