首页猿问如何在 python...

如何在 python 正则表达式中获取两个子字符串中的特定字符串？

Python

海绵宝宝撒 2023-03-16 17:44:56

这是示例：review: I love you very much... reviewer:jackson review: I hate you very much... reviewer:madden review: sky is pink and i ... reviewer: tom我想提取字符串review:和之间的字符串...所以以上情况的提取是I love you very muchI hate you very muchsky is pink and i 我使用这种正则表达式但失败了re.findall("review(.*)...",string)它提取了这种结果：I love you very much... reviewer:jackson review: I hate you very much... reviewer:madden review: sky is pink and i

查看完整描述

5 回答

潇潇雨雨

TA贡献1833条经验获得超4个赞

这也可以，而且很简单

str = "review: I love you very much... reviewer:jackson review: I hate you very much... reviewer:madden review: sky is pink and i ... reviewer: tom"

matches = re.findall('review:(.+?)\.\.\.', str)

反对回复 2023-03-16

德玛西亚99

TA贡献1770条经验获得超3个赞

使用

re.findall(r'\breview:\s*(.*?)\s*\.\.\.', string)

见证明。蟒蛇测试：

import re

regex = r"\breview:\s*(.*?)\s*\.\.\."

string = "review: I love you very much... reviewer:jackson review: I hate you very much... reviewer:madden review: sky is pink and i ... reviewer: tom"

print ( re.findall(regex, string) )

输出：['I love you very much', 'I hate you very much', 'sky is pink and i']

请注意，r"..."表示原始字符串文字的前缀"\b"不是单词边界，而是r"\b"。

解释

NODE EXPLANATION

--------------------------------------------------------------------------------

\b the boundary between a word char (\w) and

something that is not a word char

--------------------------------------------------------------------------------

review: 'review:'

--------------------------------------------------------------------------------

\s* whitespace (\n, \r, \t, \f, and " ") (0 or

more times (matching the most amount possible))

--------------------------------------------------------------------------------

( group and capture to \1:

--------------------------------------------------------------------------------

.*? any character except \n (0 or more times

(matching the least amount possible))

--------------------------------------------------------------------------------

) end of \1

--------------------------------------------------------------------------------

\s* whitespace (\n, \r, \t, \f, and " ") (0 or

more times (matching the most amount possible))

--------------------------------------------------------------------------------

\.\.\. '...'

--------------------------------------------------------------------------------

反对回复 2023-03-16

largeQ

TA贡献2039条经验获得超7个赞

您可以使用以下利用前瞻的模式：

(?<=review:\s).*?(?=\.\.\.)

inp = "review: I love you very much... reviewer:jackson review: I hate you very much... reviewer:madden review: sky is pink and i ... reviewer: tom"

matches = re.findall(r'(?<=review:\s).*?(?=\.\.\.)', inp)

print(matches)

反对回复 2023-03-16

当年话下

TA贡献1890条经验获得超9个赞

re.findall与模式一起使用\breview:\s*(.*?)\.\.\.\s*(?=\breviewer:|$)：

inp = "review: I love you very much... reviewer:jackson review: I hate you very much... reviewer:madden review: sky is pink and i ... reviewer: tom"

matches = re.findall(r'\breview:\s*(.*?)\.\.\.\s*(?=\breviewer:|$)', inp)

print(matches)

这打印：

['I love you very much', 'I hate you very much', 'sky is pink and i ']

反对回复 2023-03-16

慕虎7371278

TA贡献1802条经验获得超4个赞

\对不起，我忘了在前面添加.

正确的是： re.findall("review:\b?(.*)\.\.\.",string)

而这一次，很重要

反对回复 2023-03-16

5 回答
0 关注
128 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

如何在 python 正则表达式中获取两个子字符串中的特定字符串？

如何在 python 正则表达式中获取两个子字符串中的特定字符串？

5 回答

添加回答