为了账号安全,请及时绑定邮箱和手机立即绑定

如何在 python 正则表达式中获取两个子字符串中的特定字符串?

如何在 python 正则表达式中获取两个子字符串中的特定字符串?

海绵宝宝撒 2023-03-16 17:44:56
这是示例:review: I love you very much... reviewer:jackson review: I hate you very much... reviewer:madden review: sky is pink and i ... reviewer: tom我想提取字符串review:和之间的字符串...所以以上情况的提取是I love you very muchI hate you very muchsky is pink and i 我使用这种正则表达式但失败了re.findall("review(.*)...",string)它提取了这种结果:I love you very much... reviewer:jackson review: I hate you very much... reviewer:madden review: sky is pink and i
查看完整描述

5 回答

?
潇潇雨雨

TA贡献1833条经验 获得超4个赞

这也可以,而且很简单


str = "review: I love you very much... reviewer:jackson review: I hate you very much... reviewer:madden review: sky is pink and i ... reviewer: tom"


matches = re.findall('review:(.+?)\.\.\.', str)


查看完整回答
反对 回复 2023-03-16
?
德玛西亚99

TA贡献1770条经验 获得超3个赞

使用

re.findall(r'\breview:\s*(.*?)\s*\.\.\.', string)

证明蟒蛇测试

import re

regex = r"\breview:\s*(.*?)\s*\.\.\."

string = "review: I love you very much... reviewer:jackson review: I hate you very much... reviewer:madden review: sky is pink and i ... reviewer: tom"

print ( re.findall(regex, string) )

输出:['I love you very much', 'I hate you very much', 'sky is pink and i']


请注意,r"..."表示原始字符串文字的前缀"\b"不是单词边界,而是r"\b"。


解释


NODE                     EXPLANATION

--------------------------------------------------------------------------------

  \b                       the boundary between a word char (\w) and

                           something that is not a word char

--------------------------------------------------------------------------------

  review:                  'review:'

--------------------------------------------------------------------------------

  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or

                           more times (matching the most amount possible))

--------------------------------------------------------------------------------

  (                        group and capture to \1:

--------------------------------------------------------------------------------

    .*?                      any character except \n (0 or more times

                             (matching the least amount possible))

--------------------------------------------------------------------------------

  )                        end of \1

--------------------------------------------------------------------------------

  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or

                           more times (matching the most amount possible))

--------------------------------------------------------------------------------

  \.\.\.                   '...'

--------------------------------------------------------------------------------


查看完整回答
反对 回复 2023-03-16
?
largeQ

TA贡献2039条经验 获得超7个赞

您可以使用以下利用前瞻的模式:


(?<=review:\s).*?(?=\.\.\.)

inp = "review: I love you very much... reviewer:jackson review: I hate you very much... reviewer:madden review: sky is pink and i ... reviewer: tom"

matches = re.findall(r'(?<=review:\s).*?(?=\.\.\.)', inp)

print(matches)


查看完整回答
反对 回复 2023-03-16
?
当年话下

TA贡献1890条经验 获得超9个赞

re.findall与模式一起使用\breview:\s*(.*?)\.\.\.\s*(?=\breviewer:|$):


inp = "review: I love you very much... reviewer:jackson review: I hate you very much... reviewer:madden review: sky is pink and i ... reviewer: tom"

matches = re.findall(r'\breview:\s*(.*?)\.\.\.\s*(?=\breviewer:|$)', inp)

print(matches)

这打印:


['I love you very much', 'I hate you very much', 'sky is pink and i ']


查看完整回答
反对 回复 2023-03-16
?
慕虎7371278

TA贡献1802条经验 获得超4个赞

\对不起,我忘了在前面 添加.

正确的是: re.findall("review:\b?(.*)\.\.\.",string)

而这一次,很重要


查看完整回答
反对 回复 2023-03-16
  • 5 回答
  • 0 关注
  • 128 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信