如何在捕获组中抓取多个段落？

我正在使用此代码：(?i)(?<!\d)Item.*?1A.*?Risk.*?Factors.*?\n*(.+?)\n*Item.*?1B获取以下文本：ITEM 1A. RISK FACTORSIn addition to other information in this Form 10-K, the following risk factors should be carefully considered in evaluating us and our business because these factors currently have a significant impact or In addition to other information in this Form 10-K, the following risk factors should be carefully considered in evaluating us and our business because these factors currently have a significant impact or ITEM 1B.但它不会抓取捕获组中的任何内容，除非它是这样的一段：ITEM 1A. RISK FACTORSIn addition to other information in this Form 10-K, the following risk factors should be carefully considered in evaluating us and our business because these factors currently have a significant impact or ITEM 1B.

查看完整描述

2 回答

拉莫斯之舞

TA贡献1820条经验获得超10个赞

尝试

(?i)(?<!\d)Item.*?1A.*?Risk.*?Factors.*?\n*((.*\n*)+)\n*Item.*?1B

为了您未来的正则表达式头痛，一个令人难以置信的资源： https ://regex101.com

干杯-

反对回复 2022-06-14

心有法竹

TA贡献1866条经验获得超5个赞

您的正则表达式匹配任意数量的换行符，然后是一行上任意数量的文本，然后是任意数量的换行符 - 它只在换行符之间寻找一个“段落”，因为.它不会跨行捕获。

尝试用类似的东西替换它[\s\S]，这将捕获所有内容 - 包括换行符、段落、文本、空格、任何你想要的东西。特别值得注意的是，这将捕获任意数量的段落，它们之间有任意数量的空格。

(?i)(?<!\d)Item.*?1A.*?Risk.*?Factors\n*([\s\S]*?)\n*Item.*?1B

(?i)(?<!\d)Item.*?1A.*?Risk.*?Factors匹配到风险因素结束。
\n*根据需要匹配尽可能多的换行符，直到我们到达下一段。
([\s\S]*?)捕获任何内容，跨越任意数量的行（惰性）。
\n*根据需要匹配尽可能多的换行符，直到我们到达下一段。
Item.*?1B匹配其余内容。（这与最后的不匹配.，您的意思是这样吗？如果是，请添加\.到最后）。

反对回复 2022-06-14

热搜

最近搜索清空

如何在捕获组中抓取多个段落？

如何在捕获组中抓取多个段落？

2 回答

添加回答