我需要使用 BeautifulSoup 模块查找并计算所有“python”和“c++”单词作为 HTML 代码中的子字符串。在维基百科中,这些词相应地出现了 1 次和 9 次。为什么我的代码写0和0?from urllib.request import urlopen, urlretrievefrom bs4 import BeautifulSoupresp = urlopen("https://stepik.org/media/attachments/lesson/209717/1.html") html = resp.read().decode('utf8') soup = BeautifulSoup(html, 'html.parser') table = soup.find('table', attrs = {'class' : 'wikitable sortable'})cnt = 0for tr in soup.find_all("python"): cnt += 1print(cnt)cnt1 = 0for tr in soup.find_all("c++"): cnt += 1print(cnt)
1 回答
慕码人8056858
TA贡献1803条经验 获得超6个赞
你做错了你需要使用字符串参数来搜索任何字符串
# These will only work in case like these <b>Python</b>
soup.find_all(string="Python")
# Not in these <b>python</b> or <b>Python is best</b>
#We can use regex to fix that they will work in substring cases
soup.find_all(string=re.compile("[cC]\+\+"))
soup.find_all(string=re.compile("[Pp]ython"))
添加回答
举报
0/150
提交
取消