1 回答
TA贡献1824条经验 获得超8个赞
使用此代码:
from bs4 import BeautifulSoup
import urllib.request, urllib.parse, urllib.error
html_url = 'https://www.nwk.usace.army.mil/Locations/District-Lakes/Pomme-de-Terre-Lake/Daily-Lake-Info-2/'
html_doc = urllib.request.urlopen(html_url).read()
soup = BeautifulSoup(html_doc, 'html.parser')
pageNav = soup.find(class_= 'Normal')
pageSub = pageNav.find_all('p')
for strong_tag in soup.find_all('strong'):
if strong_tag.text == "24 Hr. Change:" or strong_tag.text=="Yesterday's High:" or strong_tag.text=="Date: " or strong_tag.text=="Lake Surface Temperature:":
print(strong_tag.text, strong_tag.next_sibling)
if 语句应该对所有内容进行排序。我在 jupyter notebook 中尝试了这段代码,它奏效了。这里唯一的问题是日期一词后面有一些空格。所以现在文件不会打印日期行。
要对日期大小写进行硬编码,请改用以下代码:
from bs4 import BeautifulSoup
import urllib.request, urllib.parse, urllib.error
html_url = 'https://www.nwk.usace.army.mil/Locations/District-Lakes/Pomme-de-Terre-Lake/Daily-Lake-Info-2/'
html_doc = urllib.request.urlopen(html_url).read()
soup = BeautifulSoup(html_doc, 'html.parser')
pageNav = soup.find(class_= 'Normal')
pageSub = pageNav.find_all('p')
date = True
for strong_tag in soup.find_all('strong'):
if date:
print(strong_tag.text, strong_tag.next_sibling)
date = False
if strong_tag.text == "24 Hr. Change:" or strong_tag.text=="Yesterday's High:" or strong_tag.text=="Lake Surface Temperature:":
print(strong_tag.text, strong_tag.next_sibling)
添加回答
举报