使用Python的BeautifiulSoup库解析Span HTML标签中的信息

我正在写一个抓取特定股票价格的Python网络抓取工具。在程序的最后，有一些打印语句可以正确地解析html数据，这样我就可以在某个HTML span标签内获取股票的价格信息。我的问题是：我该怎么做？我到目前为止已经获得了正确的HTML span标记。我以为您可以简单地进行字符串拼接，但是库存的价格会不断变化，因此我认为这种解决方案不利于解决此问题。我最近开始使用BeautifulSoup，因此不胜感激。import bs4from urllib.request import urlopen as uReqfrom bs4 import BeautifulSoup as soup# webscraping reference http://altitudelabs.com/blog/web-scraping-with-python-and-beautiful-soup/my_url = 'https://quotes.wsj.com/GRPS/options'#opens up a web connection and "downloads"a copy of the desired webpageuClient = uReq(my_url)#dumps the information read on the webpade into a variable for later use/parsingpage_html = uClient.read()uClient.close()page_soup = soup(page_html, "lxml")#find the html location for the price of the stock#<span id="quote_val">0.0008</span>all_stock_info = page_soup.find("section",{"class":"sector cr_section_1"})find_spans = all_stock_info.find("span",{"id":"quote_val"})price = page_soup.findAll("span",{"id":"quote_val"})#sanity checks to make sure the scraper is finding the correct infoprint(all_stock_info)print(len(all_stock_info))print(len(price))print(price) #this gives me the right span, I just need to be able to parse #the price of the stock between here (in this case 0.0008) no #matter what the price isprint(all_stock_info.span)print(find_spans)

查看完整描述

1 回答

一只甜甜圈

TA贡献1836条经验获得超5个赞

您可以使用.findwith.text函数来获取所需的值。

前任：

from bs4 import BeautifulSoup

page_soup = BeautifulSoup(html, "lxml")

price = page_soup.find("span",{"id":"quote_val"}).text

print( price )

输出：

0.0008

反对回复 2021-04-27

热搜

最近搜索清空

使用Python的BeautifiulSoup库解析Span HTML标签中的信息

使用Python的BeautifiulSoup库解析Span HTML标签中的信息

1 回答

添加回答