我想读取 count 中存在的整数tags。这是我写的代码:import xml.etree.ElementTree as ETimport urllib.request, urllib.parse, urllib.errorfrom bs4 import BeautifulSoupimport sslctx = ssl.create_default_context()ctx.check_hostname = Falsectx.verify_mode = ssl.CERT_NONEurl = 'http://py4e-data.dr-chuck.net/comments_42.xml'content1 = urllib.request.urlopen(url, context = ctx).read()soup = BeautifulSoup(content1, 'html.parser')tree = ET.fromstring(soup)tags = tree.findall('count')print(tags)它抛出一个错误:Traceback (most recent call last): File "C:\Users\Name\Desktop\Py4e\Me\Assi_15_01.py", line 15, in <module> tree = ET.fromstring(soup) File "C:\Users\Name\AppData\Local\Programs\Python\Python38-32\lib\xml\etree\ElementTree.py", line 1320, in XML parser.feed(text)TypeError: a bytes-like object is required, not 'BeautifulSoup'我能做些什么?更多信息:http://py4e-data.dr-chuck.net/comments_42.xml
2 回答
SMILET
TA贡献1796条经验 获得超4个赞
无需使用xml.etree,只需使用<count>BeautifulSoup 选择所有标签即可:
import requests
from bs4 import BeautifulSoup
url = 'http://py4e-data.dr-chuck.net/comments_42.xml'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
for c in soup.select('count'):
print(int(c.text))
印刷:
97
97
90
90
88
87
87
80
79
79
78
76
76
72
72
66
66
65
65
64
61
61
59
58
57
57
54
51
49
47
40
38
37
36
36
32
25
24
22
21
19
18
18
14
12
12
9
7
3
2
白衣非少年
TA贡献1155条经验 获得超0个赞
我认为您不需要使用 ElementTreee。只需将 BeautiflulSoup 更改为使用 lxml 解析器(将“html-parser”更改为“lxml”)并在汤上调用 findall 方法,而不是树(即 soup.findall('count'))。
添加回答
举报
0/150
提交
取消