用Python字符串解码HTML实体?我正在用BeautifulSoup 3解析一些HTML,但是它包含的HTML实体不是针对我自动解码的:>>> from BeautifulSoup import BeautifulSoup>>> soup = BeautifulSoup("<p>£682m</p>")>>> text = soup.find("p").string>>> print text£682m如何解码HTML实体text得到"£682m"而不是"£682m".
4 回答
![?](http://img1.sycdn.imooc.com/54584cfb0001308402200220-100-100.jpg)
慕村9548890
TA贡献1884条经验 获得超4个赞
convertEntities
BeautifulSoup
美汤3
>>> from BeautifulSoup import BeautifulSoup>>> BeautifulSoup("<p>£682m</p>", ... convertEntities=BeautifulSoup.HTML_ENTITIES)<p>£682m</p>
美汤4
>>> from bs4 import BeautifulSoup>>> BeautifulSoup("<p>£682m</p>")<html><body><p>£682m</p></body></html>
![?](http://img1.sycdn.imooc.com/533e4c9c0001975102200220-100-100.jpg)
千万里不及你
TA贡献1784条经验 获得超9个赞
In [202]: from w3lib.html import replace_entitiesIn [203]: replace_entities("£682m")Out[203]: u'\xa3682m'In [204]: print replace_entities("£682m")£682m
添加回答
举报
0/150
提交
取消