1 回答
![?](http://img1.sycdn.imooc.com/545845d30001ee8a02200220-100-100.jpg)
TA贡献1784条经验 获得超8个赞
from bs4 import BeautifulSoup, CData
txt = '''<description><![CDATA[ <p>This is a test post with a few emotes <img src="https://sjc5.discourse-cdn.com/try/images/emoji/twitter/grin.png?v=9" title=":grin:" class="emoji" alt=":grin:"> <img src="https://sjc5.discourse-cdn.com/try/images/emoji/twitter/heart.png?v=9" title=":heart:" class="emoji" alt=":heart:"></p> ]]></description>'''
# load main soup:
soup = BeautifulSoup(txt, 'html.parser')
# find CDATA inside <description>, make new soup
soup2 = BeautifulSoup(soup.find('description').find(text=lambda t: isinstance(t, CData)), 'html.parser')
# replace <img> with their alt=...
for img in soup2.select('img'):
img.replace_with(img['alt'])
# print text
print(soup2.p.text)
印刷:
This is a test post with a few emotes :grin: :heart:
添加回答
举报