1 回答
TA贡献1951条经验 获得超3个赞
您可以尝试过滤class:
posters = soup.find_all("img", {"class": "lazyloaded"})
for poster in posters:
print(poster["src"])
请参阅文档:https ://www.crummy.com/software/BeautifulSoup/bs4/doc/#searching-by-css-class
编辑:更多解释
假设您有以下文件demo.html:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Title</title>
</head>
<body>
<img class="logo" src="https://image.tmdb.org/t/p/w94_and_h141_bestv2/3qlQM9KP1cyvNfPChA9rASASdHr.jpg">
<img class="poster lazyload lazyloaded"
data-src="https://image.tmdb.org/t/p/w94_and_h141_bestv2/3qlQM9KP1cyvNfPChA9rASASdHr.jpg"
data-srcset="https://image.tmdb.org/t/p/w94_and_h141_bestv2/3qlQM9KP1cyvNfPChA9rASASdHr.jpg 1x, https://image.tmdb.org/t/p/w188_and_h282_bestv2/3qlQM9KP1cyvNfPChA9rASASdHr.jpg 2x"
alt="Hitman"
src="https://image.tmdb.org/t/p/w94_and_h141_bestv2/3qlQM9KP1cyvNfPChA9rASASdHr.jpg"
srcset="https://image.tmdb.org/t/p/w94_and_h141_bestv2/3qlQM9KP1cyvNfPChA9rASASdHr.jpg 1x, https://image.tmdb.org/t/p/w188_and_h282_bestv2/3qlQM9KP1cyvNfPChA9rASASdHr.jpg 2x"
data-loaded="true">
</body>
</html>
您可以像这样解析“海报”图像:
import io
from bs4 import BeautifulSoup
with io.open("demo.html", encoding="utf8") as fd:
soup = BeautifulSoup(fd.read(), features="html.parser")
posters = soup.find_all("img", {"class": "lazyloaded"})
for poster in posters:
print(poster["src"])
你得到:
https://image.tmdb.org/t/p/w94_and_h141_bestv2/3qlQM9KP1cyvNfPChA9rASASdHr.jpg
添加回答
举报