我正在使用代码来网络抓取客户评论。一切都按照我希望代码执行的操作进行,但我无法正确获取评级的类或属性,因此代码始终返回该Ratings列的空白结果。有人可以帮我找到正确的属性并修复Ratings代码行吗?from bs4 import BeautifulSoupimport requestsimport pandas as pdimport jsonprint ('all imported successfuly')# Initialize an empty dataframedf = pd.DataFrame()for x in range(1, 37): names = [] headers = [] bodies = [] ratings = [] published = [] updated = [] reported = [] link = (f'https://www.trustpilot.com/review/fabfitfun.com?page={x}') print (link) req = requests.get(link) content = req.content soup = BeautifulSoup(content, "lxml") articles = soup.find_all('article', {'class':'review'}) for article in articles: names.append(article.find('div', attrs={'class': 'consumer-information__name'}).text.strip()) headers.append(article.find('h2', attrs={'class':'review-content__title'}).text.strip()) try: bodies.append(article.find('p', attrs={'class':'review-content__text'}).text.strip()) except: bodies.append('') try: #ratings.append(article.find('div', attrs={'class':'star-rating star-rating--medium'}).text.strip()) ratings.append(article.find('div', attrs={'class': 'star-rating star-rating--medium'})['alt']) except: ratings.append('') dateElements = article.find('div', attrs={'class':'review-content-header__dates'}).text.strip() jsonData = json.loads(dateElements) published.append(jsonData['publishedDate']) updated.append(jsonData['updatedDate']) reported.append(jsonData['reportedDate'])
1 回答
交互式爱情
TA贡献1712条经验 获得超3个赞
只需更改代码中的这一行:
ratings.append(article.find_all("img", alt=True)[0]["alt"])
df.Rating然后输出到:
0 1 star: Bad
1 5 stars: Excellent
2 5 stars: Excellent
3 5 stars: Excellent
4 5 stars: Excellent
5 5 stars: Excellent
6 5 stars: Excellent
在文章中找到img标签并从中检索替代文本似乎更容易。
- 1 回答
- 0 关注
- 77 浏览
添加回答
举报
0/150
提交
取消