为了账号安全,请及时绑定邮箱和手机立即绑定

在亚马逊网络抓取时在BS4中收到错误:属性错误:“NoneType”对象没有属性“get_text”

在亚马逊网络抓取时在BS4中收到错误:属性错误:“NoneType”对象没有属性“get_text”

慕森王 2022-08-02 10:52:19
!pip install requests!pip install bs4import requestsfrom bs4 import BeautifulSoupurl = "https://www.amazon.in/Apple-iPhone-Pro-Max-256GB/dp/B07XVLH744/ref=sr_1_1_sspa?crid=2VCKZNOH3H6SR&keywords=apple+iphone+11+pro+max&qid=1582043410&sprefix=apple+iphone%2Caps%2C388&sr=8-1-spons&psc=1&spLa=ZW5jcnlwdGVkUXVhbGlmaWVyPUEyVjdZSE83TzU4UUMmZW5jcnlwdGVkSWQ9QTAyNTI1ODZJUzZOVUwxWDNIUlAmZW5jcnlwdGVkQWRJZD1BMDkxNDg4MzFLMFpVT1M5OFM5Q0smd2lkZ2V0TmFtZT1zcF9hdGYmYWN0aW9uPWNsaWNrUmVkaXJlY3QmZG9Ob3RMb2dDbGljaz10cnVl"headers = {"User-Agent": "in this section im adding my user agent after typing my user agent in google search"}page = requests.get(url, headers=headers)soup = BeautifulSoup(page.content, "html.parser")print(soup.prettify()) title = soup.find(id = "productTitle").get_text()price = soup.find(id = "priceblock_ourprice").get_text()converted_price = price[0:8]print(converted_price)print(titles)我正在谷歌colab上工作,当我运行这个代码时,我得到这个错误AttributeError   Traceback (most recent call last)<ipython-input-15-14696d9dc778> in <module>()     16 print(soup.prettify())     17 ---> 18 title = soup.find(id = "productTitle").get_text()     19 price = soup.find(id = "priceblock_ourprice").get_text()     20 AttributeError: 'NoneType' object has no attribute 'get_text'我尝试在互联网上搜索,但没有找到解决我问题的答案。我试图得到iPhone 11专业最高价格。当我运行此代码时,我得到上面提到的错误。
查看完整描述

4 回答

?
白衣非少年

TA贡献1155条经验 获得超0个赞

soup.find(id = "productTitle")这是返回,因为它无法找到.确保搜索正确的元素。Noneid = "producTitle"


对于语句,我建议始终编写if条件以避免和处理此类错误。find


title = soup.find(id = "productTitle")

if title:

    title = title.get_text()

else:

    title = "default_title"


price = soup.find(id = "priceblock_ourprice").get_text()

您可以使用 执行相同的操作。price


查看完整回答
反对 回复 2022-08-02
?
交互式爱情

TA贡献1712条经验 获得超3个赞

好吧,我在这里测试了你的代码,它工作正常。但是,当您尝试在短时间内访问同一链接时,亚马逊会为您提供503代码...


<html>

 <head>

  <title>

   503 - Service Unavailable Error

  </title>

 </head>

 <body bgcolor="#FFFFFF" text="#000000">

  <!--

        To discuss automated access to Amazon data please contact api-services-support@amazon.com.

        For information about migrating to our APIs refer to our Marketplace APIs at https://developer.amazonservices.in/ref=rm_5_sv, or our Product Advertising API at https://affiliate-program.amazon.in/gp/advertising/api/detail/main.html/ref=rm_5_ac for advertising use cases.

-->

  <center>

   <a href="https://www.amazon.in/ref=cs_503_logo/">

    <img alt="Amazon.in" border="0" height="45" src="https://images-eu.ssl-images-amazon.com/images/G/31/x-locale/communities/people/logo.gif" width="200"/>

   </a>

   <p align="center">

    <font face="Verdana,Arial,Helvetica">

     <font color="#CC6600" size="+2">

      <b>

       Oops!

      </b>

     </font>

     <br/>

     <b>

      It's rush hour and traffic is piling up on that page. Please try again in a short while.

      <br/>

      If you were trying to place an order, it will not have been processed at this time.

     </b>

     <p>

      <img alt="*" border="0" height="9" src="https://images-eu.ssl-images-amazon.com/images/G/02/x-locale/common/orange-arrow.gif" width="10"/>

      <b>

       <a href="https://www.amazon.in/ref=cs_503_link/">

        Go to the Amazon.in home page to continue shopping

       </a>

      </b>

     </p>

    </font>

   </p>

  </center>

 </body>

</html>

请稍等片刻,然后您可以重试,或者至少在请求之间测试更长的时间...


查看完整回答
反对 回复 2022-08-02
?
慕盖茨4494581

TA贡献1850条经验 获得超11个赞

当您尝试从值为 None 的对象中提取数据时,您会收到该错误。如果您在第 18 行看到这一点,则表示您不匹配任何内容并返回了 None。soup.find(id = "productTitle")


您需要将处理分解为多个步骤。在访问返回值之前,请先检查它。所以。。。


title_info = soup.find(id = "productTitle")

if title_info:

    title = title_info.text

else:

    'handle the situation'


查看完整回答
反对 回复 2022-08-02
?
繁星淼淼

TA贡献1775条经验 获得超11个赞

也试试这个代码


    title = soup.find(id="productTitle")

     if title:

       title = title.get_text()

     else:

       title = "default_title"

    price = soup.find(id="priceblock_ourprice")

      if price:

       price = price

      else:

       price = "default_title"


        # converted_price = price[0:8]

       convert = str(price)

       con = convert[-18:-11]


        print(con)

        print(title)

尝试使用其他 IDE


使用 repl.it= https://repl.it 创建新的 repl 并使用它


查看完整回答
反对 回复 2022-08-02
  • 4 回答
  • 0 关注
  • 236 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信