为了账号安全,请及时绑定邮箱和手机立即绑定

没去try前只能爬取一条 去掉后发生这个错误 大神求教!!!!

craw 1 : http://baike.baidu.com/item/Python/407313

Traceback (most recent call last):

  File "D:/pythonlesson1/baike_spider/spider_main.py", line 33, in <module>

    obj_spider.craw(root_url)

  File "D:/pythonlesson1/baike_spider/spider_main.py", line 19, in craw

    html_cont = self.downloader.download(new_url)

  File "D:\pythonlesson1\baike_spider\html_downloader.py", line 9, in download

    response = urllib.request.urlopen(url)

  File "D:\anaconda3.7\lib\urllib\request.py", line 222, in urlopen

    return opener.open(url, data, timeout)

  File "D:\anaconda3.7\lib\urllib\request.py", line 531, in open

    response = meth(req, response)

  File "D:\anaconda3.7\lib\urllib\request.py", line 641, in http_response

    'http', request, response, code, msg, hdrs)

  File "D:\anaconda3.7\lib\urllib\request.py", line 563, in error

    result = self._call_chain(*args)

  File "D:\anaconda3.7\lib\urllib\request.py", line 503, in _call_chain

    result = func(*args)

  File "D:\anaconda3.7\lib\urllib\request.py", line 755, in http_error_302

    return self.parent.open(new, timeout=req.timeout)

  File "D:\anaconda3.7\lib\urllib\request.py", line 525, in open

    response = self._open(req, data)

  File "D:\anaconda3.7\lib\urllib\request.py", line 548, in _open

    'unknown_open', req)

  File "D:\anaconda3.7\lib\urllib\request.py", line 503, in _call_chain

    result = func(*args)

  File "D:\anaconda3.7\lib\urllib\request.py", line 1387, in unknown_open

    raise URLError('unknown url type: %s' % type)

urllib.error.URLError: <urlopen error unknown url type: https>


正在回答

3 回答

或者在'html_parser' 中,改成:

links = soup.find_all('a', href=re.compile(r"/item/.*"))


0 回复 有任何疑惑可以回复我~

试试在 'html_downloader' 中加上:

import ssl

ssl._create_default_https_context = ssl._create_unverified_context
0 回复 有任何疑惑可以回复我~

https://github.com/DaddySheng/Python_craw_test1/blob/master/Python3_craw_code.py

try的作用原本就是跳过无法爬取的东西,你剔除了自然会出事

这个网站的代码可以爬最新的,而且是PYTHON3

0 回复 有任何疑惑可以回复我~

举报

0/150
提交
取消
Python开发简单爬虫
  • 参与学习       227670    人
  • 解答问题       1219    个

本教程带您解开python爬虫这门神奇技术的面纱

进入课程

没去try前只能爬取一条 去掉后发生这个错误 大神求教!!!!

我要回答 关注问题
意见反馈 帮助中心 APP下载
官方微信