知道为什么你们只输出了一条记录吗?仔细看看百度百科链接的后缀【/view/21087.htm】,看清楚不是 .html ,犯错误的举个手吧
2016-10-02
Python 3.x 获取网页代码
import urllib.request
url = 'http://www.baidu.com'
request = urllib.request.urlopen(url)
html = request.read().decode('utf8')
print(html)
import urllib.request
url = 'http://www.baidu.com'
request = urllib.request.urlopen(url)
html = request.read().decode('utf8')
print(html)
2016-10-02
已采纳回答 / NoBB_
用eclipse的话,可以打开eclipse,然后在help->eclipse Marketplace->搜索 pyDev->install或者不用eclipse,直接下个pyCharm, 感觉也挺好用
2016-10-01
File "E:/download/untitled/baike/spider_main.py", line 33, in <module>
obj_spider.craw(root_url)
File "E:/download/untitled/baike/spider_main.py", line 20, in craw
new_urls,new_data=self.parser.parse(new_url,html_cont)
TypeError: 'NoneType' object is not iterable
obj_spider.craw(root_url)
File "E:/download/untitled/baike/spider_main.py", line 20, in craw
new_urls,new_data=self.parser.parse(new_url,html_cont)
TypeError: 'NoneType' object is not iterable