结果只输出了源网址,然后就craw failed
代码对比的跟老师的一样了
代码对比的跟老师的一样了
2018-11-18
我和你的错误一样,去掉try块之后,显示html_parser中的get_text()有错误,
Traceback (most recent call last):
File "G:\eclipse-workspace(JAVAEE)\Python01\baike_spider\spider_main.py", line 41, in <module>
obj_spider.craw(root_url) #启动爬虫
File "G:\eclipse-workspace(JAVAEE)\Python01\baike_spider\spider_main.py", line 23, in craw
new_urls, new_data =self.parser.parse(new_url,html_cont)
File "G:\eclipse-workspace(JAVAEE)\Python01\baike_spider\html_parser.py", line 40, in parse
new_data = self._get_new_data(page_url,soup)
File "G:\eclipse-workspace(JAVAEE)\Python01\baike_spider\html_parser.py", line 27, in _get_new_data
res_data['title'] =title_node.get_text()
AttributeError: 'NoneType' object has no attribute 'get_text'
举报