运行结果只爬了一次,然后就结束了,之后去掉try块,报如下的错误。
我和你的错误一样,去掉try块之后,显示html_parser中的get_text()有错误,
Traceback (most recent call last):
File "G:\eclipse-workspace(JAVAEE)\Python01\baike_spider\spider_main.py", line 41, in <module>
obj_spider.craw(root_url) #启动爬虫
File "G:\eclipse-workspace(JAVAEE)\Python01\baike_spider\spider_main.py", line 23, in craw
new_urls, new_data =self.parser.parse(new_url,html_cont)
File "G:\eclipse-workspace(JAVAEE)\Python01\baike_spider\html_parser.py", line 40, in parse
new_data = self._get_new_data(page_url,soup)
File "G:\eclipse-workspace(JAVAEE)\Python01\baike_spider\html_parser.py", line 27, in _get_new_data
res_data['title'] =title_node.get_text()
AttributeError: 'NoneType' object has no attribute 'get_text'