爬取失败,Spider_main和outputer模块出现问题
问题如下
craw 1 :https://baike.baidu.com/item/Python/407313?fr=aladdin
Traceback (most recent call last):
File "C:\Users\Administrator\workspace\python_spider\spider_main.py", line 48, in <module>
obj_spider.craw(root_url)
File "C:\Users\Administrator\workspace\python_spider\spider_main.py", line 37, in craw
self.outputer.output_html()
File "C:\Users\Administrator\workspace\python_spider\html_outputer.py", line 28, in output_html
fout.write("<td>%s</td>"% data['summary'])
UnicodeEncodeError: 'gbk' codec can't encode character u'\xa0' in position 14: illegal multibyte sequence