scrapy-redis框架中,reids存储的xxx:requests已经爬取完毕,但程序仍然一直运行,如何自动停止程序,而不是一直在空跑?2017-07-0309:17:06[scrapy.extensions.logstats]INFO:Crawled0pages(at0pages/min),scraped0items(at0items/min)2017-07-0309:18:06[scrapy.extensions.logstats]INFO:Crawled0pages(at0pages/min),scraped0items(at0items/min)[仅供参考]可以通过engine.close_spider(spider,'reason')来停止程序的运行。#schedluer.pydefnext_request(self):block_pop_timeout=self.idle_before_closerequest=self.queue.pop(block_pop_timeout)ifrequestandself.stats:self.stats.inc_value('scheduler/dequeued/redis',spider=self.spider)ifrequestisNone:self.spider.crawler.engine.close_spider(self.spider,'queueisempty')returnrequest#当然也可以在scrapy_redis中spiders.py模块defnext_requests(self):"""Returnsarequesttobescheduledornone."""use_set=self.settings.getbool('REDIS_START_URLS_AS_SET',defaults.START_URLS_AS_SET)fetch_one=self.server.spopifuse_setelseself.server.lpop#XXX:Doweneedtouseatimeouthere?found=0#TODO:Useredispipelineexecution.whilefound
添加回答
举报
0/150
提交
取消