2 回答
TA贡献1799条经验 获得超6个赞
您将错误的函数传递给回调,您的self.parse
函数只能在登录页面上使用。
if next_page is not None: yield response.follow(next_page, callback=self.start_scraping)
TA贡献1869条经验 获得超4个赞
这是来自您的执行日志:
File "C:\Users\Robert\Documents\Demos\vstoolbox\scrapytest\scrapytest\spiders\quotes_spider.py", line 15, in parse
yield scrapy.FormRequest(url=self.login_url,formdata={
File "C:\Users\Robert\anaconda3\envs\condatest\lib\site-packages\scrapy\http\request\form.py", line 31, in __init__
querystr = _urlencode(items, self.encoding)
File "C:\Users\Robert\anaconda3\envs\condatest\lib\site-packages\scrapy\http\request\form.py", line 71, in _urlencode
values = [(to_bytes(k, enc), to_bytes(v, enc))
File "C:\Users\Robert\anaconda3\envs\condatest\lib\site-packages\scrapy\http\request\form.py", line 71, in <listcomp>
values = [(to_bytes(k, enc), to_bytes(v, enc))
File "C:\Users\Robert\anaconda3\envs\condatest\lib\site-packages\scrapy\utils\python.py", line 104, in to_bytes
raise TypeError('to_bytes must receive a str or bytes '
TypeError: to_bytes must receive a str or bytes object, got NoneType
简而言之,它告诉您formdata参数中的参数是None,但预计它是“a str 或 bytes 对象”。鉴于您formdata有三个字段,只有一个是变量,token必须返回空。
...
token = response.css('input[name="csrf_token"]::attr(value)').extract_first()
yield scrapy.FormRequest(url=self.login_url,formdata={
'csrf_token':token,
'username':'roberthng',
'password':'dsadsadsa'
},callback = self.start_scraping)
但是,如果您位于登录页面,您的选择器会正确返回值。我的假设是,当您定义下一页的请求时,您正在将回调设置为您的parse方法(或者根本不设置它,这会导致parse默认)。我说假设,因为你没有发布那部分代码。您的代码示例停在这里:
#Go to Next Page:
next_page = response.css('li.next a::attr(href)').get()
if next_page is not None:
因此,请确保在此之后为请求正确设置回调函数。
添加回答
举报