含有中文的url不能download,
包含中文的url都不能download,,,求解


包含中文的url都不能download,,,求解


2019-01-12
import urllib.request
from urllib.parse import quote
import string
class HtmlDownloader(object):
def download(self,url):
if url is None:
return None
s=quote(url,safe=string.printable)
response=urllib.request.urlopen(s)
if response.getcode()!=200:
return None
return response.read()
urllib.quote 解决Python传递中文参数给URL
def _get_new_urls(self, page_url, soup):
new_urls = set()
#<a target="_blank" href="/item/%E9%98%BF%E5%A7%86%E6%96%AF%E7%89%B9%E4%B8%B9/2259975" data-lemmaid="2259975">阿姆斯特丹</a>
#https: // baike.baidu.com / item / 阿姆斯特丹 / 2259975
links = soup.find_all('a',href=re.compile(r"/item/"))
for link in links:
new_url = '/item/'+link.get_text()
new_full_url = urlparse.urljoin(page_url,new_url)
new_urls.add(new_full_url)
return new_urls我也是这么写的,有哪里写错了吗?
举报