为了账号安全,请及时绑定邮箱和手机立即绑定

无法在 .net 中下载网页

无法在 .net 中下载网页

C#
料青山看我应如是 2022-11-13 15:03:19
我做了一批解析 gearbest.com 的 html 页面以提取项目数据(示例链接link)。直到 2-3 周前网站更新后,它才有效。所以我无法下载要解析的页面,我也不知道为什么。在更新之前,我确实使用 HtmlAgilityPack 请求了以下代码。HtmlWeb web = new HtmlWeb();    HtmlDocument doc = null;    doc = web.Load(url); //now this the point where is throw the exception我在没有框架的情况下尝试过,并在请求中添加了一些日期HttpWebRequest request = (HttpWebRequest) WebRequest.Create("https://it.gearbest.com/tv-box/pp_009940949913.html");request.Credentials = CredentialCache.DefaultCredentials;request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36";request.ContentType = "text/html; charset=UTF-8";request.CookieContainer = new CookieContainer();request.Headers.Add("accept-language", "it-IT,it;q=0.9,en-US;q=0.8,en;q=0.7");request.Headers.Add("accept-encoding", "gzip, deflate, br");request.Headers.Add("upgrade-insecure-requests", "1");request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8";request.CookieContainer = new CookieContainer();Response response = request.GetResponse();  //exception例外是:IOException:无法从传输连接读取数据SocketException: 无法建立连接。如果我尝试请求主页 ( https://it.gearbest.com ) 它可以工作。你认为问题是什么?
查看完整描述

2 回答

?
九州编程

TA贡献1785条经验 获得超4个赞

出于某种原因,它不喜欢提供的用户代理。如果您省略设置UserAgent一切正常


HttpWebRequest request = (HttpWebRequest) WebRequest.Create("https://it.gearbest.com/tv-box/pp_009940949913.html");

request.Credentials = CredentialCache.DefaultCredentials;

//request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36";

request.ContentType = "text/html; charset=UTF-8";

另一种解决方案是设置request.Connection为随机字符串(但不是keep-aliveor close)


request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36";

request.Connection = "random value";

它也有效,但我无法解释原因。


查看完整回答
反对 回复 2022-11-13
?
翻翻过去那场雪

TA贡献2065条经验 获得超13个赞

也许值得尝试一下...


HttpRequest.KeepAlive = false; 

HttpRequest.ProtocolVersion = HttpVersion.Version10;


查看完整回答
反对 回复 2022-11-13
  • 2 回答
  • 0 关注
  • 104 浏览

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信