2 回答
TA贡献1828条经验 获得超3个赞
您确实是对的,问题来自换行符。如您所见,您在使用时Printf
没有添加任何内容\n
,并且在输出的开头添加了一个,在输出的结尾添加了一个。
您可以使用strings.Trim
删除这些换行符。这是一个使用您尝试解析的站点地图的示例。修剪字符串后,您将能够http.Get
毫无错误地调用它。
func main() {
var s SitemapIndex
xml.Unmarshal(bytes, &s)
for _, Location := range s.Locations {
loc := strings.Trim(Location.Loc, "\n")
fmt.Printf("Location: %s\n", loc)
}
}
如预期的那样,此代码正确输出没有任何换行符的位置:
Location: https://www.washingtonpost.com/news-sitemaps/politics.xml
Location: https://www.washingtonpost.com/news-sitemaps/opinions.xml
Location: https://www.washingtonpost.com/news-sitemaps/local.xml
Location: https://www.washingtonpost.com/news-sitemaps/sports.xml
Location: https://www.washingtonpost.com/news-sitemaps/national.xml
Location: https://www.washingtonpost.com/news-sitemaps/world.xml
Location: https://www.washingtonpost.com/news-sitemaps/business.xml
Location: https://www.washingtonpost.com/news-sitemaps/technology.xml
Location: https://www.washingtonpost.com/news-sitemaps/lifestyle.xml
Location: https://www.washingtonpost.com/news-sitemaps/entertainment.xml
Location: https://www.washingtonpost.com/news-sitemaps/goingoutguide.xml
字段中有这些换行符的原因Location.Loc是此 URL 返回的 XML。条目遵循这种形式:
<sitemap>
<loc>
https://www.washingtonpost.com/news-sitemaps/goingoutguide.xml
</loc>
</sitemap>
正如您所看到的,元素中的内容前后都有换行符loc。
TA贡献1859条经验 获得超6个赞
查看修改代码中嵌入的注释以描述和修复问题
func main() {
resp, _ := http.Get("https://www.washingtonpost.com/news-sitemaps/index.xml")
bytes, _ := ioutil.ReadAll(resp.Body)
var s SitemapIndex
xml.Unmarshal(bytes, &s)
for _, Location := range s.Locations {
// Note that %v shows that there are indeed newlines at beginning and end of Location.Loc
fmt.Printf("Location: (%v)", Location.Loc)
// solution: use strings.TrimSpace to remove newlines from Location.Loc
resp, err := http.Get(strings.TrimSpace(Location.Loc))
fmt.Println("resp", resp)
fmt.Println("err", err)
}
}
- 2 回答
- 0 关注
- 186 浏览
添加回答
举报