3 回答

TA贡献2051条经验 获得超10个赞
Document document = Jsoup.parse("<div id=\"AE9D7F630640426F8457A661607D2B8E-5-2\" style=\"display: none;\" class=\"kbcontent\"> 19040172b-1 <br>SQL Server开发 <br> <font title=\"老师\">郑尚</font> <br> <font title=\"周次(节次)\">3-5,7-14(周)</font> <br> <font title=\"教室\">东区综合楼D-101</font> <br> </div>");
System.out.println(document.text());
Output:19040172b-1 SQL Server开发 郑尚 3-5,7-14(周) 东区综合楼D-101
不知道是否满足楼主的需求?
Document document = Jsoup.parse("<div id=\"AE9D7F630640426F8457A661607D2B8E-5-2\" style=\"display: none;\" class=\"kbcontent\"> 19040172b-1 <br>SQL Server开发 <br> <font title=\"老师\">郑尚</font> <br> <font title=\"周次(节次)\">3-5,7-14(周)</font> <br> <font title=\"教室\">东区综合楼D-101</font> <br> </div>");
Element div = document.getElementById("AE9D7F630640426F8457A661607D2B8E-5-2");
TextNode n1 = (TextNode) div.childNode(0);
System.out.println(n1.text()); // 19040172b-1
TextNode n2 = (TextNode) div.childNode(2);
System.out.println(n2.text()); // SQL Server开发
// ...
如果楼主的格式是固定的直接像上面这样解析HTML会比较好一些,不需要REGEX。

TA贡献1815条经验 获得超10个赞
String html = "<div id=\"AE9D7F630640426F8457A661607D2B8E-5-2\" style=\"display: none;\" class=\"kbcontent\"> 19040172b-1 <br>SQL Server Develop <br> <font title=\"teacher\">zheng</font> <br> <font title=\"week\">3-5,7-14</font> <br> <font title=\"classroom\">D-101</font> <br> </div> ";
html = html.replaceAll("<br>", "#~#");
Document doc = Jsoup.parse(html.toString());
String newHtml = doc.text();
String[] ary = newHtml.split("#~#");
for (int i = 0;i < ary.length;i++){
System.out.println(ary[i]);
}
添加回答
举报