1 回答
TA贡献1812条经验 获得超5个赞
这很尴尬,因为当您使用它时,它将尝试修复原始文档中的HTML。这创建了一个结构,它不是你所认为的。loadHTML()
但是,如果您有文档的基本大纲,则以下内容会将标记的内容复制到新文档(代码中的注释)...<body>
$html = '<html><body><div>Content1</div></body></html>
<html><body><div>Content2</div></body></html>
<html><body><div>Content3</div></body></html>';
libxml_use_internal_errors(true);
$newDom = new DOMDocument();
// New document with final code
$newBody = new DOMDocument();
$newDom->loadHTML(mb_convert_encoding($html, 'HTML-ENTITIES', 'UTF-8'));
// Set up basic template for new doucument
$newBody->loadHTML("<html><body /></html>");
// Find where to add any new content
$addBody = $newBody->getElementsByTagName("body")[0];
// Find the existing content to add
$bodyTags = $newDom->getElementsByTagName("body");
foreach($bodyTags as $body) {
// Add all of the contents of the <body> tag into the new document
foreach ( $body->childNodes as $node ) {
// Import the node to copy to the new document and add it in
$addBody->appendChild($newBody->importNode($node, true));
}
}
echo $newBody->saveHTML();
这给了...
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><div>Content1</div><div>Content2</div><div>Content3</div></body></html>
限制是不会保留标记之外的任何内容和标记的任何属性。<body><body>
- 1 回答
- 0 关注
- 129 浏览
添加回答
举报