Fixing Unclosed Html Tags
I am working on some blog layout and I need to create an abstract of each post (say 15 of the lastest) to show on the homepage. Now the content I use is already formatted in html t
Solution 1:
There are lots of methods that can be used:
- Use a proper HTML parser, like DOMDocument
- Use PHP Tidy to repair the un-closed tag
- Some would suggest HTML Purifier
Solution 2:
As ajreal said, DOMDocument is a solution.
Example :
$str = "
<html><head><title>test</title></head><body><p>error</i></body></html>
";
$doc = new DOMDocument();
@$doc->loadHTML($str);
echo $doc->saveHTML();
Advantage : natively included in PHP, contrary to PHP Tidy.
Solution 3:
You can use DOMDocument to do it, but be careful of string encoding issues. Also, you'll have to use a complete HTML document, then extract the components you want. Here's an example:
functionmake_excerpt ($rawHtml, $length = 500) {
// append an ellipsis and "More" link$content = substr($rawHtml, 0, $length)
. '… <a href="/link-to-somewhere">More ></a>';
// Detect the string encoding$encoding = mb_detect_encoding($content);
// pass it to the DOMDocument constructor$doc = new DOMDocument('', $encoding);
// Must include the content-type/charset meta tag with $encoding// Bad HTML will trigger warnings, suppress those
@$doc->loadHTML('<html><head>'
. '<meta http-equiv="content-type" content="text/html; charset='
. $encoding . '"></head><body>' . trim($content) . '</body></html>');
// extract the components we want$nodes = $doc->getElementsByTagName('body')->item(0)->childNodes;
$html = '';
$len = $nodes->length;
for ($i = 0; $i < $len; $i++) {
$html .= $doc->saveHTML($nodes->item($i));
}
return$html;
}
$html = "<p>.......................</p>
<p>...........
<p>............</p>
<p>...........| 500 chars";
// output fixed htmlecho make_excerpt($html, 500);
Outputs:
<p>.......................</p><p>...........
</p><p>............</p><p>...........| 500 chars… <ahref="/link-to-somewhere">More ></a></p>
If you are using WordPress you should wrap the substr()
invocation in a call to wpautop
- wpautop(substr(...))
. You may also wish to test the length of the $rawHtml passed to the function, and skip appending the "More" link if it isn't long enough.
Post a Comment for "Fixing Unclosed Html Tags"