Skip to content

Commit 95a0978

Browse files
committed
Fix hasSingleTagInsideElement method
It would fail for e.g. `<div> <p>foo</p> </div>`. mozilla/readability uses children for the tag lookup, which return only elements. PHP does not have children property so b580cf2 mistakenly used `childNodes` instead, but that can return any node type. Let’s filter the children ourselves. Also add comments from mozilla/readability’s `_hasSingleTagInsideElement`.
1 parent 8d17a88 commit 95a0978

File tree

1 file changed

+13
-6
lines changed

1 file changed

+13
-6
lines changed

src/Readability.php

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1536,16 +1536,23 @@ private function isPhrasingContent($node): bool
15361536
);
15371537
}
15381538

1539+
/**
1540+
* Checks if `$node` has only whitespace and a single element with `$tag` for the tag name.
1541+
* Returns false if `$node` contains non-empty text nodes
1542+
* or if it contains no element with given tag or more than 1 element.
1543+
*/
15391544
private function hasSingleTagInsideElement(JSLikeHTMLElement $node, string $tag): bool
15401545
{
1541-
if (1 !== $node->childNodes->length || $node->childNodes->item(0)->nodeName !== $tag) {
1542-
return false;
1546+
$childNodes = iterator_to_array($node->childNodes);
1547+
$children = array_filter($childNodes, fn ($childNode) => $childNode instanceof \DOMElement);
1548+
1549+
// There should be exactly 1 element child with given tag
1550+
if (1 !== \count($children) || $children[0]->nodeName !== $tag) {
1551+
return null;
15431552
}
15441553

1545-
$a = array_filter(
1546-
iterator_to_array($node->childNodes),
1547-
fn ($childNode) => $childNode instanceof \DOMText && preg_match($this->regexps['hasContent'], $this->getInnerText($childNode))
1548-
);
1554+
// And there should be no text nodes with real content
1555+
$a = array_filter($childNodes, fn ($childNode) => $childNode instanceof \DOMText && preg_match($this->regexps['hasContent'], $this->getInnerText($childNode)));
15491556

15501557
return 0 === \count($a);
15511558
}

0 commit comments

Comments
 (0)