Fix handling of malformed/unusual HTML (#34201)

This commit is contained in:
Claire 2025-03-18 15:50:41 +01:00 committed by KMY
parent b84d2fc860
commit 21812a0a78
4 changed files with 38 additions and 8 deletions

View file

@ -16,7 +16,15 @@ class PlainTextFormatter
if local?
text
else
node = Nokogiri::HTML5.fragment(insert_newlines)
begin
node = Nokogiri::HTML5.fragment(insert_newlines)
rescue ArgumentError
# This can happen if one of the Nokogumbo limits is encountered
# Unfortunately, it does not use a more precise error class
# nor allows more graceful handling
return ''
end
# Elements that are entirely removed with our Sanitize config
node.xpath('.//iframe|.//math|.//noembed|.//noframes|.//noscript|.//plaintext|.//script|.//style|.//svg|.//xmp').remove
node.text.chomp