|
Hi, thanks for reporting this issue. The problem is caused by some irritating behaviour of the JSoup library.
More specifically, we make a call to JSoup#isValid() which, during parsing, unfortunately removes any tags which are not expected directly beneath a body tag. As td tags are supposed to be wrapped by a table tag, they're removed prior to validation, causing the validation to "succeed" which of course is not what we want.
To fix the issue, we can parse the input as "fragment" and add it to an empty document shell ourselves. This keeps the complete structure of the original input and consequently fails the validation of your example input, just as expected. I'm about to prepare a pull request.
|