On 2024-07-07 22:43 +0200, Tom Lane wrote:
> As far as the errcontext changes go: I think we have to just bite
> the bullet and accept them. It looks like 2.13 has a completely
> different mechanism than prior versions for deciding when to issue
> XML_ERR_NOT_WELL_BALANCED. And it's not even clear that it's wrong;
> for example, in our first failing case
>
> DETAIL: line 1: xmlParseEntityRef: no name
> <invalidentity>&</invalidentity>
> ^
> -line 1: chunk is not well balanced
> -<invalidentity>&</invalidentity>
> - ^
>
> it's kind of hard to argue that the chunk isn't well-balanced.
>
> So we can either suppress errdetails from the expected output,
> or set up an additional expected-file. I'm leaning to the
> "\set VERBOSITY terse" solution.
+1 for \set VERBOSITY terse as a last resort.
But it looks to me as if "chunk is not well balanced" is just noise
because libxml2 reports more specific errors before that. For example:
SELECT xmlparse(content '<twoerrors>&idontexist;</unbalanced>');
ERROR: invalid XML content
DETAIL: line 1: Entity 'idontexist' not defined
<twoerrors>&idontexist;</unbalanced>
^
line 1: Opening and ending tag mismatch: twoerrors line 1 and unbalanced
<twoerrors>&idontexist;</unbalanced>
^
line 1: chunk is not well balanced
<twoerrors>&idontexist;</unbalanced>
^
Here, "Opening and ending tag mismatch" already covers the unbalanced
closing tag.
So how about just ignoring XML_ERR_NOT_WELL_BALANCED like in the
attached? This also adds test cases for an unclosed tag because I
wanted to see if I can trigger just "chunk is not well balanced", but
without success.
SELECT xmlparse(content '<unclosed>');
ERROR: invalid XML content
DETAIL: line 1: Premature end of data in tag unclosed line 1
<unclosed>
^
line 1: chunk is not well balanced
<unclosed>
^
libxml2 2.13 doesn't report "chunk ..." here either.
There's also this more explicit test case for unbalanced tags:
<parent><child></parent></child>
But I'm not sure if that's really necessary if we already have:
<twoerrors>&idontexist;</unbalanced>
The error messages are the same, except for the additional entity error.
--
Erik