Re: Fix XML handling with DOCTYPE - Mailing list pgsql-hackers

From Chapman Flack
Subject Re: Fix XML handling with DOCTYPE
Date
Msg-id 5C8ECAA4.3090301@anastigmatix.net
Whole thread Raw
In response to Re: Fix XML handling with DOCTYPE  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Fix XML handling with DOCTYPE
List pgsql-hackers
On 03/17/19 15:06, Tom Lane wrote:
> The error message issue is indeed a concern, but I don't see why it's
> complicated if you agree that
> 
>> If the query asked for CONTENT, any error result should be one you could get
>> when parsing as CONTENT.
> 
> That just requires us to save the first error message and be sure to issue
> that one not the DOCUMENT one, no?

I confess I haven't looked hard yet at how to do that. The way errors come
out of libxml is more involved than "here's a message", so there's a choice
of (a) try to copy off that struct in a way that's sure to survive
re-executing the parser, and then use the copy, or (b) generate a message
right away from the structured information and save that, and I guess b
might not be too bad; a might not be too bad, or it might, and slide right
back into the kind of libxml-behavior-assumptions you're wanting to avoid.

Meanwhile, here is a patch on the lines I proposed earlier, with a
pre-check. Any performance hit that it could entail (which I'd really
expect to be de minimis, though I haven't benchmarked) ought to be
compensated by the strlen I changed to strnlen in parse_xml_decl (as
there's really no need to run off and count the whole rest of the input
just to know if 1, 2, 3, or 4 bytes are available to decode a UTF-8 char).

... and, yes, I know that could be an independent patch, and then the
performance effect here should be measured from there. But it was near
what I was doing anyway, so I included it here.

Attaching both still-outstanding patches (this one and docfix) so the
CF app doesn't lose one.

Regards,
-Chap

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Performance issue in foreign-key-aware join estimation
Next
From: Thomas Munro
Date:
Subject: Re: Rare SSL failures on eelpout