Re: Fix XML handling with DOCTYPE - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Fix XML handling with DOCTYPE
Date
Msg-id 25228.1552849567@sss.pgh.pa.us
Whole thread Raw
In response to Re: Fix XML handling with DOCTYPE  (Chapman Flack <chap@anastigmatix.net>)
Responses Re: Fix XML handling with DOCTYPE
List pgsql-hackers
Chapman Flack <chap@anastigmatix.net> writes:
> On 03/17/19 13:16, Tom Lane wrote:
>> Do we need a pre-scan at all?

> Without it, we double the time to a failure result in every case that
> should actually fail, as well as in this one corner case that we want to
> see succeed, and the question you posed earlier about which error message
> to return becomes thornier.

I have absolutely zero concern about whether it takes twice as long to
detect bad input; nobody should be sending bad input if they're concerned
about performance.  (The costs of the ensuing transaction abort would
likely dwarf xml_in's runtime in any case.)  Besides, with what we're
talking about doing here,

(1) the extra runtime is consumed only in cases that would fail up to now,
so nobody can complain about a performance regression;
(2) doing a pre-scan *would* be a performance regression for cases that
work today; not a large one we hope, but still...

The error message issue is indeed a concern, but I don't see why it's
complicated if you agree that

> If the query asked for CONTENT, any error result should be one you could get
> when parsing as CONTENT.

That just requires us to save the first error message and be sure to issue
that one not the DOCUMENT one, no?  That's what we'd want to do from a
backwards-compatibility standpoint anyhow, since that's the error message
wording you'd get with today's code.

            regards, tom lane


pgsql-hackers by date:

Previous
From: "Jonathan S. Katz"
Date:
Subject: Re: jsonpath
Next
From: Tom Lane
Date:
Subject: Re: Unduly short fuse in RequestCheckpoint