Re: Patch Review: Bugfix for XPATH() if text or attribute nodes are selected - Mailing list pgsql-hackers

From Radosław Smogura
Subject Re: Patch Review: Bugfix for XPATH() if text or attribute nodes are selected
Date
Msg-id 86d1764e9c7e6020bf2872d3e98789c3@mail.softperience.eu
Whole thread Raw
In response to Re: Patch Review: Bugfix for XPATH() if text or attribute nodes are selected  (Florian Pflug <fgp@phlo.org>)
Responses Re: Patch Review: Bugfix for XPATH() if text or attribute nodes are selected  (Florian Pflug <fgp@phlo.org>)
List pgsql-hackers
On Tue, 12 Jul 2011 11:45:59 +0200, Florian Pflug wrote:
> On Jul12, 2011, at 11:00 , Radosław Smogura wrote:
>> On Sun, 10 Jul 2011 17:06:22 -0500, Robert Haas wrote:
>>> Unless I am missing something, Florian  is clearly correct here.
>> For me not, because this should be fixed internally by making xml 
>> type sefe
>
> Huh??. Making the xml type safe is *exactly* what I'm trying to do 
> here...
>
>> currently xml type may be used to keep proper XMLs and any kind of 
>> data, as well.
>
> As I pointed out before, that simply isn't true. Try storing
> non-well-formed data into an XML column (there *are* ways to do
> that, i.e. there are bugs, one if which I'm trying to fix here!)
> and then dump and (try to) reload your database. Ka-wooooom!
>
>> If I ask, by any means select xpath(/text(...))..... I want to get 
>> text.
>
> And I want '3' || '4' to return the integer 34. Though luck! The fact 
> that
> XPATH() is declared to return XML, *not* TEXT means you don't get 
> what you
> want. Period. Feel free to provide a patch that adds a function 
> XPATH_TEXT
> if you feel this is an issue.
>
> XML *is* *not* simply an alias for TEXT! It's a distinct type, which 
> its
> down distinct rules about what constitutes a valid value and what 
> doesn't.
>
>> 1) How I should descape node in client application (if it's part of 
>> xml I don't have header), bear in mind XML must give support for 
>> streaming processing too.
>
> Huh?
>
>> 2) Why I should differntly treat text() then select from varchar in 
>> both I ask for xml, driver can't make this, because it doesn't know if 
>> it gets scalar, text, comment, element, or maybe document.
>
>> 3) What about current applications, folks probably uses this and are 
>> happy they get text, and will not see, that next release of PostgreSQL 
>> will break their applications.
>
> That, and *only* that, I recognize as a valid concern. However, and 
> *again*
> as I have pointer out before a *multiple* of times, backwards 
> compatibility
> is no excuse not to fix bugs. Plus, there might just as well be 
> applications
> which feed the contents of XML columns directly into a XML parser (as 
> they
> have every right to!) and don't expect that parser to throw an error. 
> Which,
> as it stands, we cannot guarantee. Having to deal with an error there 
> is akin
> to having to deal with integer columns containing 'foobar'!
Bugs must be resolved in smart way, especially if they changes behaviour, with consideration of impact change will
produce,removing support for xml resolves this bug as well. I've said problem should be resolved in different way.
 

>> There is of course disadvantage of current behaviour as it may lead 
>> to inserting badly xmls (in one case), but I created example when auto 
>> escaping will create double escaped xmls, and may lead to insert 
>> inproper data (this is about 2nd patch where Florian add escaping, 
>> too).
>>
>> SELECT XMLELEMENT(name root, XMLATTRIBUTES(foo.namespace AS sth)) 
>> FROM (SELECT
>> (XPATH('namespace-uri(/*)', x))[1] AS namespace FROM (VALUES 
>> (XMLELEMENT(name
>> "root", XMLATTRIBUTES('<n' AS xmlns, '<v' AS value),'<t'))) v(x)) as 
>> foo;
>>
>>       xmlelement
>> -------------------------
>> <root sth="&lt;n"/>
>
> Radosław, you've raised that point before, and I refuted it. The 
> crucial
> difference is that double-escaped values are well-formed, where as 
> un-escaped
> ones are not.
>
> Again, as I said before, the double-escaping done by XMLATTRIBUTES 
> there is
> not pretty. But its *not* XPATH()'s fault!. To see that, simply 
> replace your
> XPATH() expression with '<n'::xml to see that.
>
> And in fact
>
>> It can't be resolved without storing type in xml or adding xmltext 
>> or adding pseudo xmlany element, which will be returned by xpath.
>
> Huh?
>
> Frankly, Radosław, I get the feeling that you're not trying to 
> understand my
> answers to your objections, but instead keep repeating the same 
> assertions
> over and over again. Even though at least some of them, like XML
> being able to
> store arbitrary values, are simply wrong! And I'm getting pretty
> tired of this...
> So far, you also don't seem to have taken a single look at the actual
> implementation of the patch, even though code review is an supposed 
> to be an
> integral part of the patch review process. I therefore don't believe
> that we're
> getting anywhere here.So far, you don't know if I taken a single look, your suspicious are wrong, and You try to
blameme. All of your sentences about "do not understanding" I may sent to you, and blame you with your words.
 

> So please either start reviewing the actual implementation, and leave
> the considerations about whether we want this or not to the eventual
> committer.
> Or, if you don't want to do that for one reason or another, pleaser 
> consider
> letting somebody else take over this review, i.e. consider removing 
> your name
> from the "Reviewer" field.
If I do review I may put my comments, but "I get the feeling that you're not trying to understand myanswers to your
objections,but instead keep repeating the same assertions over and over again." - and in patch there is review of
code.Soplease either "stop" responding to my objections "and leave the considerations about whether we want this or not
tothe eventual committer."
 
And consider this new assertions...I store XML document in XML (escaped text, why not?), then recreate it by xpath and
insert,with Your patch, I will not insert document but escaped text, which is wrong XML, and ALL your objections about
currentbuggy not-escaping will FULLY apply to this situation.It's ping-pong, and this patch moves problem from one
cornerto other.
 
For me this discussion is over. I putted my objections and suggestions. Full review is available in archives, and why
tonot escape is putted in review of your 2nd patch, about scalar values.
 
Regards,Radek



pgsql-hackers by date:

Previous
From: Alban Hertroys
Date:
Subject: Re: [GENERAL] Creating temp tables inside read only transactions
Next
From: Robert Haas
Date:
Subject: Re: dropping table in testcase alter_table.sql