Re: BUG #8469: Xpath behaviour unintuitive / arguably wrong - Mailing list pgsql-bugs

From Bruce Momjian
Subject Re: BUG #8469: Xpath behaviour unintuitive / arguably wrong
Date
Msg-id 20131002161946.GB5960@momjian.us
Whole thread Raw
In response to BUG #8469: Xpath behaviour unintuitive / arguably wrong  (dennis.noordsij@helsinki.fi)
Responses Re: BUG #8469: Xpath behaviour unintuitive / arguably wrong  (Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>)
List pgsql-bugs
On Tue, Sep 24, 2013 at 06:43:19PM +0000, dennis.noordsij@helsinki.fi wrote:
> The following bug has been logged on the website:
>
> Bug reference:      8469
> Logged by:          Dennis
> Email address:      dennis.noordsij@helsinki.fi
> PostgreSQL version: 9.3.0
> Operating system:   FreeBSD 9.2-RC4
> Description:
>
> Hi,
>
>
> After upgrading an 8.1 version to 9.3.0 I am suddenly seeing text fields
> containing "&" where they are populated from XML. This may be a
> coincidence and the problem may have existed earlier, in any case, now I
> noticed.
>
>
> I extract the text content of XML nodes using xpath, from something like:
>
>
> <name>Jones & Smith</name>
>
>
> The reason I end up with "&" is the IMHO rather odd xpath behaviour:
>
>
> # select xpath('/a/text()', (select xmlelement(name "a", 'A & B')));
>
>
>      xpath
> ---------------
>  {"A & B"}
>
>
> The canonical contents of "a" is "A & B". At first search I've found some
> rather heated debates about this with bits of name calling; I certainly do
> not want to get into that and I apologize in advance to those who feel very
> strongly about this.
>
>
> I've seen one "fix" describe the problem as:
>
>
> ""DESCRIPTION: Submitter invokes following statement:
> SELECT (XPATH('/*/text()', '<root><</root>'))[1].
> He expect (escaped) result "<", but gets "<"
> """
>
>
> With respect, this "bug" makes no sense as this produces in fact the right
> result. The actual value of <root> is "<", it's just escaped when serialized
> to XML. If <root> were to actually contain "<", it'd be serialized as
> "&lt;". It should not be possible to be blindly cast to a text type, but
> explicitly serialized as such.
>
>
> At least the reviewer at:
>
>
> http://www.postgresql.org/message-id/201106291934.23089.rsmogura@softperience.eu

There are two other similar bug reports on this from February and March
of this year:

    http://www.postgresql.org/message-id/E1U1FKL-0002rD-RO@wrihigleys.postgresql.org
    http://www.postgresql.org/message-id/E1UHyUw-0001oj-HE@wrigleys.postgresql.org

Someone who knows XML needs to take leadership on this and propose a
patch.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

pgsql-bugs by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: BUG #8467: Slightly confusing pgcrypto example in docs
Next
From: Bruce Momjian
Date:
Subject: Re: pg_upgrade 9.0->9.2 failure: Mismatch of relation OID in database