Thread: BUG #5434: select from xpath_table gives wrong results

BUG #5434: select from xpath_table gives wrong results

From
"Oleg"
Date:
The following bug has been logged online:

Bug reference:      5434
Logged by:          Oleg
Email address:      sacramento_fan@mail.ru
PostgreSQL version: 8.4.3
Operating system:   KUbuntu
Description:        select from xpath_table gives wrong results
Details:

Here is a code:

create table tmp_xml(rowId SERIAL UNIQUE NOT NULL,
                  xml_context xml,
                  CONSTRAINT PK_tmp_xml PRIMARY KEY (rowId));

insert into tmp_xml (xml_context)
values('<root>'
         '<obj obj_id="1" obj_name="First Object"/>'
         '<obj obj_id="2" obj_name="Second Object" obj_master="1"/>'
         '<obj obj_id="3" obj_name="Third Object"/>'
         '<obj obj_id="4" obj_name="Fourth Object" obj_master="1"/>'
       '</root>');

SELECT t.OBJ_ID, t.OBJ_NAME, t.OBJ_MASTER
FROM xpath_table('rowId',
                 'xml_context',
                'tmp_xml',

'/root/obj/@obj_id|/root/obj/@obj_name|/root/obj/@obj_master|',
                 'true')
        AS t(row_id integer,
             OBJ_ID integer,
             OBJ_NAME varchar(255),
             OBJ_MASTER INT
             );

I was sure, that result of this query should be like this:

obj_id | obj_name        | obj_master
----------------------------------------
   1   |   First Object  |
   2   |   Second Object |    1
   3   |   Third Object  |
   4   |   Fourth Object |    1

But instead of I got this result:

obj_id | obj_name        | obj_master
----------------------------------------
   1   |   First Object  |    1
   2   |   Second Object |    1
   3   |   Third Object  |
   4   |   Fourth Object |

Why is this so? Maybe I use xpath_table wrong way?

Re: BUG #5434: select from xpath_table gives wrong results

From
Robert Haas
Date:
On Thu, Apr 22, 2010 at 4:45 AM, Oleg <sacramento_fan@mail.ru> wrote:
>
> The following bug has been logged online:
>
> Bug reference: =A0 =A0 =A05434
> Logged by: =A0 =A0 =A0 =A0 =A0Oleg
> Email address: =A0 =A0 =A0sacramento_fan@mail.ru
> PostgreSQL version: 8.4.3
> Operating system: =A0 KUbuntu
> Description: =A0 =A0 =A0 =A0select from xpath_table gives wrong results
> Details:
>
> Here is a code:
>
> create table tmp_xml(rowId SERIAL UNIQUE NOT NULL,
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0xml_context xml,
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0CONSTRAINT PK_tmp_xml PRIMARY KEY (row=
Id));
>
> insert into tmp_xml (xml_context)
> values('<root>'
> =A0 =A0 =A0 =A0 '<obj obj_id=3D"1" obj_name=3D"First Object"/>'
> =A0 =A0 =A0 =A0 '<obj obj_id=3D"2" obj_name=3D"Second Object" obj_master=
=3D"1"/>'
> =A0 =A0 =A0 =A0 '<obj obj_id=3D"3" obj_name=3D"Third Object"/>'
> =A0 =A0 =A0 =A0 '<obj obj_id=3D"4" obj_name=3D"Fourth Object" obj_master=
=3D"1"/>'
> =A0 =A0 =A0 '</root>');
>
> SELECT t.OBJ_ID, t.OBJ_NAME, t.OBJ_MASTER
> FROM xpath_table('rowId',
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 'xml_context',
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0'tmp_xml',
>
> '/root/obj/@obj_id|/root/obj/@obj_name|/root/obj/@obj_master|',
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 'true')
> =A0 =A0 =A0 =A0AS t(row_id integer,
> =A0 =A0 =A0 =A0 =A0 =A0 OBJ_ID integer,
> =A0 =A0 =A0 =A0 =A0 =A0 OBJ_NAME varchar(255),
> =A0 =A0 =A0 =A0 =A0 =A0 OBJ_MASTER INT
> =A0 =A0 =A0 =A0 =A0 =A0 );
>
> I was sure, that result of this query should be like this:
>
> obj_id | obj_name =A0 =A0 =A0 =A0| obj_master
> ----------------------------------------
> =A0 1 =A0 | =A0 First Object =A0|
> =A0 2 =A0 | =A0 Second Object | =A0 =A01
> =A0 3 =A0 | =A0 Third Object =A0|
> =A0 4 =A0 | =A0 Fourth Object | =A0 =A01
>
> But instead of I got this result:
>
> obj_id | obj_name =A0 =A0 =A0 =A0| obj_master
> ----------------------------------------
> =A0 1 =A0 | =A0 First Object =A0| =A0 =A01
> =A0 2 =A0 | =A0 Second Object | =A0 =A01
> =A0 3 =A0 | =A0 Third Object =A0|
> =A0 4 =A0 | =A0 Fourth Object |
>
> Why is this so? Maybe I use xpath_table wrong way?

xpath_table() doesn't understand that the xpaths you're giving it are
interrelated.  It evaluates them independently of each other.  I
suppose /root/obj/@obj_master only matches twice, and
/root/obj/@obj_id matches four times.  xpath_table doesn't know or
care that the second match for the first expression and the second
match for the last expression are matching different parts of the
input text.

There has been some discussion of how to design an interface that
would allow users to avoid this rather surprising result, but I'm not
aware that anyone is working on it at present.

...Robert

Re: BUG #5434: select from xpath_table gives wrong results

From
Tom Lane
Date:
"Oleg" <sacramento_fan@mail.ru> writes:
> I was sure, that result of this query should be like this:

> obj_id | obj_name        | obj_master
> ----------------------------------------
>    1   |   First Object  |
>    2   |   Second Object |    1
>    3   |   Third Object  |
>    4   |   Fourth Object |    1

> But instead of I got this result:

> obj_id | obj_name        | obj_master
> ----------------------------------------
>    1   |   First Object  |    1
>    2   |   Second Object |    1
>    3   |   Third Object  |
>    4   |   Fourth Object |

Yeah, this is a known issue:
http://archives.postgresql.org/pgsql-hackers/2010-02/msg02424.php

It's not clear how to persuade libxml to do what we want here.

            regards, tom lane