Re: [HACKERS] patch: function xmltable - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: [HACKERS] patch: function xmltable
Date
Msg-id CAFj8pRCM5Fq4mShcTypcsyLtatejUf=fiO1NmQnVcWgykxJB7A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] patch: function xmltable  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: [HACKERS] patch: function xmltable  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers
Hi

2017-03-02 1:12 GMT+01:00 Alvaro Herrera <alvherre@2ndquadrant.com>:

I've been giving this a look.  I started by tweaking the docs once
again, and while verifying that the example works as expected, I
replayed what I have in sgml:

... begin SGML paste ...
    <para>
     For example, given the following XML document:
  <screen><![CDATA[
<ROWS>
  <ROW id="1">
    <COUNTRY_ID>AU</COUNTRY_ID>
    <COUNTRY_NAME>Australia</COUNTRY_NAME>
  </ROW>
  <ROW id="5">
    <COUNTRY_ID>JP</COUNTRY_ID>
    <COUNTRY_NAME>Japan</COUNTRY_NAME>
    <PREMIER_NAME>Sinzo Abe</PREMIER_NAME>
  </ROW>
  <ROW id="6">
    <COUNTRY_ID>SG</COUNTRY_ID>
    <COUNTRY_NAME>Singapore</COUNTRY_NAME>
    <SIZE unit="km">791</SIZE>
  </ROW>
</ROWS>
]]></screen>

     the following query produces the result shown below:

<screen><![CDATA[
SELECT  xmltable.*
  FROM (SELECT data FROM xmldata) x,
        LATERAL xmltable('//ROWS/ROW'
                         PASSING data
                         COLUMNS id int PATH '@id',
                                 ordinality FOR ORDINALITY,
                                 country_name text PATH 'COUNTRY_NAME',
                                 country_id text PATH 'COUNTRY_ID',
                                 size float PATH 'SIZE[@unit = "km"]/text()',
                                 unit text PATH 'SIZE/@unit',
                                 premier_name text PATH 'PREMIER_NAME' DEFAULT 'not specified');
... end SGML paste ...


But the query doesn't actually return a table, but instead it fails with
this error:
  ERROR:  invalid input syntax for type double precision: ""
This is because of the "size" column (if I remove SIZE from the COLUMNS
clause, the query returns correctly).  Apparently, for the rows where
SIZE is not given, we try to inssert an empty string instead of a NULL
value, which is what I expected.

I'm using your v44 code, but trimmed both the XML document used in SGML
as well as modified the query slightly to show additional features.  But
those changes should not cause the above error ...

The example in doc is obsolete. Following example works without problems.

 SELECT xmltable.*
  FROM (SELECT data FROM xmldata) x,       LATERAL xmltable('//ROWS/ROW'                        PASSING data                        COLUMNS id int PATH '@id',                                ordinality FOR ORDINALITY,                                country_name text PATH 'COUNTRY_NAME',                                country_id text PATH 'COUNTRY_ID',                                size float PATH 'SIZE[@unit = "km"]',                                unit text PATH 'SIZE/@unit',                                premier_name text PATH 'PREMIER_NAME' DEFAULT 'not specified');

It is related to older variants of this patch, where I explicitly mapped empty strings to NULL. 

Now, I don't do it - I use libxml2 result with following mapping

No tag ... NULL
empty tag ... empty string

Important question is about mapping empty tags to Postgres. I prefer current behave, because I have a possibility to differ between these states on application level. If we returns NULL for empty tag, then there will not be possible detect if XML has tag (although empty) or not. The change is simple - just one row - but I am thinking so current behave is better. There is possible risk of using /text() somewhere - it enforce a empty tag with all negative impacts.

I prefer to fix doc in conformance with regress tests and append note about mapping these corner cases from XML to relations.

What do you think about it?

Regards

Pavel









--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

Previous
From: Corey Huinker
Date:
Subject: Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands:\quit_if, \quit_unless)
Next
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] user mapping messages