Thread: Using Lateral

Using Lateral

From
Johann Spies
Date:
In the past I could use this in a query:

    SELECT
        DISTINCT ut,
        CASE
            WHEN xpath_exists ('//t:address_spec/t:country/text()',
                q.address_spec,
                p.ns)
            THEN unnest (xpath ('//t:address_spec/t:country/text()',
                    q.address_spec,
                    p.ns))::citext
            ELSE NULL
        END country,

No longer.  The error message suggests I should use a lateral query.
But I could not figure out in the documentation how to get the same
result using a "lateral" construct.

Just selecting "unnest(...)" gives the wrong result because if the
xpath does not exist all the other parts of the query (like 'ut' in
this case) also are empty.

Regards
Johann

-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


Re: Using Lateral

From
Paul Jungwirth
Date:
On 03/27/2018 03:22 AM, Johann Spies wrote:
> In the past I could use this in a query:
> 
>      SELECT
>          DISTINCT ut,
>          CASE
>              WHEN xpath_exists ('//t:address_spec/t:country/text()',
>                  q.address_spec,
>                  p.ns)
>              THEN unnest (xpath ('//t:address_spec/t:country/text()',
>                      q.address_spec,
>                      p.ns))::citext
>              ELSE NULL
>          END country,
> 
> No longer.  The error message suggests I should use a lateral query.
> But I could not figure out in the documentation how to get the same
> result using a "lateral" construct.
> 
> Just selecting "unnest(...)" gives the wrong result because if the
> xpath does not exist all the other parts of the query (like 'ut' in
> this case) also are empty.

It is hard to suggest something without seeing your whole query (e.g. 
how are you joining q & p?). But it sounds like you basically want a 
left join to the unnested xpath result. It could be a lateral join or not.

It is common to use UNNEST with an implicit lateral join, like this:

     SELECT  ...
     FROM    q, UNNEST(xpath('...', q.address_spec))

But that gives you an inner join. To get an outer join you need to be 
more explicit. Something like this:

     SELECT  ...
     FROM    q, p
     LEFT OUTER JOIN LATERAL (
       SELECT  *
       FROM    unnest(xpath('//t:address_spec/t:country/text()',
                        q.address_spec, p.ns))::citext
     ) x(country)
     ON true

(Presumably you would do something more restrictive to connect q & p 
though.)

Yours,

-- 
Paul              ~{:-)
pj@illuminatedcomputing.com


Re: Using Lateral

From
Johann Spies
Date:
Thanks Paul.

I was hesitant to post my full query.  It is a long and complicated
query.  But here it is now:

WITH p AS (
    SELECT
        ARRAY [ ARRAY [ 't',
'http://scientific.thomsonreuters.com/schema/wok5.4/public/FullRecord'
] ] AS ns),
uts AS (
    SELECT
        s.ut
    FROM
        wos_source.core_2015 s
    WHERE
        s.id BETWEEN 999900
        AND 100000
             ),
utsb AS (
    SELECT
        b.ut
    FROM
        wos_2017_1.belongs2 b,
        uts
    WHERE
        b.ut = uts.ut), q AS (
    SELECT
        s.ut,
        unnest (xpath
('//t:static_data/t:fullrecord_metadata/t:addresses/t:address_name/t:address_spec',
                xml,
                ns)) AS address_spec
    FROM
        p,
        uts a
    LEFT JOIN utsb b ON b.ut = a.ut
    LEFT JOIN wos_source.core_2015 s ON s.ut = a.ut
WHERE
    b.ut IS NULL), r AS (
    SELECT
        s.ut,
        unnest (xpath
('//t:static_data/t:item/t:reprint_contact/t:address_spec',
                xml,
                ns)) AS raddress_spec
    FROM
        p,
        wos_2017_1.publication l,
        uts a
    LEFT JOIN utsb b ON b.ut = a.ut
    LEFT JOIN wos_source.core_2015 s ON s.ut = a.ut
WHERE
    b.ut IS NULL
    AND xpath_exists
('//t:static_data/t:item/t:reprint_contact/t:address_spec', xml, ns)
    AND s.ut = l.ut
    AND l.pubyear < 1998), qd AS (
    SELECT
        DISTINCT ut,
        CASE
            WHEN xpath_exists ('//t:address_spec/t:country/text()',
                q.address_spec,
                p.ns)
            THEN unnest (xpath ('//t:address_spec/t:country/text()',
                    q.address_spec,
                    p.ns))::citext
            ELSE NULL
        END country,
        CASE
            WHEN xpath_exists ('//t:address_spec/t:city/text()',
                q.address_spec,
                p.ns)
            THEN unnest (xpath ('//t:address_spec/t:city/text()',
                    q.address_spec,
                    p.ns))::citext
            ELSE NULL
        END city,
        CASE
            WHEN xpath_exists ('//t:organizations/t:organization/text()',
                q.address_spec,
                ns)
            THEN unnest (xpath ('//t:organizations/t:organization/text()',
                    q.address_spec,
                    ns))::citext
            ELSE NULL
        END organisation,
        CASE
            WHEN xpath_exists
('//t:organizations/t:organization[@pref="Y"]/text()',
                q.address_spec,
                ns)
            THEN unnest (xpath
('//t:organizations/t:organization[@pref="Y"]/text()',
                    q.address_spec,
                    ns))::citext
            ELSE NULL
        END AS prefname,
        CASE
            WHEN xpath_exists ('//t:suborganizations/t:suborganization/text()',
                q.address_spec,
                ns)
            THEN unnest (xpath ('//t:suborganizations/t:suborganization/text()',
                    q.address_spec,
                    ns))::citext
            ELSE NULL
        END suborgname,
        CASE
            WHEN xpath_exists ('/t:address_spec/@addr_no',
                q.address_spec,
                ns)
            THEN (xpath ('/t:address_spec/@addr_no',
                    q.address_spec,
                    ns))
            [ 1 ]::text::INTEGER
            ELSE NULL
        END addr_no
    FROM
        p,
        q),
rd AS (
    SELECT
        DISTINCT ut,
        CASE
            WHEN xpath_exists ('//t:address_spec/t:country/text()',
                r.raddress_spec,
                p.ns)
            THEN unnest (xpath ('//t:address_spec/t:country/text()',
                    r.raddress_spec,
                    p.ns))::citext
            ELSE NULL
        END country,
        CASE
            WHEN xpath_exists ('//t:address_spec/t:city/text()',
                r.raddress_spec,
                p.ns)
            THEN unnest (xpath ('//t:address_spec/t:city/text()',
                    r.raddress_spec,
                    p.ns))::citext
            ELSE NULL
        END city,
        CASE
            WHEN xpath_exists ('//t:organizations/t:organization/text()',
                r.raddress_spec,
                ns)
            THEN unnest (xpath ('//t:organizations/t:organization/text()',
                    r.raddress_spec,
                    ns))::citext
            ELSE NULL
        END organisation,
        CASE
            WHEN xpath_exists
('//t:organizations/t:organization[@pref="Y"]/text()',
                r.raddress_spec,
                ns)
            THEN unnest (xpath
('//t:organizations/t:organization[@pref="Y"]/text()',
                    r.raddress_spec,
                    ns))::citext
            ELSE NULL
        END AS prefname,
        CASE
            WHEN xpath_exists ('//t:suborganizations/t:suborganization/text()',
                r.raddress_spec,
                ns)
            THEN unnest (xpath ('//t:suborganizations/t:suborganization/text()',
                    r.raddress_spec,
                    ns))::citext
            ELSE NULL
        END suborgname,
        CASE
            WHEN xpath_exists ('/t:address_spec/@addr_no',
                r.raddress_spec,
                ns)
            THEN (xpath ('/t:address_spec/@addr_no',
                    r.raddress_spec,
                    ns))
            [ 1 ]::text::INTEGER
            ELSE NULL
        END reprint_addr_no
    FROM
        p,
        r
    WHERE
        r.raddress_spec IS NOT NULL), uq AS (
    SELECT
        DISTINCT qd.ut,
        qd.addr_no::INTEGER,
        0 AS reprint_addr_no,
        c.uuid city_id,
        y.uuid country_id,
        o.uuid organisation_id,
        u.uuid suborg_id,
        p.uuid pref_name_id
    FROM
        qd
    LEFT JOIN wos_2017_1.city c ON (c.city = qd.city)
    LEFT JOIN wos_2017_1.country_alias y ON (y.country = qd.country)
    LEFT JOIN wos_2017_1.organisation o ON (o.organisation = regexp_replace (
         regexp_replace (
             regexp_replace (qd.organisation, '<', '<', 'g'),
        '&', '&', 'g'),
    '>', '>', 'g'))
    LEFT JOIN wos_2017_1.org_pref_name p ON (p.name = qd.prefname)
    LEFT JOIN wos_2017_1.suborg u ON (u.suborgname = regexp_replace (
         regexp_replace (
             regexp_replace (qd.suborgname, '<', '<', 'g'),
        '&', '&', 'g'),
    '>', '>', 'g'))),
ur AS (
    SELECT
        DISTINCT rd.ut,
        0 AS addr_no,
        rd.reprint_addr_no::INTEGER,
        c.uuid city_id,
        y.uuid country_id,
        o.uuid organisation_id,
        u.uuid suborg_id,
        p.uuid pref_name_id
    FROM
        r,
        rd
    LEFT JOIN wos_2017_1.city c ON (c.city = rd.city)
    LEFT JOIN wos_2017_1.country_alias y ON (y.country = rd.country)
    LEFT JOIN wos_2017_1.organisation o ON (o.organisation = regexp_replace (
         regexp_replace (
             regexp_replace (rd.organisation, '<', '<', 'g'),
        '&', '&', 'g'),
    '>', '>', 'g'))
    LEFT JOIN wos_2017_1.org_pref_name p ON (p.name = rd.prefname)
    LEFT JOIN wos_2017_1.suborg u ON (u.suborgname = regexp_replace (
         regexp_replace (
             regexp_replace (rd.suborgname, '<', '<', 'g'),
        '&', '&', 'g'),
    '>', '>', 'g'))
WHERE
    r.raddress_spec IS NOT NULL), qr AS (
    SELECT
        *
    FROM
        uq
    UNION
    SELECT
        ur.*
    FROM
        ur)
SELECT
    DISTINCT ON (qr.ut,
        qr.addr_no,
        qr.reprint_addr_no,
        a.uuid,
        qr.organisation_id,
        qr.suborg_id,
        qr.pref_name_id)
    qr.ut,
    qr.addr_no,
    qr.reprint_addr_no,
    a.uuid AS address_id,
    qr.organisation_id,
    qr.suborg_id,
    qr.pref_name_id,
    uuid_generate_v1 ()
    uuid
FROM
    qr
    LEFT JOIN wos_2017_1.address a ON (a.city_id = qr.city_id
        AND a.country_id = qr.country_id)




Johann
-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


Re: Using Lateral

From
Paul Jungwirth
Date:
On 03/28/2018 05:22 AM, Johann Spies wrote:
> Thanks Paul.
> 
> I was hesitant to post my full query.  It is a long and complicated
> query.

Ha ha, you aren't joking. :-)

With something that long I don't think I'd want to split every xpath 
call into a new join. I guess you could try it and see though.

Stepping back, your original query is prepared for xpath to give 0 
results or 1 result, but not 2+, and the problem is that you're getting 
a 2+ result from the address_spec element. What is the right behavior 
there? To get two rows in the end result? Just use the first/last 
address_spec?

If you want to force 0/1 results, you can keep your structure and do this:

SELECT  DISTINCT ut,
         (xpath('//t:address_spec/t:country/text()', q.address_spec, 
p.ns))[1]::citext AS country,
         (xpath('//t:address_spec/t:city/text()', q.address_spec, 
p.ns))[1]::citext AS city,
         (xpath('//t:organizations/t:organization/text()', 
q.address_spec, p.ns))[1]::citext AS organisation,
          (xpath('//t:organizations/t:organization[@pref="Y"]/text()', 
q.address_spec, p.ns))[1]::citext AS prefname,
          (xpath ('//t:suborganizations/t:suborganization/text()', 
q.address_spec, p.ns))[1]::citext AS suborgname,
          (xpath ('/t:address_spec/@addr_no', q.address_spec, 
p.ns))[1]::text::integer AS addr_no

(Actually I see you are already doing that for addr_no. And an aside: 
did you intend `/t:address_spec` instead of `//t:address_spec` there?)

If you would prefer to get multiple rows back, then you'll probably need 
a subquery to give one row per `xpath('//t:address_spec')`, so that you 
can keep the cities/countries/addr_nos together.

Anyway, I hope that's enough to get you started on the right path! The 
root cause is that UNNEST is giving you two rows where it only makes 
sense to have one, so you need to restrict that or pull it into a 
context where several rows are allowed (like a join).

-- 
Paul              ~{:-)
pj@illuminatedcomputing.com


Using Lateral

From
"David G. Johnston"
Date:
On Tuesday, March 27, 2018, Johann Spies <johann.spies@gmail.com> wrote:
In the past I could use this in a query:

    SELECT
        DISTINCT ut,
        CASE
            WHEN xpath_exists ('//t:address_spec/t:country/text()',
                q.address_spec,
                p.ns)
            THEN unnest (xpath ('//t:address_spec/t:country/text()',
                    q.address_spec,
                    p.ns))::citext
            ELSE NULL
        END country,
[...]
Just selecting "unnest(...)" gives the wrong result because if the
xpath does not exist all the other parts of the query (like 'ut' in
this case) also are empty.

You should be able to solve the empty-set-in-target-list problem via a scalar subquery instead of a case construct.

Select distinct ut, (select unnest(...)) as country from ...

The subselect wil return null if fed zero rows. Though you will still have to solve an issue if the unrest returns 1+ rows.

In lieu of the inline scalar subquery I would advise writing a function and just calling it directly in the target-list.  But it should not return setof, I'd return an array if you need to accept the possibility of 1+ matches, and return an empty array for zero matches.

David J.

Re: Using Lateral

From
Johann Spies
Date:
Thanks David and Paul,

You have helped me a lot.

Regards
Johann.

On 28 March 2018 at 20:49, David G. Johnston <david.g.johnston@gmail.com> wrote:
> On Tuesday, March 27, 2018, Johann Spies <johann.spies@gmail.com> wrote:
>>
>> In the past I could use this in a query:
>>
>>     SELECT
>>         DISTINCT ut,
>>         CASE
>>             WHEN xpath_exists ('//t:address_spec/t:country/text()',
>>                 q.address_spec,
>>                 p.ns)
>>             THEN unnest (xpath ('//t:address_spec/t:country/text()',
>>                     q.address_spec,
>>                     p.ns))::citext
>>             ELSE NULL
>>         END country,
>> [...]
>> Just selecting "unnest(...)" gives the wrong result because if the
>> xpath does not exist all the other parts of the query (like 'ut' in
>> this case) also are empty.
>
>
> You should be able to solve the empty-set-in-target-list problem via a
> scalar subquery instead of a case construct.
>
> Select distinct ut, (select unnest(...)) as country from ...
>
> The subselect wil return null if fed zero rows. Though you will still have
> to solve an issue if the unrest returns 1+ rows.
>
> In lieu of the inline scalar subquery I would advise writing a function and
> just calling it directly in the target-list.  But it should not return
> setof, I'd return an array if you need to accept the possibility of 1+
> matches, and return an empty array for zero matches.
>
> David J.



-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)