Thread: Using Lateral
In the past I could use this in a query: SELECT DISTINCT ut, CASE WHEN xpath_exists ('//t:address_spec/t:country/text()', q.address_spec, p.ns) THEN unnest (xpath ('//t:address_spec/t:country/text()', q.address_spec, p.ns))::citext ELSE NULL END country, No longer. The error message suggests I should use a lateral query. But I could not figure out in the documentation how to get the same result using a "lateral" construct. Just selecting "unnest(...)" gives the wrong result because if the xpath does not exist all the other parts of the query (like 'ut' in this case) also are empty. Regards Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3)
On 03/27/2018 03:22 AM, Johann Spies wrote: > In the past I could use this in a query: > > SELECT > DISTINCT ut, > CASE > WHEN xpath_exists ('//t:address_spec/t:country/text()', > q.address_spec, > p.ns) > THEN unnest (xpath ('//t:address_spec/t:country/text()', > q.address_spec, > p.ns))::citext > ELSE NULL > END country, > > No longer. The error message suggests I should use a lateral query. > But I could not figure out in the documentation how to get the same > result using a "lateral" construct. > > Just selecting "unnest(...)" gives the wrong result because if the > xpath does not exist all the other parts of the query (like 'ut' in > this case) also are empty. It is hard to suggest something without seeing your whole query (e.g. how are you joining q & p?). But it sounds like you basically want a left join to the unnested xpath result. It could be a lateral join or not. It is common to use UNNEST with an implicit lateral join, like this: SELECT ... FROM q, UNNEST(xpath('...', q.address_spec)) But that gives you an inner join. To get an outer join you need to be more explicit. Something like this: SELECT ... FROM q, p LEFT OUTER JOIN LATERAL ( SELECT * FROM unnest(xpath('//t:address_spec/t:country/text()', q.address_spec, p.ns))::citext ) x(country) ON true (Presumably you would do something more restrictive to connect q & p though.) Yours, -- Paul ~{:-) pj@illuminatedcomputing.com
Thanks Paul. I was hesitant to post my full query. It is a long and complicated query. But here it is now: WITH p AS ( SELECT ARRAY [ ARRAY [ 't', 'http://scientific.thomsonreuters.com/schema/wok5.4/public/FullRecord' ] ] AS ns), uts AS ( SELECT s.ut FROM wos_source.core_2015 s WHERE s.id BETWEEN 999900 AND 100000 ), utsb AS ( SELECT b.ut FROM wos_2017_1.belongs2 b, uts WHERE b.ut = uts.ut), q AS ( SELECT s.ut, unnest (xpath ('//t:static_data/t:fullrecord_metadata/t:addresses/t:address_name/t:address_spec', xml, ns)) AS address_spec FROM p, uts a LEFT JOIN utsb b ON b.ut = a.ut LEFT JOIN wos_source.core_2015 s ON s.ut = a.ut WHERE b.ut IS NULL), r AS ( SELECT s.ut, unnest (xpath ('//t:static_data/t:item/t:reprint_contact/t:address_spec', xml, ns)) AS raddress_spec FROM p, wos_2017_1.publication l, uts a LEFT JOIN utsb b ON b.ut = a.ut LEFT JOIN wos_source.core_2015 s ON s.ut = a.ut WHERE b.ut IS NULL AND xpath_exists ('//t:static_data/t:item/t:reprint_contact/t:address_spec', xml, ns) AND s.ut = l.ut AND l.pubyear < 1998), qd AS ( SELECT DISTINCT ut, CASE WHEN xpath_exists ('//t:address_spec/t:country/text()', q.address_spec, p.ns) THEN unnest (xpath ('//t:address_spec/t:country/text()', q.address_spec, p.ns))::citext ELSE NULL END country, CASE WHEN xpath_exists ('//t:address_spec/t:city/text()', q.address_spec, p.ns) THEN unnest (xpath ('//t:address_spec/t:city/text()', q.address_spec, p.ns))::citext ELSE NULL END city, CASE WHEN xpath_exists ('//t:organizations/t:organization/text()', q.address_spec, ns) THEN unnest (xpath ('//t:organizations/t:organization/text()', q.address_spec, ns))::citext ELSE NULL END organisation, CASE WHEN xpath_exists ('//t:organizations/t:organization[@pref="Y"]/text()', q.address_spec, ns) THEN unnest (xpath ('//t:organizations/t:organization[@pref="Y"]/text()', q.address_spec, ns))::citext ELSE NULL END AS prefname, CASE WHEN xpath_exists ('//t:suborganizations/t:suborganization/text()', q.address_spec, ns) THEN unnest (xpath ('//t:suborganizations/t:suborganization/text()', q.address_spec, ns))::citext ELSE NULL END suborgname, CASE WHEN xpath_exists ('/t:address_spec/@addr_no', q.address_spec, ns) THEN (xpath ('/t:address_spec/@addr_no', q.address_spec, ns)) [ 1 ]::text::INTEGER ELSE NULL END addr_no FROM p, q), rd AS ( SELECT DISTINCT ut, CASE WHEN xpath_exists ('//t:address_spec/t:country/text()', r.raddress_spec, p.ns) THEN unnest (xpath ('//t:address_spec/t:country/text()', r.raddress_spec, p.ns))::citext ELSE NULL END country, CASE WHEN xpath_exists ('//t:address_spec/t:city/text()', r.raddress_spec, p.ns) THEN unnest (xpath ('//t:address_spec/t:city/text()', r.raddress_spec, p.ns))::citext ELSE NULL END city, CASE WHEN xpath_exists ('//t:organizations/t:organization/text()', r.raddress_spec, ns) THEN unnest (xpath ('//t:organizations/t:organization/text()', r.raddress_spec, ns))::citext ELSE NULL END organisation, CASE WHEN xpath_exists ('//t:organizations/t:organization[@pref="Y"]/text()', r.raddress_spec, ns) THEN unnest (xpath ('//t:organizations/t:organization[@pref="Y"]/text()', r.raddress_spec, ns))::citext ELSE NULL END AS prefname, CASE WHEN xpath_exists ('//t:suborganizations/t:suborganization/text()', r.raddress_spec, ns) THEN unnest (xpath ('//t:suborganizations/t:suborganization/text()', r.raddress_spec, ns))::citext ELSE NULL END suborgname, CASE WHEN xpath_exists ('/t:address_spec/@addr_no', r.raddress_spec, ns) THEN (xpath ('/t:address_spec/@addr_no', r.raddress_spec, ns)) [ 1 ]::text::INTEGER ELSE NULL END reprint_addr_no FROM p, r WHERE r.raddress_spec IS NOT NULL), uq AS ( SELECT DISTINCT qd.ut, qd.addr_no::INTEGER, 0 AS reprint_addr_no, c.uuid city_id, y.uuid country_id, o.uuid organisation_id, u.uuid suborg_id, p.uuid pref_name_id FROM qd LEFT JOIN wos_2017_1.city c ON (c.city = qd.city) LEFT JOIN wos_2017_1.country_alias y ON (y.country = qd.country) LEFT JOIN wos_2017_1.organisation o ON (o.organisation = regexp_replace ( regexp_replace ( regexp_replace (qd.organisation, '<', '<', 'g'), '&', '&', 'g'), '>', '>', 'g')) LEFT JOIN wos_2017_1.org_pref_name p ON (p.name = qd.prefname) LEFT JOIN wos_2017_1.suborg u ON (u.suborgname = regexp_replace ( regexp_replace ( regexp_replace (qd.suborgname, '<', '<', 'g'), '&', '&', 'g'), '>', '>', 'g'))), ur AS ( SELECT DISTINCT rd.ut, 0 AS addr_no, rd.reprint_addr_no::INTEGER, c.uuid city_id, y.uuid country_id, o.uuid organisation_id, u.uuid suborg_id, p.uuid pref_name_id FROM r, rd LEFT JOIN wos_2017_1.city c ON (c.city = rd.city) LEFT JOIN wos_2017_1.country_alias y ON (y.country = rd.country) LEFT JOIN wos_2017_1.organisation o ON (o.organisation = regexp_replace ( regexp_replace ( regexp_replace (rd.organisation, '<', '<', 'g'), '&', '&', 'g'), '>', '>', 'g')) LEFT JOIN wos_2017_1.org_pref_name p ON (p.name = rd.prefname) LEFT JOIN wos_2017_1.suborg u ON (u.suborgname = regexp_replace ( regexp_replace ( regexp_replace (rd.suborgname, '<', '<', 'g'), '&', '&', 'g'), '>', '>', 'g')) WHERE r.raddress_spec IS NOT NULL), qr AS ( SELECT * FROM uq UNION SELECT ur.* FROM ur) SELECT DISTINCT ON (qr.ut, qr.addr_no, qr.reprint_addr_no, a.uuid, qr.organisation_id, qr.suborg_id, qr.pref_name_id) qr.ut, qr.addr_no, qr.reprint_addr_no, a.uuid AS address_id, qr.organisation_id, qr.suborg_id, qr.pref_name_id, uuid_generate_v1 () uuid FROM qr LEFT JOIN wos_2017_1.address a ON (a.city_id = qr.city_id AND a.country_id = qr.country_id) Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3)
On 03/28/2018 05:22 AM, Johann Spies wrote: > Thanks Paul. > > I was hesitant to post my full query. It is a long and complicated > query. Ha ha, you aren't joking. :-) With something that long I don't think I'd want to split every xpath call into a new join. I guess you could try it and see though. Stepping back, your original query is prepared for xpath to give 0 results or 1 result, but not 2+, and the problem is that you're getting a 2+ result from the address_spec element. What is the right behavior there? To get two rows in the end result? Just use the first/last address_spec? If you want to force 0/1 results, you can keep your structure and do this: SELECT DISTINCT ut, (xpath('//t:address_spec/t:country/text()', q.address_spec, p.ns))[1]::citext AS country, (xpath('//t:address_spec/t:city/text()', q.address_spec, p.ns))[1]::citext AS city, (xpath('//t:organizations/t:organization/text()', q.address_spec, p.ns))[1]::citext AS organisation, (xpath('//t:organizations/t:organization[@pref="Y"]/text()', q.address_spec, p.ns))[1]::citext AS prefname, (xpath ('//t:suborganizations/t:suborganization/text()', q.address_spec, p.ns))[1]::citext AS suborgname, (xpath ('/t:address_spec/@addr_no', q.address_spec, p.ns))[1]::text::integer AS addr_no (Actually I see you are already doing that for addr_no. And an aside: did you intend `/t:address_spec` instead of `//t:address_spec` there?) If you would prefer to get multiple rows back, then you'll probably need a subquery to give one row per `xpath('//t:address_spec')`, so that you can keep the cities/countries/addr_nos together. Anyway, I hope that's enough to get you started on the right path! The root cause is that UNNEST is giving you two rows where it only makes sense to have one, so you need to restrict that or pull it into a context where several rows are allowed (like a join). -- Paul ~{:-) pj@illuminatedcomputing.com
On Tuesday, March 27, 2018, Johann Spies <johann.spies@gmail.com> wrote:
In the past I could use this in a query:
SELECT
DISTINCT ut,
CASE
WHEN xpath_exists ('//t:address_spec/t:country/text()',
q.address_spec,
p.ns)
THEN unnest (xpath ('//t:address_spec/t:country/text()',
q.address_spec,
p.ns))::citext
ELSE NULL
END country,
[...]
Just selecting "unnest(...)" gives the wrong result because if the
xpath does not exist all the other parts of the query (like 'ut' in
this case) also are empty.
You should be able to solve the empty-set-in-target-list problem via a scalar subquery instead of a case construct.
Select distinct ut, (select unnest(...)) as country from ...
The subselect wil return null if fed zero rows. Though you will still have to solve an issue if the unrest returns 1+ rows.
In lieu of the inline scalar subquery I would advise writing a function and just calling it directly in the target-list. But it should not return setof, I'd return an array if you need to accept the possibility of 1+ matches, and return an empty array for zero matches.
David J.
Thanks David and Paul, You have helped me a lot. Regards Johann. On 28 March 2018 at 20:49, David G. Johnston <david.g.johnston@gmail.com> wrote: > On Tuesday, March 27, 2018, Johann Spies <johann.spies@gmail.com> wrote: >> >> In the past I could use this in a query: >> >> SELECT >> DISTINCT ut, >> CASE >> WHEN xpath_exists ('//t:address_spec/t:country/text()', >> q.address_spec, >> p.ns) >> THEN unnest (xpath ('//t:address_spec/t:country/text()', >> q.address_spec, >> p.ns))::citext >> ELSE NULL >> END country, >> [...] >> Just selecting "unnest(...)" gives the wrong result because if the >> xpath does not exist all the other parts of the query (like 'ut' in >> this case) also are empty. > > > You should be able to solve the empty-set-in-target-list problem via a > scalar subquery instead of a case construct. > > Select distinct ut, (select unnest(...)) as country from ... > > The subselect wil return null if fed zero rows. Though you will still have > to solve an issue if the unrest returns 1+ rows. > > In lieu of the inline scalar subquery I would advise writing a function and > just calling it directly in the target-list. But it should not return > setof, I'd return an array if you need to accept the possibility of 1+ > matches, and return an empty array for zero matches. > > David J. -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3)