Using Lateral

Started by Johann Spiesabout 8 years ago6 messagesgeneral
Jump to latest
#1Johann Spies
johann.spies@gmail.com

In the past I could use this in a query:

SELECT
DISTINCT ut,
CASE
WHEN xpath_exists ('//t:address_spec/t:country/text()',
q.address_spec,
p.ns)
THEN unnest (xpath ('//t:address_spec/t:country/text()',
q.address_spec,
p.ns))::citext
ELSE NULL
END country,

No longer. The error message suggests I should use a lateral query.
But I could not figure out in the documentation how to get the same
result using a "lateral" construct.

Just selecting "unnest(...)" gives the wrong result because if the
xpath does not exist all the other parts of the query (like 'ut' in
this case) also are empty.

Regards
Johann

--
Because experiencing your loyal love is better than life itself,
my lips will praise you. (Psalm 63:3)

#2Paul Jungwirth
pj@illuminatedcomputing.com
In reply to: Johann Spies (#1)
Re: Using Lateral

On 03/27/2018 03:22 AM, Johann Spies wrote:

In the past I could use this in a query:

SELECT
DISTINCT ut,
CASE
WHEN xpath_exists ('//t:address_spec/t:country/text()',
q.address_spec,
p.ns)
THEN unnest (xpath ('//t:address_spec/t:country/text()',
q.address_spec,
p.ns))::citext
ELSE NULL
END country,

No longer. The error message suggests I should use a lateral query.
But I could not figure out in the documentation how to get the same
result using a "lateral" construct.

Just selecting "unnest(...)" gives the wrong result because if the
xpath does not exist all the other parts of the query (like 'ut' in
this case) also are empty.

It is hard to suggest something without seeing your whole query (e.g.
how are you joining q & p?). But it sounds like you basically want a
left join to the unnested xpath result. It could be a lateral join or not.

It is common to use UNNEST with an implicit lateral join, like this:

SELECT ...
FROM q, UNNEST(xpath('...', q.address_spec))

But that gives you an inner join. To get an outer join you need to be
more explicit. Something like this:

SELECT ...
FROM q, p
LEFT OUTER JOIN LATERAL (
SELECT *
FROM unnest(xpath('//t:address_spec/t:country/text()',
q.address_spec, p.ns))::citext
) x(country)
ON true

(Presumably you would do something more restrictive to connect q & p
though.)

Yours,

--
Paul ~{:-)
pj@illuminatedcomputing.com

#3Johann Spies
johann.spies@gmail.com
In reply to: Paul Jungwirth (#2)
Re: Using Lateral

Thanks Paul.

I was hesitant to post my full query. It is a long and complicated
query. But here it is now:

WITH p AS (
SELECT
ARRAY [ ARRAY [ 't',
'http://scientific.thomsonreuters.com/schema/wok5.4/public/FullRecord'
] ] AS ns),
uts AS (
SELECT
s.ut
FROM
wos_source.core_2015 s
WHERE
s.id BETWEEN 999900
AND 100000
),
utsb AS (
SELECT
b.ut
FROM
wos_2017_1.belongs2 b,
uts
WHERE
b.ut = uts.ut), q AS (
SELECT
s.ut,
unnest (xpath
('//t:static_data/t:fullrecord_metadata/t:addresses/t:address_name/t:address_spec',
xml,
ns)) AS address_spec
FROM
p,
uts a
LEFT JOIN utsb b ON b.ut = a.ut
LEFT JOIN wos_source.core_2015 s ON s.ut = a.ut
WHERE
b.ut IS NULL), r AS (
SELECT
s.ut,
unnest (xpath
('//t:static_data/t:item/t:reprint_contact/t:address_spec',
xml,
ns)) AS raddress_spec
FROM
p,
wos_2017_1.publication l,
uts a
LEFT JOIN utsb b ON b.ut = a.ut
LEFT JOIN wos_source.core_2015 s ON s.ut = a.ut
WHERE
b.ut IS NULL
AND xpath_exists
('//t:static_data/t:item/t:reprint_contact/t:address_spec', xml, ns)
AND s.ut = l.ut
AND l.pubyear < 1998), qd AS (
SELECT
DISTINCT ut,
CASE
WHEN xpath_exists ('//t:address_spec/t:country/text()',
q.address_spec,
p.ns)
THEN unnest (xpath ('//t:address_spec/t:country/text()',
q.address_spec,
p.ns))::citext
ELSE NULL
END country,
CASE
WHEN xpath_exists ('//t:address_spec/t:city/text()',
q.address_spec,
p.ns)
THEN unnest (xpath ('//t:address_spec/t:city/text()',
q.address_spec,
p.ns))::citext
ELSE NULL
END city,
CASE
WHEN xpath_exists ('//t:organizations/t:organization/text()',
q.address_spec,
ns)
THEN unnest (xpath ('//t:organizations/t:organization/text()',
q.address_spec,
ns))::citext
ELSE NULL
END organisation,
CASE
WHEN xpath_exists
('//t:organizations/t:organization[@pref="Y"]/text()',
q.address_spec,
ns)
THEN unnest (xpath
('//t:organizations/t:organization[@pref="Y"]/text()',
q.address_spec,
ns))::citext
ELSE NULL
END AS prefname,
CASE
WHEN xpath_exists ('//t:suborganizations/t:suborganization/text()',
q.address_spec,
ns)
THEN unnest (xpath ('//t:suborganizations/t:suborganization/text()',
q.address_spec,
ns))::citext
ELSE NULL
END suborgname,
CASE
WHEN xpath_exists ('/t:address_spec/@addr_no',
q.address_spec,
ns)
THEN (xpath ('/t:address_spec/@addr_no',
q.address_spec,
ns))
[ 1 ]::text::INTEGER
ELSE NULL
END addr_no
FROM
p,
q),
rd AS (
SELECT
DISTINCT ut,
CASE
WHEN xpath_exists ('//t:address_spec/t:country/text()',
r.raddress_spec,
p.ns)
THEN unnest (xpath ('//t:address_spec/t:country/text()',
r.raddress_spec,
p.ns))::citext
ELSE NULL
END country,
CASE
WHEN xpath_exists ('//t:address_spec/t:city/text()',
r.raddress_spec,
p.ns)
THEN unnest (xpath ('//t:address_spec/t:city/text()',
r.raddress_spec,
p.ns))::citext
ELSE NULL
END city,
CASE
WHEN xpath_exists ('//t:organizations/t:organization/text()',
r.raddress_spec,
ns)
THEN unnest (xpath ('//t:organizations/t:organization/text()',
r.raddress_spec,
ns))::citext
ELSE NULL
END organisation,
CASE
WHEN xpath_exists
('//t:organizations/t:organization[@pref="Y"]/text()',
r.raddress_spec,
ns)
THEN unnest (xpath
('//t:organizations/t:organization[@pref="Y"]/text()',
r.raddress_spec,
ns))::citext
ELSE NULL
END AS prefname,
CASE
WHEN xpath_exists ('//t:suborganizations/t:suborganization/text()',
r.raddress_spec,
ns)
THEN unnest (xpath ('//t:suborganizations/t:suborganization/text()',
r.raddress_spec,
ns))::citext
ELSE NULL
END suborgname,
CASE
WHEN xpath_exists ('/t:address_spec/@addr_no',
r.raddress_spec,
ns)
THEN (xpath ('/t:address_spec/@addr_no',
r.raddress_spec,
ns))
[ 1 ]::text::INTEGER
ELSE NULL
END reprint_addr_no
FROM
p,
r
WHERE
r.raddress_spec IS NOT NULL), uq AS (
SELECT
DISTINCT qd.ut,
qd.addr_no::INTEGER,
0 AS reprint_addr_no,
c.uuid city_id,
y.uuid country_id,
o.uuid organisation_id,
u.uuid suborg_id,
p.uuid pref_name_id
FROM
qd
LEFT JOIN wos_2017_1.city c ON (c.city = qd.city)
LEFT JOIN wos_2017_1.country_alias y ON (y.country = qd.country)
LEFT JOIN wos_2017_1.organisation o ON (o.organisation = regexp_replace (
regexp_replace (
regexp_replace (qd.organisation, '&lt;', '<', 'g'),
'&amp;', '&', 'g'),
'&gt;', '>', 'g'))
LEFT JOIN wos_2017_1.org_pref_name p ON (p.name = qd.prefname)
LEFT JOIN wos_2017_1.suborg u ON (u.suborgname = regexp_replace (
regexp_replace (
regexp_replace (qd.suborgname, '&lt;', '<', 'g'),
'&amp;', '&', 'g'),
'&gt;', '>', 'g'))),
ur AS (
SELECT
DISTINCT rd.ut,
0 AS addr_no,
rd.reprint_addr_no::INTEGER,
c.uuid city_id,
y.uuid country_id,
o.uuid organisation_id,
u.uuid suborg_id,
p.uuid pref_name_id
FROM
r,
rd
LEFT JOIN wos_2017_1.city c ON (c.city = rd.city)
LEFT JOIN wos_2017_1.country_alias y ON (y.country = rd.country)
LEFT JOIN wos_2017_1.organisation o ON (o.organisation = regexp_replace (
regexp_replace (
regexp_replace (rd.organisation, '&lt;', '<', 'g'),
'&amp;', '&', 'g'),
'&gt', '>', 'g'))
LEFT JOIN wos_2017_1.org_pref_name p ON (p.name = rd.prefname)
LEFT JOIN wos_2017_1.suborg u ON (u.suborgname = regexp_replace (
regexp_replace (
regexp_replace (rd.suborgname, '&lt;', '<', 'g'),
'&amp;', '&', 'g'),
'&gt;', '>', 'g'))
WHERE
r.raddress_spec IS NOT NULL), qr AS (
SELECT
*
FROM
uq
UNION
SELECT
ur.*
FROM
ur)
SELECT
DISTINCT ON (qr.ut,
qr.addr_no,
qr.reprint_addr_no,
a.uuid,
qr.organisation_id,
qr.suborg_id,
qr.pref_name_id)
qr.ut,
qr.addr_no,
qr.reprint_addr_no,
a.uuid AS address_id,
qr.organisation_id,
qr.suborg_id,
qr.pref_name_id,
uuid_generate_v1 ()
uuid
FROM
qr
LEFT JOIN wos_2017_1.address a ON (a.city_id = qr.city_id
AND a.country_id = qr.country_id)

Johann
--
Because experiencing your loyal love is better than life itself,
my lips will praise you. (Psalm 63:3)

#4Paul Jungwirth
pj@illuminatedcomputing.com
In reply to: Johann Spies (#3)
Re: Using Lateral

On 03/28/2018 05:22 AM, Johann Spies wrote:

Thanks Paul.

I was hesitant to post my full query. It is a long and complicated
query.

Ha ha, you aren't joking. :-)

With something that long I don't think I'd want to split every xpath
call into a new join. I guess you could try it and see though.

Stepping back, your original query is prepared for xpath to give 0
results or 1 result, but not 2+, and the problem is that you're getting
a 2+ result from the address_spec element. What is the right behavior
there? To get two rows in the end result? Just use the first/last
address_spec?

If you want to force 0/1 results, you can keep your structure and do this:

SELECT DISTINCT ut,
(xpath('//t:address_spec/t:country/text()', q.address_spec,
p.ns))[1]::citext AS country,
(xpath('//t:address_spec/t:city/text()', q.address_spec,
p.ns))[1]::citext AS city,
(xpath('//t:organizations/t:organization/text()',
q.address_spec, p.ns))[1]::citext AS organisation,
(xpath('//t:organizations/t:organization[@pref="Y"]/text()',
q.address_spec, p.ns))[1]::citext AS prefname,
(xpath ('//t:suborganizations/t:suborganization/text()',
q.address_spec, p.ns))[1]::citext AS suborgname,
(xpath ('/t:address_spec/@addr_no', q.address_spec,
p.ns))[1]::text::integer AS addr_no

(Actually I see you are already doing that for addr_no. And an aside:
did you intend `/t:address_spec` instead of `//t:address_spec` there?)

If you would prefer to get multiple rows back, then you'll probably need
a subquery to give one row per `xpath('//t:address_spec')`, so that you
can keep the cities/countries/addr_nos together.

Anyway, I hope that's enough to get you started on the right path! The
root cause is that UNNEST is giving you two rows where it only makes
sense to have one, so you need to restrict that or pull it into a
context where several rows are allowed (like a join).

--
Paul ~{:-)
pj@illuminatedcomputing.com

#5David G. Johnston
david.g.johnston@gmail.com
In reply to: Johann Spies (#1)

On Tuesday, March 27, 2018, Johann Spies <johann.spies@gmail.com> wrote:

In the past I could use this in a query:

SELECT
DISTINCT ut,
CASE
WHEN xpath_exists ('//t:address_spec/t:country/text()',
q.address_spec,
p.ns)
THEN unnest (xpath ('//t:address_spec/t:country/text()',
q.address_spec,
p.ns))::citext
ELSE NULL
END country,
[...]
Just selecting "unnest(...)" gives the wrong result because if the
xpath does not exist all the other parts of the query (like 'ut' in
this case) also are empty.

You should be able to solve the empty-set-in-target-list problem via a
scalar subquery instead of a case construct.

Select distinct ut, (select unnest(...)) as country from ...

The subselect wil return null if fed zero rows. Though you will still have
to solve an issue if the unrest returns 1+ rows.

In lieu of the inline scalar subquery I would advise writing a function and
just calling it directly in the target-list. But it should not return
setof, I'd return an array if you need to accept the possibility of 1+
matches, and return an empty array for zero matches.

David J.

#6Johann Spies
johann.spies@gmail.com
In reply to: David G. Johnston (#5)
Re: Using Lateral

Thanks David and Paul,

You have helped me a lot.

Regards
Johann.

On 28 March 2018 at 20:49, David G. Johnston <david.g.johnston@gmail.com> wrote:

On Tuesday, March 27, 2018, Johann Spies <johann.spies@gmail.com> wrote:

In the past I could use this in a query:

SELECT
DISTINCT ut,
CASE
WHEN xpath_exists ('//t:address_spec/t:country/text()',
q.address_spec,
p.ns)
THEN unnest (xpath ('//t:address_spec/t:country/text()',
q.address_spec,
p.ns))::citext
ELSE NULL
END country,
[...]
Just selecting "unnest(...)" gives the wrong result because if the
xpath does not exist all the other parts of the query (like 'ut' in
this case) also are empty.

You should be able to solve the empty-set-in-target-list problem via a
scalar subquery instead of a case construct.

Select distinct ut, (select unnest(...)) as country from ...

The subselect wil return null if fed zero rows. Though you will still have
to solve an issue if the unrest returns 1+ rows.

In lieu of the inline scalar subquery I would advise writing a function and
just calling it directly in the target-list. But it should not return
setof, I'd return an array if you need to accept the possibility of 1+
matches, and return an empty array for zero matches.

David J.

--
Because experiencing your loyal love is better than life itself,
my lips will praise you. (Psalm 63:3)