Non-Materialized CTE bug?

Started by greigwiseover 5 years ago2 messagesbugs
Jump to latest
#1greigwise
greigwise@comcast.net

create table types (type varchar);
CREATE TABLE
insert into types values ('a');

with test (i, j) as
(values ( 1, '{"key": "a", "value": "true"}'::json), (2, '{"key": "b",
"value": "xxx"}'::json)),
test2 (i, j) as (select i, (j->>'value')::boolean
from test a
join types b on b.type = a.j->>'key') select * from test2 where j = true;

ERROR: invalid input syntax for type boolean: "xxx"

Why would this result in an error here? It seems like the second row from
the "test" CTE should be excluded since the key value "b" will not join with
anything in the "types" table. If I materialize the "test2" CTE, then it
works as I'd expect without error. Is this a bug? It seems like the
optimizer should be smart enough to know that it has to materialize that
CTE. This is on version 12.4.

Thanks for the help.
Greig Wise

--
Sent from: https://www.postgresql-archive.org/PostgreSQL-bugs-f2117394.html

#2Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: greigwise (#1)
Re: Non-Materialized CTE bug?

Hello.

At Tue, 27 Oct 2020 17:43:08 -0700 (MST), greigwise <greigwise@comcast.net> wrote in

create table types (type varchar);
CREATE TABLE
insert into types values ('a');

with test (i, j) as
(values ( 1, '{"key": "a", "value": "true"}'::json), (2, '{"key": "b",
"value": "xxx"}'::json)),
test2 (i, j) as (select i, (j->>'value')::boolean
from test a
join types b on b.type = a.j->>'key') select * from test2 where j = true;

ERROR: invalid input syntax for type boolean: "xxx"

Why would this result in an error here? It seems like the second row from
the "test" CTE should be excluded since the key value "b" will not join with
anything in the "types" table. If I materialize the "test2" CTE, then it
works as I'd expect without error. Is this a bug? It seems like the
optimizer should be smart enough to know that it has to materialize that
CTE. This is on version 12.4.

test2 needs to evaluate all the output columns *before* the join with
types, which leads to the error. You need to make sure that the
::boolean is evaluated after the join. MATERIALIZE prevents the CTE
from being merged into upper queries.

The following rewrites work.

with
test (i, j) as
(values ( 1, '{"key": "a", "value": "true"}'::json),
(2, '{"key": "b", "value": "xxx"}'::json)),
test2 (i, j) as MATERIALIZED
(select i, j->>'value' as j
from test a
join types b on b.type = a.j->>'key')
select i from test2 where j::boolean = true;

with
test (i, j) as
(values ( 1, '{"key": "a", "value": "true"}'::json),
(2, '{"key": "b", "value": "xxx"}'::json)),
test2 (i, j) as MATERIALIZED
(select i, j->>'value' as j
from test a
join types b on b.type = a.j->>'key'),
test3 (i, j) as
(select i, j::boolean as j from test2)
select i from test3 where j = true;

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center