BUG #17985: Inconsistent results of SELECT comparing two CASE WHEN clause
The following bug has been logged on the website:
Bug reference: 17985
Logged by: Zuming Jiang
Email address: zuming.jiang@inf.ethz.ch
PostgreSQL version: 16beta1
Operating system: Ubuntu 20.04
Description:
My fuzzer finds a correctness bug in Postgres, which makes Postgres return
inconsistent results. This bug can be reproduced even after applying the
fixing patches for
/messages/by-id/17976-4b638b525e9a983b@postgresql.org
and
/messages/by-id/17978-12f3d93a55297266@postgresql.org
--- Set up database ---
create table t0 (c2 text);
create table t2 (c10 text);
create table t5 (vkey int4, pkey int4, c27 text, c28 text, c29 text, c30
text);
insert into t0 values ('');
insert into t2 values ('');
insert into t5 values (1, 2, 'a', 'a', 'a', 'a'), (0, 1, '', '', 'a',
'L');
---
The fuzzer generates Test case 1:
--- Test case 1 ---
select * from t5
where (t5.pkey >= t5.vkey) <> (t5.c30 = (
select
t5.c29 as c_0
from
(t2 as ref_0
inner join t0 as ref_1
on (ref_0.c10 = ref_1.c2))
where ((case when (((ref_0.c10 like 'z~%')
and (not (ref_0.c10 like 'z~%')))
and ((ref_0.c10 like 'z~%') is not null)) then
t5.c28 else t5.c28 end)
= (case when (((ref_1.c2 not like '_%%')
and (not (ref_1.c2 not like '_%%')))
and ((ref_1.c2 not like '_%%') is not null)) then
t5.c29 else t5.c27 end))
order by c_0 desc limit 1));
---
Because the then branch and else branch of the CASE WHEN expression '((case
when (((ref_0.c10 like 'z~%') and (not (ref_0.c10 like 'z~%'))) and
((ref_0.c10 like 'z~%') is not null)) then t5.c28 else t5.c28 end)' are the
same (both are t5.c28), I simplify this CASE WHEN expression by replacing it
with t5.c28, and get Test case 2:
--- Test case 2 ---
select * from t5
where (t5.pkey >= t5.vkey) <> (t5.c30 = (
select
t5.c29 as c_0
from
(t2 as ref_0
inner join t0 as ref_1
on (ref_0.c10 = ref_1.c2))
where (t5.c28
= (case when (((ref_1.c2 not like '_%%')
and (not (ref_1.c2 not like '_%%')))
and ((ref_1.c2 not like '_%%') is not null)) then
t5.c29 else t5.c27 end))
order by c_0 desc limit 1));
---
--- Expected behavior ---
Test case 1 and Test case 2 return the same results.
--- Actual behavior ---
Test case 1 returns 0 rows, while Test case 2 returns 1 row.
Output of Test case 1:
vkey | pkey | c27 | c28 | c29 | c30
------+------+-----+-----+-----+-----
(0 rows)
Output of Test case 2:
vkey | pkey | c27 | c28 | c29 | c30
------+------+-----+-----+-----+-----
0 | 1 | | | a | L
(1 row)
--- Postgres version ---
Github commit: efeb12ef0bfef0b5aa966a56bb4dbb1f936bda0c
Version: PostgreSQL 16beta1 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit
--- Platform information ---
Platform: Ubuntu 20.04
Kernel: Linux 5.4.0-147-generic
PG Bug reporting form <noreply@postgresql.org> writes:
My fuzzer finds a correctness bug in Postgres, which makes Postgres return
inconsistent results.
Huh. This is a very ancient bug, dating seemingly to commit d24d75ff1.
When rescanning a hash join, we'll skip rebuilding the hash table if
the inner subplan contains no updated Param values. However, the
inner hash key expressions can themselves contain Param references,
which evidently are not caught by that test. A change in the value
of such a Param necessitates rebuilding the hash table, and this
example shows that we're not doing that.
One way to fix it is as attached. I wonder though if this isn't
telling us that there's a bug in the planner's assignment of
allParams bits for Hash nodes. The dangerous Param is present
in the Hash node's hashkeys field, so why isn't the existing
chgParam test adequate?
regards, tom lane
Attachments:
quick-fix-for-bug-17985.patchtext/x-diff; charset=us-ascii; name=quick-fix-for-bug-17985.patchDownload+91-4
I wrote:
One way to fix it is as attached. I wonder though if this isn't
telling us that there's a bug in the planner's assignment of
allParams bits for Hash nodes. The dangerous Param is present
in the Hash node's hashkeys field, so why isn't the existing
chgParam test adequate?
Yeah, so looking at that, the planner is just ignoring Hash.hashkeys,
apparently figuring that incorporating Param IDs from HashJoin.hashclauses
into the allParams for the parent HashJoin is sufficient. But as we
see here, it isn't. The attached seems like a better fix.
regards, tom lane