UNION removes almost all rows (not duplicates) - in fresh build of pg17!

Started by hubert depesz lubaczewskialmost 2 years ago4 messagesbugs

Jump to latest

hubert depesz lubaczewski

depesz@depesz.com

almost 2 years ago

So, test case is trivial:

#v+
$ select count(*) from pg_class where relkind = 'r';
count
───────
68
(1 row)

$ select count(*) from pg_class where relkind = 'i';
count
───────
164
(1 row)

$ select count(*) from ( select * from pg_class where relkind = 'r' union select * from pg_class where relkind = 'i' );
count
───────
1
(1 row)
#v-

explain shows unexpected:

#v+
$ explain (analyze) select count(*) from ( select * from pg_class where relkind = 'r' union select * from pg_class where relkind = 'i' );
QUERY PLAN
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Aggregate (cost=42.43..42.44 rows=1 width=8) (actual time=0.108..0.109 rows=1 loops=1)
-> Unique (cost=0.00..39.53 rows=232 width=236) (actual time=0.013..0.105 rows=1 loops=1)
-> Append (cost=0.00..39.53 rows=232 width=236) (actual time=0.012..0.099 rows=232 loops=1)
-> Seq Scan on pg_class (cost=0.00..19.19 rows=68 width=263) (actual time=0.012..0.058 rows=68 loops=1)
Filter: (relkind = 'r'::"char")
Rows Removed by Filter: 348
-> Seq Scan on pg_class pg_class_1 (cost=0.00..19.19 rows=164 width=263) (actual time=0.002..0.030 rows=164 loops=1)
Filter: (relkind = 'i'::"char")
Rows Removed by Filter: 252
Planning Time: 0.161 ms
Execution Time: 0.129 ms
(11 rows)
#v-

It seems that I get 232 correct rows from Append, but then Unique
removes all of them, except for one?

For whatever it's worth returned row is:

#v+
$ select * from pg_class where relkind = 'r' union select * from pg_class where relkind = 'i' \gx
─[ RECORD 1 ]───────┬───────────────────────
oid │ 2619
relname │ pg_statistic
relnamespace │ 11
reltype │ 10029
reloftype │ 0
relowner │ 10
relam │ 2
relfilenode │ 2619
reltablespace │ 0
relpages │ 19
reltuples │ 410
relallvisible │ 19
reltoastrelid │ 2840
relhasindex │ t
relisshared │ f
relpersistence │ p
relkind │ r
relnatts │ 31
relchecks │ 0
relhasrules │ f
relhastriggers │ f
relhassubclass │ f
relrowsecurity │ f
relforcerowsecurity │ f
relispopulated │ t
relreplident │ n
relispartition │ f
relrewrite │ 0
relfrozenxid │ 730
relminmxid │ 1
relacl │ {pgdba=arwdDxtm/pgdba}
reloptions │ [null]
relpartbound │ [null]
#v-

As for test environment:

Debian testing on amd64, self-compiled from git HEAD at d2a04470aa6401c1938cc107e0b2c56c22a2321f

Did I do something wrong?

Best regards,

depesz

Tom Lane

tgl@sss.pgh.pa.us

almost 2 years ago

In reply to: hubert depesz lubaczewski (#1)

Re: UNION removes almost all rows (not duplicates) - in fresh build of pg17!

hubert depesz lubaczewski <depesz@depesz.com> writes:

$ select count(*) from ( select * from pg_class where relkind = 'r' union select * from pg_class where relkind = 'i' );
count
───────
1
(1 row)

Ugh, that is awful. Bisecting shows it broke at

commit 66c0185a3d14bbbf51d0fc9d267093ffec735231
Author: David Rowley <drowley@postgresql.org>
Date: Mon Mar 25 14:31:14 2024 +1300

Allow planner to use Merge Append to efficiently implement UNION

I've not looked at the patch, but somehow it's totally forgetting
what it's supposed to be unique-ifying on.

regards, tom lane

David Rowley

dgrowleyml@gmail.com

almost 2 years ago

In reply to: hubert depesz lubaczewski (#1)

Re: UNION removes almost all rows (not duplicates) - in fresh build of pg17!

On Tue, 21 May 2024 at 04:00, hubert depesz lubaczewski
<depesz@depesz.com> wrote:

$ explain (analyze) select count(*) from ( select * from pg_class where relkind = 'r' union select * from pg_class where relkind = 'i' );
QUERY PLAN
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Aggregate (cost=42.43..42.44 rows=1 width=8) (actual time=0.108..0.109 rows=1 loops=1)
-> Unique (cost=0.00..39.53 rows=232 width=236) (actual time=0.013..0.105 rows=1 loops=1)
-> Append (cost=0.00..39.53 rows=232 width=236) (actual time=0.012..0.099 rows=232 loops=1)
-> Seq Scan on pg_class (cost=0.00..19.19 rows=68 width=263) (actual time=0.012..0.058 rows=68 loops=1)
Filter: (relkind = 'r'::"char")
Rows Removed by Filter: 348
-> Seq Scan on pg_class pg_class_1 (cost=0.00..19.19 rows=164 width=263) (actual time=0.002..0.030 rows=164 loops=1)
Filter: (relkind = 'i'::"char")
Rows Removed by Filter: 252

Thanks for the report.

It looks like it's a very simple fix. The problem is I wasn't setting
groupList when the grouping_is_sortable() returned false. In the
prior version, make_union_unique() always set that.

The attached should apply cleanly up to d2a04470a.

David

hubert depesz lubaczewski

depesz@depesz.com

almost 2 years ago

In reply to: David Rowley (#3)

Re: UNION removes almost all rows (not duplicates) - in fresh build of pg17!

On Tue, May 21, 2024 at 09:54:25AM +1200, David Rowley wrote:

Thanks for the report.
It looks like it's a very simple fix. The problem is I wasn't setting
groupList when the grouping_is_sortable() returned false. In the
prior version, make_union_unique() always set that.
The attached should apply cleanly up to d2a04470a.

Hi,
tested, and can confirm that applying this patch on top of
d2a04470aa6401c1938cc107e0b2c56c22a2321f solved the problem.

Best regards,

depesz

UNION removes almost all rows (not duplicates) - in fresh build of pg17!

Attachments: