UNION removes almost all rows (not duplicates) - in fresh build of pg17!

Started by hubert depesz lubaczewskialmost 2 years ago4 messagesbugs
Jump to latest

So, test case is trivial:

#v+
$ select count(*) from pg_class where relkind = 'r';
count
───────
68
(1 row)

$ select count(*) from pg_class where relkind = 'i';
count
───────
164
(1 row)

$ select count(*) from ( select * from pg_class where relkind = 'r' union select * from pg_class where relkind = 'i' );
count
───────
1
(1 row)
#v-

explain shows unexpected:

#v+
$ explain (analyze) select count(*) from ( select * from pg_class where relkind = 'r' union select * from pg_class where relkind = 'i' );
QUERY PLAN
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Aggregate (cost=42.43..42.44 rows=1 width=8) (actual time=0.108..0.109 rows=1 loops=1)
-> Unique (cost=0.00..39.53 rows=232 width=236) (actual time=0.013..0.105 rows=1 loops=1)
-> Append (cost=0.00..39.53 rows=232 width=236) (actual time=0.012..0.099 rows=232 loops=1)
-> Seq Scan on pg_class (cost=0.00..19.19 rows=68 width=263) (actual time=0.012..0.058 rows=68 loops=1)
Filter: (relkind = 'r'::"char")
Rows Removed by Filter: 348
-> Seq Scan on pg_class pg_class_1 (cost=0.00..19.19 rows=164 width=263) (actual time=0.002..0.030 rows=164 loops=1)
Filter: (relkind = 'i'::"char")
Rows Removed by Filter: 252
Planning Time: 0.161 ms
Execution Time: 0.129 ms
(11 rows)
#v-

It seems that I get 232 correct rows from Append, but then Unique
removes all of them, except for one?

For whatever it's worth returned row is:

#v+
$ select * from pg_class where relkind = 'r' union select * from pg_class where relkind = 'i' \gx
─[ RECORD 1 ]───────┬───────────────────────
oid │ 2619
relname │ pg_statistic
relnamespace │ 11
reltype │ 10029
reloftype │ 0
relowner │ 10
relam │ 2
relfilenode │ 2619
reltablespace │ 0
relpages │ 19
reltuples │ 410
relallvisible │ 19
reltoastrelid │ 2840
relhasindex │ t
relisshared │ f
relpersistence │ p
relkind │ r
relnatts │ 31
relchecks │ 0
relhasrules │ f
relhastriggers │ f
relhassubclass │ f
relrowsecurity │ f
relforcerowsecurity │ f
relispopulated │ t
relreplident │ n
relispartition │ f
relrewrite │ 0
relfrozenxid │ 730
relminmxid │ 1
relacl │ {pgdba=arwdDxtm/pgdba}
reloptions │ [null]
relpartbound │ [null]
#v-

As for test environment:

Debian testing on amd64, self-compiled from git HEAD at d2a04470aa6401c1938cc107e0b2c56c22a2321f

Did I do something wrong?

Best regards,

depesz

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: hubert depesz lubaczewski (#1)
Re: UNION removes almost all rows (not duplicates) - in fresh build of pg17!

hubert depesz lubaczewski <depesz@depesz.com> writes:

$ select count(*) from ( select * from pg_class where relkind = 'r' union select * from pg_class where relkind = 'i' );
count
───────
1
(1 row)

Ugh, that is awful. Bisecting shows it broke at

commit 66c0185a3d14bbbf51d0fc9d267093ffec735231
Author: David Rowley <drowley@postgresql.org>
Date: Mon Mar 25 14:31:14 2024 +1300

Allow planner to use Merge Append to efficiently implement UNION

I've not looked at the patch, but somehow it's totally forgetting
what it's supposed to be unique-ifying on.

regards, tom lane

#3David Rowley
dgrowleyml@gmail.com
In reply to: hubert depesz lubaczewski (#1)
Re: UNION removes almost all rows (not duplicates) - in fresh build of pg17!

On Tue, 21 May 2024 at 04:00, hubert depesz lubaczewski
<depesz@depesz.com> wrote:

$ explain (analyze) select count(*) from ( select * from pg_class where relkind = 'r' union select * from pg_class where relkind = 'i' );
QUERY PLAN
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Aggregate (cost=42.43..42.44 rows=1 width=8) (actual time=0.108..0.109 rows=1 loops=1)
-> Unique (cost=0.00..39.53 rows=232 width=236) (actual time=0.013..0.105 rows=1 loops=1)
-> Append (cost=0.00..39.53 rows=232 width=236) (actual time=0.012..0.099 rows=232 loops=1)
-> Seq Scan on pg_class (cost=0.00..19.19 rows=68 width=263) (actual time=0.012..0.058 rows=68 loops=1)
Filter: (relkind = 'r'::"char")
Rows Removed by Filter: 348
-> Seq Scan on pg_class pg_class_1 (cost=0.00..19.19 rows=164 width=263) (actual time=0.002..0.030 rows=164 loops=1)
Filter: (relkind = 'i'::"char")
Rows Removed by Filter: 252

Thanks for the report.

It looks like it's a very simple fix. The problem is I wasn't setting
groupList when the grouping_is_sortable() returned false. In the
prior version, make_union_unique() always set that.

The attached should apply cleanly up to d2a04470a.

David

Attachments:

fix_union_planning.patchapplication/octet-stream; name=fix_union_planning.patchDownload+10-7
In reply to: David Rowley (#3)
Re: UNION removes almost all rows (not duplicates) - in fresh build of pg17!

On Tue, May 21, 2024 at 09:54:25AM +1200, David Rowley wrote:

Thanks for the report.
It looks like it's a very simple fix. The problem is I wasn't setting
groupList when the grouping_is_sortable() returned false. In the
prior version, make_union_unique() always set that.
The attached should apply cleanly up to d2a04470a.

Hi,
tested, and can confirm that applying this patch on top of
d2a04470aa6401c1938cc107e0b2c56c22a2321f solved the problem.

Best regards,

depesz