Possible regression in PG18 beta1

Started by Sadeq Dousti10 months ago12 messages

Jump to latest

Sadeq Dousti

msdousti@gmail.com

10 months ago

Dear all,

I was testing PG18 beta 1 new features (noticeably, AIO and index skip
scan), and I came up with this example:

===========
drop table if exists t;

create TEMP table t(i,j,k)
as select n,n,n
from generate_series(1,10_000_000) as n;

analyze t;

create index on t(i,j,k);

explain (analyze,buffers,costs off,timing off)
select * from t where k = 1;
===========

On my laptop (MacBook Air M3), Postgres 17.5 runs the query over 2x faster
than 18 beta1. The former uses index-only scan, while the latter uses seq
scan:

=== PG 17.5 =====
QUERY PLAN
----------------------------------------------------------------
Index Only Scan using t_i_j_k_idx on t (actual rows=1 loops=1)
Index Cond: (k = 1)
Heap Fetches: 1
Buffers: local hit=1 read=38317
Planning Time: 0.315 ms
Execution Time: 242.711 ms
(6 rows)

=== PG 18 beta 1 =====
QUERY PLAN
------------------------------------------
Seq Scan on t (actual rows=1.00 loops=1)
Filter: (k = 1)
Rows Removed by Filter: 9999999
Buffers: local hit=1 read=54079
Planning Time: 0.115 ms
Execution Time: 520.465 ms
(6 rows)

If I turn off enable_seqscan on the latter, it will use the index-only
scan, with the query time similar to PG17 (yet a tad slower):

=== PG 18 beta 1, with enable_seqscan disabled =====
QUERY PLAN
-------------------------------------------------------------------
Index Only Scan using t_i_j_k_idx on t (actual rows=1.00 loops=1)
Index Cond: (k = 1)
Heap Fetches: 1
Index Searches: 1
Buffers: local hit=1 read=38317
Planning Time: 0.200 ms
Execution Time: 281.742 ms
(7 rows)

* Also, I noticed "explain analyze" reports actual rows as a decimal number
(1.00) as opposed to an integer (1); not sure if that's intentional.

* Changing the table from TEMP to UNLOGGED makes both versions use
"Parallel Seq Scan", with PG18 being ~25% faster. (190ms vs. 150ms).

Best Regards,
Sadeq Dousti

Sadeq Dousti

msdousti@gmail.com

10 months ago

In reply to: Sadeq Dousti (#1)

Re: Possible regression in PG18 beta1

Hi,

You're right, with settings, it revealed that PG17 had random_page_cost=1.1
configured.

Adding that to PG18 sets the plan to index only; however, PG18 is still
somewhat (~15%) slower:

============ PG17.5 =================
QUERY PLAN
----------------------------------------------------------------
Index Only Scan using t_i_j_k_idx on t (actual rows=1 loops=1)
Index Cond: (k = 1)
Heap Fetches: 1
Buffers: local read=38318 written=424
Settings: random_page_cost = '1.1'
Planning:
Buffers: shared hit=29, local read=1 written=1
Planning Time: 0.098 ms
Execution Time: 137.209 ms
(9 rows)

============ PG18 Beta 1 =================
QUERY PLAN
-------------------------------------------------------------------
Index Only Scan using t_i_j_k_idx on t (actual rows=1.00 loops=1)
Index Cond: (k = 1)
Heap Fetches: 1
Index Searches: 1
Buffers: local read=38318 written=443
Settings: random_page_cost = '1.1'
Planning:
Buffers: shared hit=30, local read=1
Planning Time: 0.097 ms
Execution Time: 160.595 ms
(10 rows)

So, one mystery solved (no planner regression), still three questions:
* Somewhat slower execution in PG18
* Planner differences in TEMP vs. UNLOGGED
* Actual rows with decimal (1.00) vs. integer (1)

Best Regards,
Sadeq Dousti

On Sat, May 17, 2025 at 6:18 PM Christophe Courtois <
christophe.courtois@dalibo.com> wrote:

Show quoted text

Hi,

- I cannot reproduce your regression, I have always a Seq Scan (PG 13,
17, 18, default config, last versions).

I suggest that you add a SETTINGS clause in EXPLAIN and \d+ and \di+
before, just in case.

- I see the difference between TEMP and UNLOGGED too (since PG13),
this is funny but I have no explanation. Something to do with the access
to shared buffers, I suppose.

Yours,

Le 17/05/2025 à 17:45, Sadeq Dousti a écrit :

Dear all,

I was testing PG18 beta 1 new features (noticeably, AIO and index skip
scan), and I came up with this example:

===========
drop table if exists t;

create TEMP table t(i,j,k)
as select n,n,n
from generate_series(1,10_000_000) as n;

analyze t;

create index on t(i,j,k);

explain (analyze,buffers,costs off,timing off)
select * from t where k = 1;
===========

On my laptop (MacBook Air M3), Postgres 17.5 runs the query over 2x
faster than 18 beta1. The former uses index-only scan, while the latter
uses seq scan:

=== PG 17.5 =====
QUERY PLAN
----------------------------------------------------------------
Index Only Scan using t_i_j_k_idx on t (actual rows=1 loops=1)
Index Cond: (k = 1)
Heap Fetches: 1
Buffers: local hit=1 read=38317
Planning Time: 0.315 ms
Execution Time: 242.711 ms
(6 rows)

=== PG 18 beta 1 =====
QUERY PLAN
------------------------------------------
Seq Scan on t (actual rows=1.00 loops=1)
Filter: (k = 1)
Rows Removed by Filter: 9999999
Buffers: local hit=1 read=54079
Planning Time: 0.115 ms
Execution Time: 520.465 ms
(6 rows)

If I turn off enable_seqscan on the latter, it will use the index-only
scan, with the query time similar to PG17 (yet a tad slower):

=== PG 18 beta 1, with enable_seqscan disabled =====
QUERY PLAN
-------------------------------------------------------------------
Index Only Scan using t_i_j_k_idx on t (actual rows=1.00 loops=1)
Index Cond: (k = 1)
Heap Fetches: 1
Index Searches: 1
Buffers: local hit=1 read=38317
Planning Time: 0.200 ms
Execution Time: 281.742 ms
(7 rows)

* Also, I noticed "explain analyze" reports actual rows as a decimal
number (1.00) as opposed to an integer (1); not sure if that's

intentional.

* Changing the table from TEMP to UNLOGGED makes both versions use
"Parallel Seq Scan", with PG18 being ~25% faster. (190ms vs. 150ms).

Best Regards,
Sadeq Dousti

--
_________ ____
| || | Christophe Courtois
| ||__ | Consultant DALIBO
| | | | 43, rue du Faubourg Montmartre
| - | / / 75009 Paris
|___| |___| \/ www.dalibo.com

Import Notes

Reply to msg id not found: f06839c4-de46-494e-86ab-3fc24316a85d@dalibo.com

Maciek Sakrejda

m.sakrejda@gmail.com

10 months ago

In reply to: Sadeq Dousti (#2)

Re: Possible regression in PG18 beta1

On Sat, May 17, 2025 at 9:38 AM Sadeq Dousti <msdousti@gmail.com> wrote:

So, one mystery solved (no planner regression), still three questions:
* Somewhat slower execution in PG18
* Planner differences in TEMP vs. UNLOGGED
* Actual rows with decimal (1.00) vs. integer (1)

Regarding the last one, I believe that's intentional, due to a change to
EXPLAIN output [1]https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=95dbd827f2edc4d10bebd7e840a0bd6782cf69b7. No comment on the others.

Thanks,
Maciek

[1]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=95dbd827f2edc4d10bebd7e840a0bd6782cf69b7
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=95dbd827f2edc4d10bebd7e840a0bd6782cf69b7

Sami Imseih

samimseih@gmail.com

10 months ago

In reply to: Sadeq Dousti (#2)

Re: Possible regression in PG18 beta1

* Also, I noticed "explain analyze" reports actual rows as a decimal
number (1.00) as opposed to an integer (1); not sure if that's intentional.

Yes, this is intentional as of 95dbd827f2edc

* Changing the table from TEMP to UNLOGGED makes both versions use
"Parallel Seq Scan", with PG18 being ~25% faster. (190ms vs. 150ms).

I'll be curious about tests with a normal table as well with a
sufficiently large
shared_buffers.

--
Sami Imseih
Amazon Web Services (AWS)

Sadeq Dousti

msdousti@gmail.com

10 months ago

In reply to: Sami Imseih (#4)

Re: Possible regression in PG18 beta1

Yes, this is intentional as of 95dbd827f2edc

Thanks!

* Changing the table from TEMP to UNLOGGED makes both versions use
"Parallel Seq Scan", with PG18 being ~25% faster. (190ms vs. 150ms).

I'll be curious about tests with a normal table as well with a
sufficiently large shared_buffers.

Here are results for a normal table with default shared_buffers (128 MB)
and large shared_buffers (4GB):

==== PG17.5, shared_buffers=128MB ======
QUERY PLAN
------------------------------------------------------
Gather (actual rows=1 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=3584 read=50496
-> Parallel Seq Scan on t (actual rows=0 loops=3)
Filter: (k = 1)
Rows Removed by Filter: 3333333
Buffers: shared hit=3584 read=50496
Settings: random_page_cost = '1.1'
Planning Time: 0.119 ms
Execution Time: 141.374 ms
(11 rows)

==== PG17.5, shared_buffers=4GB ======
select * from t where k = 1;
QUERY PLAN
------------------------------------------------------
Gather (actual rows=1 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=3872 read=50208
-> Parallel Seq Scan on t (actual rows=0 loops=3)
Filter: (k = 1)
Rows Removed by Filter: 3333333
Buffers: shared hit=3872 read=50208
Settings: random_page_cost = '1.1'
Planning Time: 0.118 ms
Execution Time: 141.846 ms
(11 rows)

==== PG18 beta1, shared_buffers=128MB ======
QUERY PLAN
---------------------------------------------------------
Gather (actual rows=1.00 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=15986 read=38094
-> Parallel Seq Scan on t (actual rows=0.33 loops=3)
Filter: (k = 1)
Rows Removed by Filter: 3333333
Buffers: shared hit=15986 read=38094
Settings: random_page_cost = '1.1'
Planning Time: 0.123 ms
Execution Time: 110.650 ms
(11 rows)

==== PG18 beta1, shared_buffers=4GB ======
QUERY PLAN
---------------------------------------------------------
Gather (actual rows=1.00 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=15991 read=38089
-> Parallel Seq Scan on t (actual rows=0.33 loops=3)
Filter: (k = 1)
Rows Removed by Filter: 3333333
Buffers: shared hit=15991 read=38089
Settings: random_page_cost = '1.1'
Planning Time: 0.095 ms
Execution Time: 109.073 ms
(11 rows)

Best Regards,
Sadeq Dousti

Sami Imseih

samimseih@gmail.com

10 months ago

In reply to: Sadeq Dousti (#5)

Re: Possible regression in PG18 beta1

I'll be curious about tests with a normal table as well with a
sufficiently large shared_buffers.

Here are results for a normal table with default shared_buffers (128 MB) and large shared_buffers (4GB):

thanks. I don't see regression for a normal table, at least for this test.

In terms of your original test, I tried it out on my Ubuntu machine
and with your test as-is, I see 2.8 seconds on 17.5 and 3.3 seconds
on HEAD if the plan performs a seq scan without parallelism.
However, the test as you have it is indexing all columns
on the table. If I just index on the filtered column

```
create index on t(k);

explain (analyze,buffers,costs off,timing off)
select k from t where k = 1;
```

I see similar behavior between HEAD
```
test=# explain (analyze,buffers,costs off,timing off)
select k from t where k = 1;
QUERY PLAN
---------------------------------------------------------------
Index Only Scan using t_k_idx on t (actual rows=1.00 loops=1)
Index Cond: (k = 1)
Heap Fetches: 1
Index Searches: 1
Buffers: local hit=4
Planning Time: 0.088 ms
Execution Time: 0.059 ms
```

and 17.5
```
test=# explain (analyze,buffers,costs off,timing off)
select k from t where k = 1;
QUERY PLAN
------------------------------------------------------------
Index Only Scan using t_k_idx on t (actual rows=1 loops=1)
Index Cond: (k = 1)
Heap Fetches: 1
Buffers: local hit=4
Planning Time: 0.084 ms
Execution Time: 0.053 ms
(6 rows)

```

--
Sami

Sadeq Dousti

msdousti@gmail.com

10 months ago

In reply to: Sami Imseih (#6)

Re: Possible regression in PG18 beta1

thanks. I don't see regression for a normal table, at least for this test.

No, there isn't. I just added them as per your request ;)

In terms of your original test, I tried it out on my Ubuntu machine

and with your test as-is, I see 2.8 seconds on 17.5 and 3.3 seconds

on HEAD if the plan performs a seq scan without parallelism.

Which is unexpected, no?

However, the test as you have it is indexing all columns

on the table. If I just index on the filtered column

Yes, I agree. Changing the indexing setup will diminish the difference.
However, given the sub-optimal index, PG18 seems to be slower.

Also, there's a meaningful difference in the plans for TEMP table vs.
UNLOGGED, which is interesting.

Best regards,
Sadeq Dousti

Peter Geoghegan

pg@bowt.ie

10 months ago

In reply to: Sadeq Dousti (#2)

Re: Possible regression in PG18 beta1

On Sat, May 17, 2025 at 12:38 PM Sadeq Dousti <msdousti@gmail.com> wrote:

So, one mystery solved (no planner regression), still three questions:
* Somewhat slower execution in PG18

I cannot recreate the problem.

The fastest plan for this query is a parallel sequential scan -- the
t_i_j_k_idx index is useless. The only reason it's used for an
index-only scan when random_page_cost is reduced to 1.1 is because it
happens to have no fragmentation (meaning the index itself can be read
in physical order).

Even if I force an index-only scan, I still see no significant
performance differences when I compare REL_17_STABLE to master/18.

Here's what I see on master/18:

regression=# explain (analyze,buffers,costs off,timing off)
select * from t where k = 1;
QUERY PLAN
-------------------------------------------------------------------
Index Only Scan using t_i_j_k_idx on t (actual rows=1.00 loops=1)
Index Cond: (k = 1)
Heap Fetches: 0
Index Searches: 1
Buffers: shared hit=38318
Planning Time: 0.025 ms
Execution Time: 152.502 ms
(7 rows)

Here's the same query on current REL_17_STABLE:

regression=# explain (analyze,buffers,costs off,timing off)
select * from t where k = 1;
QUERY PLAN
----------------------------------------------------------------
Index Only Scan using t_i_j_k_idx on t (actual rows=1 loops=1)
Index Cond: (k = 1)
Heap Fetches: 0
Buffers: shared hit=38318
Planning Time: 0.042 ms
Execution Time: 155.890 ms
(6 rows)

I used a regular/logged table for this. I repeatedly executed the
query, to minimize noise (I'm showing the last execution for each of
master and REL_17_STABLE).

For what it's worth, there *are* known regressions with more
complicated cases, most of which involve multiple inequality/range
conditions on multiple high cardinality columns. This was discussed
extensively over the course of work on skip scan -- they were deemed
acceptable. But I see no reason to believe that this particular query
will run any slower on 18.

FWIW, similar queries that don't have to scan the full index (which
makes an index-only scan attractive to the planner) are much faster on
Postgres 18, compared to 17. Here's one executed on 18:

regression=# explain (analyze,buffers,costs off,timing off)
select * from t where i between 1 and 500_000 and k = 1;
QUERY PLAN
-------------------------------------------------------------------
Index Only Scan using t_i_j_k_idx on t (actual rows=1.00 loops=1)
Index Cond: ((i >= 1) AND (i <= 500000) AND (k = 1))
Heap Fetches: 0
Index Searches: 1
Buffers: shared hit=1919
Planning:
Buffers: shared hit=4
Planning Time: 0.039 ms
Execution Time: 7.141 ms
(9 rows)

Here's the same query executed on Postgres 17:

regression=# explain (analyze,buffers,costs off,timing off)
select * from t where i between 1 and 500_000 and k = 1;
QUERY PLAN
----------------------------------------------------------------
Index Only Scan using t_i_j_k_idx on t (actual rows=1 loops=1)
Index Cond: ((i >= 1) AND (i <= 500000) AND (k = 1))
Heap Fetches: 0
Buffers: shared hit=1919
Planning:
Buffers: shared hit=4
Planning Time: 0.039 ms
Execution Time: 10.771 ms
(8 rows)

--
Peter Geoghegan

Sami Imseih

samimseih@gmail.com

10 months ago

In reply to: Sadeq Dousti (#7)

Re: Possible regression in PG18 beta1

On Sat, May 17, 2025 at 2:00 PM Sadeq Dousti <msdousti@gmail.com> wrote:

thanks. I don't see regression for a normal table, at least for this test.

No, there isn't. I just added them as per your request ;)

In terms of your original test, I tried it out on my Ubuntu machine

and with your test as-is, I see 2.8 seconds on 17.5 and 3.3 seconds

on HEAD if the plan performs a seq scan without parallelism.

Which is unexpected, no?

For the temp table test, it seems like I can account mostly all of the
extra time to the fact that checksums are enabled by default in 18,
due to 04bec894a04c

I ran the below script which runs the select 100 times against
the temp table on HEAD

```
drop table if exists t;

create TEMP table t(i,j,k)
as select n,n,n
from generate_series(1,10_000_000) as n;

analyze t;

create index on t(i,j,k);

SELECT 'select * from t where k = 1;' FROM generate_series(1, 100)
\gexec
```

and looked at perf top at the time, which shows
pg_checksum_block at the top using a cluster that was created
with initdb without any flags.

```
12.12% postgres [.] pg_checksum_block
11.97% postgres [.] ExecInterpExpr
10.57% postgres [.] slot_deform_heap_tuple_internal
5.90% postgres [.] fetch_att
4.18% postgres [.] heapgettup_pagemode
```

and explain analyze for a single execution
```
test=# EXPLAIN (ANALYZE, timing on) select * from t where k = 1;
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Seq Scan on t (cost=0.00..179080.00 rows=1 width=12) (actual
time=0.065..3375.125 rows=1.00 loops=1)
Filter: (k = 1)
Rows Removed by Filter: 9999999
Buffers: local read=54080
Planning Time: 0.090 ms
Execution Time: 3375.149 ms
(6 rows)
```

Now, with initdb and --no-data-checksums
```
13.32% postgres [.] ExecInterpExpr
12.44% postgres [.] slot_deform_heap_tuple_internal
6.64% postgres [.] fetch_att
4.70% postgres [.] heapgettup_pagemode
4.22% postgres [.] slot_deform_heap_tuple
3.75% postgres [.] TupleDescCompactAttr
```
and explain for a single execution
```
test=# EXPLAIN (ANALYZE, timing on) select * from t where k = 1;
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Seq Scan on t (cost=0.00..179080.00 rows=1 width=12) (actual
time=0.043..2939.101 rows=1.00 loops=1)
Filter: (k = 1)
Rows Removed by Filter: 9999999
Buffers: local read=54080
Planning Time: 0.087 ms
Execution Time: 2939.125 ms
(6 rows)
```
v18 with --no-data-checksums gives me close performance to v17

Can you try the same test ( with --no-data-checksums) on you mac
and see if that makes a difference?

[0]: https://www.postgresql.org/docs/current/checksums.html
structures and temporary files are not."

Is what is observed with temp files being protected by checksums correct?

[0]: https://www.postgresql.org/docs/current/checksums.html

--
Sami

#10

Sadeq Dousti

msdousti@gmail.com

10 months ago

In reply to: Sami Imseih (#9)

Re: Possible regression in PG18 beta1

Thanks, very interesting insights!

Can you try the same test ( with --no-data-checksums) on you mac

and see if that makes a difference?

I disabled checksums on PG18, and retried the tests, with and without
modifying random_page_cost, and for TEMP tables only.

When random_page_cost is the default value (=4), both versions do Seq Scan,
and PG18 is indeed a bit faster (313 ms vs. 347 ms).

However, when random_page_cost = 1.1, both versions do Index Only Scan, and
PG 17.5 is faster (135 ms vs. 152 ms).

Another counter-intuitive part is that when I run "explain" again, both
versions show slower results. (both 170-180 ms)
Using pg_prewarm to load both the table and the index into the shared
buffers does not change anything.

To summarize:
* Test setup works for TEMP table with random_page_cost =
1.1; data-checksums did not play a role
* It's fast initially and slows down on both versions in subsequent runs
* PG 17.5 is faster, but much more noticeably on the first run

I'm now thinking maybe the compilation options for PG 17.5 have been
different. I'm using the default version that comes with Homebrew, but for
PG 18, I compiled it myself. Here are the results for `select version();`
on both:

PostgreSQL 17.5 (Homebrew) on aarch64-apple-darwin24.4.0,
compiled by Apple clang version 17.0.0 (clang-1700.0.13.3), 64-bit

vs.

PostgreSQL 18beta1 on aarch64-darwin, compiled by clang-17.0.0, 64-bit

Best Regards,
Sadeq Dousti

#11

Sami Imseih

samimseih@gmail.com

10 months ago

In reply to: Sadeq Dousti (#10)

Re: Possible regression in PG18 beta1

I'm now thinking maybe the compilation options for PG 17.5 have been
different. I'm using the default version that comes with Homebrew, but for
PG 18, I compiled it myself. Here are the results for `select version();`
on both:

PostgreSQL 17.5 (Homebrew) on aarch64-apple-darwin24.4.0,
compiled by Apple clang version 17.0.0 (clang-1700.0.13.3), 64-bit

In my tests I build from source from both the 17 stable and head branch
and use the same build options.

Why temp uses checksum, which also happens on 17 and presumably
earlier versions, is still not too clear to me.

—
Sami

Show quoted text

#12

Sadeq Dousti

msdousti@gmail.com

10 months ago

In reply to: Sami Imseih (#11)

Re: Possible regression in PG18 beta1

In my tests I build from source from both the 17 stable and head branch
and use the same build options.

OK, I built 17 from REL_17_STABLE, and got the same behavior as the
Homebrew PG17:

* First run: 135 ms
* Subsequent runs: 169 ms, 181 ms, 187 ms, 177 ms, 170 ms

For comparison, with PG18:
* First run: 150 ms
* Subsequent runs: 183 ms, 182 ms, 185 ms, 187 ms, 180 ms

On the first and second runs, PG17 is consistently faster.
It's also strange to me that subsequent runs are consistently slower than
the first run.

Best Regards,
Sadeq Dousti