PgSQL 15.3: Execution plan not using index as expected

Started by Dürr Softwareover 2 years ago6 messagesgeneral
Jump to latest
#1Dürr Software
info@fduerr.de

Dear list,

i have a strange problem when migrating a DB from version 9.3.4 to 15.3:
An index which seems perfect for the query and is used in 9.3.4 as
expected is not used in 15.3.
I just wonder, whether the bug is on my side or on PgSQL's..
The details:
===========================================================================
PostgreSQL 9.3.4:

test=# select version();
version
----------------------------------------------------------------------------------------------
 PostgreSQL 9.3.4 on x86_64-unknown-linux-gnu, compiled by gcc (Debian
4.8.3-2) 4.8.3, 64-bit
(1 row)

test=> \d client_session
                                            Table "client_session"
    Column     |              Type |                             Modifiers
---------------+--------------------------------+-------------------------------------------------------------------
 id            | bigint                         | not null default
nextval('client_session_id_seq'::regclass)
 tstamp_start  | timestamp(3) without time zone | not null default now()
 permit_id     | character varying(63)          | not null default
"current_user"()
 user_id       | character varying(63)          | not null default
"session_user"()
Indexes:
    "client_session_pkey" PRIMARY KEY, btree (id)
    "client_session_user_id_idx" btree (user_id, tstamp_start DESC)

vdws=# explain analyze SELECT permit_id FROM client_session WHERE
user_id=SESSION_USER::VARCHAR ORDER BY tstamp_start DESC LIMIT 1;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..2.37 rows=1 width=23) (actual time=0.134..0.134
rows=1 loops=1)
   ->  Index Scan using client_session_user_id_idx on client_session 
(cost=0.57..52337.99 rows=29181 width=23) (actual time=0.133..0.133
rows=1 loops=1)
         Index Cond: ((user_id)::text = (("session_user"())::character
varying)::text)
 Total runtime: 0.165 ms
(4 rows)

===========================================================================
PostgreSQL 15.3:

test=# select version();
version
---------------------------------------------------------------------------------------------------------------------
 PostgreSQL 15.3 (Debian 15.3-1.pgdg120+1) on x86_64-pc-linux-gnu,
compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
(1 Zeile)

test=# \d client_session
                                                  Tabelle »client_session«
    Spalte     |              Typ               | Sortierfolge | NULL
erlaubt? |                   Vorgabewert
---------------+--------------------------------+--------------+---------------+--------------------------------------------------
 id            | bigint                         |              | not
null      | nextval('client_session_id_seq'::regclass)
 tstamp_start  | timestamp(3) without time zone |              | not
null      | now()
 permit_id     | character varying(63)          |              | not
null      | "current_user"()
 user_id       | character varying(63)          |              | not
null      | "session_user"()
Indexe:
    "client_session_pkey" PRIMARY KEY, btree (id)
    "client_session_user_id_idx" btree (user_id, tstamp_start DESC)

test=# explain analyze SELECT permit_id FROM client_session WHERE
user_id=SESSION_USER::VARCHAR ORDER BY tstamp_start DESC LIMIT 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=3778568.38..3778568.50 rows=1 width=152) (actual
time=8431.320..8437.169 rows=1 loops=1)
   ->  Gather Merge  (cost=3778568.38..3853392.64 rows=641306
width=152) (actual time=8383.774..8389.622 rows=1 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=3777568.36..3778369.99 rows=320653 width=152)
(actual time=8372.263..8372.263 rows=0 loops=3)
               Sort Key: tstamp_start DESC
               Sort Method: quicksort  Memory: 25kB
               Worker 0:  Sort Method: quicksort  Memory: 25kB
               Worker 1:  Sort Method: quicksort  Memory: 25kB
               ->  Parallel Seq Scan on client_session
(cost=0.00..3775965.09 rows=320653 width=152) (actual
time=6150.412..8372.191 rows=1 loops=3)
                     Filter: ((user_id)::text =
((SESSION_USER)::character varying)::text)
                     Rows Removed by Filter: 51303778
 Planning Time: 0.203 ms
 JIT:
   Functions: 13
   Options: Inlining true, Optimization true, Expressions true,
Deforming true
   Timing: Generation 1.644 ms, Inlining 120.073 ms, Optimization
70.361 ms, Emission 28.476 ms, Total 220.554 ms
 Execution Time: 8438.307 ms
(18 rows)

Thanks a lot for your help

--
======================================
Dürr Software Entw.
Guggenberg 26, DE-82380 Peißenberg
fon: +49-8803-4899016 fax: +49-8803-4899017
info@fduerr.de

Attachments:

info.vcftext/vcard; charset=UTF-8; name=info.vcfDownload
#2Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Dürr Software (#1)
Re: PgSQL 15.3: Execution plan not using index as expected

On 8/9/23 01:14, Dürr Software wrote:

Dear list,

i have a strange problem when migrating a DB from version 9.3.4 to 15.3:
An index which seems perfect for the query and is used in 9.3.4 as
expected is not used in 15.3.

Did you run ANALYZE on the 15.3 database after the migration?

--
Adrian Klaver
adrian.klaver@aklaver.com

#3Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Dürr Software (#1)
Re: PgSQL 15.3: Execution plan not using index as expected

On 8/11/23 03:11, Dürr Software wrote:

Please reply to list also
Ccing list

Dear Adrian,

thanks for the reply. Of course i ran ANALYZE on the 15.3 system, its in
the second part of my post, but here again, FYI:

That is EXPLAIN ANALYZE where it is an option to the command:

https://www.postgresql.org/docs/current/sql-explain.html

ANALYZE

Carry out the command and show actual run times and other
statistics. This parameter defaults to FALSE.

What I was talking about was the ANALYZE command:

https://www.postgresql.org/docs/current/sql-analyze.html

ANALYZE collects statistics about the contents of tables in the
database, and stores the results in the pg_statistic system catalog.
Subsequently, the query planner uses these statistics to help determine
the most efficient execution plans for queries.

test=# \d client_session
                                                 Tabelle »client_session«
   Spalte     |              Typ               | Sortierfolge | NULL
erlaubt? |                   Vorgabewert
---------------+--------------------------------+--------------+---------------+--------------------------------------------------
id            | bigint                         |              | not null
     | nextval('admin.client_session_id_seq'::regclass)
tstamp_start  | timestamp(3) without time zone |              | not null
     | now()
permit_id     | character varying(63)          |              | not null
     | "current_user"()
user_id       | character varying(63)          |              | not null
     | "session_user"()

Indexe:
    "client_session_pkey" PRIMARY KEY, btree (id)
    "client_session_user_id_idx" btree (user_id, tstamp_start DESC)

test=# explain analyze SELECT permit_id FROM client_session WHERE
user_id::character varying(63)=SESSION_USER::character varying(63) ORDER
BY tstamp_start DESC LIMIT 1;
                                                                   QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
Limit  (cost=2852336.36..2852336.48 rows=1 width=23) (actual
time=5994.540..6000.702 rows=1 loops=1)
  ->  Gather Merge  (cost=2852336.36..2852697.59 rows=3096 width=23)
(actual time=5946.422..5952.583 rows=1 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Sort  (cost=2851336.34..2851340.21 rows=1548 width=23)
(actual time=5934.963..5934.964 rows=1 loops=3)
              Sort Key: tstamp_start DESC
              Sort Method: quicksort  Memory: 25kB
              Worker 0:  Sort Method: quicksort  Memory: 25kB
              Worker 1:  Sort Method: quicksort  Memory: 25kB
              ->  Parallel Seq Scan on client_session
 (cost=0.00..2851328.60 rows=1548 width=23) (actual
time=3885.774..5934.915 rows=1 loops=3)
                    Filter: ((user_id)::text =
((SESSION_USER)::character varying(63))::text)
                    Rows Removed by Filter: 37163374
Planning Time: 0.167 ms
JIT:
  Functions: 13
  Options: Inlining true, Optimization true, Expressions true,
Deforming true
  Timing: Generation 0.940 ms, Inlining 119.027 ms, Optimization 79.333
ms, Emission 29.624 ms, Total 228.924 ms
Execution Time: 6001.014 ms
(18 Zeilen)

Funny thing: if i create an index on tstamp_start alone, it is used just
perfectly:

Indexe:
    "client_session_pkey" PRIMARY KEY, btree (id)
    "client_session_tstamp_start" btree (tstamp_start)
    "client_session_user_id_idx" btree (user_id, tstamp_start DESC)

test=# explain analyze SELECT permit_id FROM admin.client_session WHERE
user_id::character varying(63)=SESSION_USER::character varying(63) ORDER
BY tstamp_start DESC LIMIT 1;

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..1787.85 rows=1 width=23) (actual time=0.721..0.723
rows=1 loops=1)
   ->  Index Scan Backward using client_session_tstamp_start on
client_session  (cost=0.57..6639766.39 rows=3715 width=23) (actual
time=0.719..0.719 rows=1 loops=1)
         Filter: ((user_id)::text = ((SESSION_USER)::character
varying(63))::text)
 Planning Time: 0.227 ms
 Execution Time: 0.761 ms
(5 Zeilen)

======================================
Dürr Software Entw.
Guggenberg 26, DE-82380 Peißenberg
fon: +49-8803-4899016 fax: +49-8803-4899017
info@fduerr.de

Am 10.08.23 um 16:41 schrieb Adrian Klaver:

On 8/9/23 01:14, Dürr Software wrote:

Dear list,

i have a strange problem when migrating a DB from version 9.3.4 to 15.3:
An index which seems perfect for the query and is used in 9.3.4 as
expected is not used in 15.3.

Did you run ANALYZE on the 15.3 database after the migration?

--
Adrian Klaver
adrian.klaver@aklaver.com

#4rob stone
floriparob@tpg.com.au
In reply to: Dürr Software (#1)
Re: PgSQL 15.3: Execution plan not using index as expected

Hello,

---------------------------------------------------------------------
-------------------------
  PostgreSQL 9.3.4 on x86_64-unknown-linux-gnu, compiled by gcc
(Debian
4.8.3-2) 4.8.3, 64-bit

---------------------------------------------------------------------
------------------------------------------------
  PostgreSQL 15.3 (Debian 15.3-1.pgdg120+1) on x86_64-pc-linux-gnu,
compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
(1 Zeile)

9.3 plan
-> Index Scan using client_session_user_id_idx on client_session

Looks like a collation issue given the difference in compilers used.
In the 9.3 plan the index is used.
Maybe try a reindex of the table.

HTH,
Rob

#5Dürr Software
info@fduerr.de
In reply to: Adrian Klaver (#3)
Re: PgSQL 15.3: Execution plan not using index as expected

Thanks Adrian,

sorry for the misunderstanding.
I ran ANALYZE, it didn't change a thing (as expected).
Anyway, I pinned the problem down now: It's the use of CURRENT_USER (or
SESSION_USER etc.) in the WHERE condition.
If i replace it with 'postgres' (the result of CURRENT_USER) the planner
works as expected..
The old 9.x - version of PgSQL didn't have that problem.

Test case:

-- our test table with index on user_id
CREATE TABLE tt (
 user_id VARCHAR(63) NOT NULL DEFAULT SESSION_USER
);
CREATE INDEX tt_user_id_idx ON tt(user_id);

-- fill with test data
INSERT INTO tt(user_id) select 'U' || i from generate_series(1,100000) as i;
INSERT INTO tt(user_id) select SESSION_USER from generate_series(1,100);

-- query using CURRENT_USER as WHERE-condition - doesn't use index
EXPLAIN ANALYZE SELECT * FROM tt WHERE user_id::character
varying(63)=CURRENT_USER::character varying(63) LIMIT 1;
                                               QUERY PLAN
---------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..21.65 rows=1 width=6) (actual time=18.143..18.143
rows=1 loops=1)
   ->  Seq Scan on tt  (cost=0.00..2446.00 rows=113 width=6) (actual
time=18.141..18.141 rows=1 loops=1)
         Filter: ((user_id)::text = ((CURRENT_USER)::character
varying(63))::text)
         Rows Removed by Filter: 100000
 Planning Time: 0.154 ms
 Execution Time: 18.163 ms
(6 Zeilen)

SELECT CURRENT_USER;
 current_user
--------------
 postgres
(1 Zeile)

-- query using result of CURRENT_USER as WHERE-condition - uses index
EXPLAIN ANALYZE SELECT * FROM tt WHERE user_id::character
varying(63)='postgres'::character varying(63) LIMIT 1;
                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..0.47 rows=1 width=6) (actual time=0.018..0.019
rows=1 loops=1)
   ->  Index Only Scan using tt_user_id_idx on tt (cost=0.42..6.39
rows=113 width=6) (actual time=0.017..0.018 rows=1 loops=1)
         Index Cond: (user_id = 'postgres'::text)
         Heap Fetches: 0
 Planning Time: 0.081 ms
 Execution Time: 0.026 ms
(6 Zeilen)

-- CURRENT_USER is not expensive..
EXPLAIN ANALYZE SELECT CURRENT_USER;
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Result  (cost=0.00..0.01 rows=1 width=64) (actual time=0.005..0.006
rows=1 loops=1)
 Planning Time: 0.031 ms
 Execution Time: 0.025 ms
(3 Zeilen)

I hope that this should clarify the problem.
Thanks and kind regards

======================================
Dürr Software Entw.
info@fduerr.de

Am 11.08.23 um 16:32 schrieb Adrian Klaver:

Show quoted text

On 8/11/23 03:11, Dürr Software wrote:

Please reply to list also
Ccing list

Dear Adrian,

thanks for the reply. Of course i ran ANALYZE on the 15.3 system, its
in the second part of my post, but here again, FYI:

That is EXPLAIN ANALYZE where it is an option to the command:

https://www.postgresql.org/docs/current/sql-explain.html

ANALYZE

    Carry out the command and show actual run times and other
statistics. This parameter defaults to FALSE.

What I was talking about was the ANALYZE command:

https://www.postgresql.org/docs/current/sql-analyze.html

ANALYZE collects statistics about the contents of tables in the
database, and stores the results in the pg_statistic system catalog.
Subsequently, the query planner uses these statistics to help
determine the most efficient execution plans for queries.

test=# \d client_session
                                                  Tabelle
»client_session«
    Spalte     |              Typ               | Sortierfolge | NULL
erlaubt? |                   Vorgabewert
---------------+--------------------------------+--------------+---------------+--------------------------------------------------

id            | bigint                         |              | not
null       | nextval('admin.client_session_id_seq'::regclass)
tstamp_start  | timestamp(3) without time zone |              | not
null       | now()
permit_id     | character varying(63)          |              | not
null       | "current_user"()
user_id       | character varying(63)          |              | not
null       | "session_user"()

Indexe:
     "client_session_pkey" PRIMARY KEY, btree (id)
     "client_session_user_id_idx" btree (user_id, tstamp_start DESC)

test=# explain analyze SELECT permit_id FROM client_session WHERE
user_id::character varying(63)=SESSION_USER::character varying(63)
ORDER BY tstamp_start DESC LIMIT 1;
                                                                    QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------

Limit  (cost=2852336.36..2852336.48 rows=1 width=23) (actual
time=5994.540..6000.702 rows=1 loops=1)
   ->  Gather Merge  (cost=2852336.36..2852697.59 rows=3096 width=23)
(actual time=5946.422..5952.583 rows=1 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=2851336.34..2851340.21 rows=1548 width=23)
(actual time=5934.963..5934.964 rows=1 loops=3)
               Sort Key: tstamp_start DESC
               Sort Method: quicksort  Memory: 25kB
               Worker 0:  Sort Method: quicksort  Memory: 25kB
               Worker 1:  Sort Method: quicksort  Memory: 25kB
               ->  Parallel Seq Scan on client_session
  (cost=0.00..2851328.60 rows=1548 width=23) (actual
time=3885.774..5934.915 rows=1 loops=3)
                     Filter: ((user_id)::text =
((SESSION_USER)::character varying(63))::text)
                     Rows Removed by Filter: 37163374
Planning Time: 0.167 ms
JIT:
   Functions: 13
   Options: Inlining true, Optimization true, Expressions true,
Deforming true
   Timing: Generation 0.940 ms, Inlining 119.027 ms, Optimization
79.333 ms, Emission 29.624 ms, Total 228.924 ms
Execution Time: 6001.014 ms
(18 Zeilen)

Funny thing: if i create an index on tstamp_start alone, it is used
just perfectly:

Indexe:
     "client_session_pkey" PRIMARY KEY, btree (id)
     "client_session_tstamp_start" btree (tstamp_start)
     "client_session_user_id_idx" btree (user_id, tstamp_start DESC)

test=# explain analyze SELECT permit_id FROM admin.client_session
WHERE user_id::character varying(63)=SESSION_USER::character
varying(63) ORDER BY tstamp_start DESC LIMIT 1;

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

  Limit  (cost=0.57..1787.85 rows=1 width=23) (actual
time=0.721..0.723 rows=1 loops=1)
    ->  Index Scan Backward using client_session_tstamp_start on
client_session (cost=0.57..6639766.39 rows=3715 width=23) (actual
time=0.719..0.719 rows=1 loops=1)
          Filter: ((user_id)::text = ((SESSION_USER)::character
varying(63))::text)
  Planning Time: 0.227 ms
  Execution Time: 0.761 ms
(5 Zeilen)

======================================
Dürr Software Entw.
info@fduerr.de

Am 10.08.23 um 16:41 schrieb Adrian Klaver:

On 8/9/23 01:14, Dürr Software wrote:

Dear list,

i have a strange problem when migrating a DB from version 9.3.4 to
15.3:
An index which seems perfect for the query and is used in 9.3.4 as
expected is not used in 15.3.

Did you run ANALYZE on the 15.3 database after the migration?

Attachments:

info.vcftext/vcard; charset=UTF-8; name=info.vcfDownload
#6Dürr Software
info@fduerr.de
In reply to: rob stone (#4)
Re: PgSQL 15.3: Execution plan not using index as expected

Thanks Rob,

no it's not a problem with the index. It's a problem with the use of
CURRENT_USER in the WHERE
I submitted a new post on this matter with a test case.

Kind regards

======================================
Dürr Software Entw.
Guggenberg 26, DE-82380 Peißenberg
fon: +49-8803-4899016 fax: +49-8803-4899017
info@fduerr.de

Am 12.08.23 um 04:25 schrieb rob stone:

Show quoted text

Hello,

---------------------------------------------------------------------
-------------------------
  PostgreSQL 9.3.4 on x86_64-unknown-linux-gnu, compiled by gcc
(Debian
4.8.3-2) 4.8.3, 64-bit

---------------------------------------------------------------------
------------------------------------------------
  PostgreSQL 15.3 (Debian 15.3-1.pgdg120+1) on x86_64-pc-linux-gnu,
compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
(1 Zeile)

9.3 plan
-> Index Scan using client_session_user_id_idx on client_session

Looks like a collation issue given the difference in compilers used.
In the 9.3 plan the index is used.
Maybe try a reindex of the table.

HTH,
Rob

Attachments:

info.vcftext/vcard; charset=UTF-8; name=info.vcfDownload