Question on execution plan and suitable index
Hi,
It's postgres version 17. We are having a critical UI query which runs for
~7 seconds+. The requirement is to bring down the response time within ~1
sec. Now in this plan , If i read this correctly, the below section is
consuming a significant amount of resources and should be addressed. i.e.
"Full scan of table "orders" and Nested loop with event_audit_log table".
*Below is the query and its complete plan:- *
https://gist.github.com/databasetech0073/f564ac23ee35d1f0413980fe4d00efa9
I am a bit new to the indexing strategy in postgres. My question is, what
suitable index should we create to cater these above?
1)For table event_audit_log:- Should we create composite Index on column
(request_id,created_at,event_comment_text) or should we create the covering
index i.e. just on two column (request_id,created_at) with "include" clause
for "event_comment_text". How and when the covering index indexes should be
used here in postgres. Want to understand from experts?
2)Similarly for table orders:- Should we create a covering index on column
(entity_id,due_date,order_type) with include clause (firm_dspt_case_id). Or
just a composite index (entity_id,due_date,order_type).
3)Whether the column used as range operator (here created_at or due_date)
should be used as leading column in the composite index or is it fine to
keep it as non leading?
-> Nested Loop (cost=50.06..2791551.71 rows=3148 width=19) (actual
time=280.735..7065.313 rows=57943 loops=3)
Buffers: shared hit=10014901
-> Hash Join (cost=49.49..1033247.35 rows=36729 width=8) (actual
time=196.407..3805.755 rows=278131 loops=3)
Hash Cond: ((ord.entity_id)::numeric = e.entity_id)
Buffers: shared hit=755352
-> Parallel Seq Scan on orders ord (cost=0.00..1022872.54 rows=3672860
width=16) (actual time=139.883..3152.627 rows=2944671 loops=3)
Filter: ((due_date >= '2024-01-01'::date) AND (due_date <=
'2024-04-01'::date) AND (order_type = ANY ('{TYPE_A,TYPE_B}'::text[])))
Rows Removed by Filter: 6572678
Buffers: shared hit=755208
Regards
Yudhi
On 2/15/26 11:04, yudhi s wrote:
Hi,
It's postgres version 17. We are having a critical UI query which runs
for ~7 seconds+. The requirement is to bring down the response time
within ~1 sec. Now in this plan , If i read this correctly, the below
section is consuming a significant amount of resources and should be
addressed. i.e. "Full scan of table "orders" and Nested loop with
event_audit_log table".
For a start:
1) Supply the complete schema for the tables involved.
2) Also what is the minor version you are using e.g the x in 17.x?
I also recommend reading:
https://wiki.postgresql.org/wiki/Slow_Query_Questions
*Below is the query and its complete plan:- *
https://gist.github.com/databasetech0073/
f564ac23ee35d1f0413980fe4d00efa9 <https://gist.github.com/
databasetech0073/f564ac23ee35d1f0413980fe4d00efa9>I am a bit new to the indexing strategy in postgres. My question is,
what suitable index should we create to cater these above?1)For table event_audit_log:- Should we create composite Index on column
(request_id,created_at,event_comment_text) or should we create the
covering index i.e. just on two column (request_id,created_at) with
"include" clause for "event_comment_text". How and when the covering
index indexes should be used here in postgres. Want to understand from
experts?
2)Similarly for table orders:- Should we create a covering index on
column (entity_id,due_date,order_type) with include clause
(firm_dspt_case_id). Or just a composite index
(entity_id,due_date,order_type).
3)Whether the column used as range operator (here created_at or
due_date) should be used as leading column in the composite index or is
it fine to keep it as non leading?-> Nested Loop (cost=50.06..2791551.71 rows=3148 width=19) (actual
time=280.735..7065.313 rows=57943 loops=3)
Buffers: shared hit=10014901
-> Hash Join (cost=49.49..1033247.35 rows=36729 width=8) (actual
time=196.407..3805.755 rows=278131 loops=3)
Hash Cond: ((ord.entity_id)::numeric = e.entity_id)
Buffers: shared hit=755352
-> Parallel Seq Scan on orders ord (cost=0.00..1022872.54 rows=3672860
width=16) (actual time=139.883..3152.627 rows=2944671 loops=3)
Filter: ((due_date >= '2024-01-01'::date) AND (due_date <=
'2024-04-01'::date) AND (order_type = ANY ('{TYPE_A,TYPE_B}'::text[])))
Rows Removed by Filter: 6572678
Buffers: shared hit=755208Regards
Yudhi
--
Adrian Klaver
adrian.klaver@aklaver.com
I would definitely recommend to focus on this section:
Filter: ((due_date >= '2024-01-01'::date) AND (due_date <=
'2024-04-01'::date) AND (order_type = ANY ('{TYPE_A,TYPE_B}'::text[])))
Rows Removed by Filter: 6572678
You can certainly try composite index or partial index for order_type.
Thank you,
Nisarg
On Sun, Feb 15, 2026, 3:51 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:
Show quoted text
On 2/15/26 11:04, yudhi s wrote:
Hi,
It's postgres version 17. We are having a critical UI query which runs
for ~7 seconds+. The requirement is to bring down the response time
within ~1 sec. Now in this plan , If i read this correctly, the below
section is consuming a significant amount of resources and should be
addressed. i.e. "Full scan of table "orders" and Nested loop with
event_audit_log table".For a start:
1) Supply the complete schema for the tables involved.
2) Also what is the minor version you are using e.g the x in 17.x?
I also recommend reading:
https://wiki.postgresql.org/wiki/Slow_Query_Questions
*Below is the query and its complete plan:- *
https://gist.github.com/databasetech0073/
f564ac23ee35d1f0413980fe4d00efa9 <https://gist.github.com/
databasetech0073/f564ac23ee35d1f0413980fe4d00efa9>I am a bit new to the indexing strategy in postgres. My question is,
what suitable index should we create to cater these above?1)For table event_audit_log:- Should we create composite Index on column
(request_id,created_at,event_comment_text) or should we create the
covering index i.e. just on two column (request_id,created_at) with
"include" clause for "event_comment_text". How and when the covering
index indexes should be used here in postgres. Want to understand from
experts?
2)Similarly for table orders:- Should we create a covering index on
column (entity_id,due_date,order_type) with include clause
(firm_dspt_case_id). Or just a composite index
(entity_id,due_date,order_type).
3)Whether the column used as range operator (here created_at or
due_date) should be used as leading column in the composite index or is
it fine to keep it as non leading?-> Nested Loop (cost=50.06..2791551.71 rows=3148 width=19) (actual
time=280.735..7065.313 rows=57943 loops=3)
Buffers: shared hit=10014901
-> Hash Join (cost=49.49..1033247.35 rows=36729 width=8) (actual
time=196.407..3805.755 rows=278131 loops=3)
Hash Cond: ((ord.entity_id)::numeric = e.entity_id)
Buffers: shared hit=755352
-> Parallel Seq Scan on orders ord (cost=0.00..1022872.54 rows=3672860
width=16) (actual time=139.883..3152.627 rows=2944671 loops=3)
Filter: ((due_date >= '2024-01-01'::date) AND (due_date <=
'2024-04-01'::date) AND (order_type = ANY ('{TYPE_A,TYPE_B}'::text[])))
Rows Removed by Filter: 6572678
Buffers: shared hit=755208Regards
Yudhi--
Adrian Klaver
adrian.klaver@aklaver.com
On Mon, 2026-02-16 at 00:34 +0530, yudhi s wrote:
It's postgres version 17. We are having a critical UI query which runs for ~7 seconds+. The requirement is to bring down the response time within ~1 sec. Now in this plan , If i read this correctly, the below section is consuming a significant amount of resources and should be addressed. i.e. "Full scan of table "orders" and Nested loop with event_audit_log table".
Below is the query and its complete plan:-
https://gist.github.com/databasetech0073/f564ac23ee35d1f0413980fe4d00efa9I am a bit new to the indexing strategy in postgres. My question is, what suitable index should we create to cater these above?
1)For table event_audit_log:- Should we create composite Index on column (request_id,created_at,event_comment_text) or should we create the covering index i.e. just on two column (request_id,created_at) with "include" clause for "event_comment_text". How and when the covering index indexes should be used here in postgres. Want to understand from experts?
2)Similarly for table orders:- Should we create a covering index on column (entity_id,due_date,order_type) with include clause (firm_dspt_case_id). Or just a composite index (entity_id,due_date,order_type).
3)Whether the column used as range operator (here created_at or due_date) should be used as leading column in the composite index or is it fine to keep it as non leading?-> Nested Loop (cost=50.06..2791551.71 rows=3148 width=19) (actual time=280.735..7065.313 rows=57943 loops=3)
Buffers: shared hit=10014901
-> Hash Join (cost=49.49..1033247.35 rows=36729 width=8) (actual time=196.407..3805.755 rows=278131 loops=3)
Hash Cond: ((ord.entity_id)::numeric = e.entity_id)
Buffers: shared hit=755352
-> Parallel Seq Scan on orders ord (cost=0.00..1022872.54 rows=3672860 width=16) (actual time=139.883..3152.627 rows=2944671 loops=3)
Filter: ((due_date >= '2024-01-01'::date) AND (due_date <= '2024-04-01'::date) AND (order_type = ANY ('{TYPE_A,TYPE_B}'::text[])))
Rows Removed by Filter: 6572678
Buffers: shared hit=755208
You are selecting a lot of rows, so the query will never be really cheap.
But I agree that an index scan should be a win.
If the condition on "order_type" is always the same, a partial index is ideal:
CREATE INDEX ON orders (due_date) WHERE order_type IN ('TYPE_A', 'TYPE_B');
Otherwise, I'd create two indexes: one on "order_type" and one on "due_date".
Yours,
Laurenz Albe
On Mon, Feb 16, 2026 at 2:29 PM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:
On Mon, 2026-02-16 at 00:34 +0530, yudhi s wrote:
It's postgres version 17. We are having a critical UI query which runs
for ~7 seconds+. The requirement is to bring down the response time within
~1 sec. Now in this plan , If i read this correctly, the below section is
consuming a significant amount of resources and should be addressed. i.e.
"Full scan of table "orders" and Nested loop with event_audit_log table".Below is the query and its complete plan:-
https://gist.github.com/databasetech0073/f564ac23ee35d1f0413980fe4d00efa9
I am a bit new to the indexing strategy in postgres. My question is,
what suitable index should we create to cater these above?
1)For table event_audit_log:- Should we create composite Index on column
(request_id,created_at,event_comment_text) or should we create the covering
index i.e. just on two column (request_id,created_at) with "include" clause
for "event_comment_text". How and when the covering index indexes should be
used here in postgres. Want to understand from experts?2)Similarly for table orders:- Should we create a covering index on
column (entity_id,due_date,order_type) with include clause
(firm_dspt_case_id). Or just a composite index
(entity_id,due_date,order_type).3)Whether the column used as range operator (here created_at or
due_date) should be used as leading column in the composite index or is it
fine to keep it as non leading?->
*Nested Loop (cost=50.06..2791551.71 rows=3148 width=19) (actual
time=280.735..7065.313 rows=57943 loops=3) > Buffers: shared hit=10014901*-> Hash Join (cost=49.49..1033247.35 rows=36729 width=8) (actual
time=196.407..3805.755 rows=278131 loops=3)
Hash Cond: ((ord.entity_id)::numeric = e.entity_id)
Buffers: shared hit=755352
-> Parallel Seq Scan on orders ord (cost=0.00..1022872.54rows=3672860 width=16) (actual time=139.883..3152.627 rows=2944671 loops=3)
Filter: ((due_date >= '2024-01-01'::date) AND (due_date <=
'2024-04-01'::date) AND (order_type = ANY ('{TYPE_A,TYPE_B}'::text[])))
Rows Removed by Filter: 6572678
Buffers: shared hit=755208You are selecting a lot of rows, so the query will never be really cheap.
But I agree that an index scan should be a win.If the condition on "order_type" is always the same, a partial index is
ideal:CREATE INDEX ON orders (due_date) WHERE order_type IN ('TYPE_A',
'TYPE_B');Otherwise, I'd create two indexes: one on "order_type" and one on
"due_date".
Version is 17.7. Below is the table definitions as i pulled from Dbeaver
tool:-
https://gist.github.com/databasetech0073/f22d95de18dc3f1fa54af13e7fd2ce9e
The Order_type will be TYPE_A and TYPE_B in most of the cases. And below is
the distribution. So , it looks like the index on this column will not help
much. Correct me if I'm wrong. I am wondering why the already existing
index on column "due_date" of table "order" is not getting used by the
optimizer? Should we also add the column "entity_id" to the index too?
TYPE_A 25 Million
TYPE_B 2 Million
TYPE_C 700K
TYPE_D 200K
TYPE_E 6k
And, Yes there are differences in data types of the "entity_id" for columns
of table "order" and "entity". We need to fix that after analyzing the data.
Also the highlighted Nested loop above shows ~10M shared hits (which will
be ~70GB+ if we consider one hit as an 8K block). So does that mean , apart
from the Full scan on the "order" table , the main resource consuming
factor here is the scanning of "event_audit_log". And what is the best way
to improve this? Currently this table is getting scanned through an unique
index on column "request_id".
Regards
Yudhi
On Mon, 2026-02-16 at 14:43 +0530, yudhi s wrote:
On Mon, Feb 16, 2026 at 2:29 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Mon, 2026-02-16 at 00:34 +0530, yudhi s wrote:
It's postgres version 17. We are having a critical UI query which runs for ~7 seconds+. The requirement is to bring down the response time within ~1 sec. Now in this plan , If i read this correctly, the below section is consuming a significant amount of resources and should be addressed. i.e. "Full scan of table "orders" and Nested loop with event_audit_log table".
Below is the query and its complete plan:-
https://gist.github.com/databasetech0073/f564ac23ee35d1f0413980fe4d00efa9I am a bit new to the indexing strategy in postgres. My question is, what suitable index should we create to cater these above?
1)For table event_audit_log:- Should we create composite Index on column (request_id,created_at,event_comment_text) or should we create the covering index i.e. just on two column (request_id,created_at) with "include" clause for "event_comment_text". How and when the covering index indexes should be used here in postgres. Want to understand from experts?
2)Similarly for table orders:- Should we create a covering index on column (entity_id,due_date,order_type) with include clause (firm_dspt_case_id). Or just a composite index (entity_id,due_date,order_type).
3)Whether the column used as range operator (here created_at or due_date) should be used as leading column in the composite index or is it fine to keep it as non leading?-> Nested Loop (cost=50.06..2791551.71 rows=3148 width=19) (actual time=280.735..7065.313 rows=57943 loops=3)
Buffers: shared hit=10014901
-> Hash Join (cost=49.49..1033247.35 rows=36729 width=8) (actual time=196.407..3805.755 rows=278131 loops=3)
Hash Cond: ((ord.entity_id)::numeric = e.entity_id)
Buffers: shared hit=755352
-> Parallel Seq Scan on orders ord (cost=0.00..1022872.54 rows=3672860 width=16) (actual time=139.883..3152.627 rows=2944671 loops=3)
Filter: ((due_date >= '2024-01-01'::date) AND (due_date <= '2024-04-01'::date) AND (order_type = ANY ('{TYPE_A,TYPE_B}'::text[])))
Rows Removed by Filter: 6572678
Buffers: shared hit=755208You are selecting a lot of rows, so the query will never be really cheap.
But I agree that an index scan should be a win.If the condition on "order_type" is always the same, a partial index is ideal:
CREATE INDEX ON orders (due_date) WHERE order_type IN ('TYPE_A', 'TYPE_B');
Otherwise, I'd create two indexes: one on "order_type" and one on "due_date".
Version is 17.7. Below is the table definitions as i pulled from Dbeaver tool:-
https://gist.github.com/databasetech0073/f22d95de18dc3f1fa54af13e7fd2ce9e
The Order_type will be TYPE_A and TYPE_B in most of the cases. And below is the distribution.
So , it looks like the index on this column will not help much. Correct me if I'm wrong.TYPE_A 25 Million
TYPE_B 2 Million
TYPE_C 700K
TYPE_D 200K
TYPE_E 6k
No, you are right about that.
I am wondering why the already existing index on column "due_date" of table "order" is not
getting used by the optimizer? Should we also add the column "entity_id" to the index too?
Seeing that your execution plan is incomplete, it is hard to say anything about that.
The scans of "entities" are missing, as is the UNION.
And, Yes there are differences in data types of the "entity_id" for columns of table "order"
and "entity". We need to fix that after analyzing the data.Also the highlighted Nested loop above shows ~10M shared hits (which will be ~70GB+ if we
consider one hit as an 8K block). So does that mean , apart from the Full scan on the "order"
table , the main resource consuming factor here is the scanning of "event_audit_log".
Correct.
Yours,
Laurenz Albe
On Mon, Feb 16, 2026 at 3:24 PM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:
On Mon, 2026-02-16 at 14:43 +0530, yudhi s wrote:
On Mon, Feb 16, 2026 at 2:29 PM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:
On Mon, 2026-02-16 at 00:34 +0530, yudhi s wrote:
It's postgres version 17. We are having a critical UI query which
runs for ~7 seconds+. The requirement is to bring down the response time
within ~1 sec. Now in this plan , If i read this correctly, the below
section is consuming a significant amount of resources and should be
addressed. i.e. "Full scan of table "orders" and Nested loop with
event_audit_log table".Below is the query and its complete plan:-
https://gist.github.com/databasetech0073/f564ac23ee35d1f0413980fe4d00efa9
I am a bit new to the indexing strategy in postgres. My question is,
what suitable index should we create to cater these above?
1)For table event_audit_log:- Should we create composite Index on
column (request_id,created_at,event_comment_text) or should we create the
covering index i.e. just on two column (request_id,created_at) with
"include" clause for "event_comment_text". How and when the covering index
indexes should be used here in postgres. Want to understand from experts?2)Similarly for table orders:- Should we create a covering index on
column (entity_id,due_date,order_type) with include clause
(firm_dspt_case_id). Or just a composite index
(entity_id,due_date,order_type).3)Whether the column used as range operator (here created_at or
due_date) should be used as leading column in the composite index or is it
fine to keep it as non leading?-> Nested Loop (cost=50.06..2791551.71 rows=3148 width=19) (actual
time=280.735..7065.313 rows=57943 loops=3)
Buffers: shared hit=10014901
-> Hash Join (cost=49.49..1033247.35 rows=36729 width=8) (actualtime=196.407..3805.755 rows=278131 loops=3)
Hash Cond: ((ord.entity_id)::numeric = e.entity_id)
Buffers: shared hit=755352
-> Parallel Seq Scan on orders ord (cost=0.00..1022872.54rows=3672860 width=16) (actual time=139.883..3152.627 rows=2944671 loops=3)
Filter: ((due_date >= '2024-01-01'::date) AND (due_date <=
'2024-04-01'::date) AND (order_type = ANY ('{TYPE_A,TYPE_B}'::text[])))
Rows Removed by Filter: 6572678
Buffers: shared hit=755208You are selecting a lot of rows, so the query will never be really
cheap.
But I agree that an index scan should be a win.
If the condition on "order_type" is always the same, a partial index
is ideal:
CREATE INDEX ON orders (due_date) WHERE order_type IN ('TYPE_A',
'TYPE_B');
Otherwise, I'd create two indexes: one on "order_type" and one on
"due_date".
Version is 17.7. Below is the table definitions as i pulled from Dbeaver
tool:-
https://gist.github.com/databasetech0073/f22d95de18dc3f1fa54af13e7fd2ce9e
The Order_type will be TYPE_A and TYPE_B in most of the cases. And below
is the distribution.
So , it looks like the index on this column will not help much. Correct
me if I'm wrong.
TYPE_A 25 Million
TYPE_B 2 Million
TYPE_C 700K
TYPE_D 200K
TYPE_E 6kNo, you are right about that.
I am wondering why the already existing index on column "due_date" of
table "order" is not
getting used by the optimizer? Should we also add the column "entity_id"
to the index too?
Seeing that your execution plan is incomplete, it is hard to say anything
about that.
The scans of "entities" are missing, as is the UNION.And, Yes there are differences in data types of the "entity_id" for
columns of table "order"
and "entity". We need to fix that after analyzing the data.
Also the highlighted Nested loop above shows ~10M shared hits (which
will be ~70GB+ if we
consider one hit as an 8K block). So does that mean , apart from the
Full scan on the "order"
table , the main resource consuming factor here is the scanning of
"event_audit_log".
Correct.
Yours,
Laurenz Albe
Hi,
I have updated the plan below. While trying to replace actual binds and the
objects with sample names some lines got missed initially it seems.
https://gist.github.com/databasetech0073/f564ac23ee35d1f0413980fe4d00efa9
On Mon, 2026-02-16 at 16:09 +0530, yudhi s wrote:
I have updated the plan below. While trying to replace actual binds and
the objects with sample names some lines got missed initially it seems.https://gist.github.com/databasetech0073/f564ac23ee35d1f0413980fe4d00efa9
Thanks.
Does the filter on "due_date" eliminate many rows in "orders"? If yes,
and an index on that column would actually perform better (which you
could test with enable_seqscan = off), perhaps your "random_page_cost"
parameter is set too high.
Where you can certainly make a difference is the repeated scan on
"event_audit_log". An index on (request_id, event_comment_text, created_at)
should speed up that part.
Yours,
Laurenz Albe
On Mon, Feb 16, 2026 at 5:22 PM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:
On Mon, 2026-02-16 at 16:09 +0530, yudhi s wrote:
I have updated the plan below. While trying to replace actual binds and
the objects with sample names some lines got missed initially it seems.https://gist.github.com/databasetech0073/f564ac23ee35d1f0413980fe4d00efa9
Thanks.
Does the filter on "due_date" eliminate many rows in "orders"? If yes,
and an index on that column would actually perform better (which you
could test with enable_seqscan = off), perhaps your "random_page_cost"
parameter is set too high.Where you can certainly make a difference is the repeated scan on
"event_audit_log". An index on (request_id, event_comment_text,
created_at)
should speed up that part.Yours,
Laurenz Albe
Thank you so much. Will try this one.
Regarding the composite index on (request_id, event_comment_text,
created_at) for table event_audit_log, is there any advice, which we should
follow for keeping "date column"(like column "Created_at" here) in the
indexing order (apart from the frequency of usage in the query)?
And to help the table scan of the ORDER table, should we also have
"entity_id" added to the index along with "due_date" i.e. a composite index
on (entity_id,due_date)?
Regards
Yudhi
On Mon, 2026-02-16 at 17:52 +0530, yudhi s wrote:
Regarding the composite index on (request_id, event_comment_text, created_at) for
table event_audit_log, is there any advice, which we should follow for keeping
"date column"(like column "Created_at" here) in the indexing order (apart from
the frequency of usage in the query)?
The rule is to first have all columns that are compared with equality, then the
others, starting with the most selective one. Based on the condition, I guessed
that that wouldn't be your "date" column, but if it is, put it second.
And to help the table scan of the ORDER table, should we also have "entity_id"
added to the index along with "due_date" i.e. a composite index on (entity_id,due_date)?
I don't know; you'd have to test it on your test system.
The current execution plan has no use for such an index, but adding additional
columns could
1. lead to an efficient index-only scan
2. make PostgreSQL pick an altogether different, better plan
Yours,
Laurenz Albe