How to improve the performance of my SQL query?
Hi everyone,
I'm running into some performance issues with my SQL query.
The following SQL query is taking a long time to execute.
explain analyze
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = '009'
and TBL_SHA.ETRYS in
(select TBL_INF.RY_CD
from TBL_INF
WHERE TBL_INF.MS_CD = '009'
AND TBL_INF.RY_CD = '000001'
)
----- Execution Plan -----
Limit (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.769..124168.771 rows=1 loops=1)
-> Aggregate (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.767..124168.769 rows=1 loops=1)
-> Nested Loop (cost=0.29..2730702.63 rows=3202774 width=9) (actual time=97264.166..123920.769 rows=3200000 loops=1)
-> Index Only Scan using TBL_INF_pkc on TBL_INF (cost=0.29..8.31 rows=1 width=9) (actual time=0.025..0.030 rows=1 loops=1)
Index Cond: ((MS_CD = '009'::bpchar) AND (RY_CD = '000001'::bpchar))
Heap Fetches: 1
-> Seq Scan on TBL_SHA (cost=0.00..2698666.58 rows=3202774 width=18) (actual time=97264.138..123554.792 rows=3200000 loops=1)
Filter: ((MS_CD = '009'::bpchar) AND (ETRYS = '000001'::bpchar))
Rows Removed by Filter: 32000325
Planning Time: 0.162 ms
Execution Time: 124168.838 ms
--------------------------------------------------------------------------------
The index is defined as follows.
CREATE INDEX index_search_01 ON mdb.TBL_SHA USING btree (MS_CD, ETRYS);
When I take the following sql statement, the index works fine and the query is fast.
select COUNT(ET_CD)
from TBL_SHA
WHERE MS_CD = '009'
AND ETRYS = '000001'
The amount of data in the table is as follows.
TBL_SHA 38700325
TBL_INF 35546
Any suggestions for improving the performance of the query would be greatly appreciated.
Thanks in advance!
On 20/07/2023 09:09 CEST gzh <gzhcoder@126.com> wrote:
I'm running into some performance issues with my SQL query.
The following SQL query is taking a long time to execute.explain analyze
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = '009'
and TBL_SHA.ETRYS in
(select TBL_INF.RY_CD
from TBL_INF
WHERE TBL_INF.MS_CD = '009'
AND TBL_INF.RY_CD = '000001'
)
----- Execution Plan -----
Limit (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.769..124168.771 rows=1 loops=1)
-> Aggregate (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.767..124168.769 rows=1 loops=1)
-> Nested Loop (cost=0.29..2730702.63 rows=3202774 width=9) (actual time=97264.166..123920.769 rows=3200000 loops=1)
-> Index Only Scan using TBL_INF_pkc on TBL_INF (cost=0.29..8.31 rows=1 width=9) (actual time=0.025..0.030 rows=1 loops=1)
Index Cond: ((MS_CD = '009'::bpchar) AND (RY_CD = '000001'::bpchar))
Heap Fetches: 1
-> Seq Scan on TBL_SHA (cost=0.00..2698666.58 rows=3202774 width=18) (actual time=97264.138..123554.792 rows=3200000 loops=1)
Filter: ((MS_CD = '009'::bpchar) AND (ETRYS = '000001'::bpchar))
Rows Removed by Filter: 32000325
Planning Time: 0.162 ms
Execution Time: 124168.838 ms
--------------------------------------------------------------------------------The index is defined as follows.
CREATE INDEX index_search_01 ON mdb.TBL_SHA USING btree (MS_CD, ETRYS);
When I take the following sql statement, the index works fine and the query
is fast.select COUNT(ET_CD)
from TBL_SHA
WHERE MS_CD = '009'
AND ETRYS = '000001'The amount of data in the table is as follows.
TBL_SHA 38700325
TBL_INF 35546Any suggestions for improving the performance of the query would be greatly
appreciated.
You can try EXISTS instead of IN to correlate the subquery and the outer query:
SELECT count(et_cd)
FROM tbl_sha
WHERE tbl_sha.ms_cd = '009'
AND tbl_sha.etrys = '000001'
AND EXISTS (
SELECT
FROM tbl_inf
WHERE tbl_inf.ms_cd = tbl_sha.ms_cd
AND tbl_inf.ry_cd = tbl_sha.etrys
)
--
Erik
On 2023-07-20 15:09:22 +0800, gzh wrote:
Hi everyone,
I'm running into some performance issues with my SQL query.
The following SQL query is taking a long time to execute.
explain analyze
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = '009'
and TBL_SHA.ETRYS in
(select TBL_INF.RY_CD
from TBL_INF
WHERE TBL_INF.MS_CD = '009'
AND TBL_INF.RY_CD = '000001'
)----- Execution Plan -----
[...]
-> Seq Scan on TBL_SHA (cost=0.00..2698666.58 rows=3202774 width=18)
(actual time=97264.138..123554.792 rows=3200000 loops=1)
Filter: ((MS_CD = '009'::bpchar) AND (ETRYS = '000001'::bpchar))
Rows Removed by Filter: 32000325
[...]
--------------------------------------------------------------------------------
The index is defined as follows.
CREATE INDEX index_search_01 ON mdb.TBL_SHA USING btree (MS_CD, ETRYS);
When I take the following sql statement, the index works fine and the query is
fast.select COUNT(ET_CD)
from TBL_SHA
WHERE MS_CD = '009'
AND ETRYS = '000001'
What's the plan for that query?
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
Thank you very much for taking the time to reply to my question.
Sorry, I provided incorrect information.
The index also does not work in the following query statement.
select COUNT(ET_CD)
from TBL_SHA
WHERE MS_CD = '009'
AND ETRYS = '000001'
QUERY PLAN
Limit (cost=2419643.47..2419643.48 rows=1 width=8) (actual time=128667.439..128668.250 rows=1 loops=1)
-> Finalize Aggregate (cost=2419643.47..2419643.48 rows=1 width=8) (actual time=128667.437..128668.246 rows=1 loops=1)
-> Gather (cost=2419643.25..2419643.46 rows=2 width=8) (actual time=128664.108..128668.233 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=2418643.25..2418643.26 rows=1 width=8) (actual time=128655.256..128655.258 rows=1 loops=3)
-> Parallel Seq Scan on TBL_SHA (cost=0.00..2415548.85 rows=1237762 width=9) (actual time=75357.455..128531.615 rows=1066667 loops=3)
Filter: ((MS_CD = '009'::bpchar) AND (ETRYS = '000001'::bpchar))
Rows Removed by Filter: 11833442
Planning Time: 0.118 ms
Execution Time: 128668.290 ms
The TBL_SHA table has another index, as shown below.
CREATE INDEX index_search_02 ON mdb.TBL_SHA USING btree (ET_CD, ETRYS)
CREATE INDEX index_search_03 ON mdb.TBL_SHA USING btree (MS_CD, ET_DAY, BK_CD, FR_CD, RM_CD)
When I take the following query statement, the result is returned quickly.
Why does index_search_01 always not work?
explain analyze
select TO_CHAR(MAX(TBL_SHA.ET_DAY),'YYYYMMDD') AS ET_DAY
from TBL_SHA
WHERE MS_CD = '008'
AND ET_CD = '000003'
QUERY PLAN
Limit (cost=4.11..4.13 rows=1 width=32) (actual time=0.043..0.044 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.56..4.11 rows=1 width=8) (actual time=0.032..0.033 rows=1 loops=1)
-> Index Scan Backward using index_search_01 on TBL_SHA (cost=0.56..10836962.40 rows=3054052 width=8) (actual time=0.032..0.032 rows=1 loops=1)
Index Cond: ((MS_CD = '008'::bpchar) AND (ent_day IS NOT NULL))
Filter: (ET_CD = '000003'::bpchar)
-> Result (cost=4.11..4.13 rows=1 width=32) (actual time=0.042..0.042 rows=1 loops=1)
Planning Time: 0.228 ms
Execution Time: 0.070 ms
At 2023-07-20 16:07:15, "Erik Wienhold" <ewie@ewie.name> wrote:
Show quoted text
On 20/07/2023 09:09 CEST gzh <gzhcoder@126.com> wrote:
I'm running into some performance issues with my SQL query.
The following SQL query is taking a long time to execute.explain analyze
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = '009'
and TBL_SHA.ETRYS in
(select TBL_INF.RY_CD
from TBL_INF
WHERE TBL_INF.MS_CD = '009'
AND TBL_INF.RY_CD = '000001'
)
----- Execution Plan -----
Limit (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.769..124168.771 rows=1 loops=1)
-> Aggregate (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.767..124168.769 rows=1 loops=1)
-> Nested Loop (cost=0.29..2730702.63 rows=3202774 width=9) (actual time=97264.166..123920.769 rows=3200000 loops=1)
-> Index Only Scan using TBL_INF_pkc on TBL_INF (cost=0.29..8.31 rows=1 width=9) (actual time=0.025..0.030 rows=1 loops=1)
Index Cond: ((MS_CD = '009'::bpchar) AND (RY_CD = '000001'::bpchar))
Heap Fetches: 1
-> Seq Scan on TBL_SHA (cost=0.00..2698666.58 rows=3202774 width=18) (actual time=97264.138..123554.792 rows=3200000 loops=1)
Filter: ((MS_CD = '009'::bpchar) AND (ETRYS = '000001'::bpchar))
Rows Removed by Filter: 32000325
Planning Time: 0.162 ms
Execution Time: 124168.838 ms
--------------------------------------------------------------------------------The index is defined as follows.
CREATE INDEX index_search_01 ON mdb.TBL_SHA USING btree (MS_CD, ETRYS);
When I take the following sql statement, the index works fine and the query
is fast.select COUNT(ET_CD)
from TBL_SHA
WHERE MS_CD = '009'
AND ETRYS = '000001'The amount of data in the table is as follows.
TBL_SHA 38700325
TBL_INF 35546Any suggestions for improving the performance of the query would be greatly
appreciated.You can try EXISTS instead of IN to correlate the subquery and the outer query:
SELECT count(et_cd)
FROM tbl_sha
WHERE tbl_sha.ms_cd = '009'
AND tbl_sha.etrys = '000001'
AND EXISTS (
SELECT
FROM tbl_inf
WHERE tbl_inf.ms_cd = tbl_sha.ms_cd
AND tbl_inf.ry_cd = tbl_sha.etrys
)--
Erik
On Thu, 2023-07-20 at 15:09 +0800, gzh wrote:
I'm running into some performance issues with my SQL query.
The following SQL query is taking a long time to execute.explain analyze
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = '009'
and TBL_SHA.ETRYS in
(select TBL_INF.RY_CD
from TBL_INF
WHERE TBL_INF.MS_CD = '009'
AND TBL_INF.RY_CD = '000001'
)
----- Execution Plan -----
Limit (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.769..124168.771 rows=1 loops=1)
-> Aggregate (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.767..124168.769 rows=1 loops=1)
-> Nested Loop (cost=0.29..2730702.63 rows=3202774 width=9) (actual time=97264.166..123920.769 rows=3200000 loops=1)
-> Index Only Scan using TBL_INF_pkc on TBL_INF (cost=0.29..8.31 rows=1 width=9) (actual time=0.025..0.030 rows=1 loops=1)
Index Cond: ((MS_CD = '009'::bpchar) AND (RY_CD = '000001'::bpchar))
Heap Fetches: 1
-> Seq Scan on TBL_SHA (cost=0.00..2698666.58 rows=3202774 width=18) (actual time=97264.138..123554.792 rows=3200000 loops=1)
Filter: ((MS_CD = '009'::bpchar) AND (ETRYS = '000001'::bpchar))
Rows Removed by Filter: 32000325
Planning Time: 0.162 ms
Execution Time: 124168.838 ms
--------------------------------------------------------------------------------The index is defined as follows.
CREATE INDEX index_search_01 ON mdb.TBL_SHA USING btree (MS_CD, ETRYS);
Actual rows = 3200000, rows removed by filter is ten times as much.
It should use an index.
When I take the following sql statement, the index works fine and the query is fast.
select COUNT(ET_CD)
from TBL_SHA
WHERE MS_CD = '009'
AND ETRYS = '000001'The amount of data in the table is as follows.
TBL_SHA 38700325
TBL_INF 35546
This looks very much like it is a problem with the data types.
I see that you are using "character", which you shouldn't do.
What I cannot see is if the columns are defined as "character" or whether you bind
the parameters as "character". Can you show us the table definition of "TBL_SHA"
and "TBL_INF"?
Yours,
Laurenz Albe
Thank you for your reply.
The information I provided is incorrect, please see my previous reply.
What I cannot see is if the columns are defined as "character" or whether you bind
the parameters as "character". Can you show us the table definition of "TBL_SHA"
and "TBL_INF"?
For information security reasons, I can't provide the table definition, these columns are defined as "character".
At 2023-07-20 19:58:59, "Laurenz Albe" <laurenz.albe@cybertec.at> wrote:
Show quoted text
On Thu, 2023-07-20 at 15:09 +0800, gzh wrote:
I'm running into some performance issues with my SQL query.
The following SQL query is taking a long time to execute.explain analyze
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = '009'
and TBL_SHA.ETRYS in
(select TBL_INF.RY_CD
from TBL_INF
WHERE TBL_INF.MS_CD = '009'
AND TBL_INF.RY_CD = '000001'
)
----- Execution Plan -----
Limit (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.769..124168.771 rows=1 loops=1)
-> Aggregate (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.767..124168.769 rows=1 loops=1)
-> Nested Loop (cost=0.29..2730702.63 rows=3202774 width=9) (actual time=97264.166..123920.769 rows=3200000 loops=1)
-> Index Only Scan using TBL_INF_pkc on TBL_INF (cost=0.29..8.31 rows=1 width=9) (actual time=0.025..0.030 rows=1 loops=1)
Index Cond: ((MS_CD = '009'::bpchar) AND (RY_CD = '000001'::bpchar))
Heap Fetches: 1
-> Seq Scan on TBL_SHA (cost=0.00..2698666.58 rows=3202774 width=18) (actual time=97264.138..123554.792 rows=3200000 loops=1)
Filter: ((MS_CD = '009'::bpchar) AND (ETRYS = '000001'::bpchar))
Rows Removed by Filter: 32000325
Planning Time: 0.162 ms
Execution Time: 124168.838 ms
--------------------------------------------------------------------------------The index is defined as follows.
CREATE INDEX index_search_01 ON mdb.TBL_SHA USING btree (MS_CD, ETRYS);
Actual rows = 3200000, rows removed by filter is ten times as much.
It should use an index.When I take the following sql statement, the index works fine and the query is fast.
select COUNT(ET_CD)
from TBL_SHA
WHERE MS_CD = '009'
AND ETRYS = '000001'The amount of data in the table is as follows.
TBL_SHA 38700325
TBL_INF 35546This looks very much like it is a problem with the data types.
I see that you are using "character", which you shouldn't do.What I cannot see is if the columns are defined as "character" or whether you bind
the parameters as "character". Can you show us the table definition of "TBL_SHA"
and "TBL_INF"?Yours,
Laurenz Albe
On Thu, 2023-07-20 at 22:14 +0800, gzh wrote:
The information I provided is incorrect, please see my previous reply.
My question remains: I would like to see the table definitions.
Also, did you ANALYZE the tables?
Yours,
Laurenz Albe
On Thu, Jul 20, 2023 at 7:36 PM gzh <gzhcoder@126.com> wrote:
Thank you very much for taking the time to reply to my question.
Sorry, I provided incorrect information.
The index also does not work in the following query statement.
select COUNT(ET_CD)
from TBL_SHA
WHERE MS_CD = '009'
AND ETRYS = '000001'
QUERY PLAN
Limit (cost=2419643.47..2419643.48 rows=1 width=8) (actual time=128667.439..128668.250 rows=1 loops=1)
-> Finalize Aggregate (cost=2419643.47..2419643.48 rows=1 width=8) (actual time=128667.437..128668.246 rows=1 loops=1)
-> Gather (cost=2419643.25..2419643.46 rows=2 width=8) (actual time=128664.108..128668.233 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=2418643.25..2418643.26 rows=1 width=8) (actual time=128655.256..128655.258 rows=1 loops=3)
-> Parallel Seq Scan on TBL_SHA (cost=0.00..2415548.85 rows=1237762 width=9) (actual time=75357.455..128531.615 rows=1066667 loops=3)
Filter: ((MS_CD = '009'::bpchar) AND (ETRYS = '000001'::bpchar))
Rows Removed by Filter: 11833442
Planning Time: 0.118 ms
Execution Time: 128668.290 ms
The TBL_SHA table has another index, as shown below.
CREATE INDEX index_search_02 ON mdb.TBL_SHA USING btree (ET_CD, ETRYS)
CREATE INDEX index_search_03 ON mdb.TBL_SHA USING btree (MS_CD, ET_DAY, BK_CD, FR_CD, RM_CD)
Rows Removed by Filter: 11833442
select (38700325 - 11833442) /38700325.0;
is 0.69 approx.
So I think it says around 69% of rows satisfy the query condition.
but I am not sure in the following 2 cases, whether the actual rows
are noisy or not. I can not find the doc explaining it.
Show quoted text
Partial Aggregate (actual time=128655.256..128655.258 rows=1 loops=3)
Finalize Aggregate (actual time=128667.437..128668.246 rows=1 loops=1)
The definitions of the columns used in SQL are as follows.
TBL_SHA
ms_cd character(6) NOT NULL -- PRIMARY KEY
et_cd character(8)
etrys character(8)
TBL_INF
ms_cd character(6) NOT NULL -- PRIMARY KEY
ry_cd character(8) NOT NULL -- PRIMARY KEY
I made some modifications to the data, and I realized that I should not change the length of the data.
The actual data and its corresponding execution plan are shown below.
explain analyze
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = 'MLD009'
and TBL_SHA.ETRYS in
(select TBL_INF.RY_CD
from TBL_INF
WHERE TBL_INF.MS_CD = 'MLD009'
AND TBL_INF.RY_CD = '00000001'
)
----- Execution Plan -----
Limit (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.769..124168.771 rows=1 loops=1)
-> Aggregate (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.767..124168.769 rows=1 loops=1)
-> Nested Loop (cost=0.29..2730702.63 rows=3202774 width=9) (actual time=97264.166..123920.769 rows=3200000 loops=1)
-> Index Only Scan using TBL_INF_pkc on TBL_INF (cost=0.29..8.31 rows=1 width=9) (actual time=0.025..0.030 rows=1 loops=1)
Index Cond: ((MS_CD = 'MLD009'::bpchar) AND (RY_CD = '00000001'::bpchar))
Heap Fetches: 1
-> Seq Scan on TBL_SHA (cost=0.00..2698666.58 rows=3202774 width=18) (actual time=97264.138..123554.792 rows=3200000 loops=1)
Filter: ((MS_CD = 'MLD009'::bpchar) AND (ETRYS = '00000001'::bpchar))
Rows Removed by Filter: 32000325
Planning Time: 0.162 ms
Execution Time: 124168.838 ms
At 2023-07-20 22:48:19, "Laurenz Albe" <laurenz.albe@cybertec.at> wrote:
Show quoted text
On Thu, 2023-07-20 at 22:14 +0800, gzh wrote:
The information I provided is incorrect, please see my previous reply.
My question remains: I would like to see the table definitions.
Also, did you ANALYZE the tables?
Yours,
Laurenz Albe
select (38700325 - 11833442) /38700325.0;
is 0.69 approx.
So I think it says around 69% of rows satisfy the query condition.Thank you for your reply. I have learned a lot from it.
At 2023-07-20 23:20:16, "jian he" <jian.universality@gmail.com> wrote:
Show quoted text
On Thu, Jul 20, 2023 at 7:36 PM gzh <gzhcoder@126.com> wrote:
Thank you very much for taking the time to reply to my question.
Sorry, I provided incorrect information.
The index also does not work in the following query statement.
select COUNT(ET_CD)
from TBL_SHA
WHERE MS_CD = '009'
AND ETRYS = '000001'
QUERY PLAN
Limit (cost=2419643.47..2419643.48 rows=1 width=8) (actual time=128667.439..128668.250 rows=1 loops=1)
-> Finalize Aggregate (cost=2419643.47..2419643.48 rows=1 width=8) (actual time=128667.437..128668.246 rows=1 loops=1)
-> Gather (cost=2419643.25..2419643.46 rows=2 width=8) (actual time=128664.108..128668.233 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=2418643.25..2418643.26 rows=1 width=8) (actual time=128655.256..128655.258 rows=1 loops=3)
-> Parallel Seq Scan on TBL_SHA (cost=0.00..2415548.85 rows=1237762 width=9) (actual time=75357.455..128531.615 rows=1066667 loops=3)
Filter: ((MS_CD = '009'::bpchar) AND (ETRYS = '000001'::bpchar))
Rows Removed by Filter: 11833442
Planning Time: 0.118 ms
Execution Time: 128668.290 ms
The TBL_SHA table has another index, as shown below.
CREATE INDEX index_search_02 ON mdb.TBL_SHA USING btree (ET_CD, ETRYS)
CREATE INDEX index_search_03 ON mdb.TBL_SHA USING btree (MS_CD, ET_DAY, BK_CD, FR_CD, RM_CD)
Rows Removed by Filter: 11833442
select (38700325 - 11833442) /38700325.0;
is 0.69 approx.
So I think it says around 69% of rows satisfy the query condition.but I am not sure in the following 2 cases, whether the actual rows
are noisy or not. I can not find the doc explaining it.Partial Aggregate (actual time=128655.256..128655.258 rows=1 loops=3)
Finalize Aggregate (actual time=128667.437..128668.246 rows=1 loops=1)
On Fri, 2023-07-21 at 09:43 +0800, gzh wrote:
The definitions of the columns used in SQL are as follows.
TBL_SHA
ms_cd character(6) NOT NULL -- PRIMARY KEY
et_cd character(8)
etrys character(8)TBL_INF
ms_cd character(6) NOT NULL -- PRIMARY KEY
ry_cd character(8) NOT NULL -- PRIMARY KEYI made some modifications to the data, and I realized that I should not change the length of the data.
The actual data and its corresponding execution plan are shown below.explain analyze
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = 'MLD009'
and TBL_SHA.ETRYS in
(select TBL_INF.RY_CD
from TBL_INF
WHERE TBL_INF.MS_CD = 'MLD009'
AND TBL_INF.RY_CD = '00000001'
)
----- Execution Plan -----
Limit (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.769..124168.771 rows=1 loops=1)
-> Aggregate (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.767..124168.769 rows=1 loops=1)
-> Nested Loop (cost=0.29..2730702.63 rows=3202774 width=9) (actual time=97264.166..123920.769 rows=3200000 loops=1)
-> Index Only Scan using TBL_INF_pkc on TBL_INF (cost=0.29..8.31 rows=1 width=9) (actual time=0.025..0.030 rows=1 loops=1)
Index Cond: ((MS_CD = 'MLD009'::bpchar) AND (RY_CD = '00000001'::bpchar))
Heap Fetches: 1
-> Seq Scan on TBL_SHA (cost=0.00..2698666.58 rows=3202774 width=18) (actual time=97264.138..123554.792 rows=3200000 loops=1)
Filter: ((MS_CD = 'MLD009'::bpchar) AND (ETRYS = '00000001'::bpchar))
Rows Removed by Filter: 32000325
Planning Time: 0.162 ms
Execution Time: 124168.838 ms
Thanks. That should definitely use a b-tree index defined on (ms_cd, etrsys).
Did you change any parameters that have an impact on query planning?
You can see that in the output of EXPLAIN (ANALYZE, BUFFERS, SETTINGS).
Yours,
Laurenz Albe
Did you change any parameters that have an impact on query planning?
You can see that in the output of EXPLAIN (ANALYZE, BUFFERS, SETTINGS).
I added some parameters and re-executed the Execution Plan.
Except for the index not taking effect, I still don't know the reason why the index is not working.
Is it because there is too much data that meets the conditions?
EXPLAIN (ANALYZE ON, VERBOSE ON, COSTS ON, BUFFERS ON)
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = 'MLD009'
and TBL_SHA.ETRYS in
(select TBL_INF.RY_CD
from TBL_INF
WHERE TBL_INF.MS_CD = 'MLD009'
AND TBL_INF.RY_CD = '00000001'
)
----- Execution Plan -----
Limit (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.521..128717.677 rows=1 loops=1)
Output: (count(tbl_sha.et_cd))
Buffers: shared hit=58948 read=2112758
I/O Timings: read=357249.120
-> Aggregate (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.519..128717.674 rows=1 loops=1)
Output: count(tbl_sha.et_cd)
Buffers: shared hit=58948 read=2112758
I/O Timings: read=357249.120
-> Nested Loop (cost=1000.29..2722556.76 rows=2430587 width=9) (actual time=2.364..128350.279 rows=2613500 loops=1)
Output: tbl_sha.et_cd
Buffers: shared hit=58948 read=2112758
I/O Timings: read=357249.120
-> Index Only Scan using tbl_inf_pkc on mtpdb.tbl_inf (cost=0.29..8.31 rows=1 width=9) (actual time=0.046..0.051 rows=1 loops=1)
Output: tbl_inf.ms_cd, tbl_inf.ry_cd
Index Cond: ((tbl_inf.ms_cd = 'MLD009'::bpchar) AND (tbl_inf.ry_cd = '00000001'::bpchar))
Heap Fetches: 1
Buffers: shared hit=4
-> Gather (cost=1000.00..2698242.58 rows=2430587 width=18) (actual time=2.315..127773.087 rows=2613500 loops=1)
Output: tbl_sha.et_cd, tbl_sha.etrys
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=58944 read=2112758
I/O Timings: read=357249.120
-> Parallel Seq Scan on mtpdb.tbl_sha (cost=0.00..2454183.88 rows=1012745 width=18) (actual time=952.728..127583.089 rows=871167 loops=3)
Output: tbl_sha.et_cd, tbl_sha.etrys
Filter: ((tbl_sha.ms_cd = 'MLD009'::bpchar) AND (tbl_sha.etrys = '00000001'::bpchar))
Rows Removed by Filter: 14062278
Buffers: shared hit=58944 read=2112758
I/O Timings: read=357249.120
Worker 0: actual time=1432.292..127762.181 rows=988036 loops=1
Buffers: shared hit=17875 read=706862
I/O Timings: read=119193.744
Worker 1: actual time=1425.878..127786.777 rows=992381 loops=1
Buffers: shared hit=19813 read=706359
I/O Timings: read=119386.899
Planning:
Buffers: shared hit=42
Planning Time: 1.024 ms
Execution Time: 128717.731 ms
At 2023-07-24 13:43:46, "Laurenz Albe" <laurenz.albe@cybertec.at> wrote:
Show quoted text
On Fri, 2023-07-21 at 09:43 +0800, gzh wrote:
The definitions of the columns used in SQL are as follows.
TBL_SHA
ms_cd character(6) NOT NULL -- PRIMARY KEY
et_cd character(8)
etrys character(8)TBL_INF
ms_cd character(6) NOT NULL -- PRIMARY KEY
ry_cd character(8) NOT NULL -- PRIMARY KEYI made some modifications to the data, and I realized that I should not change the length of the data.
The actual data and its corresponding execution plan are shown below.explain analyze
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = 'MLD009'
and TBL_SHA.ETRYS in
(select TBL_INF.RY_CD
from TBL_INF
WHERE TBL_INF.MS_CD = 'MLD009'
AND TBL_INF.RY_CD = '00000001'
)
----- Execution Plan -----
Limit (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.769..124168.771 rows=1 loops=1)
-> Aggregate (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.767..124168.769 rows=1 loops=1)
-> Nested Loop (cost=0.29..2730702.63 rows=3202774 width=9) (actual time=97264.166..123920.769 rows=3200000 loops=1)
-> Index Only Scan using TBL_INF_pkc on TBL_INF (cost=0.29..8.31 rows=1 width=9) (actual time=0.025..0.030 rows=1 loops=1)
Index Cond: ((MS_CD = 'MLD009'::bpchar) AND (RY_CD = '00000001'::bpchar))
Heap Fetches: 1
-> Seq Scan on TBL_SHA (cost=0.00..2698666.58 rows=3202774 width=18) (actual time=97264.138..123554.792 rows=3200000 loops=1)
Filter: ((MS_CD = 'MLD009'::bpchar) AND (ETRYS = '00000001'::bpchar))
Rows Removed by Filter: 32000325
Planning Time: 0.162 ms
Execution Time: 124168.838 msThanks. That should definitely use a b-tree index defined on (ms_cd, etrsys).
Did you change any parameters that have an impact on query planning?
You can see that in the output of EXPLAIN (ANALYZE, BUFFERS, SETTINGS).Yours,
Laurenz Albe
On Mon, Jul 24, 2023 at 5:54 PM gzh <gzhcoder@126.com> wrote:
Did you change any parameters that have an impact on query planning?
You can see that in the output of EXPLAIN (ANALYZE, BUFFERS, SETTINGS).
I added some parameters and re-executed the Execution Plan.
Except for the index not taking effect, I still don't know the reason why the index is not working.
Is it because there is too much data that meets the conditions?
EXPLAIN (ANALYZE ON, VERBOSE ON, COSTS ON, BUFFERS ON)
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = 'MLD009'
and TBL_SHA.ETRYS in
(select TBL_INF.RY_CD
from TBL_INF
WHERE TBL_INF.MS_CD = 'MLD009'
AND TBL_INF.RY_CD = '00000001'
)
----- Execution Plan -----
Limit (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.521..128717.677 rows=1 loops=1)
Output: (count(tbl_sha.et_cd))
Buffers: shared hit=58948 read=2112758
I/O Timings: read=357249.120
-> Aggregate (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.519..128717.674 rows=1 loops=1)
Output: count(tbl_sha.et_cd)
Buffers: shared hit=58948 read=2112758
I/O Timings: read=357249.120
-> Nested Loop (cost=1000.29..2722556.76 rows=2430587 width=9) (actual time=2.364..128350.279 rows=2613500 loops=1)
Output: tbl_sha.et_cd
Buffers: shared hit=58948 read=2112758
I/O Timings: read=357249.120
-> Index Only Scan using tbl_inf_pkc on mtpdb.tbl_inf (cost=0.29..8.31 rows=1 width=9) (actual time=0.046..0.051 rows=1 loops=1)
Output: tbl_inf.ms_cd, tbl_inf.ry_cd
Index Cond: ((tbl_inf.ms_cd = 'MLD009'::bpchar) AND (tbl_inf.ry_cd = '00000001'::bpchar))
Heap Fetches: 1
Buffers: shared hit=4
-> Gather (cost=1000.00..2698242.58 rows=2430587 width=18) (actual time=2.315..127773.087 rows=2613500 loops=1)
Output: tbl_sha.et_cd, tbl_sha.etrys
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=58944 read=2112758
I/O Timings: read=357249.120
-> Parallel Seq Scan on mtpdb.tbl_sha (cost=0.00..2454183.88 rows=1012745 width=18) (actual time=952.728..127583.089 rows=871167 loops=3)
Output: tbl_sha.et_cd, tbl_sha.etrys
Filter: ((tbl_sha.ms_cd = 'MLD009'::bpchar) AND (tbl_sha.etrys = '00000001'::bpchar))
Rows Removed by Filter: 14062278
Buffers: shared hit=58944 read=2112758
I/O Timings: read=357249.120
Worker 0: actual time=1432.292..127762.181 rows=988036 loops=1
Buffers: shared hit=17875 read=706862
I/O Timings: read=119193.744
Worker 1: actual time=1425.878..127786.777 rows=992381 loops=1
Buffers: shared hit=19813 read=706359
I/O Timings: read=119386.899
Planning:
Buffers: shared hit=42
Planning Time: 1.024 ms
Execution Time: 128717.731 ms
I think the whole query can just:
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = 'MLD009' AND TBL_SHA.ETRYS = '00000001';
and TBL_SHA.ETRYS in
(select TBL_INF.RY_CD
from TBL_INF
WHERE TBL_INF.MS_CD = 'MLD009'
AND TBL_INF.RY_CD = '00000001'
)
if subquery after IN clause part, no rows returned then the whole
query would return zero row.
if many duplicates rows returned, then there is no point of evaluate
something like {1 in (1,1,1,1,1,1,)}
On Mon, 2023-07-24 at 17:54 +0800, gzh wrote:
EXPLAIN (ANALYZE ON, VERBOSE ON, COSTS ON, BUFFERS ON)
I asked for EXPLAIN (ANALYZE, BUFFERS, SETTINGS) ...
Or do you have a version that is too old for SETTINGS?
One other idea: check if the index is INVALID (this will
be visible if you run "\d tablenane" in "psql").
Invalid indexes won't be used.
Yours,
Laurenz Albe
I asked for EXPLAIN (ANALYZE, BUFFERS, SETTINGS) ...
Or do you have a version that is too old for SETTINGS?
Sorry. Please refer to the following execution plan.
EXPLAIN (ANALYZE, BUFFERS, SETTINGS)
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = 'MLD009'
and TBL_SHA.ETRYS in
(select TBL_INF.RY_CD
from TBL_INF
WHERE TBL_INF.MS_CD = 'MLD009'
AND TBL_INF.RY_CD = '00000001'
)
----- Execution Plan -----
Buffers: shared hit=13 read=2171693
I/O Timings: read=365863.877
-> Aggregate (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=131595.624..131599.529 rows=1 loops=1)
Buffers: shared hit=13 read=2171693
I/O Timings: read=365863.877
-> Nested Loop (cost=1000.29..2722556.76 rows=2430587 width=9) (actual time=2.341..131256.445 rows=2513500 loops=1)
Buffers: shared hit=13 read=2171693
I/O Timings: read=365863.877
-> Index Only Scan using tbl_inf_pkc on tbl_inf (cost=0.29..8.31 rows=1 width=9) (actual time=1.471..1.476 rows=1 loops=1)
Index Cond: ((ms_cd = 'MLD009'::bpchar) AND (ry_cd = '00000001'::bpchar))
Heap Fetches: 1
Buffers: shared hit=2 read=2
I/O Timings: read=1.412
-> Gather (cost=1000.00..2698242.58 rows=2430587 width=18) (actual time=0.866..130696.440 rows=2513500 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=11 read=2171691
I/O Timings: read=365862.464
-> Parallel Seq Scan on tbl_sha (cost=0.00..2454183.88 rows=1012745 width=18) (actual time=0.215..130476.981 rows=837833 loops=3)
Filter: ((ms_cd = 'MLD009'::bpchar) AND (etrys = '00000001'::bpchar))
Rows Removed by Filter: 13728945
Buffers: shared hit=11 read=2171691
I/O Timings: read=365862.464
Settings: effective_cache_size = '1886088kB', jit = 'off', search_path = '"$user", mdb'
Planning:
Buffers: shared hit=167 read=7
I/O Timings: read=2.735
Planning Time: 3.733 ms
Execution Time: 131599.594 ms
At 2023-07-24 23:58:50, "Laurenz Albe" <laurenz.albe@cybertec.at> wrote:
Show quoted text
On Mon, 2023-07-24 at 17:54 +0800, gzh wrote:
EXPLAIN (ANALYZE ON, VERBOSE ON, COSTS ON, BUFFERS ON)
I asked for EXPLAIN (ANALYZE, BUFFERS, SETTINGS) ...
Or do you have a version that is too old for SETTINGS?One other idea: check if the index is INVALID (this will
be visible if you run "\d tablenane" in "psql").
Invalid indexes won't be used.Yours,
Laurenz Albe
Thank you for your reply.
I think the whole query can just:
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = 'MLD009' AND TBL_SHA.ETRYS = '00000001';
This is related to the business logic.
if many duplicates rows returned, then there is no point of evaluate
something like {1 in (1,1,1,1,1,1,)}
Because the primary key of the 'tbl_inf' table only consists of 'ms_cd' and 'ry_cd' columns, the subquery will not return duplicate rows.
At 2023-07-24 22:42:01, "jian he" <jian.universality@gmail.com> wrote:
Show quoted text
On Mon, Jul 24, 2023 at 5:54 PM gzh <gzhcoder@126.com> wrote:
Did you change any parameters that have an impact on query planning?
You can see that in the output of EXPLAIN (ANALYZE, BUFFERS, SETTINGS).
I added some parameters and re-executed the Execution Plan.
Except for the index not taking effect, I still don't know the reason why the index is not working.
Is it because there is too much data that meets the conditions?
EXPLAIN (ANALYZE ON, VERBOSE ON, COSTS ON, BUFFERS ON)
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = 'MLD009'
and TBL_SHA.ETRYS in
(select TBL_INF.RY_CD
from TBL_INF
WHERE TBL_INF.MS_CD = 'MLD009'
AND TBL_INF.RY_CD = '00000001'
)
----- Execution Plan -----
Limit (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.521..128717.677 rows=1 loops=1)
Output: (count(tbl_sha.et_cd))
Buffers: shared hit=58948 read=2112758
I/O Timings: read=357249.120
-> Aggregate (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.519..128717.674 rows=1 loops=1)
Output: count(tbl_sha.et_cd)
Buffers: shared hit=58948 read=2112758
I/O Timings: read=357249.120
-> Nested Loop (cost=1000.29..2722556.76 rows=2430587 width=9) (actual time=2.364..128350.279 rows=2613500 loops=1)
Output: tbl_sha.et_cd
Buffers: shared hit=58948 read=2112758
I/O Timings: read=357249.120
-> Index Only Scan using tbl_inf_pkc on mtpdb.tbl_inf (cost=0.29..8.31 rows=1 width=9) (actual time=0.046..0.051 rows=1 loops=1)
Output: tbl_inf.ms_cd, tbl_inf.ry_cd
Index Cond: ((tbl_inf.ms_cd = 'MLD009'::bpchar) AND (tbl_inf.ry_cd = '00000001'::bpchar))
Heap Fetches: 1
Buffers: shared hit=4
-> Gather (cost=1000.00..2698242.58 rows=2430587 width=18) (actual time=2.315..127773.087 rows=2613500 loops=1)
Output: tbl_sha.et_cd, tbl_sha.etrys
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=58944 read=2112758
I/O Timings: read=357249.120
-> Parallel Seq Scan on mtpdb.tbl_sha (cost=0.00..2454183.88 rows=1012745 width=18) (actual time=952.728..127583.089 rows=871167 loops=3)
Output: tbl_sha.et_cd, tbl_sha.etrys
Filter: ((tbl_sha.ms_cd = 'MLD009'::bpchar) AND (tbl_sha.etrys = '00000001'::bpchar))
Rows Removed by Filter: 14062278
Buffers: shared hit=58944 read=2112758
I/O Timings: read=357249.120
Worker 0: actual time=1432.292..127762.181 rows=988036 loops=1
Buffers: shared hit=17875 read=706862
I/O Timings: read=119193.744
Worker 1: actual time=1425.878..127786.777 rows=992381 loops=1
Buffers: shared hit=19813 read=706359
I/O Timings: read=119386.899
Planning:
Buffers: shared hit=42
Planning Time: 1.024 ms
Execution Time: 128717.731 ms
I think the whole query can just:
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = 'MLD009' AND TBL_SHA.ETRYS = '00000001';and TBL_SHA.ETRYS in
(select TBL_INF.RY_CD
from TBL_INF
WHERE TBL_INF.MS_CD = 'MLD009'
AND TBL_INF.RY_CD = '00000001'
)if subquery after IN clause part, no rows returned then the whole
query would return zero row.
if many duplicates rows returned, then there is no point of evaluate
something like {1 in (1,1,1,1,1,1,)}
On Tue, 2023-07-25 at 11:11 +0800, gzh wrote:
I asked for EXPLAIN (ANALYZE, BUFFERS, SETTINGS) ...
Or do you have a version that is too old for SETTINGS?Sorry. Please refer to the following execution plan.
[...]
Settings: effective_cache_size = '1886088kB', jit = 'off', search_path = '"$user", mdb'
[...]
No unusual settings. So that theory is dead.
One other idea: check if the index is INVALID (this will
be visible if you run "\d tablenane" in "psql").
Invalid indexes won't be used.
Did you check that?
After that, I am out of ideas.
Yours,
Laurenz Albe
On Thu, 20 Jul 2023 at 23:36, gzh <gzhcoder@126.com> wrote:
Thank you very much for taking the time to reply to my question.
Sorry, I provided incorrect information.
The index also does not work in the following query statement.
select COUNT(ET_CD)
from TBL_SHA
WHERE MS_CD = '009'
AND ETRYS = '000001'QUERY PLAN
Limit (cost=2419643.47..2419643.48 rows=1 width=8) (actual time=128667.439..128668.250 rows=1 loops=1)
-> Finalize Aggregate (cost=2419643.47..2419643.48 rows=1 width=8) (actual time=128667.437..128668.246 rows=1 loops=1)
-> Gather (cost=2419643.25..2419643.46 rows=2 width=8) (actual time=128664.108..128668.233 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=2418643.25..2418643.26 rows=1 width=8) (actual time=128655.256..128655.258 rows=1 loops=3)
-> Parallel Seq Scan on TBL_SHA (cost=0.00..2415548.85 rows=1237762 width=9) (actual time=75357.455..128531.615 rows=1066667 loops=3)
Filter: ((MS_CD = '009'::bpchar) AND (ETRYS = '000001'::bpchar))
Rows Removed by Filter: 11833442
Planning Time: 0.118 ms
Execution Time: 128668.290 msThe TBL_SHA table has another index, as shown below.
CREATE INDEX index_search_02 ON mdb.TBL_SHA USING btree (ET_CD, ETRYS)
CREATE INDEX index_search_03 ON mdb.TBL_SHA USING btree (MS_CD, ET_DAY, BK_CD, FR_CD, RM_CD)When I take the following query statement, the result is returned quickly.
Why does index_search_01 always not work?
The method to access the table is decided by the query planner based
on costs. The costs are driven off the row estimates which are driven
from table statistics. If the table statistics, for example say that
99% of rows have MS_CD = '009', then scanning an index on MS_CD is
unlikely to be a good idea as that would likely require random access
to the heap. It's likely better to perform a table scan and then just
filter out the 1% of rows that don't match.
Try executing the query after having done:
SET enable_seqscan TO off;
What plan does it use now?
Is that plan faster or slower than the seq scan plan?
David
On Fri, 21 Jul 2023 at 13:44, gzh <gzhcoder@126.com> wrote:
The definitions of the columns used in SQL are as follows.
TBL_SHA
ms_cd character(6) NOT NULL -- PRIMARY KEY
et_cd character(8)
etrys character(8)
explain analyze
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = 'MLD009'
and TBL_SHA.ETRYS in
(select TBL_INF.RY_CD
from TBL_INF
WHERE TBL_INF.MS_CD = 'MLD009'
AND TBL_INF.RY_CD = '00000001'
)
----- Execution Plan -----
Limit (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.769..124168.771 rows=1 loops=1)
-> Aggregate (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.767..124168.769 rows=1 loops=1)
-> Nested Loop (cost=0.29..2730702.63 rows=3202774 width=9) (actual time=97264.166..123920.769 rows=3200000 loops=1)
-> Index Only Scan using TBL_INF_pkc on TBL_INF (cost=0.29..8.31 rows=1 width=9) (actual time=0.025..0.030 rows=1 loops=1)
Index Cond: ((MS_CD = 'MLD009'::bpchar) AND (RY_CD = '00000001'::bpchar))
Heap Fetches: 1
-> Seq Scan on TBL_SHA (cost=0.00..2698666.58 rows=3202774 width=18) (actual time=97264.138..123554.792 rows=3200000 loops=1)
Filter: ((MS_CD = 'MLD009'::bpchar) AND (ETRYS = '00000001'::bpchar))
Rows Removed by Filter: 32000325
Planning Time: 0.162 ms
Execution Time: 124168.838 ms
You might want to check your description of the table definitions.
Going by the above EXPLAIN ANALYZE output, it very much does not look
like ms_cd is the primary key of TBL_SHA. If it is then it's very
weird that you have 3200000 rows for MS_CD = 'MLD009'. You have some
data corruption if that's the case. I suspect you've just not
accurately described the table definition, however.
David
Thank you very much for taking the time to reply to my question.
You might want to check your description of the table definitions.
Going by the above EXPLAIN ANALYZE output, it very much does not look
like ms_cd is the primary key of TBL_SHA. If it is then it's very
weird that you have 3200000 rows for MS_CD = 'MLD009'. You have some
data corruption if that's the case. I suspect you've just not
accurately described the table definition, however.
The primary key of the SHA table has six fields, and ms_cd is just one of them.
I'm sorry, I didn't make that clear.
Try executing the query after having done:
SET enable_seqscan TO off;
What plan does it use now?
Is that plan faster or slower than the seq scan plan?
There's improvement, but it's still quite slow.
QUERY PLAN (enable_seqscan=on)
Limit (cost=2693516.87..2693516.88 rows=1 width=8) (actual time=167089.822..167183.058 rows=1 loops=1)
-> Aggregate (cost=2693516.87..2693516.88 rows=1 width=8) (actual time=167089.820..167183.056 rows=1 loops=1)
-> Nested Loop (cost=1000.29..2688558.85 rows=1983209 width=9) (actual time=43544.753..166906.304 rows=2413500 loops=1)
-> Index Only Scan using tbl_inf_pkc on tbl_inf (cost=0.29..8.31 rows=1 width=9) (actual time=1.034..1.038 rows=1 loops=1)
Index Cond: ((ms_cd = 'MLD009'::bpchar) AND (ry_cd = '00000001'::bpchar))
Heap Fetches: 1
-> Gather (cost=1000.00..2668718.45 rows=1983209 width=18) (actual time=43543.714..166447.333 rows=2413500 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on tbl_sha (cost=0.00..2469397.55 rows=826337 width=18) (actual time=43537.056..166225.162 rows=804500 loops=3)
Filter: ((ms_cd = 'MLD009'::bpchar) AND (etrys = '00000001'::bpchar))
Rows Removed by Filter: 15362328
Planning Time: 2.942 ms
Execution Time: 167183.133 ms
SET enable_seqscan TO off;
QUERY PLAN (enable_seqscan=off)
Limit (cost=2880973.06..2880973.07 rows=1 width=8) (actual time=22295.419..22320.102 rows=1 loops=1)
-> Aggregate (cost=2880973.06..2880973.07 rows=1 width=8) (actual time=22295.418..22320.100 rows=1 loops=1)
-> Nested Loop (cost=93112.74..2876169.16 rows=1921561 width=9) (actual time=265.880..22000.432 rows=2413500 loops=1)
-> Index Only Scan using tbl_inf_pkc on tbl_inf (cost=0.29..8.31 rows=1 width=9) (actual time=0.013..0.020 rows=1 loops=1)
Index Cond: ((ms_cd = 'MLD009'::bpchar) AND (ry_cd = '00000001'::bpchar))
Heap Fetches: 1
-> Gather (cost=93112.45..2856945.24 rows=1921561 width=18) (actual time=265.864..21535.325 rows=2413500 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Bitmap Heap Scan on tbl_sha (cost=92112.45..2663789.14 rows=800650 width=18) (actual time=260.540..21442.169 rows=804500 loops=3)
Recheck Cond: (ms_cd = 'MLD009'::bpchar)
Rows Removed by Index Recheck: 49
Filter: (etrys = '00000001'::bpchar)
Rows Removed by Filter: 295500
Heap Blocks: exact=13788 lossy=10565
-> Bitmap Index Scan on index_search_04_mscd_cdate (cost=0.00..91632.06 rows=3402599 width=0) (actual time=249.718..249.718 rows=3300000 loops=1)
Index Cond: (ms_cd = 'MLD009'::bpchar)
Planning Time: 0.670 ms
Execution Time: 22320.153 ms
At 2023-07-25 21:04:16, "David Rowley" <dgrowleyml@gmail.com> wrote:
Show quoted text
On Fri, 21 Jul 2023 at 13:44, gzh <gzhcoder@126.com> wrote:
The definitions of the columns used in SQL are as follows.
TBL_SHA
ms_cd character(6) NOT NULL -- PRIMARY KEY
et_cd character(8)
etrys character(8)explain analyze
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = 'MLD009'
and TBL_SHA.ETRYS in
(select TBL_INF.RY_CD
from TBL_INF
WHERE TBL_INF.MS_CD = 'MLD009'
AND TBL_INF.RY_CD = '00000001'
)
----- Execution Plan -----
Limit (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.769..124168.771 rows=1 loops=1)
-> Aggregate (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.767..124168.769 rows=1 loops=1)
-> Nested Loop (cost=0.29..2730702.63 rows=3202774 width=9) (actual time=97264.166..123920.769 rows=3200000 loops=1)
-> Index Only Scan using TBL_INF_pkc on TBL_INF (cost=0.29..8.31 rows=1 width=9) (actual time=0.025..0.030 rows=1 loops=1)
Index Cond: ((MS_CD = 'MLD009'::bpchar) AND (RY_CD = '00000001'::bpchar))
Heap Fetches: 1
-> Seq Scan on TBL_SHA (cost=0.00..2698666.58 rows=3202774 width=18) (actual time=97264.138..123554.792 rows=3200000 loops=1)
Filter: ((MS_CD = 'MLD009'::bpchar) AND (ETRYS = '00000001'::bpchar))
Rows Removed by Filter: 32000325
Planning Time: 0.162 ms
Execution Time: 124168.838 msYou might want to check your description of the table definitions.
Going by the above EXPLAIN ANALYZE output, it very much does not look
like ms_cd is the primary key of TBL_SHA. If it is then it's very
weird that you have 3200000 rows for MS_CD = 'MLD009'. You have some
data corruption if that's the case. I suspect you've just not
accurately described the table definition, however.David