Different execution plan between PostgreSQL 8.4 and 12.11

Started by gzhalmost 4 years ago24 messagesgeneral

Jump to latest

gzh

gzhcoder@126.com

almost 4 years ago

Hi,

I have had a Perl Website working for 7 years and have had no problems

until a few weeks ago I replaced my database server with a newer one.

Database server (old): PostgreSQL 8.4 32bit

Database server (new): PostgreSQL 12.11 64bit

I run following sql in PostgreSQL 8.4 and PostgreSQL 12.11, it returns different execution plan.

--PostgreSQL 8.4

---------------

old=# select count(1) from analyze_word_reports;

count

---------

9164136

(1 row)

old=# select indexdef from pg_indexes where tablename='analyze_word_reports';

indexdef

-------------------------------------------------------------------------------------------

CREATE INDEX analyze_word_reports_index_cseid ON analyze_word_reports USING btree (cseid)

CREATE UNIQUE INDEX analyze_word_reports_pkey ON analyze_word_reports USING btree (seq)

(2 rows)

old=# explain analyze select 2 from analyze_word_reports where (cseid = 94) limit 1;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=0.00..0.63 rows=1 width=0) (actual time=0.893..0.893 rows=0 loops=1)

-> Index Scan using analyze_word_reports_index_cseid on analyze_word_reports (cost=0.00..18621.98 rows=29707 width=0) (actual time=0.892..0.892 rows=0 loops=1)

Index Cond: (cseid = 94)

Total runtime: 0.941 ms

(4 rows)

--PostgreSQL 12.11

---------------

new=# select count(1) from analyze_word_reports;

count

---------

20131947

(1 row)

new=# select indexdef from pg_indexes where tablename='analyze_word_reports';

indexdef

-------------------------------------------------------------------------------------------

CREATE INDEX analyze_word_reports_index_cseid ON analyze_word_reports USING btree (cseid)

CREATE UNIQUE INDEX analyze_word_reports_pkey ON analyze_word_reports USING btree (seq)

(2 rows)

new=# explain analyze select 2 from analyze_word_reports where (cseid = 94) limit 1;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=0.00..0.41 rows=1 width=4) (actual time=4908.459..4908.462 rows=1 loops=1)

-> Seq Scan on analyze_word_reports (cost=0.00..528610.15 rows=1278561 width=4) (actual time=4908.455..4908.456 rows=1 loops=1)

Filter: (cseid = 94)

Rows Removed by Filter: 15477750

Planning Time: 0.411 ms

Execution Time: 4908.498 ms

(6 行)

Although PostgreSQL 8.4 has more than 10 million rows of data less than PostgreSQL 12.11,

PostgreSQL 12.11 is too slow and analyze_word_reports_index_cseid does not seem to work.

I'm guessing that the limit keyword of PostgreSQL 12.11 causes the index not to work.

But I don't understand why PostgreSQL 8.4 is normal.

What is the reason for this and is there any easy way to maintain compatibility?

Regards,

gzh

Ron

ronljohnsonjr@gmail.com

almost 4 years ago

In reply to: gzh (#1)

Re: Different execution plan between PostgreSQL 8.4 and 12.11

Did you analyze and vacuum all of the tables in the new database?

On 10/9/22 04:11, gzh wrote:

Hi,

I have had a Perl Website working for 7 years and have had no problems

until a few weeks ago I replaced my database server with a newer one.

Database server (old): PostgreSQL 8.4 32bit

Database server (new): PostgreSQL 12.11 64bit

I run following sql in PostgreSQL 8.4 and PostgreSQL 12.11, it returns
different execution plan.

--PostgreSQL 8.4

---------------

old=# select count(1) from analyze_word_reports;

count

---------

9164136

(1 row)

old=# select indexdef from pg_indexes where tablename='analyze_word_reports';

indexdef

-------------------------------------------------------------------------------------------

CREATE INDEX analyze_word_reports_index_cseid ON analyze_word_reports
USING btree (cseid)

CREATE UNIQUE INDEX analyze_word_reports_pkey ON analyze_word_reports
USING btree (seq)

(2 rows)

old=# explain analyze select 2 from analyze_word_reports where (cseid =
94) limit 1;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=0.00..0.63 rows=1 width=0) (actual time=0.893..0.893 rows=0
loops=1)

-> Index Scan using analyze_word_reports_index_cseid on
analyze_word_reports (cost=0.00..18621.98 rows=29707 width=0) (actual
time=0.892..0.892 rows=0 loops=1)

Index Cond: (cseid = 94)

Total runtime: 0.941 ms

(4 rows)

--PostgreSQL 12.11

---------------

new=# select count(1) from analyze_word_reports;

count

---------

20131947

(1 row)

new=# select indexdef from pg_indexes where tablename='analyze_word_reports';

indexdef

-------------------------------------------------------------------------------------------

CREATE INDEX analyze_word_reports_index_cseid ON analyze_word_reports
USING btree (cseid)

CREATE UNIQUE INDEX analyze_word_reports_pkey ON analyze_word_reports
USING btree (seq)

(2 rows)

new=# explain analyze select 2 from analyze_word_reports where (cseid =
94) limit 1;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=0.00..0.41 rows=1 width=4) (actual time=4908.459..4908.462
rows=1 loops=1)

-> Seq Scan on analyze_word_reports (cost=0.00..528610.15 rows=1278561
width=4) (actual time=4908.455..4908.456 rows=1 loops=1)

Filter: (cseid = 94)

Rows Removed by Filter: 15477750

Planning Time: 0.411 ms

Execution Time: 4908.498 ms

(6 行)

Although PostgreSQL 8.4 has more than 10 million rows of data less than
PostgreSQL 12.11,

PostgreSQL 12.11 is too slow and analyze_word_reports_index_cseid does not
seem to work.

I'm guessing that the limit keyword of PostgreSQL 12.11 causes the index
not to work.

But I don't understand why PostgreSQL 8.4 is normal.

What is the reason for this and is there any easy way to maintain
compatibility?

Regards,

--

gzh

--
Angular momentum makes the world go 'round.

Tom Lane

tgl@sss.pgh.pa.us

almost 4 years ago

In reply to: gzh (#1)

Re: Different execution plan between PostgreSQL 8.4 and 12.11

gzh <gzhcoder@126.com> writes:

I run following sql in PostgreSQL 8.4 and PostgreSQL 12.11, it returns different execution plan.

Have you done an ANALYZE in the new database? The rowcount estimates
seem a lot different, leading me to guess that the newer installation
doesn't have good statistics yet.

regards, tom lane

Mladen Gogala

gogala.mladen@gmail.com

almost 4 years ago

In reply to: gzh (#1)

Re: Different execution plan between PostgreSQL 8.4 and 12.11

On 10/9/22 05:11, gzh wrote:

Hi,

I have had a Perl Website working for 7 years and have had no problems

until a few weeks ago I replaced my database server with a newer one.

Database server (old): PostgreSQL 8.4 32bit

Database server (new): PostgreSQL 12.11 64bit

I run following sql in PostgreSQL 8.4 and PostgreSQL 12.11, it returns
different execution plan.

--PostgreSQL 8.4

---------------

old=# select count(1) from analyze_word_reports;

count

---------

9164136

(1 row)

old=# select indexdef from pg_indexes where
tablename='analyze_word_reports';

indexdef

-------------------------------------------------------------------------------------------

CREATE INDEX analyze_word_reports_index_cseid ON analyze_word_reports
USING btree (cseid)

CREATE UNIQUE INDEX analyze_word_reports_pkey ON analyze_word_reports
USING btree (seq)

(2 rows)

old=# explain analyze select 2 from analyze_word_reports where (cseid
= 94) limit 1;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=0.00..0.63 rows=1 width=0) (actual time=0.893..0.893
rows=0 loops=1)

-> Index Scan using analyze_word_reports_index_cseid on
analyze_word_reports (cost=0.00..18621.98 rows=29707 width=0) (actual
time=0.892..0.892 rows=0 loops=1)

Index Cond: (cseid = 94)

Total runtime: 0.941 ms

(4 rows)

--PostgreSQL 12.11

---------------

new=# select count(1) from analyze_word_reports;

count

---------

20131947

(1 row)

new=# select indexdef from pg_indexes where
tablename='analyze_word_reports';

indexdef

-------------------------------------------------------------------------------------------

CREATE INDEX analyze_word_reports_index_cseid ON analyze_word_reports
USING btree (cseid)

CREATE UNIQUE INDEX analyze_word_reports_pkey ON analyze_word_reports
USING btree (seq)

(2 rows)

new=# explain analyze select 2 from analyze_word_reports where (cseid
= 94) limit 1;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=0.00..0.41 rows=1 width=4) (actual
time=4908.459..4908.462 rows=1 loops=1)

-> Seq Scan on analyze_word_reports (cost=0.00..528610.15
rows=1278561 width=4) (actual time=4908.455..4908.456 rows=1 loops=1)

Filter: (cseid = 94)

Rows Removed by Filter: 15477750

Planning Time: 0.411 ms

Execution Time: 4908.498 ms

(6 行)

Although PostgreSQL 8.4 has more than 10 million rows of data less
than PostgreSQL 12.11,

PostgreSQL 12.11 is too slow and analyze_word_reports_index_cseid does
not seem to work.

I'm guessing that the limit keyword of PostgreSQL 12.11 causes the
index not to work.

But I don't understand why PostgreSQL 8.4 is normal.

What is the reason for this and is there any easy way to maintain
compatibility?

Regards,

--

gzh

Well, as someone who has worked on more than one database, I can tell
you that new version always means new plans. Most of the time, the new
plans are better but sometimes they're not. Your problem is probably
caused by one or two SQL statements that have changed plans. I would
advise installing pg_hint_plan extension and fixing those few SQL
queries manually. PostgreSQL would probably benefit from something like
Oracle baselines, which are a good mechanism for carrying plans over to
the new version.

If you don't want to install the new extension, you can try by running
vacuum analyze on the database. Also, upgrading to PgSQL 12 doesn't make
much sense given the fact that PgSQL 15 will be released in a few weeks.
Can you upgrade to PgSQL 14?

Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

gzh

gzhcoder@126.com

almost 4 years ago

In reply to: Tom Lane (#3)

Re:Re: Different execution plan between PostgreSQL 8.4 and 12.11

Hi Tom,

Thank you for your prompt response.
I've run analyze(not vacuum analyze), but it doesn't seem to work.
Is there any other way to optimize the database?

At 2022-10-10 00:02:09, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

Show quoted text

gzh <gzhcoder@126.com> writes:

I run following sql in PostgreSQL 8.4 and PostgreSQL 12.11, it returns different execution plan.

Have you done an ANALYZE in the new database? The rowcount estimates
seem a lot different, leading me to guess that the newer installation
doesn't have good statistics yet.

regards, tom lane

Tom Lane

tgl@sss.pgh.pa.us

almost 4 years ago

In reply to: gzh (#5)

Re: Different execution plan between PostgreSQL 8.4 and 12.11

gzh <gzhcoder@126.com> writes:

I've run analyze(not vacuum analyze), but it doesn't seem to work.

When you're asking for help, please don't give us vague statements
like "doesn't seem to work". Did the plan (including rowcount
estimates) change at all? To what? How far off is that rowcount
estimate, anyway --- that is, how many rows actually have cseid = 94?

If the estimate is far off, then increasing the table's statistics
target might help.

Another thing that would be worth checking is whether
"set enable_seqscan = off" prods it to choose the plan you want.
If not, then there's something else going on besides poor estimates.

regards, tom lane

gzh

gzhcoder@126.com

almost 4 years ago

In reply to: Tom Lane (#6)

Re:Re: Different execution plan between PostgreSQL 8.4 and 12.11

Hi, Tom
Thank you for your reply.

When you're asking for help, please don't give us vague statements

like "doesn't seem to work".

I understand.

Did the plan (including rowcount

estimates) change at all? To what? How far off is that rowcount

estimate, anyway --- that is, how many rows actually have cseid = 94?

Please refer to the new execution plan (PostgreSQL 12.11) below.

new=# show enable_seqscan;

enable_seqscan

----------------

(1 行)

new=# select count(*) from analyze_word_reports;

count

----------

21331980

(1 行)

new=# select count(*) from analyze_word_reports where (cseid = 94);

count

---------

1287156

(1 行)

new=# explain analyze select count(2) from analyze_word_reports where (cseid = 94) limit 1;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------

Limit (cost=65184.06..65184.07 rows=1 width=8) (actual time=123.713..133.035 rows=1 loops=1)

-> Finalize Aggregate (cost=65184.06..65184.07 rows=1 width=8) (actual time=123.712..133.033 rows=1 loops=1)

-> Gather (cost=65183.85..65184.06 rows=2 width=8) (actual time=123.548..133.024 rows=3 loops=1)

Workers Planned: 2

Workers Launched: 2

-> Partial Aggregate (cost=64183.85..64183.86 rows=1 width=8) (actual time=119.495..119.496 rows=1 loops=3)

-> Parallel Index Only Scan using analyze_word_reports_index_cseid on analyze_word_reports (cost=0.56..6290

9.34 rows=509805 width=0) (actual time=0.031..98.706 rows=429052 loops=3)

Index Cond: (cseid = 94)

Heap Fetches: 1287156 Planning Time: 0.122 ms Execution Time: 133.069 ms

(11 行)

new=# explain analyze select 2 from analyze_word_reports where (cseid = 94) limit 1;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------

---

Limit (cost=0.00..0.43 rows=1 width=4) (actual time=2156.964..2156.966 rows=1 loops=1)

-> Seq Scan on analyze_word_reports (cost=0.00..528550.75 rows=1223533 width=4) (actual time=2156.962..2156.964 rows=1 loops=

Filter: (cseid = 94)

Rows Removed by Filter: 18320180 Planning Time: 0.086 ms Execution Time: 2156.985 ms

(6 行)

If the estimate is far off, then increasing the table's statistics

target might help.

Thank you for your advice.

Please tell me how to set the table's statistics up to improve performance.

new=# select oid from pg_class where relname = 'analyze_word_reports';

oid

-------

16429

(1 行)

new=# select attrelid,attname,attstattarget from pg_attribute where attrelid=16429 and attname='cseid';

attrelid | attname | attstattarget

----------+---------+---------------

16429 | cseid | -1

(1 行)

Another thing that would be worth checking is whether

"set enable_seqscan = off" prods it to choose the plan you want.

If not, then there's something else going on besides poor estimates.

"set enable_seqscan = off" works, and the performance is greatly improved, which is almost the same as PostgreSQL 8.4.

The enable_seqscan(PostgreSQL 8.4) is on, will this change have an unknown effect on other queries?

At 2022-10-10 10:45:54, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

Show quoted text

gzh <gzhcoder@126.com> writes:

I've run analyze(not vacuum analyze), but it doesn't seem to work.

When you're asking for help, please don't give us vague statements
like "doesn't seem to work". Did the plan (including rowcount
estimates) change at all? To what? How far off is that rowcount
estimate, anyway --- that is, how many rows actually have cseid = 94?

If the estimate is far off, then increasing the table's statistics
target might help.

Another thing that would be worth checking is whether
"set enable_seqscan = off" prods it to choose the plan you want.
If not, then there's something else going on besides poor estimates.

regards, tom lane

Pavel Stehule

pavel.stehule@gmail.com

almost 4 years ago

In reply to: gzh (#7)

Re: Re: Different execution plan between PostgreSQL 8.4 and 12.11

út 11. 10. 2022 v 5:13 odesílatel gzh <gzhcoder@126.com> napsal:

Hi, Tom
Thank you for your reply.

When you're asking for help, please don't give us vague statements

like "doesn't seem to work".

I understand.

Did the plan (including rowcount

estimates) change at all? To what? How far off is that rowcount

estimate, anyway --- that is, how many rows actually have cseid = 94?

Please refer to the new execution plan (PostgreSQL 12.11) below.

new=# show enable_seqscan;

enable_seqscan

----------------

on

(1 行)

new=# select count(*) from analyze_word_reports;

count

----------

21331980

(1 行)

new=# select count(*) from analyze_word_reports where (cseid = 94);

count

---------

1287156

(1 行)

new=# explain analyze select count(2) from analyze_word_reports where
(cseid = 94) limit 1;

QUERY PLAN

the LIMIT clause is in this case totally useless and messy, and maybe can
negative impacts optimizer

Regards

Pavel

Show quoted text

----------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------

Limit (cost=65184.06..65184.07 rows=1 width=8) (actual
time=123.713..133.035 rows=1 loops=1)

-> Finalize Aggregate (cost=65184.06..65184.07 rows=1 width=8)
(actual time=123.712..133.033 rows=1 loops=1)

-> Gather (cost=65183.85..65184.06 rows=2 width=8) (actual
time=123.548..133.024 rows=3 loops=1)

Workers Planned: 2

Workers Launched: 2

-> Partial Aggregate (cost=64183.85..64183.86 rows=1
width=8) (actual time=119.495..119.496 rows=1 loops=3)

-> Parallel Index Only Scan using
analyze_word_reports_index_cseid on analyze_word_reports (cost=0.56..6290

9.34 rows=509805 width=0) (actual time=0.031..98.706 rows=429052 loops=3)

Index Cond: (cseid = 94)

Heap Fetches: 1287156 Planning Time: 0.122 ms
Execution Time: 133.069 ms

(11 行)

new=# explain analyze select 2 from analyze_word_reports where (cseid =
94) limit 1;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------

---

Limit (cost=0.00..0.43 rows=1 width=4) (actual time=2156.964..2156.966
rows=1 loops=1)

-> Seq Scan on analyze_word_reports (cost=0.00..528550.75
rows=1223533 width=4) (actual time=2156.962..2156.964 rows=1 loops=

1)

Filter: (cseid = 94)

Rows Removed by Filter: 18320180 Planning Time: 0.086 ms
Execution Time: 2156.985 ms

(6 行)

If the estimate is far off, then increasing the table's statistics

target might help.

Thank you for your advice.

Please tell me how to set the table's statistics up to improve performance.

new=# select oid from pg_class where relname = 'analyze_word_reports';

oid

-------

16429

(1 行)

new=# select attrelid,attname,attstattarget from pg_attribute where
attrelid=16429 and attname='cseid';

attrelid | attname | attstattarget

----------+---------+---------------

16429 | cseid | -1

(1 行)

Another thing that would be worth checking is whether

"set enable_seqscan = off" prods it to choose the plan you want.

If not, then there's something else going on besides poor estimates.

"set enable_seqscan = off" works, and the performance is greatly improved,
which is almost the same as PostgreSQL 8.4.

The enable_seqscan(PostgreSQL 8.4) is on, will this change have an unknown
effect on other queries?

At 2022-10-10 10:45:54, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

gzh <gzhcoder@126.com> writes:

I've run analyze(not vacuum analyze), but it doesn't seem to work.

When you're asking for help, please don't give us vague statements
like "doesn't seem to work". Did the plan (including rowcount
estimates) change at all? To what? How far off is that rowcount
estimate, anyway --- that is, how many rows actually have cseid = 94?

If the estimate is far off, then increasing the table's statistics
target might help.

Another thing that would be worth checking is whether
"set enable_seqscan = off" prods it to choose the plan you want.
If not, then there's something else going on besides poor estimates.

regards, tom lane

gzh

gzhcoder@126.com

almost 4 years ago

In reply to: Pavel Stehule (#8)

Re:Re: Re: Different execution plan between PostgreSQL 8.4 and 12.11

Hi, Pavel

Thank you for your reply.

the LIMIT clause is in this case totally useless and messy, and maybe can

negative impacts optimizer

Yes. After removing the LIMIT clause, the performance is improved.

The execution plan shows that the index worked.

We've noticed it, but I don't want to fix the problem by modifying the SQL until I find the cause.

At 2022-10-11 11:32:48, "Pavel Stehule" <pavel.stehule@gmail.com> wrote:

út 11. 10. 2022 v 5:13 odesílatel gzh <gzhcoder@126.com> napsal:

Hi, Tom
Thank you for your reply.

When you're asking for help, please don't give us vague statements

like "doesn't seem to work".

I understand.

Did the plan (including rowcount

estimates) change at all? To what? How far off is that rowcount

estimate, anyway --- that is, how many rows actually have cseid = 94?

Please refer to the new execution plan (PostgreSQL 12.11) below.

new=# show enable_seqscan;

enable_seqscan

----------------

(1 行)

new=# select count(*) from analyze_word_reports;

count

----------

21331980

(1 行)

new=# select count(*) from analyze_word_reports where (cseid = 94);

count

---------

1287156

(1 行)

new=# explain analyze select count(2) from analyze_word_reports where (cseid = 94) limit 1;

QUERY PLAN

the LIMIT clause is in this case totally useless and messy, and maybe can negative impacts optimizer

Regards

Pavel

----------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------

Limit (cost=65184.06..65184.07 rows=1 width=8) (actual time=123.713..133.035 rows=1 loops=1)

-> Finalize Aggregate (cost=65184.06..65184.07 rows=1 width=8) (actual time=123.712..133.033 rows=1 loops=1)

-> Gather (cost=65183.85..65184.06 rows=2 width=8) (actual time=123.548..133.024 rows=3 loops=1)

Workers Planned: 2

Workers Launched: 2

-> Partial Aggregate (cost=64183.85..64183.86 rows=1 width=8) (actual time=119.495..119.496 rows=1 loops=3)

-> Parallel Index Only Scan using analyze_word_reports_index_cseid on analyze_word_reports (cost=0.56..6290

9.34 rows=509805 width=0) (actual time=0.031..98.706 rows=429052 loops=3)

Index Cond: (cseid = 94)

Heap Fetches: 1287156 Planning Time: 0.122 ms Execution Time: 133.069 ms

(11 行)

new=# explain analyze select 2 from analyze_word_reports where (cseid = 94) limit 1;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------

---

Limit (cost=0.00..0.43 rows=1 width=4) (actual time=2156.964..2156.966 rows=1 loops=1)

-> Seq Scan on analyze_word_reports (cost=0.00..528550.75 rows=1223533 width=4) (actual time=2156.962..2156.964 rows=1 loops=

Filter: (cseid = 94)

Rows Removed by Filter: 18320180 Planning Time: 0.086 ms Execution Time: 2156.985 ms

(6 行)

If the estimate is far off, then increasing the table's statistics

target might help.

Thank you for your advice.

Please tell me how to set the table's statistics up to improve performance.

new=# select oid from pg_class where relname = 'analyze_word_reports';

oid

-------

16429

(1 行)

new=# select attrelid,attname,attstattarget from pg_attribute where attrelid=16429 and attname='cseid';

attrelid | attname | attstattarget

----------+---------+---------------

16429 | cseid | -1

(1 行)

Another thing that would be worth checking is whether

"set enable_seqscan = off" prods it to choose the plan you want.

If not, then there's something else going on besides poor estimates.

"set enable_seqscan = off" works, and the performance is greatly improved, which is almost the same as PostgreSQL 8.4.

The enable_seqscan(PostgreSQL 8.4) is on, will this change have an unknown effect on other queries?

At 2022-10-10 10:45:54, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

Show quoted text

gzh <gzhcoder@126.com> writes:

I've run analyze(not vacuum analyze), but it doesn't seem to work.

When you're asking for help, please don't give us vague statements
like "doesn't seem to work". Did the plan (including rowcount
estimates) change at all? To what? How far off is that rowcount
estimate, anyway --- that is, how many rows actually have cseid = 94?

If the estimate is far off, then increasing the table's statistics
target might help.

Another thing that would be worth checking is whether
"set enable_seqscan = off" prods it to choose the plan you want.
If not, then there's something else going on besides poor estimates.

regards, tom lane

#10

Pavel Stehule

pavel.stehule@gmail.com

almost 4 years ago

In reply to: gzh (#9)

Re: Re: Re: Different execution plan between PostgreSQL 8.4 and 12.11

út 11. 10. 2022 v 6:05 odesílatel gzh <gzhcoder@126.com> napsal:

Hi, Pavel

Thank you for your reply.

the LIMIT clause is in this case totally useless and messy, and maybe can

negative impacts optimizer

Yes. After removing the LIMIT clause, the performance is improved.

The execution plan shows that the index worked.

We've noticed it, but I don't want to fix the problem by modifying the SQL
until I find the cause.

The LIMIT clause changes total cost. This is a very aggressive clause. And
although it is absolutely useless in this case, Postgres does not have any
logic for removing it. Postgres doesn't try to fix developer's mistakes.

Show quoted text

At 2022-10-11 11:32:48, "Pavel Stehule" <pavel.stehule@gmail.com> wrote:

út 11. 10. 2022 v 5:13 odesílatel gzh <gzhcoder@126.com> napsal:

Hi, Tom
Thank you for your reply.

When you're asking for help, please don't give us vague statements

like "doesn't seem to work".

I understand.

Did the plan (including rowcount

estimates) change at all? To what? How far off is that rowcount

estimate, anyway --- that is, how many rows actually have cseid = 94?

Please refer to the new execution plan (PostgreSQL 12.11) below.

new=# show enable_seqscan;

enable_seqscan

----------------

on

(1 行)

new=# select count(*) from analyze_word_reports;

count

----------

21331980

(1 行)

new=# select count(*) from analyze_word_reports where (cseid = 94);

count

---------

1287156

(1 行)

new=# explain analyze select count(2) from analyze_word_reports where
(cseid = 94) limit 1;

QUERY PLAN

the LIMIT clause is in this case totally useless and messy, and maybe can
negative impacts optimizer

Regards

Pavel

----------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------

Limit (cost=65184.06..65184.07 rows=1 width=8) (actual
time=123.713..133.035 rows=1 loops=1)

-> Finalize Aggregate (cost=65184.06..65184.07 rows=1 width=8)
(actual time=123.712..133.033 rows=1 loops=1)

-> Gather (cost=65183.85..65184.06 rows=2 width=8) (actual
time=123.548..133.024 rows=3 loops=1)

Workers Planned: 2

Workers Launched: 2

-> Partial Aggregate (cost=64183.85..64183.86 rows=1
width=8) (actual time=119.495..119.496 rows=1 loops=3)

-> Parallel Index Only Scan using
analyze_word_reports_index_cseid on analyze_word_reports (cost=0.56..6290

9.34 rows=509805 width=0) (actual time=0.031..98.706 rows=429052 loops=3)

Index Cond: (cseid = 94)

Heap Fetches: 1287156 Planning Time: 0.122
ms Execution Time: 133.069 ms

(11 行)

new=# explain analyze select 2 from analyze_word_reports where (cseid =
94) limit 1;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------

---

Limit (cost=0.00..0.43 rows=1 width=4) (actual time=2156.964..2156.966
rows=1 loops=1)

-> Seq Scan on analyze_word_reports (cost=0.00..528550.75
rows=1223533 width=4) (actual time=2156.962..2156.964 rows=1 loops=

1)

Filter: (cseid = 94)

Rows Removed by Filter: 18320180 Planning Time: 0.086 ms
Execution Time: 2156.985 ms

(6 行)

If the estimate is far off, then increasing the table's statistics

target might help.

Thank you for your advice.

Please tell me how to set the table's statistics up to improve
performance.

new=# select oid from pg_class where relname = 'analyze_word_reports';

oid

-------

16429

(1 行)

new=# select attrelid,attname,attstattarget from pg_attribute where
attrelid=16429 and attname='cseid';

attrelid | attname | attstattarget

----------+---------+---------------

16429 | cseid | -1

(1 行)

Another thing that would be worth checking is whether

"set enable_seqscan = off" prods it to choose the plan you want.

If not, then there's something else going on besides poor estimates.

"set enable_seqscan = off" works, and the performance is greatly
improved, which is almost the same as PostgreSQL 8.4.

The enable_seqscan(PostgreSQL 8.4) is on, will this change have an
unknown effect on other queries?

At 2022-10-10 10:45:54, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

gzh <gzhcoder@126.com> writes:

I've run analyze(not vacuum analyze), but it doesn't seem to work.

When you're asking for help, please don't give us vague statements
like "doesn't seem to work". Did the plan (including rowcount
estimates) change at all? To what? How far off is that rowcount
estimate, anyway --- that is, how many rows actually have cseid = 94?

If the estimate is far off, then increasing the table's statistics
target might help.

Another thing that would be worth checking is whether
"set enable_seqscan = off" prods it to choose the plan you want.
If not, then there's something else going on besides poor estimates.

regards, tom lane

#11

gzh

gzhcoder@126.com

almost 4 years ago

In reply to: Pavel Stehule (#10)

Re:Re: Re: Re: Different execution plan between PostgreSQL 8.4 and 12.11

Hi, Pavel

The LIMIT clause changes total cost. This is a very aggressive clause. And

although it is absolutely useless in this case, Postgres does not have any

logic for removing it. Postgres doesn't try to fix developer's mistakes.

Sorry,I didn't understand what you mean.

Couldn't the LIMIT clause be used like the SQL statement below?

new=# explain analyze select 2 from analyze_word_reports where (cseid =

94) limit 1;

This SQL statement is no problem under PostgreSQL 8.4, the index works well.

At 2022-10-11 12:13:47, "Pavel Stehule" <pavel.stehule@gmail.com> wrote:

út 11. 10. 2022 v 6:05 odesílatel gzh <gzhcoder@126.com> napsal:

Hi, Pavel

Thank you for your reply.

the LIMIT clause is in this case totally useless and messy, and maybe can

negative impacts optimizer

Yes. After removing the LIMIT clause, the performance is improved.

The execution plan shows that the index worked.

We've noticed it, but I don't want to fix the problem by modifying the SQL until I find the cause.

The LIMIT clause changes total cost. This is a very aggressive clause. And although it is absolutely useless in this case, Postgres does not have any logic for removing it. Postgres doesn't try to fix developer's mistakes.

At 2022-10-11 11:32:48, "Pavel Stehule" <pavel.stehule@gmail.com> wrote:

út 11. 10. 2022 v 5:13 odesílatel gzh <gzhcoder@126.com> napsal:

Hi, Tom
Thank you for your reply.

When you're asking for help, please don't give us vague statements

like "doesn't seem to work".

I understand.

Did the plan (including rowcount

estimates) change at all? To what? How far off is that rowcount

estimate, anyway --- that is, how many rows actually have cseid = 94?

Please refer to the new execution plan (PostgreSQL 12.11) below.

new=# show enable_seqscan;

enable_seqscan

----------------

(1 行)

new=# select count(*) from analyze_word_reports;

count

----------

21331980

(1 行)

new=# select count(*) from analyze_word_reports where (cseid = 94);

count

---------

1287156

(1 行)

new=# explain analyze select count(2) from analyze_word_reports where (cseid = 94) limit 1;

QUERY PLAN

the LIMIT clause is in this case totally useless and messy, and maybe can negative impacts optimizer

Regards

Pavel

----------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------

Limit (cost=65184.06..65184.07 rows=1 width=8) (actual time=123.713..133.035 rows=1 loops=1)

-> Finalize Aggregate (cost=65184.06..65184.07 rows=1 width=8) (actual time=123.712..133.033 rows=1 loops=1)

-> Gather (cost=65183.85..65184.06 rows=2 width=8) (actual time=123.548..133.024 rows=3 loops=1)

Workers Planned: 2

Workers Launched: 2

-> Partial Aggregate (cost=64183.85..64183.86 rows=1 width=8) (actual time=119.495..119.496 rows=1 loops=3)

-> Parallel Index Only Scan using analyze_word_reports_index_cseid on analyze_word_reports (cost=0.56..6290

9.34 rows=509805 width=0) (actual time=0.031..98.706 rows=429052 loops=3)

Index Cond: (cseid = 94)

Heap Fetches: 1287156 Planning Time: 0.122 ms Execution Time: 133.069 ms

(11 行)

new=# explain analyze select 2 from analyze_word_reports where (cseid = 94) limit 1;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------

---

Limit (cost=0.00..0.43 rows=1 width=4) (actual time=2156.964..2156.966 rows=1 loops=1)

-> Seq Scan on analyze_word_reports (cost=0.00..528550.75 rows=1223533 width=4) (actual time=2156.962..2156.964 rows=1 loops=

Filter: (cseid = 94)

Rows Removed by Filter: 18320180 Planning Time: 0.086 ms Execution Time: 2156.985 ms

(6 行)

If the estimate is far off, then increasing the table's statistics

target might help.

Thank you for your advice.

Please tell me how to set the table's statistics up to improve performance.

new=# select oid from pg_class where relname = 'analyze_word_reports';

oid

-------

16429

(1 行)

new=# select attrelid,attname,attstattarget from pg_attribute where attrelid=16429 and attname='cseid';

attrelid | attname | attstattarget

----------+---------+---------------

16429 | cseid | -1

(1 行)

Another thing that would be worth checking is whether

"set enable_seqscan = off" prods it to choose the plan you want.

If not, then there's something else going on besides poor estimates.

"set enable_seqscan = off" works, and the performance is greatly improved, which is almost the same as PostgreSQL 8.4.

The enable_seqscan(PostgreSQL 8.4) is on, will this change have an unknown effect on other queries?

At 2022-10-10 10:45:54, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

Show quoted text

gzh <gzhcoder@126.com> writes:

I've run analyze(not vacuum analyze), but it doesn't seem to work.

When you're asking for help, please don't give us vague statements
like "doesn't seem to work". Did the plan (including rowcount
estimates) change at all? To what? How far off is that rowcount
estimate, anyway --- that is, how many rows actually have cseid = 94?

If the estimate is far off, then increasing the table's statistics
target might help.

Another thing that would be worth checking is whether
"set enable_seqscan = off" prods it to choose the plan you want.
If not, then there's something else going on besides poor estimates.

regards, tom lane

#12

Pavel Stehule

pavel.stehule@gmail.com

almost 4 years ago

In reply to: gzh (#11)

Re: Re: Re: Re: Different execution plan between PostgreSQL 8.4 and 12.11

út 11. 10. 2022 v 7:08 odesílatel gzh <gzhcoder@126.com> napsal:

Hi, Pavel

The LIMIT clause changes total cost. This is a very aggressive clause.

And

although it is absolutely useless in this case, Postgres does not have

any

logic for removing it. Postgres doesn't try to fix developer's mistakes.

Sorry,I didn't understand what you mean.

Couldn't the LIMIT clause be used like the SQL statement below?

new=# explain analyze select 2 from analyze_word_reports where (cseid =

94) limit 1;

there was query

SELECT aggregate() FROM xx LIMIT 1

This SQL statement is no problem under PostgreSQL 8.4, the index works
well.

The optimizer is under nonstop change. And you can expect from any new
release

75% queries are without change, 20% queries are faster, and 5% queries are
slower

The optimization is based on statistics and estimations, and searching for
the optimal solution in space of all solutions. In any version there are
smaller or bigger changes of estimation methods, and between old 8.4 and 12
there are big changes in possibilities of how the query can be executed. So
there is a higher possibility to find some really fast queries, but there
is a higher possibility to find some local optimum or slow query too.
Usually the optimizer is smarter (what is the benefit), but more sensitive
too (what is the cost). You cannot expect the same result, when the data
and algorithm is changed in any version. Postgres doesn't garant the
immutability of execution plans.

The clause LIMIT with low LIMIT value can be problematic in more cases. The
model in Postgres expects data are uniformly stored in the table (heap),
but the reality can be different. The common trick in these cases is using
OFFSET 0 clause like

SELECT * FROM (SELECT * FROM foo LIMIT 1000 OFFSET 0) s LIMIT 10.

Show quoted text

At 2022-10-11 12:13:47, "Pavel Stehule" <pavel.stehule@gmail.com> wrote:

út 11. 10. 2022 v 6:05 odesílatel gzh <gzhcoder@126.com> napsal:

Hi, Pavel

Thank you for your reply.

the LIMIT clause is in this case totally useless and messy, and maybe

can

negative impacts optimizer

Yes. After removing the LIMIT clause, the performance is improved.

The execution plan shows that the index worked.

We've noticed it, but I don't want to fix the problem by modifying the
SQL until I find the cause.

The LIMIT clause changes total cost. This is a very aggressive clause.
And although it is absolutely useless in this case, Postgres does not have
any logic for removing it. Postgres doesn't try to fix developer's mistakes.

At 2022-10-11 11:32:48, "Pavel Stehule" <pavel.stehule@gmail.com> wrote:

út 11. 10. 2022 v 5:13 odesílatel gzh <gzhcoder@126.com> napsal:

Hi, Tom
Thank you for your reply.

When you're asking for help, please don't give us vague statements

like "doesn't seem to work".

I understand.

Did the plan (including rowcount

estimates) change at all? To what? How far off is that rowcount

estimate, anyway --- that is, how many rows actually have cseid = 94?

Please refer to the new execution plan (PostgreSQL 12.11) below.

new=# show enable_seqscan;

enable_seqscan

----------------

on

(1 行)

new=# select count(*) from analyze_word_reports;

count

----------

21331980

(1 行)

new=# select count(*) from analyze_word_reports where (cseid = 94);

count

---------

1287156

(1 行)

new=# explain analyze select count(2) from analyze_word_reports where
(cseid = 94) limit 1;

QUERY PLAN

the LIMIT clause is in this case totally useless and messy, and maybe can
negative impacts optimizer

Regards

Pavel

----------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------

Limit (cost=65184.06..65184.07 rows=1 width=8) (actual
time=123.713..133.035 rows=1 loops=1)

-> Finalize Aggregate (cost=65184.06..65184.07 rows=1 width=8)
(actual time=123.712..133.033 rows=1 loops=1)

-> Gather (cost=65183.85..65184.06 rows=2 width=8) (actual
time=123.548..133.024 rows=3 loops=1)

Workers Planned: 2

Workers Launched: 2

-> Partial Aggregate (cost=64183.85..64183.86 rows=1
width=8) (actual time=119.495..119.496 rows=1 loops=3)

-> Parallel Index Only Scan using
analyze_word_reports_index_cseid on analyze_word_reports (cost=0.56..6290

9.34 rows=509805 width=0) (actual time=0.031..98.706 rows=429052 loops=3)

Index Cond: (cseid = 94)

Heap Fetches: 1287156 Planning Time: 0.122
ms Execution Time: 133.069 ms

(11 行)

new=# explain analyze select 2 from analyze_word_reports where (cseid =
94) limit 1;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------

---

Limit (cost=0.00..0.43 rows=1 width=4) (actual time=2156.964..2156.966
rows=1 loops=1)

-> Seq Scan on analyze_word_reports (cost=0.00..528550.75
rows=1223533 width=4) (actual time=2156.962..2156.964 rows=1 loops=

1)

Filter: (cseid = 94)

Rows Removed by Filter: 18320180 Planning Time: 0.086 ms
Execution Time: 2156.985 ms

(6 行)

If the estimate is far off, then increasing the table's statistics

target might help.

Thank you for your advice.

Please tell me how to set the table's statistics up to improve
performance.

new=# select oid from pg_class where relname = 'analyze_word_reports';

oid

-------

16429

(1 行)

new=# select attrelid,attname,attstattarget from pg_attribute where
attrelid=16429 and attname='cseid';

attrelid | attname | attstattarget

----------+---------+---------------

16429 | cseid | -1

(1 行)

Another thing that would be worth checking is whether

"set enable_seqscan = off" prods it to choose the plan you want.

If not, then there's something else going on besides poor estimates.

"set enable_seqscan = off" works, and the performance is greatly
improved, which is almost the same as PostgreSQL 8.4.

The enable_seqscan(PostgreSQL 8.4) is on, will this change have an
unknown effect on other queries?

At 2022-10-10 10:45:54, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

gzh <gzhcoder@126.com> writes:

I've run analyze(not vacuum analyze), but it doesn't seem to work.

When you're asking for help, please don't give us vague statements
like "doesn't seem to work". Did the plan (including rowcount
estimates) change at all? To what? How far off is that rowcount
estimate, anyway --- that is, how many rows actually have cseid = 94?

If the estimate is far off, then increasing the table's statistics
target might help.

Another thing that would be worth checking is whether
"set enable_seqscan = off" prods it to choose the plan you want.
If not, then there's something else going on besides poor estimates.

regards, tom lane

#13

gzh

gzhcoder@126.com

almost 4 years ago

In reply to: Pavel Stehule (#12)

Re:Re: Re: Re: Re: Different execution plan between PostgreSQL 8.4 and 12.11

Thank you for providing the requested information.

The WebSite has been used for many years, and this upgrade is only a version upgrade of the PostgreSQL database.

My customer does not want to modify the SQL because it will increase the cost of the project(All SQL that contains a LIMIT clause needs to be analyzed and checked).

Is there no other way to solve the problem?

At 2022-10-11 13:24:12, "Pavel Stehule" <pavel.stehule@gmail.com> wrote:

út 11. 10. 2022 v 7:08 odesílatel gzh <gzhcoder@126.com> napsal:

Hi, Pavel

The LIMIT clause changes total cost. This is a very aggressive clause. And

although it is absolutely useless in this case, Postgres does not have any

logic for removing it. Postgres doesn't try to fix developer's mistakes.

Sorry,I didn't understand what you mean.

Couldn't the LIMIT clause be used like the SQL statement below?

new=# explain analyze select 2 from analyze_word_reports where (cseid =

94) limit 1;

there was query

SELECT aggregate() FROM xx LIMIT 1

This SQL statement is no problem under PostgreSQL 8.4, the index works well.

The optimizer is under nonstop change. And you can expect from any new release

75% queries are without change, 20% queries are faster, and 5% queries are slower

The optimization is based on statistics and estimations, and searching for the optimal solution in space of all solutions. In any version there are smaller or bigger changes of estimation methods, and between old 8.4 and 12 there are big changes in possibilities of how the query can be executed. So there is a higher possibility to find some really fast queries, but there is a higher possibility to find some local optimum or slow query too. Usually the optimizer is smarter (what is the benefit), but more sensitive too (what is the cost). You cannot expect the same result, when the data and algorithm is changed in any version. Postgres doesn't garant the immutability of execution plans.

The clause LIMIT with low LIMIT value can be problematic in more cases. The model in Postgres expects data are uniformly stored in the table (heap), but the reality can be different. The common trick in these cases is using OFFSET 0 clause like

SELECT * FROM (SELECT * FROM foo LIMIT 1000 OFFSET 0) s LIMIT 10.

At 2022-10-11 12:13:47, "Pavel Stehule" <pavel.stehule@gmail.com> wrote:

út 11. 10. 2022 v 6:05 odesílatel gzh <gzhcoder@126.com> napsal:

Hi, Pavel

Thank you for your reply.

the LIMIT clause is in this case totally useless and messy, and maybe can

negative impacts optimizer

Yes. After removing the LIMIT clause, the performance is improved.

The execution plan shows that the index worked.

We've noticed it, but I don't want to fix the problem by modifying the SQL until I find the cause.

The LIMIT clause changes total cost. This is a very aggressive clause. And although it is absolutely useless in this case, Postgres does not have any logic for removing it. Postgres doesn't try to fix developer's mistakes.

At 2022-10-11 11:32:48, "Pavel Stehule" <pavel.stehule@gmail.com> wrote:

út 11. 10. 2022 v 5:13 odesílatel gzh <gzhcoder@126.com> napsal:

Hi, Tom
Thank you for your reply.

When you're asking for help, please don't give us vague statements

like "doesn't seem to work".

I understand.

Did the plan (including rowcount

estimates) change at all? To what? How far off is that rowcount

estimate, anyway --- that is, how many rows actually have cseid = 94?

Please refer to the new execution plan (PostgreSQL 12.11) below.

new=# show enable_seqscan;

enable_seqscan

----------------

(1 行)

new=# select count(*) from analyze_word_reports;

count

----------

21331980

(1 行)

new=# select count(*) from analyze_word_reports where (cseid = 94);

count

---------

1287156

(1 行)

new=# explain analyze select count(2) from analyze_word_reports where (cseid = 94) limit 1;

QUERY PLAN

the LIMIT clause is in this case totally useless and messy, and maybe can negative impacts optimizer

Regards

Pavel

----------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------

Limit (cost=65184.06..65184.07 rows=1 width=8) (actual time=123.713..133.035 rows=1 loops=1)

-> Finalize Aggregate (cost=65184.06..65184.07 rows=1 width=8) (actual time=123.712..133.033 rows=1 loops=1)

-> Gather (cost=65183.85..65184.06 rows=2 width=8) (actual time=123.548..133.024 rows=3 loops=1)

Workers Planned: 2

Workers Launched: 2

-> Partial Aggregate (cost=64183.85..64183.86 rows=1 width=8) (actual time=119.495..119.496 rows=1 loops=3)

-> Parallel Index Only Scan using analyze_word_reports_index_cseid on analyze_word_reports (cost=0.56..6290

9.34 rows=509805 width=0) (actual time=0.031..98.706 rows=429052 loops=3)

Index Cond: (cseid = 94)

Heap Fetches: 1287156 Planning Time: 0.122 ms Execution Time: 133.069 ms

(11 行)

new=# explain analyze select 2 from analyze_word_reports where (cseid = 94) limit 1;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------

---

Limit (cost=0.00..0.43 rows=1 width=4) (actual time=2156.964..2156.966 rows=1 loops=1)

-> Seq Scan on analyze_word_reports (cost=0.00..528550.75 rows=1223533 width=4) (actual time=2156.962..2156.964 rows=1 loops=

Filter: (cseid = 94)

Rows Removed by Filter: 18320180 Planning Time: 0.086 ms Execution Time: 2156.985 ms

(6 行)

If the estimate is far off, then increasing the table's statistics

target might help.

Thank you for your advice.

Please tell me how to set the table's statistics up to improve performance.

new=# select oid from pg_class where relname = 'analyze_word_reports';

oid

-------

16429

(1 行)

new=# select attrelid,attname,attstattarget from pg_attribute where attrelid=16429 and attname='cseid';

attrelid | attname | attstattarget

----------+---------+---------------

16429 | cseid | -1

(1 行)

Another thing that would be worth checking is whether

"set enable_seqscan = off" prods it to choose the plan you want.

If not, then there's something else going on besides poor estimates.

"set enable_seqscan = off" works, and the performance is greatly improved, which is almost the same as PostgreSQL 8.4.

The enable_seqscan(PostgreSQL 8.4) is on, will this change have an unknown effect on other queries?

At 2022-10-10 10:45:54, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

Show quoted text

gzh <gzhcoder@126.com> writes:

I've run analyze(not vacuum analyze), but it doesn't seem to work.

When you're asking for help, please don't give us vague statements
like "doesn't seem to work". Did the plan (including rowcount
estimates) change at all? To what? How far off is that rowcount
estimate, anyway --- that is, how many rows actually have cseid = 94?

If the estimate is far off, then increasing the table's statistics
target might help.

Another thing that would be worth checking is whether
"set enable_seqscan = off" prods it to choose the plan you want.
If not, then there's something else going on besides poor estimates.

regards, tom lane

#14

Pavel Stehule

pavel.stehule@gmail.com

almost 4 years ago

In reply to: gzh (#13)

Re: Re: Re: Re: Re: Different execution plan between PostgreSQL 8.4 and 12.11

út 11. 10. 2022 v 10:01 odesílatel gzh <gzhcoder@126.com> napsal:

Thank you for providing the requested information.

The WebSite has been used for many years, and this upgrade is only a
version upgrade of the PostgreSQL database.
My customer does not want to modify the SQL because it will increase the
cost of the project(All SQL that contains a LIMIT clause needs to be
analyzed and checked).

Is there no other way to solve the problem?

I don't know about any alternative

Regards

Pavel

Show quoted text

At 2022-10-11 13:24:12, "Pavel Stehule" <pavel.stehule@gmail.com> wrote:

út 11. 10. 2022 v 7:08 odesílatel gzh <gzhcoder@126.com> napsal:

Hi, Pavel

The LIMIT clause changes total cost. This is a very aggressive clause.

And

although it is absolutely useless in this case, Postgres does not have

any

logic for removing it. Postgres doesn't try to fix developer's mistakes.

Sorry,I didn't understand what you mean.

Couldn't the LIMIT clause be used like the SQL statement below?

new=# explain analyze select 2 from analyze_word_reports where (cseid =

94) limit 1;

there was query

SELECT aggregate() FROM xx LIMIT 1

This SQL statement is no problem under PostgreSQL 8.4, the index works
well.

The optimizer is under nonstop change. And you can expect from any new
release

75% queries are without change, 20% queries are faster, and 5% queries are
slower

The optimization is based on statistics and estimations, and searching for
the optimal solution in space of all solutions. In any version there are
smaller or bigger changes of estimation methods, and between old 8.4 and 12
there are big changes in possibilities of how the query can be executed. So
there is a higher possibility to find some really fast queries, but there
is a higher possibility to find some local optimum or slow query too.
Usually the optimizer is smarter (what is the benefit), but more sensitive
too (what is the cost). You cannot expect the same result, when the data
and algorithm is changed in any version. Postgres doesn't garant the
immutability of execution plans.

The clause LIMIT with low LIMIT value can be problematic in more cases.
The model in Postgres expects data are uniformly stored in the table
(heap), but the reality can be different. The common trick in these cases
is using OFFSET 0 clause like

SELECT * FROM (SELECT * FROM foo LIMIT 1000 OFFSET 0) s LIMIT 10.

At 2022-10-11 12:13:47, "Pavel Stehule" <pavel.stehule@gmail.com> wrote:

út 11. 10. 2022 v 6:05 odesílatel gzh <gzhcoder@126.com> napsal:

Hi, Pavel

Thank you for your reply.

the LIMIT clause is in this case totally useless and messy, and maybe

can

negative impacts optimizer

Yes. After removing the LIMIT clause, the performance is improved.

The execution plan shows that the index worked.

We've noticed it, but I don't want to fix the problem by modifying the
SQL until I find the cause.

The LIMIT clause changes total cost. This is a very aggressive clause.
And although it is absolutely useless in this case, Postgres does not have
any logic for removing it. Postgres doesn't try to fix developer's mistakes.

At 2022-10-11 11:32:48, "Pavel Stehule" <pavel.stehule@gmail.com> wrote:

út 11. 10. 2022 v 5:13 odesílatel gzh <gzhcoder@126.com> napsal:

Hi, Tom
Thank you for your reply.

When you're asking for help, please don't give us vague statements

like "doesn't seem to work".

I understand.

Did the plan (including rowcount

estimates) change at all? To what? How far off is that rowcount

estimate, anyway --- that is, how many rows actually have cseid = 94?

Please refer to the new execution plan (PostgreSQL 12.11) below.

new=# show enable_seqscan;

enable_seqscan

----------------

on

(1 行)

new=# select count(*) from analyze_word_reports;

count

----------

21331980

(1 行)

new=# select count(*) from analyze_word_reports where (cseid = 94);

count

---------

1287156

(1 行)

new=# explain analyze select count(2) from analyze_word_reports where
(cseid = 94) limit 1;

QUERY PLAN

the LIMIT clause is in this case totally useless and messy, and maybe
can negative impacts optimizer

Regards

Pavel

----------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------

Limit (cost=65184.06..65184.07 rows=1 width=8) (actual
time=123.713..133.035 rows=1 loops=1)

-> Finalize Aggregate (cost=65184.06..65184.07 rows=1 width=8)
(actual time=123.712..133.033 rows=1 loops=1)

-> Gather (cost=65183.85..65184.06 rows=2 width=8) (actual
time=123.548..133.024 rows=3 loops=1)

Workers Planned: 2

Workers Launched: 2

-> Partial Aggregate (cost=64183.85..64183.86 rows=1
width=8) (actual time=119.495..119.496 rows=1 loops=3)

-> Parallel Index Only Scan using
analyze_word_reports_index_cseid on analyze_word_reports (cost=0.56..6290

9.34 rows=509805 width=0) (actual time=0.031..98.706 rows=429052
loops=3)

Index Cond: (cseid = 94)

Heap Fetches: 1287156 Planning Time: 0.122
ms Execution Time: 133.069 ms

(11 行)

new=# explain analyze select 2 from analyze_word_reports where (cseid =
94) limit 1;

QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------

---

Limit (cost=0.00..0.43 rows=1 width=4) (actual
time=2156.964..2156.966 rows=1 loops=1)

-> Seq Scan on analyze_word_reports (cost=0.00..528550.75
rows=1223533 width=4) (actual time=2156.962..2156.964 rows=1 loops=

1)

Filter: (cseid = 94)

Rows Removed by Filter: 18320180 Planning Time: 0.086 ms
Execution Time: 2156.985 ms

(6 行)

If the estimate is far off, then increasing the table's statistics

target might help.

Thank you for your advice.

Please tell me how to set the table's statistics up to improve
performance.

new=# select oid from pg_class where relname = 'analyze_word_reports';

oid

-------

16429

(1 行)

new=# select attrelid,attname,attstattarget from pg_attribute where
attrelid=16429 and attname='cseid';

attrelid | attname | attstattarget

----------+---------+---------------

16429 | cseid | -1

(1 行)

Another thing that would be worth checking is whether

"set enable_seqscan = off" prods it to choose the plan you want.

If not, then there's something else going on besides poor estimates.

"set enable_seqscan = off" works, and the performance is greatly
improved, which is almost the same as PostgreSQL 8.4.

The enable_seqscan(PostgreSQL 8.4) is on, will this change have an
unknown effect on other queries?

At 2022-10-10 10:45:54, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

gzh <gzhcoder@126.com> writes:

I've run analyze(not vacuum analyze), but it doesn't seem to work.

When you're asking for help, please don't give us vague statements
like "doesn't seem to work". Did the plan (including rowcount
estimates) change at all? To what? How far off is that rowcount
estimate, anyway --- that is, how many rows actually have cseid = 94?

If the estimate is far off, then increasing the table's statistics
target might help.

Another thing that would be worth checking is whether
"set enable_seqscan = off" prods it to choose the plan you want.
If not, then there's something else going on besides poor estimates.

regards, tom lane

#15

gzh

gzhcoder@126.com

almost 4 years ago

In reply to: Pavel Stehule (#14)

Re: Different execution plan between PostgreSQL 8.4 and 12.11

Thank you for all your assistance.

By communicating with my customer, we have adopted the following solution to fix the problem.

set enable_seqscan = off

At 2022-10-11 16:21:42, "Pavel Stehule" <pavel.stehule@gmail.com> wrote: