Query 4-5 times slower after ANALYZE

Started by Philippe Langabout 17 years ago9 messagesgeneral

philippe.lang@attiksystem.ch

about 17 years ago

Hi,

I'm using Postgresql 8.3.6 under Freebsd 7.1.

After a fresh restore of a customer dump (running version 8.2.7 at the
moment), a rather big query executes in about 30 seconds. As soon as I
run ANALYZE, it is instantly 4-5 times slower. I could check that
multiples times.

Here is the EXPLAIN ANALYZE before the ANALYZE:
http://www.attiksystem.ch/postgresql/query_slower_after_analyze/before.t
xt

And here the the EXPLAIN ANALYZE after the ANALYZE:
http://www.attiksystem.ch/postgresql/query_slower_after_analyze/after.tx
t

Any idea what could be turned on/off in order not to have this slowdown
after the ANALYZE?

Best regards,

---------------
Philippe Lang
Attik Syste,

Import Notes

Reference msg id not found: E6A0649F1FBFA3408A37F505400E7AC215CDA5@email.attiksystem.ch

Bill Moran

wmoran@potentialtech.com

about 17 years ago

In reply to: Philippe Lang (#1)

Re: Query 4-5 times slower after ANALYZE

In response to "Philippe Lang" <philippe.lang@attiksystem.ch>:

I'm using Postgresql 8.3.6 under Freebsd 7.1.

After a fresh restore of a customer dump (running version 8.2.7 at the
moment), a rather big query executes in about 30 seconds. As soon as I
run ANALYZE, it is instantly 4-5 times slower. I could check that
multiples times.

Here is the EXPLAIN ANALYZE before the ANALYZE:
http://www.attiksystem.ch/postgresql/query_slower_after_analyze/before.txt

And here the the EXPLAIN ANALYZE after the ANALYZE:
http://www.attiksystem.ch/postgresql/query_slower_after_analyze/after.txt

Any idea what could be turned on/off in order not to have this slowdown
after the ANALYZE?

I opened one of those links figuring I'd take a few minutes to see if I could
muster up some advice ... and just started laughing ... definitely not the
type of query that one can even understand in just a few minutes!

Anyway, the real reason I posted -- I doubt if anyone will be able to make
sense of a query plan that complex without the actual query, so you'll
probably want to post it as well.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 17 years ago

In reply to: Bill Moran (#2)

Re: Query 4-5 times slower after ANALYZE

In response to "Philippe Lang" <philippe.lang@attiksystem.ch>:

I'm using Postgresql 8.3.6 under Freebsd 7.1.

After a fresh restore of a customer dump (running version 8.2.7 at the
moment), a rather big query executes in about 30 seconds. As soon as I
run ANALYZE, it is instantly 4-5 times slower. I could check that
multiples times.

Here is the EXPLAIN ANALYZE before the ANALYZE:
http://www.attiksystem.ch/postgresql/query_slower_after_analyze/before.txt

And here the the EXPLAIN ANALYZE after the ANALYZE:
http://www.attiksystem.ch/postgresql/query_slower_after_analyze/after.txt

Any idea what could be turned on/off in order not to have this slowdown
after the ANALYZE?

I opened one of those links figuring I'd take a few minutes to see if I
could
muster up some advice ... and just started laughing ... definitely not the
type of query that one can even understand in just a few minutes!

Anyway, the real reason I posted -- I doubt if anyone will be able to make
sense of a query plan that complex without the actual query, so you'll
probably want to post it as well.

Yeah, I had the same problem ;-) Anyway the reason why this query is so
slow is scanning the "customers" table - before the analysis an index scan
is used (and it's really fast), while after the analyze a sequential scan
is used instead (it's scanned several times and it takes almost 30 seconds
every time).

The question is why is a sequential scan chosen instead of index scan -
the estimated row counts seem quite precise, so maybe there's something
wrong with the cost settings. Have you modified the _cost parameters? What
are the current values? See this

http://www.postgresql.org/docs/8.3/interactive/runtime-config-query.html

Try to increase the seq_page_cost a (you may do that for the session only)
and see if the performance improves. You may even disable the sequential
scan using enable_seqscan = off.

Another cause might be a low statistics target - try to increase it with

ALTER TABLE SET STATISTICS

but the funny thing is it's not necessarily the "customers" table ;-)

regards
Tomas

Import Notes

Resolved by subject fallback

Bruce Momjian

bruce@momjian.us

about 17 years ago

In reply to: Bill Moran (#2)

Re: Query 4-5 times slower after ANALYZE

Bill Moran <wmoran@potentialtech.com> writes:

I opened one of those links figuring I'd take a few minutes to see if I could
muster up some advice ... and just started laughing ... definitely not the
type of query that one can even understand in just a few minutes!

You might consider setting default_statistics_target to 100 and re-analyzing.
The estimates don't look too far off but like Bill I haven't analyzed it very
carefully.

One other things that stands out, using comparisons like
('now'::date - creation_date) >= <expression>

is going to make it hard to optimize. Better to use something like
creation_date <= now() - <expression>

Both because of the now() instead of 'now'::date and because the latter is a
comparison that can be indexed instead of an expression which could use an
index on creation_date.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's On-Demand Production Tuning

Bruce Momjian

bruce@momjian.us

about 17 years ago

In reply to: Tomas Vondra (#3)

Re: Query 4-5 times slower after ANALYZE

Oh, the other thing you could try experimenting with are these two parameters.

Your query has *way* more tables than the default values for these so you
would have to raise them substantially. Given that the query was running in
30s you may find that this increases the planning time by more time than it
saves in the query -- the reason these limits exist at all..

geqo_threshold
join_collapse_limit
from_collapse_limit

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's On-Demand Production Tuning

Sebastian Pawłowski

sebpaa@gmail.com

about 17 years ago

In reply to: Philippe Lang (#1)

Re: Query 4-5 times slower after ANALYZE

it's look better with explain.depesz.com :) you can easily find that
in "before" it uses index scan and in "after" it uses seq

before:

http://explain.depesz.com/s/RC

after:

http://explain.depesz.com/s/nm

try as mentioned before to change costs

Best regards,

Sebastian Pawłowski

Wiadomość napisana w dniu 2009-03-18, o godz. 10:55, przez Philippe
Lang:

Show quoted text

Hi,

I'm using Postgresql 8.3.6 under Freebsd 7.1.

After a fresh restore of a customer dump (running version 8.2.7 at the
moment), a rather big query executes in about 30 seconds. As soon as I
run ANALYZE, it is instantly 4-5 times slower. I could check that
multiples times.

Here is the EXPLAIN ANALYZE before the ANALYZE:
http://www.attiksystem.ch/postgresql/query_slower_after_analyze/before.t
xt

And here the the EXPLAIN ANALYZE after the ANALYZE:
http://www.attiksystem.ch/postgresql/query_slower_after_analyze/after.tx
t

Any idea what could be turned on/off in order not to have this
slowdown
after the ANALYZE?

Best regards,

---------------
Philippe Lang
Attik Syste,

-
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Philippe Lang

philippe.lang@attiksystem.ch

about 17 years ago

In reply to: Philippe Lang (#1)

Re: Query 4-5 times slower after ANALYZE

pgsql-general-owner@postgresql.org wrote:

In response to "Philippe Lang" <philippe.lang@attiksystem.ch>:

I'm using Postgresql 8.3.6 under Freebsd 7.1.

After a fresh restore of a customer dump (running version 8.2.7 at
the moment), a rather big query executes in about 30 seconds. As
soon as I run ANALYZE, it is instantly 4-5 times slower. I could
check that multiples times.

Here is the EXPLAIN ANALYZE before the ANALYZE:

http://www.attiksystem.ch/postgresql/query_slower_after_analyze/before

.txt

And here the the EXPLAIN ANALYZE after the ANALYZE:

http://www.attiksystem.ch/postgresql/query_slower_after_analyze/after.

txt

Any idea what could be turned on/off in order not to have this
slowdown after the ANALYZE?

I opened one of those links figuring I'd take a few minutes to see if
I could muster up some advice ... and just started laughing ...
definitely not the type of query that one can even understand in just
a few minutes!

Anyway, the real reason I posted -- I doubt if anyone will be able to
make sense of a query plan that complex without the actual query, so
you'll probably want to post it as well.

:) What? I thought you would read that like Neo was reading the
Matrix... :)

Bill, Gregory, Tomas and Sebastian, thanks for your answers. I have
tried everything your mentioned:

- increase seq_page_cost
- increase geqo_threshold
- increase join_collapse_limit
- increase from_collapse_limit

But it did not help (except disabling completely sequential scans), and
for a reason I think I understand better now: part of the query looks
like:

----------------
SELECT

c.id AS customer_id,
c.name AS customer_name,
d.id AS document_id,
d.number AS document_number,
d.vref AS document_vref,
dt.name AS type,
d.creation_date AS value_date

FROM documents AS d

LEFT JOIN payment_terms AS pt
ON d.payment_term_id = pt.id

INNER JOIN reminder_levels AS rl
ON d.reminder_level_id = rl.id

INNER JOIN document_types AS dt
ON d.document_type_id = dt.id

INNER JOIN projects AS p
ON d.project_id = p.id

INNER JOIN customers AS c
ON p.customer_id = c.id

WHERE d.reminder = 1

AND solde_po(CURRENT_DATE, c.id) > 0

AND d.creation_date <= CURRENT_DATE
----------------

The heavy part here is the "solde_po" call (at the end), which takes up
most CPU time. That's why scanning the customers table takes up so much
time. I imagine a small change in the way this table is scanned can have
enormous effects in the overall execution time, like when an sequential
scan is preferred over an index scan. Does that sound correct?

A small question here: solde_po is an SQL function (not PLPGSQL). Is it
"inlined" in the parent query before the whole query execution plan is
calculated? Or are they treated completely separately?

Philippe

P.S. Thanks for the link to "explain.depesz.com"! Great tool!

Import Notes

Reference msg id not found: E6A0649F1FBFA3408A37F505400E7AC215CDA5@email.attiksystem.ch

Bill Moran

wmoran@potentialtech.com

about 17 years ago

In reply to: Philippe Lang (#7)

Re: Query 4-5 times slower after ANALYZE

In response to "Philippe Lang" <philippe.lang@attiksystem.ch>:

[snip]

Anyway, the real reason I posted -- I doubt if anyone will be able to
make sense of a query plan that complex without the actual query, so
you'll probably want to post it as well.

:) What? I thought you would read that like Neo was reading the
Matrix... :)

Yeah ... not quite like that, although it appears that there are some
on-list who do have that level of skill ...

[more snip ...]

AND solde_po(CURRENT_DATE, c.id) > 0

AND d.creation_date <= CURRENT_DATE
----------------

The heavy part here is the "solde_po" call (at the end), which takes up
most CPU time. That's why scanning the customers table takes up so much
time. I imagine a small change in the way this table is scanned can have
enormous effects in the overall execution time, like when an sequential
scan is preferred over an index scan. Does that sound correct?

Quite likely.

A small question here: solde_po is an SQL function (not PLPGSQL). Is it
"inlined" in the parent query before the whole query execution plan is
calculated? Or are they treated completely separately?

Is that function STABLE? If it's VOLATILE, can it be rewritten to be
STABLE? That might make a lot of difference if it's repeatedly called
with the same values.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Tom Lane

tgl@sss.pgh.pa.us

about 17 years ago

In reply to: Philippe Lang (#7)

Re: Query 4-5 times slower after ANALYZE

"Philippe Lang" <philippe.lang@attiksystem.ch> writes:

A small question here: solde_po is an SQL function (not PLPGSQL). Is it
"inlined" in the parent query before the whole query execution plan is
calculated?

You should be able to tell that by inspecting the filter conditions
in the ANALYZE output. Do you see solde_po(), or its innards?

(Or in other words: it depends, and you didn't give us sufficient
info to say.)

regards, tom lane