PostgreSQL performance enhancement when query planner fails to guess the right plan

Started by Constantin Teodorescuover 19 years ago7 messages

Hello all, hope you are remembering me, some years ago I've designed the
PgAccess , the Tcl/Tk visual interface to PostgreSQL.

Thought you haven't received any news from me, I continued working with
PostgreSQL, being involved in very big projects in Romania.
Right now, the national identification of the cows, sheep, goats and
pigs in Romania runs on PostgreSQL on a very big database.
Once again , I had to thank you all for keeping up maintaining and
improving PostgreSQL.

My message to all of you is related to this big project (a government
sustained project) and some performance issues.

Very few words about the database: approx. 60 tables, 30 of them
containing 10 millions to 50 millions records , the whole database is
approx 40 Gb size !

In order to get a good performance, the database is operated on a dual
XEON with 6 Gb RAM IBM x235 server, the database with tables and indexes
carefully distributed on 6 different SCSI disks, in different
tablespaces in such a manner to allow parallelizing reads and HDD head
movements on different devices when joining those big tables.

We have tuned every possible parameter in config file, we have
reorganized queries, analyzing explains in order to get the best results
for all big queries and we succeeded most of the time.
But we have encountered some problems. Due to constant updates and
inserts into the database, it's size is growing continuously.
Of course we are doing DAILY the needed maintaince, vacuums, analyzes
and backups.
Due to permanent changes in database size and statistics there are
queries that sometimes change their execution plan, badly choosing
another plan and executing those queries in 2,3 minutes instead of 10
seconds, the usual execution time since the query plan is "switched". We
have done any effort in changing subselects and the query sentence in
order to "force" using some indexes, continuously watching the explain
results.

We have faced yesterday with such a problem with a query that "switched"
the query plan to a very bad one, almost putting the whole system down.
The only way that we have succeeded to make it work again was by using
the "SET ENABLE_MERGE_JOIN to OFF".
For the moment it works but in our opinion this is NOT the best approach
to guide the planner to a better query-plan variant.

Our suggestion would be : extending the EXPLAIN and SELECT commands like
that:

EXPLAIN VARIANTS SELECT ...... (and so on) that will display the
different query plans analyzed by the planner and their "estimated time
values" , not just the "best guess" .

assuming that the EXPLAIN VARIANTS will show 3 or 4 different query
plans, the database manager will be able to experiment, to test, and to
decide by himself what is "THE BEST PLAN FOR ME", instead of letting
postgresql planner to to that. Doing this, we would be able to clearly
specify then in the SELECT statement the "version" of the query-plan
that would be used in execution like in the following example:

SELECT .... (very big and complex query) ... USING PLAN 3;

Specifying the desired plan could be of course, different.
I realise that it would be probably better that the query-plan will
guess the right and optimal plan. I agree that this can be done be
tweaking parameters and costs BUT THIS CAN TAKE A LOT OF TIME, much more
than a couple of tests on the real database. An experimented database
admin can detect much easier the appropriate plan and "force" the
executor to select that one that he desires.

In our opinion, this would be the simplest and the most non-intrusive
method of "manual choosing" another query plan rather than indirectly
setting ON or OFFS various parameters that could affect badly other
queries.
First of all, it's assumed that the query planner HAS ALREADY evaluated
different variants and it decides to use one based upon the statistics
informations of the involved tables and "costs" for various types of
access.
Unfortunately, due to a very difficult adjustment of those costs and
timings of the HDD performance, IO transfer speeds, PostgreSQL is
choosing sometimes a wrong plan.
If we would have the power of choosing and experimenting different plans
with "SELECT .... USING PLAN <that-one>" we can select than the right
one in our real world.

The "... USING PLAN 9" extension to the language I hope that it's the
most delicate and innocent :-) that I hope that it can be accepted and
it will give a extremely powerful way of controlling the execution
performance.

Hope that you know the old joke with someone who is receiving an email
message : "Hi , I'm the Albanian virus. Due to our poor technologies, I
cannot do much so please delete some of your files and pretend to be
scared" :-)
I admit that this approach might be called "the Albanian way of choosing
the best query plan" :-) but you must admit also that it does not place
a big burden on the developers, it does not change anything in what have
been done since now and it allows the developers and database
administrators to dive into the query plan ocean and to get out of there
the best of the quickest of the fastest query plan. :-)

Hope that this long message have not disturb you so much ... so I'm
waiting for your comments and suggestions.

Best regards,
Constantin Teodorescu
Braila, ROMANIA

#2Lukas Smith
smith@pooteeweet.org
In reply to: Constantin Teodorescu (#1)
Re: PostgreSQL performance enhancement when query planner fails to

Constantin Teodorescu wrote:

EXPLAIN VARIANTS SELECT ...... (and so on) that will display the
different query plans analyzed by the planner and their "estimated time
values" , not just the "best guess" .

assuming that the EXPLAIN VARIANTS will show 3 or 4 different query
plans, the database manager will be able to experiment, to test, and to
decide by himself what is "THE BEST PLAN FOR ME", instead of letting
postgresql planner to to that. Doing this, we would be able to clearly
specify then in the SELECT statement the "version" of the query-plan
that would be used in execution like in the following example:

SELECT .... (very big and complex query) ... USING PLAN 3;

I think something like what Sybase has would be better. A friend told me
that in Sybase you can dump a query plan into a meta structure you can
handtweak (leaving out non relevant aspects) and force the usage of that
handtweaked query plan, where the missing pieces are filled in at run
time. It might still be helpful to get a dump of alternative
query plans via EXPLAIN to get
started.

I guess generally the idea must be to improve the planner to not even
need things like this, but it seems like a very useful fallback for the
time between finding a planner limitation and getting the improvement
into production.

An alternative approach that is already possible today is to handtweak
the table stats in order to generate a stable query plan. But this seems
like a very indirect way to get something you explicitly know you want.

regards,
Lukas

#3Csaba Nagy
nagy@ecircle-ag.com
In reply to: Constantin Teodorescu (#1)
Re: PostgreSQL performance enhancement when query

Constantin,

What binding are you using ? We here use Java+JDBC, and we were able to
get stable query plans by forcing server side prepared statements (using
PGStatement#setPrepareThreshold with 1 as the threshold), where the
query is prepared without knowing the parameter values. This can
backfire too, but for our purposes it was the right thing (probably
sacrificing some performance, but getting a stable system). The plans in
this case are made to work with guessed mean values for the estimates,
and that's usually resulting in a stable plan, so once you got it right
it will stay like that.

Cheers,
Csaba.

Show quoted text

On Mon, 2006-08-07 at 22:02, Constantin Teodorescu wrote:

Hello all, hope you are remembering me, some years ago I've designed the
PgAccess , the Tcl/Tk visual interface to PostgreSQL.

Thought you haven't received any news from me, I continued working with
PostgreSQL, being involved in very big projects in Romania.
Right now, the national identification of the cows, sheep, goats and
pigs in Romania runs on PostgreSQL on a very big database.
Once again , I had to thank you all for keeping up maintaining and
improving PostgreSQL.

My message to all of you is related to this big project (a government
sustained project) and some performance issues.

Very few words about the database: approx. 60 tables, 30 of them
containing 10 millions to 50 millions records , the whole database is
approx 40 Gb size !

In order to get a good performance, the database is operated on a dual
XEON with 6 Gb RAM IBM x235 server, the database with tables and indexes
carefully distributed on 6 different SCSI disks, in different
tablespaces in such a manner to allow parallelizing reads and HDD head
movements on different devices when joining those big tables.

We have tuned every possible parameter in config file, we have
reorganized queries, analyzing explains in order to get the best results
for all big queries and we succeeded most of the time.
But we have encountered some problems. Due to constant updates and
inserts into the database, it's size is growing continuously.
Of course we are doing DAILY the needed maintaince, vacuums, analyzes
and backups.
Due to permanent changes in database size and statistics there are
queries that sometimes change their execution plan, badly choosing
another plan and executing those queries in 2,3 minutes instead of 10
seconds, the usual execution time since the query plan is "switched". We
have done any effort in changing subselects and the query sentence in
order to "force" using some indexes, continuously watching the explain
results.

We have faced yesterday with such a problem with a query that "switched"
the query plan to a very bad one, almost putting the whole system down.
The only way that we have succeeded to make it work again was by using
the "SET ENABLE_MERGE_JOIN to OFF".
For the moment it works but in our opinion this is NOT the best approach
to guide the planner to a better query-plan variant.

Our suggestion would be : extending the EXPLAIN and SELECT commands like
that:

EXPLAIN VARIANTS SELECT ...... (and so on) that will display the
different query plans analyzed by the planner and their "estimated time
values" , not just the "best guess" .

assuming that the EXPLAIN VARIANTS will show 3 or 4 different query
plans, the database manager will be able to experiment, to test, and to
decide by himself what is "THE BEST PLAN FOR ME", instead of letting
postgresql planner to to that. Doing this, we would be able to clearly
specify then in the SELECT statement the "version" of the query-plan
that would be used in execution like in the following example:

SELECT .... (very big and complex query) ... USING PLAN 3;

Specifying the desired plan could be of course, different.
I realise that it would be probably better that the query-plan will
guess the right and optimal plan. I agree that this can be done be
tweaking parameters and costs BUT THIS CAN TAKE A LOT OF TIME, much more
than a couple of tests on the real database. An experimented database
admin can detect much easier the appropriate plan and "force" the
executor to select that one that he desires.

In our opinion, this would be the simplest and the most non-intrusive
method of "manual choosing" another query plan rather than indirectly
setting ON or OFFS various parameters that could affect badly other
queries.
First of all, it's assumed that the query planner HAS ALREADY evaluated
different variants and it decides to use one based upon the statistics
informations of the involved tables and "costs" for various types of
access.
Unfortunately, due to a very difficult adjustment of those costs and
timings of the HDD performance, IO transfer speeds, PostgreSQL is
choosing sometimes a wrong plan.
If we would have the power of choosing and experimenting different plans
with "SELECT .... USING PLAN <that-one>" we can select than the right
one in our real world.

The "... USING PLAN 9" extension to the language I hope that it's the
most delicate and innocent :-) that I hope that it can be accepted and
it will give a extremely powerful way of controlling the execution
performance.

Hope that you know the old joke with someone who is receiving an email
message : "Hi , I'm the Albanian virus. Due to our poor technologies, I
cannot do much so please delete some of your files and pretend to be
scared" :-)
I admit that this approach might be called "the Albanian way of choosing
the best query plan" :-) but you must admit also that it does not place
a big burden on the developers, it does not change anything in what have
been done since now and it allows the developers and database
administrators to dive into the query plan ocean and to get out of there
the best of the quickest of the fastest query plan. :-)

Hope that this long message have not disturb you so much ... so I'm
waiting for your comments and suggestions.

Best regards,
Constantin Teodorescu
Braila, ROMANIA

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

#4Constantin Teodorescu
brailateo@gmail.com
In reply to: Csaba Nagy (#3)
Re: PostgreSQL performance enhancement when query planner

Csaba Nagy wrote:

Constantin,

What binding are you using ? We here use Java+JDBC, and we were able to
get stable query plans by forcing server side prepared statements (using
PGStatement#setPrepareThreshold with 1 as the threshold), where the
query is prepared without knowing the parameter values. This can
backfire too, but for our purposes it was the right thing (probably
sacrificing some performance, but getting a stable system). The plans in
this case are made to work with guessed mean values for the estimates,
and that's usually resulting in a stable plan, so once you got it right
it will stay like that.

We have tried PGStatement#setPrepareThreshold with 1 as the threshold
but it's not a good solution.
Actually is worst. Considering that you have 5 different query plans,
you are selecting approx. random one of them, not taking into account
the statistics.

The situation is simpler than it's at the first view.

Guessing what is the best plan, based on statistics and costs, IS NOT A
EASY THING TO DO.
Tweaking costs and statistics CAN TAKE A VERY LONG TIME and need strong
knowledge about database architecture, hardware performances and many
other things.
Not every average user of PostgreSQL can do that!

Experimenting the first 3 or 4 query plans in the descending order of
their estimated cost, IS SIMPLER and it can take less than an hour and
can be done by less experimented people.
Choosing the "proved" better query plan IS SIMPLER and that means
"PERFORMANCE EVEN FOR THE AVERAGE USER".

We are talking about open-source, free-source and the freedom of choice,
isn't it? So, why not give the user the freedom of choosing a different
query plan that will give a better performances.

Maybe I'm not interested in developing WHY the query planner is choosing
wrong. Of course , the developers will enhance it but until then, let's
give the user the power of manually selecting the right query plan.
The final result may be something like that : "I heard that PostgreSQL
has a very handy tool that gives you a better performance in queries. It
gives you the ability to make fine adjustments".
Sound good, isn't it ?
:-)

Teo

#5Csaba Nagy
nagy@ecircle-ag.com
In reply to: Constantin Teodorescu (#4)
Re: PostgreSQL performance enhancement when query

On Tue, 2006-08-08 at 12:36, Constantin Teodorescu wrote:

We have tried PGStatement#setPrepareThreshold with 1 as the threshold
but it's not a good solution.
Actually is worst. Considering that you have 5 different query plans,
you are selecting approx. random one of them, not taking into account
the statistics.

Wrong, you'll select _the same_ plan, that's what matters. If it's not
the plan you wanted, you have to rewrite the query, and try again, but
once you got the plan you wanted, it's pretty much you'll get always the
same plan. So you only need to test as long as you get the right query
to trigger the right plan... but of course this requires that your
queries are so constructed to always be OK with that plan, regardless
the parameter values. Usually this means a suboptimal plan, but stable
execution times.

If you need to give hints to the DB based on the parameter values and
choose different plans for different parameter values, then you
basically do the job of the planner in your application, and I guess
sooner or later you'll make wrong choices too.

Some hinting mechanism would be good for cases where the developer
really know better how the data is laid out (e.g. forcing the use of a
specific access method for one table in a complex join), but that
forcing a complete plan is probably not good. Even the hinting is only a
workaround for the planner fixes which will cannot make it to the stable
version...

On the daydreaming part, how about a 2 phase planner ?

Modus operandi:

Phase 1: compile and cache plan decision tree:
- collect all reasonable plans without taking into account the
parameter values;
- check the parameter bounds where each plan is the fastest;
- compile a decision tree which based on the parameter values chooses
one plan or the other;
- cache this plan decision tree;
- there's no need to cache plans which will always loose to some other
plan no matter what parameter values you give (to limit the size of the
decision tree);

Phase 2: run the decision tree to chose the best cached plan for the
parameter values;

You could use variables coming from the statistics system in the
decision tree so it doesn't have to be recalculated too often on
statistics changes.

With a system like this, you could at system startup make the decision
tree for all your frequently used queries and have fast planning at
runtime which is optimized for the parameter values (takes the decision
tree from the cache, runs it with the current parameters). Or just store
the whole thing in a system table... or tweak the decision tree
manually...

This is actually not addressing the plan stability issue, but if manual
tweaking would be allowed, it would...

Cheers,
Csaba.

#6Jim C. Nasby
jnasby@pervasive.com
In reply to: Lukas Smith (#2)
Re: PostgreSQL performance enhancement when query planner fails to

On Mon, Aug 07, 2006 at 10:47:39PM +0200, Lukas Smith wrote:

Constantin Teodorescu wrote:

EXPLAIN VARIANTS SELECT ...... (and so on) that will display the
different query plans analyzed by the planner and their "estimated time
values" , not just the "best guess" .

assuming that the EXPLAIN VARIANTS will show 3 or 4 different query
plans, the database manager will be able to experiment, to test, and to
decide by himself what is "THE BEST PLAN FOR ME", instead of letting
postgresql planner to to that. Doing this, we would be able to clearly
specify then in the SELECT statement the "version" of the query-plan
that would be used in execution like in the following example:

SELECT .... (very big and complex query) ... USING PLAN 3;

I think something like what Sybase has would be better. A friend told me
that in Sybase you can dump a query plan into a meta structure you can
handtweak (leaving out non relevant aspects) and force the usage of that
handtweaked query plan, where the missing pieces are filled in at run
time. It might still be helpful to get a dump of alternative
query plans via EXPLAIN to get
started.

Another option would be adding planner hints.

I guess generally the idea must be to improve the planner to not even
need things like this, but it seems like a very useful fallback for the
time between finding a planner limitation and getting the improvement
into production.

Yeah, this comes up at least twice a year and every time the answer is
"We won't add planner hints because we should just fix the planner".
Fact is, all of the 'big 3' have planner hints, even though they've
invested litterally millions (if not billions) of dollars in their
planners. If they can't get it right I think it's hubris for us to think
we'll magically be able to, especially when there are known
deficiencies in our statistics system.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#7Lukas Smith
smith@pooteeweet.org
In reply to: Csaba Nagy (#5)
Re: PostgreSQL performance enhancement when query

Csaba Nagy wrote:

On Tue, 2006-08-08 at 12:36, Constantin Teodorescu wrote:

We have tried PGStatement#setPrepareThreshold with 1 as the threshold
but it's not a good solution.
Actually is worst. Considering that you have 5 different query plans,
you are selecting approx. random one of them, not taking into account
the statistics.

Wrong, you'll select _the same_ plan, that's what matters. If it's not
the plan you wanted, you have to rewrite the query, and try again, but
once you got the plan you wanted, it's pretty much you'll get always the
same plan. So you only need to test as long as you get the right query
to trigger the right plan... but of course this requires that your
queries are so constructed to always be OK with that plan, regardless
the parameter values. Usually this means a suboptimal plan, but stable
execution times.

Well it should usually be possible to find a query that gives a stable
query plan. However in some cases stable query plan means varying
performance which is also not ideal. So you have to actually find a
query that will give you stable performance (which often means finding a
query that is a good compromise and that producses a stable plan).

But if you have changing data, very different selectivity for values etc
this can become very hard, maybe even impossible. For these kinds of
queries it might just be easier to put in the effort to specify (parts
of) the query plan explicitly. Especially as an interim solution until a
new stable release comes around that fixes the underlying planner issue
(which will usually be atleast 6-12 months).

regards,
Lukas