[Bug][Ver 11]: Generic query plan selected is worse than custom query plan

Started by Malay Keshavover 3 years ago2 messagesbugs
Jump to latest
#1Malay Keshav
malay.keshav@databricks.com

Hi,

We are using Postgres 11.13 for our company's critical database. However,
recently after the addition of an index to a table, we found significant
degradation in a specific query's execution time.

We found that Postgres11 caches a generic execution plan for a
parameterized query on the 6th execution of the query based on some
heuristic comparison b/w the generic plan and the custom plan for that
query.

In our particular case, the Postgres engine decided to pick the generic
query plan and cache it for all further calls with that query. My
understanding was that the generic query plan would only be selected if it
had a better execution time than the custom query plan. Which in our case
is not true.

We were able to reproduce this deterministically using the same query
parameters to trigger the engine to pick the bad generic query plan on the
6th run (first 5 runs shows the engine used the efficient query plan). Why
does the engine pick the generic query plan when its execution time is
worse than the custom query plan? Is this a bug?

We have run vacuum analyze, created new tables from existing data, etc but
the problem still persisted. Funny thing is, this only happens in one of
the many deployed regions suggesting it has to do with the data
distribution of that region. We were also able to trick the postgres engine
into not caching the generic plan and to always go for the custom query
plan on each execution. We did this by formulating a query that on the 6th
execution would trigger the heuristic to pick the custom plan. However,
this is not a scalable or practical solution with the 100s of queries we
run against the database - finding a query that can trick the engine into
selecting the custom query plan.

What are our options other than upgrading to Postgres 12 which provides a
configuration to override and use a custom query plan on every execution?

All best,
Malay Keshav

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Malay Keshav (#1)
Re: [Bug][Ver 11]: Generic query plan selected is worse than custom query plan

Malay Keshav <malay.keshav@databricks.com> writes:

In our particular case, the Postgres engine decided to pick the generic
query plan and cache it for all further calls with that query. My
understanding was that the generic query plan would only be selected if it
had a better execution time than the custom query plan. Which in our case
is not true.

We were able to reproduce this deterministically using the same query
parameters to trigger the engine to pick the bad generic query plan on the
6th run (first 5 runs shows the engine used the efficient query plan). Why
does the engine pick the generic query plan when its execution time is
worse than the custom query plan? Is this a bug?

Since you've provided precisely zero detail, it's impossible to answer
in any detail, but this is unlikely to be anything we'd call a bug.
What's presumably happening is that the planner's estimate of the cost
of the generic plan comes out less than its estimate of the cost of
a custom plan, but that estimate is faulty. Sure, we'd love to improve
all such situations, but it's not an easy task --- and any changes in that
area would not be something we'd consider back-patching into long-stable
branches.

regards, tom lane