Planner hints in Postgresql
I am implementing Planner hints in Postgresql to force the optimizer to
select a particular plan for a query on request from sql input. I am having
trouble in modifying the planner code. I want to create a path node of hint
plan and make it the plan to be used by executor. How do I enforce this ?
Should I create a new Plan for this ..how to create a plan node which can
be then given directly to executor for a particular query?
On Mon, Mar 17, 2014 at 9:22 PM, Rajmohan C <csrajmohan@gmail.com> wrote:
I am implementing Planner hints in Postgresql to force the optimizer to
select a particular plan for a query on request from sql input. I am having
trouble in modifying the planner code. I want to create a path node of hint
plan and make it the plan to be used by executor. How do I enforce this ?
Should I create a new Plan for this ..how to create a plan node which can
be then given directly to executor for a particular query?
Planner hints have been discussed a lot before as well and AFAIK there is a
wiki page that says why we shouldnt implement them. Have you referred to
them?
Please share if you have any new points on the same.
Regards,
Atri
Atri Sharma wrote
On Mon, Mar 17, 2014 at 9:22 PM, Rajmohan C <
csrajmohan@
> wrote:
I am implementing Planner hints in Postgresql to force the optimizer to
select a particular plan for a query on request from sql input. I am
having
trouble in modifying the planner code. I want to create a path node of
hint
plan and make it the plan to be used by executor. How do I enforce this ?
Should I create a new Plan for this ..how to create a plan node which can
be then given directly to executor for a particular query?Planner hints have been discussed a lot before as well and AFAIK there is
a
wiki page that says why we shouldnt implement them. Have you referred to
them?Please share if you have any new points on the same.
Regards,
Atri
http://wiki.postgresql.org/wiki/Todo
(I got to it via the "FAQ" link on the homepage and the "Developer FAQ"
section there-in. You should make sure you've scanned that as well.)
Note the final section titled: "Features We Do Not Want"
Also, you need to consider what you are doing when you cross-post (a bad
thing generally) "-hackers" and "-novice". As there is, rightly IMO, no
"-novice-hackers" list you should have probably just hit up "-general".
Need to discuss the general "why" before any meaningful help on the "how" is
going to be considered by hackers.
David J.
--
View this message in context: http://postgresql.1045698.n5.nabble.com/Planner-hints-in-Postgresql-tp5796347p5796353.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
David Johnston <polobo@yahoo.com> writes:
Need to discuss the general "why" before any meaningful help on the "how" is
going to be considered by hackers.
Possibly worth noting is that in past discussions, we've concluded that
the most sensible type of hint would not be "use this plan" at all, but
"here's what to assume about the selectivity of this WHERE clause".
That seems considerably less likely to break than any attempt to directly
specify plan details.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Mar 17, 2014 at 9:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Johnston <polobo@yahoo.com> writes:
Need to discuss the general "why" before any meaningful help on the
"how" is
going to be considered by hackers.
Possibly worth noting is that in past discussions, we've concluded that
the most sensible type of hint would not be "use this plan" at all, but
"here's what to assume about the selectivity of this WHERE clause".
That seems considerably less likely to break than any attempt to directly
specify plan details.
Isnt using a user given value for selectivity a pretty risky situation as
it can horribly screw up the plan selection?
Why not allow the user to specify an alternate plan and have the planner
assign a higher preference to it during plan evaluation? This shall allow
us to still have a fair evaluation of all possible plans as we do right now
and yet have a higher preference for the user given plan during evaluation?
Regards,
Atri
--
Regards,
Atri
*l'apprenant*
Atri Sharma <atri.jiit@gmail.com> writes:
On Mon, Mar 17, 2014 at 9:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Possibly worth noting is that in past discussions, we've concluded that
the most sensible type of hint would not be "use this plan" at all, but
"here's what to assume about the selectivity of this WHERE clause".
That seems considerably less likely to break than any attempt to directly
specify plan details.
Isnt using a user given value for selectivity a pretty risky situation as
it can horribly screw up the plan selection?
And forcing a plan to be used *isn't* that? Please re-read the older
threads, since you evidently have not.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
* Atri Sharma (atri.jiit@gmail.com) wrote:
Isnt using a user given value for selectivity a pretty risky situation as
it can horribly screw up the plan selection?Why not allow the user to specify an alternate plan and have the planner
Uh, you're worried about the user given us a garbage selectivity, but
they're going to get a full-blown plan perfect?
assign a higher preference to it during plan evaluation? This shall allow
us to still have a fair evaluation of all possible plans as we do right now
and yet have a higher preference for the user given plan during evaluation?
What exactly would such a "preference" look like? A cost modifier?
We'd almost certainly have to make that into a GUC or a value passed in
as part of the query, with a high likelihood of users figuring out how
to use it to say "use my plan forever and always"..
Thanks,
Stephen
Atri Sharma wrote
On Mon, Mar 17, 2014 at 9:45 PM, Tom Lane <
tgl@.pa
> wrote:
David Johnston <
polobo@
> writes:
Need to discuss the general "why" before any meaningful help on the
"how" is
going to be considered by hackers.
Possibly worth noting is that in past discussions, we've concluded that
the most sensible type of hint would not be "use this plan" at all, but
"here's what to assume about the selectivity of this WHERE clause".
That seems considerably less likely to break than any attempt to directly
specify plan details.Isnt using a user given value for selectivity a pretty risky situation as
it can horribly screw up the plan selection?Why not allow the user to specify an alternate plan and have the planner
assign a higher preference to it during plan evaluation? This shall allow
us to still have a fair evaluation of all possible plans as we do right
now
and yet have a higher preference for the user given plan during
evaluation?
The larger question to answer first is whether we want to implement
something that is deterministic...
How about just dropping the whole concept of "hinting" and provide a way for
someone to say "use this plan, or die trying." Maybe require it be used in
conjunction with named PREPAREd statements:
PREPARE s1 (USING /path/to/plan_def_on_server_or_something_similar) AS
SELECT ...;
Aside from whole-plan specification I can definitely see where join/where
specification could be useful if it can overcome the current limitation of
not being able to calculate inter-table estimations.
David J.
--
View this message in context: http://postgresql.1045698.n5.nabble.com/Planner-hints-in-Postgresql-tp5796347p5796378.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Mar 17, 2014 at 10:58 PM, Stephen Frost <sfrost@snowman.net> wrote:
* Atri Sharma (atri.jiit@gmail.com) wrote:
Isnt using a user given value for selectivity a pretty risky situation as
it can horribly screw up the plan selection?Why not allow the user to specify an alternate plan and have the planner
Uh, you're worried about the user given us a garbage selectivity, but
they're going to get a full-blown plan perfect?
I never said that the user plan would be perfect. The entire point of
planner hints is based on the assumption that the user knows more about the
data than the planner does hence the user's ideas about the plan should be
given a preference. Garbage selectivity can screw up the cost estimation
of *all* our possible plans and we could end up preferring a sequential
scan over an index only scan for e.g. I am trying to think of ways that
give some preference to a user plan but do not interfere with the cost
estimation of our other potential plans.
What exactly would such a "preference" look like? A cost modifier?
We'd almost certainly have to make that into a GUC or a value passed in
as part of the query, with a high likelihood of users figuring out how
to use it to say "use my plan forever and always"..
A factor that we experimentally determine by which we decrease the cost of
the user specified plan so that it gets a higher preference in the plan
evaluation.
Of course, this is not a nice hack. Specifically after our discussion on
IRC the other day, I am against planner hints, but if we are just
discussing how it could be done, I could think of some ways which I listed.
Regards,
Atri
--
Regards,
Atri
*l'apprenant*
* Atri Sharma (atri.jiit@gmail.com) wrote:
Of course, this is not a nice hack. Specifically after our discussion on
IRC the other day, I am against planner hints, but if we are just
discussing how it could be done, I could think of some ways which I listed.
There's lots of ways to implement planner hints, but I fail to see the
point in discussing how to implement something we actively don't want.
Thanks,
Stephen
The larger question to answer first is whether we want to implement
something that is deterministic...
How about just dropping the whole concept of "hinting" and provide a way
for
someone to say "use this plan, or die trying." Maybe require it be used in
conjunction with named PREPAREd statements:
You mean taking away the entire concept of query planning and cost
estimation? Thats like replacing the optimizer with DBA decision and I am
not at all comfortable with that idea. That are only my thoughts though.
PREPARE s1 (USING /path/to/plan_def_on_server_or_something_similar) AS
SELECT ...;Aside from whole-plan specification I can definitely see where join/where
specification could be useful if it can overcome the current limitation of
not being able to calculate inter-table estimations.
Prepare plans use a generic plan for the execution. Replacing it with a
totally user defined plan does not seem to be clean.
The crux is that IMHO planner hints are a bad way of trying to circumvent
the need for cross-column statistics. We should do cross-column statistics
done and ignore planner hints completely.
Regards,
Atri
--
Regards,
Atri
*l'apprenant*
There's lots of ways to implement planner hints, but I fail to see the
point in discussing how to implement something we actively don't want.
+1. The original poster wanted a way to implement it as a personal project
or something ( I think he only replied to me, not the entire list).
Planner hints should be ignored :)
Regards,
Atri
--
Regards,
Atri
*l'apprenant*
On Mon, Mar 17, 2014 at 10:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Atri Sharma <atri.jiit@gmail.com> writes:
On Mon, Mar 17, 2014 at 9:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Possibly worth noting is that in past discussions, we've concluded that
the most sensible type of hint would not be "use this plan" at all, but
"here's what to assume about the selectivity of this WHERE clause".
That seems considerably less likely to break than any attempt todirectly
specify plan details.
Isnt using a user given value for selectivity a pretty risky situation as
it can horribly screw up the plan selection?And forcing a plan to be used *isn't* that? Please re-read the older
threads, since you evidently have not.
I never said that we force a plan to be used. I just said that we should
increase the preference for a user given plan and not interfere in the cost
estimation of the other potential plans and the evaluation of the final
selected plan.
Regards,
Atri
--
Regards,
Atri
*l'apprenant*
On Mon, Mar 17, 2014 at 11:15 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Johnston <polobo@yahoo.com> writes:
Need to discuss the general "why" before any meaningful help on the "how" is
going to be considered by hackers.Possibly worth noting is that in past discussions, we've concluded that
the most sensible type of hint would not be "use this plan" at all, but
"here's what to assume about the selectivity of this WHERE clause".
That seems considerably less likely to break than any attempt to directly
specify plan details.
Yeah -- the most common case I see is outlier culling where several
repeated low non-deterministic selectivity quals stack reducing the
row count estimate to 1. For example:
SELECT * FROM foo WHERE length(bar) <= 1000 AND length(bar) >= 2;
The user may have special knowledge that the above is very (or very
un-) selective that is difficult or not cost effective to gather in
the general case. IIRC in the archives (heh) there is a special
workaround using indexes and some discussion regarding how a
hypothetical feature involving user input selectivity estimates might
look. I don't think that discussion is complete: the syntax for user
input selectivity is an unsolved problem.
There's a big difference between saying to the planner, "Use plan X"
vs "Here's some information describing the data supporting choosing
plan X intelligently". The latter allows for better plans in the face
of varied/changing data, integrates with the planner in natural way,
and encourages users to understand how the planner works.
merlin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
There's a big difference between saying to the planner, "Use plan X"
vs "Here's some information describing the data supporting choosing
plan X intelligently". The latter allows for better plans in the face
of varied/changing data, integrates with the planner in natural way,
and encourages users to understand how the planner works.
+1
I was thinking of varying the 'weight' of a user defined plan by an fixed
experimental factor to tell the planner to give higher/lower preference to
this plan, but after your idea above, I think Stephen's point of
introducing a GUC for the factor is the only way possible and I agree with
him on the point that eventually the user will figure out a way to force
usage of his plan using the GUC.
Regards,
Atri
--
Regards,
Atri
*l'apprenant*
* Merlin Moncure (mmoncure@gmail.com) wrote:
Yeah -- the most common case I see is outlier culling where several
repeated low non-deterministic selectivity quals stack reducing the
row count estimate to 1. For example:
SELECT * FROM foo WHERE length(bar) <= 1000 AND length(bar) >= 2;
This is exactly the issue that I've seen also- where we end up picking a
Nested Loop because we think only one row is going to be returned and
instead we end up getting a bunch and it takes forever.
There was also some speculation on trying to change plans mid-stream to
address a situation like that, once we realize what's happening. Not
sure that's really practical but it would be nice to find some solution.
Thanks,
Stephen
On Mon, Mar 17, 2014 at 12:57 PM, Atri Sharma <atri.jiit@gmail.com> wrote:
There's a big difference between saying to the planner, "Use plan X"
vs "Here's some information describing the data supporting choosing
plan X intelligently". The latter allows for better plans in the face
of varied/changing data, integrates with the planner in natural way,
and encourages users to understand how the planner works.+1
I was thinking of varying the 'weight' of a user defined plan by an fixed
experimental factor to tell the planner to give higher/lower preference to
this plan, but after your idea above, I think Stephen's point of introducing
a GUC for the factor is the only way possible and I agree with him on the
point that eventually the user will figure out a way to force usage of his
plan using the GUC.
GUC is not the answer beyond the "broad brush" mostly debugging level
features they already support. What do you do if your plan
simultaneously needs and does not need nestloops?
A query plan is a complicated thing that is the result of detail
analysis of the data. I bet there are less than 100 users on the
planet with the architectural knowledge of the planner to submit a
'plan'. What users do have is knowledge of the data that the database
can't effectively gather for some reason. Looking at my query above,
what it would need (assuming the planner could not be made to look
through length()) would be something like:
SELECT * FROM foo WHERE
length(bar) <= 1000 WITH SELECTIVITY 0.999
AND length(bar) >= 2 WITH SELECTIVITY 0.999;
Note, that's a trivial treatment of the syntax challenges. Ultimately
it'd probably look different and/or be hooked in a different way (say,
via the function call).
merlin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Mar 17, 2014 at 11:50 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Mon, Mar 17, 2014 at 12:57 PM, Atri Sharma <atri.jiit@gmail.com> wrote:
There's a big difference between saying to the planner, "Use plan X"
vs "Here's some information describing the data supporting choosing
plan X intelligently". The latter allows for better plans in the face
of varied/changing data, integrates with the planner in natural way,
and encourages users to understand how the planner works.+1
I was thinking of varying the 'weight' of a user defined plan by an fixed
experimental factor to tell the planner to give higher/lower preferenceto
this plan, but after your idea above, I think Stephen's point of
introducing
a GUC for the factor is the only way possible and I agree with him on the
point that eventually the user will figure out a way to force usage ofhis
plan using the GUC.
GUC is not the answer beyond the "broad brush" mostly debugging level
features they already support. What do you do if your plan
simultaneously needs and does not need nestloops?A query plan is a complicated thing that is the result of detail
analysis of the data. I bet there are less than 100 users on the
planet with the architectural knowledge of the planner to submit a
'plan'. What users do have is knowledge of the data that the database
can't effectively gather for some reason. Looking at my query above,
what it would need (assuming the planner could not be made to look
through length()) would be something like:SELECT * FROM foo WHERE
length(bar) <= 1000 WITH SELECTIVITY 0.999
AND length(bar) >= 2 WITH SELECTIVITY 0.999;
Wont this have scaling issues and issues over time as the data in the
table changes?
Suppose I make a view with the above query. With time, as the data in the
table changes, the selectivity values wont be good for planning. This may
potentially lead to a lot of changes in the view definition and other
places where this query was used.
In general, I think I step back on my point that specifying the selectivity
is a bad idea.
Could this also work (for the time being) for cross-column statistics?
Regards,
Atri
--
Regards,
Atri
*l'apprenant*
2014-03-17 19:35 GMT+01:00 Atri Sharma <atri.jiit@gmail.com>:
On Mon, Mar 17, 2014 at 11:50 PM, Merlin Moncure <mmoncure@gmail.com>wrote:
On Mon, Mar 17, 2014 at 12:57 PM, Atri Sharma <atri.jiit@gmail.com>
wrote:There's a big difference between saying to the planner, "Use plan X"
vs "Here's some information describing the data supporting choosing
plan X intelligently". The latter allows for better plans in the face
of varied/changing data, integrates with the planner in natural way,
and encourages users to understand how the planner works.+1
I was thinking of varying the 'weight' of a user defined plan by an
fixed
experimental factor to tell the planner to give higher/lower preference
to
this plan, but after your idea above, I think Stephen's point of
introducing
a GUC for the factor is the only way possible and I agree with him on
the
point that eventually the user will figure out a way to force usage of
his
plan using the GUC.
GUC is not the answer beyond the "broad brush" mostly debugging level
features they already support. What do you do if your plan
simultaneously needs and does not need nestloops?A query plan is a complicated thing that is the result of detail
analysis of the data. I bet there are less than 100 users on the
planet with the architectural knowledge of the planner to submit a
'plan'. What users do have is knowledge of the data that the database
can't effectively gather for some reason. Looking at my query above,
what it would need (assuming the planner could not be made to look
through length()) would be something like:SELECT * FROM foo WHERE
length(bar) <= 1000 WITH SELECTIVITY 0.999
AND length(bar) >= 2 WITH SELECTIVITY 0.999;Wont this have scaling issues and issues over time as the data in the
table changes?Suppose I make a view with the above query. With time, as the data in the
table changes, the selectivity values wont be good for planning. This may
potentially lead to a lot of changes in the view definition and other
places where this query was used.In general, I think I step back on my point that specifying the
selectivity is a bad idea.Could this also work (for the time being) for cross-column statistics?
It is another issue.
I don't believe so SELECTIVITY can work well too. Slow queries are usually
related to some strange points in data. I am thinking so well concept
should be based on validity of estimations. Some plans are based on totally
wrong estimation, but should be fast due less sensitivity to bad
estimations. So well concept is penalization some risk plans - or use brute
force - like COLUMN store engine does. Their plan is usually simply and
tolerant to bad estimations.
Pavel
Show quoted text
Regards,
Atri
--
Regards,Atri
*l'apprenant*
On 3/17/14, 12:58 PM, Stephen Frost wrote:
* Merlin Moncure (mmoncure@gmail.com) wrote:
Yeah -- the most common case I see is outlier culling where several
repeated low non-deterministic selectivity quals stack reducing the
row count estimate to 1. For example:
SELECT * FROM foo WHERE length(bar) <= 1000 AND length(bar) >= 2;This is exactly the issue that I've seen also- where we end up picking a
Nested Loop because we think only one row is going to be returned and
instead we end up getting a bunch and it takes forever.
FWIW, I've also seen problems with merge and hash joins at work, but I don't have any concrete examples handy. :(
There was also some speculation on trying to change plans mid-stream to
address a situation like that, once we realize what's happening. Not
sure that's really practical but it would be nice to find some solution.
Just being able to detect that something has possibly gone wrong would be useful. We could log that to alert the DBA/user of a potential bad plan. We could even format this in such a fashion that it's suitable for emailing the community with; the query, the plan, the stats, etc. That might make it easier for us to fix the planner (although at this point it seems like we're hitting statistics gathering problems that we simply don't know how to solve).
There is another aspect of this though: plan stability. There are lots of cases where users couldn't care less about getting an optimal plan, but they care *greatly* about not getting a brain-dead plan.
--
Jim C. Nasby, Data Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers