disfavoring unparameterized nested loops
From time to time, someone tells me that they've configured
enable_nestloop=false on postgresql.conf, which is a pretty bad idea
since there are a significant number of cases where such plans are the
only reasonable way of executing some query. However, it's no great
secret that PostgreSQL's optimizer sometimes produces nested loops
that are very, very, very slow, generally because it has grossly
underestimated the cardinality of the inner side of the nested loop.
The frustration which users experience as a result of such events is
understandable.
I read https://15721.courses.cs.cmu.edu/spring2020/papers/22-costmodels/p204-leis.pdf
today and found out that the authors of that paper did something a bit
more nuanced which, in their experiments, was very successful. It
sounds like what they basically did is disabled unparameterized nested
loops. They argue that such nested loops figure to gain very little as
compared with a hash join, but that they might turn out to lose a lot
if the cardinality estimation is inaccurate, and they present
experimental results to back up those claims. One observation that the
paper makes along the way is that every system they tested is more
likely to underestimate the cardinality of joins than to overestimate
it, and that this tendency becomes more pronounced as the size of the
join planning problem increases. On reflection, I believe this matches
my experience, and it makes sense that it should be so, since it
occasionally happens that the join selectivity estimate is essentially
zero, and a bigger join problem is more likely to have at least one
such case. On the other hand, the join selectivity estimate can never
be +infinity. Hence, it's more important in general for a database
system to be resilient against underestimates than to be resilient
against overestimates. Being less willing to choose unparameterized
nested loops is one way to move in that direction.
How to do that is not very clear. One very simple thing we could do
would be to introduce enable_nestloop=parameterized or
enable_parameterized_nestloop=false. That is a pretty blunt instrument
but the authors of the paper seem to have done that with positive
results, so maybe it's actually not a dumb idea. Another approach
would be to introduce a large fuzz factor for such nested loops e.g.
keep them only if the cost estimate is better than the comparable hash
join plan by at least a factor of N (or if no such plan is being
generated for some reason). I'm not very confident that this would
actually do what we want, though. In the problematic cases, a nested
loop is likely to look extremely cheap, so just imagining that the
cost might be higher is not very protective. Perhaps a better approach
would be something like: if the estimated number of inner rows is less
than 100, then re-estimate the cost of this approach and of the best
available hash join on the assumption that there are 100 inner rows.
If the hash join still wins, keep it; if it loses under that
assumption, throw it out. I think it's likely that this approach would
eliminate a large number of highly risky nested loop plans, probably
even with s/100/10/g, without causing many other problems (except
perhaps increased planner CPU consumption ... but maybe that's not too
bad).
Just to be clear, I do understand that there are cases where no Hash
Join is possible, but anything we do here could be made to apply only
when a hash join is in fact possible. We could also think about making
the point of comparison the best other plans of any sort rather than a
hash join specifically, which feels a little more principled but might
actually be worse. When a Nested Loop is a stupid idea, it's stupid
precisely because the inner side is big and we could've avoided
recomputing it over and over by using a Hash Join instead, not because
some Merge Join based plan turns out to be better. I mean, it is
possible that some Merge Join plan does turn out to be better, but
that's not rage-inducing in the same way. Nobody looks at a
complicated join plan that happened to use a Nested Loop and says
"obviously, this is inferior to a merge join," or if they do, they're
probably full of hot air. But people look at complicated join plans
that happen to use a Nested Loop and say "obviously, this is inferior
to a hash join" *all the time* and assuming the inner path is
unparameterized, they are very often correct.
Thoughts? I'd be particularly curious to hear about any cases anyone
knows about where an unparameterized nested loop and a hash join with
batches=1 are both possible and where the unparameterized nested loop
is *much* cheaper than the hash join.
Thanks,
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Jun 15, 2021 at 10:09 AM Robert Haas <robertmhaas@gmail.com> wrote:
How to do that is not very clear. One very simple thing we could do
would be to introduce enable_nestloop=parameterized or
enable_parameterized_nestloop=false. That is a pretty blunt instrument
but the authors of the paper seem to have done that with positive
results, so maybe it's actually not a dumb idea.
I think that it's probably a good idea as-is.
Simple heuristics that are very frequently wrong when considered in a
naive way can work very well in practice. This seems to happen when
they capture some kind of extreme naturally occuring cost/benefit
asymmetry -- especially one with fixed well understood costs and
unlimited benefits (this business with unparameterized nestloop joins
is about *avoiding* the inverse asymmetry, but that seems very
similar). My go to example of such an asymmetry is the rightmost page
split heuristic of applying leaf fillfactor regardless of any of the
other specifics; we effectively assume that all indexes are on columns
with ever-increasing values. Which is obviously wrong.
We're choosing between two alternatives (unparameterized nested loop
vs hash join) that are really very similar when things go as expected,
but diverge sharply when there is a misestimation -- who wouldn't take
the "conservative" choice here?
I guess that there is a hesitation to not introduce heuristics like
this because it doesn't fit into some larger framework that captures
risk -- it might be seen as an ugly special case. But isn't this
already actually kind of special, whether or not we officially think
so?
--
Peter Geoghegan
On Tue, Jun 15, 2021 at 2:04 PM Peter Geoghegan <pg@bowt.ie> wrote:
I guess that there is a hesitation to not introduce heuristics like
this because it doesn't fit into some larger framework that captures
risk -- it might be seen as an ugly special case. But isn't this
already actually kind of special, whether or not we officially think
so?
Yes, I think it is. Reading the paper really helped me crystallize my
thoughts about this, because when I've studied the problems myself, I
came, as you postulate here, to the conclusion that there's a lot of
stuff the planner does where there is risk and uncertainty, and thus
that a general framework would be necessary to deal with it. But the
fact that an academic researcher called this problem out as the only
one worth treating specially makes me think that perhaps it deserves
special handling. In defense of that approach, note that this is a
case where we know both that the Nested Loop is risky and that Hash
Join is a similar alternative with probably similar cost. I am not
sure there are any other cases where we can say quite so generally
both that a certain thing is risky and what we could do instead.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Jun 15, 2021 at 12:31 PM Robert Haas <robertmhaas@gmail.com> wrote:
Yes, I think it is. Reading the paper really helped me crystallize my
thoughts about this, because when I've studied the problems myself, I
came, as you postulate here, to the conclusion that there's a lot of
stuff the planner does where there is risk and uncertainty, and thus
that a general framework would be necessary to deal with it.
It is an example (perhaps the only example in the optimizer) of an
oasis of certainty in an ocean of uncertainty. As uncertain as
everything is, we seemingly can make strong robust statements about
the relative merits of each strategy *in general*, just in this
particular instance. It's just not reasonable to make such a reckless
choice, no matter what your general risk tolerance is.
Goetz Graefe is interviewed here, and goes into his philosophy on
robustness -- it seems really interesting to me:
https://sigmodrecord.org/publications/sigmodRecord/2009/pdfs/05_Profiles_Graefe.pdf
In defense of that approach, note that this is a
case where we know both that the Nested Loop is risky and that Hash
Join is a similar alternative with probably similar cost. I am not
sure there are any other cases where we can say quite so generally
both that a certain thing is risky and what we could do instead.
I tend to think of a hash join as like a nested loop join with an
inner index scan where you build the index yourself, dynamically. That
might be why I find it easy to make this mental leap. In theory you
could do this by giving the nestloop join runtime smarts -- make it
turn into a hash join adaptively. Like Graefe's G-Join design. That
way you could do this in a theoretically pure way.
I don't think that that's actually necessary just to deal with this
case -- it probably really is as simple as it seems. I point this out
because perhaps it's useful to have that theoretical anchoring.
--
Peter Geoghegan
On Wed, 16 Jun 2021 at 05:09, Robert Haas <robertmhaas@gmail.com> wrote:
How to do that is not very clear. One very simple thing we could do
would be to introduce enable_nestloop=parameterized or
enable_parameterized_nestloop=false. That is a pretty blunt instrument
but the authors of the paper seem to have done that with positive
results, so maybe it's actually not a dumb idea.
It's not great that people are having to use such blunt instruments to
get the planner to behave. It might not be a terrible idea to provide
them with something a bit sharper as you suggest. The GUC idea is
certainly something that could be done without too much effort.
There was some talk of doing that in [1]/messages/by-id/CAKJS1f8nsm-T0KMvGJz_bskUjQ=yGmGUUtUdAcFoEaZ_tuTXjA@mail.gmail.com.
Another approach
would be to introduce a large fuzz factor for such nested loops e.g.
keep them only if the cost estimate is better than the comparable hash
join plan by at least a factor of N (or if no such plan is being
generated for some reason).
In my experience, the most common reason that the planner chooses
non-parameterized nested loops wrongly is when there's row
underestimation that says there's just going to be 1 row returned by
some set of joins. The problem often comes when some subsequent join
is planned and the planner sees the given join rel only produces one
row. The cheapest join method we have to join 1 row is Nested Loop.
So the planner just sticks the 1-row join rel on the outer side
thinking the executor will only need to scan the inner side of the
join once. When the outer row count blows up, then we end up scanning
that inner side many more times. The problem is compounded when you
nest it a few joins deep
Most of the time when I see that happen it's down to either the
selectivity of some correlated base-quals being multiplied down to a
number low enough that we clamp the estimate to be 1 row. The other
case is similar, but with join quals.
It seems to me that each time we multiply 2 selectivities together
that the risk of the resulting selectivity being out increases. The
risk is likely lower when we have some extended statistics which
allows us to do fewer selectivity multiplications.
For that 1-row case, doing an unparameterized nested loop is only the
cheapest join method by a tiny amount. It really wouldn't be much
more expensive to just put that single row into a hash table. If that
1 estimated row turns out to be 10 actual rows then it's likely not
too big a deal for the hash join code to accommodate the 9 additional
rows.
This makes me think that it's insanely risky for the planner to be
picking Nested Loop joins in this case. And that puts me down the path
of thinking that this problem should be solved by having the planner
take into account risk as well as costs when generating plans.
I don't really have any concrete ideas on that, but a basic idea that
I have been considering is that a Path has a risk_factor field that is
decided somewhere like clauselist_selectivity(). Maybe the risk can go
up by 1 each time we multiply an individual selectivity. (As of
master, estimate_num_groups() allows the estimation to pass back some
further information to the caller. I added that for Result Cache so I
could allow the caller to get visibility about when the estimate fell
back on DEFAULT_NUM_DISTINCT. clauselist_selectivity() maybe could get
similar treatment to allow the risk_factor or number of nstats_used to
be passed back). We'd then add a GUC, something like
planner_risk_adversion which could be set to anything from 0.0 to some
positive number. During add_path() we could do the cost comparison
like: path1.cost * path1.risk_factor * (1.0 + planner_risk_adversion)
< path2.cost * path2.risk_factor * (1.0 + planner_risk_adversion).
That way, if you set planner_risk_adversion to 0.0, then the planner
does as it does today, i.e takes big risks.
The problem I have with this idea is that I really don't know how to
properly calculate what the risk_factor should be set to. It seems
easy at first to set it to something that has the planner avoid these
silly 1-row estimate nested loop mistakes, but I think what we'd set
the risk_factor to would become much more important when more and more
Path types start using it. So if we did this and just guessed the
risk_factor, that might be fine when only 1 of the paths being
compared had a non-zero risk_factor, but as soon as both paths have
one set, unless they're set to something sensible, then we just end up
comparing garbage costs to garbage costs.
Another common mistake the planner makes is around WHERE a = <value>
ORDER BY b LIMIT n; where there are separate indexes on (a) and (b).
Scanning the (b) index is pretty risky. All the "a" values you need
might be right at the end of the index. It seems safer to scan the (a)
index as we'd likely have statistics that tell us how many rows exist
with <value>. We don't have any stats that tell us where in the (b)
index are all the rows with a = <value>.
I don't really think we should solve this by having nodeNestloop.c
fall back on hashing when the going gets tough. Overloading our nodes
that way is not a sustainable thing to do. I'd rather see the
executor throw the plan back at the planner along with some hints
about what was wrong with it. We could do that providing we've not
sent anything back to the client yet.
David
[1]: /messages/by-id/CAKJS1f8nsm-T0KMvGJz_bskUjQ=yGmGUUtUdAcFoEaZ_tuTXjA@mail.gmail.com
On Tue, Jun 15, 2021 at 5:00 PM David Rowley <dgrowleyml@gmail.com> wrote:
I don't really think we should solve this by having nodeNestloop.c
fall back on hashing when the going gets tough. Overloading our nodes
that way is not a sustainable thing to do. I'd rather see the
executor throw the plan back at the planner along with some hints
about what was wrong with it. We could do that providing we've not
sent anything back to the client yet.
It wasn't a serious suggestion -- it was just a way of framing the
issue at hand that I thought might be useful.
If we did have something like that (which FWIW I think makes sense but
is hard to do right in a general way) then it might be expected to
preemptively refuse to even start down the road of using an
unparameterized nestloop join very early, or even before execution
time. Such an adaptive join algorithm/node might be expected to have a
huge bias against this particular plan shape, that can be reduced to a
simple heuristic. But you can have the simple heuristic without
needing to build everything else.
Whether or not we throw the plan back at the planner or "really change
our minds at execution time" seems like a distinction without a
difference. Either way we're changing our minds about the plan based
on information that is fundamentally execution time information, not
plan time information. Have I missed something?
--
Peter Geoghegan
On Wed, 16 Jun 2021 at 12:11, Peter Geoghegan <pg@bowt.ie> wrote:
Whether or not we throw the plan back at the planner or "really change
our minds at execution time" seems like a distinction without a
difference.
What is "really change our minds at execution time"? Is that switch
to another plan without consulting the planner? If so what decides
what that new plan should be? The planner is meant to be the expert in
that department. The new information might cause the join order to
completely change. It might not be as simple as swapping a Nested Loop
for a Hash Join.
Either way we're changing our minds about the plan based
on information that is fundamentally execution time information, not
plan time information. Have I missed something?
I don't really see why you think the number of rows that a given join
might produce is execution information. It's exactly the sort of
information the planner needs to make a good plan. It's just that
today we get that information from statistics. Plenty of other DBMSs
make decisions from sampling. FWIW, we do already have a minimalist
sampling already in get_actual_variable_range().
I'm just trying to highlight that I don't think overloading nodes is a
good path to go down. It's not a sustainable practice. It's a path
towards just having a single node that does everything. If your
suggestion was not serious then there's no point in discussing it
further.
David
On Tue, Jun 15, 2021 at 6:15 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Wed, 16 Jun 2021 at 12:11, Peter Geoghegan <pg@bowt.ie> wrote:
Whether or not we throw the plan back at the planner or "really change
our minds at execution time" seems like a distinction without a
difference.What is "really change our minds at execution time"? Is that switch
to another plan without consulting the planner?
I don't know what it means. That was my point -- it all seems like
semantics to me.
The strong separation between plan time and execution time isn't
necessarily a good thing, at least as far as solving some of the
thorniest problems goes. It seems obvious to me that cardinality
estimation is the main problem, and that the most promising solutions
are all fundamentally about using execution time information to change
course. Some problems with planning just can't be solved at plan time
-- no model can ever be smart enough. Better to focus on making query
execution more robust, perhaps by totally changing the plan when it is
clearly wrong. But also by using more techniques that we've
traditionally thought of as execution time techniques (e.g. role
reversal in hash join). The distinction is blurry to me.
There are no doubt practical software engineering issues with this --
separation of concerns and whatnot. But it seems premature to go into
that now.
The new information might cause the join order to
completely change. It might not be as simple as swapping a Nested Loop
for a Hash Join.
I agree that it might not be that simple at all. I think that Robert
is saying that this is one case where it really does appear to be that
simple, and so we really can expect to benefit from a simple plan-time
heuristic that works within the confines of the current model. Why
wouldn't we just take that easy win, once the general idea has been
validated some more? Why let the perfect be the enemy of the good?
I have perhaps muddied the waters by wading into the more general
question of robust execution, the inherent uncertainty with
cardinality estimation, and so on. Robert really didn't seem to be
talking about that at all (though it is clearly related).
Either way we're changing our minds about the plan based
on information that is fundamentally execution time information, not
plan time information. Have I missed something?I don't really see why you think the number of rows that a given join
might produce is execution information.
If we're 100% sure a join will produce at least n rows because we
executed it (with the express intention of actually doing real query
processing that returns rows to the client), and it already produced n
rows, then what else could it be called? Why isn't it that simple?
It's exactly the sort of
information the planner needs to make a good plan. It's just that
today we get that information from statistics. Plenty of other DBMSs
make decisions from sampling.
FWIW, we do already have a minimalist
sampling already in get_actual_variable_range().
I know, but that doesn't seem all that related -- it almost seems like
the opposite idea. It isn't the executor balking when it notices that
the plan is visibly wrong during execution, in some important way.
It's more like the planner using the executor to get information about
an index that is well within the scope of what we think of as plan
time.
To some degree the distinction gets really blurred due to nodes like
hash join, where some important individual decisions are delayed until
execution time already. It's really unclear when precisely it stops
being that, and starts being more of a case of either partially or
wholly replanning. I don't know how to talk about it without it being
confusing.
I'm just trying to highlight that I don't think overloading nodes is a
good path to go down. It's not a sustainable practice. It's a path
towards just having a single node that does everything. If your
suggestion was not serious then there's no point in discussing it
further.
As I said, it was a way of framing one particular issue that Robert is
concerned about.
--
Peter Geoghegan
On Tue, Jun 15, 2021 at 5:00 PM David Rowley <dgrowleyml@gmail.com> wrote:
Most of the time when I see that happen it's down to either the
selectivity of some correlated base-quals being multiplied down to a
number low enough that we clamp the estimate to be 1 row. The other
case is similar, but with join quals.It seems to me that each time we multiply 2 selectivities together
that the risk of the resulting selectivity being out increases. The
risk is likely lower when we have some extended statistics which
allows us to do fewer selectivity multiplications.
It seems important to distinguish between risk and uncertainty --
they're rather different things. The short version is that you can
model risk but you cannot model uncertainty. This seems like a problem
of uncertainty to me.
The example from the paper that Robert cited isn't interesting to me
because it hints at a way of managing the uncertainty, exactly. It's
interesting because it seems to emphasize the user's exposure to the
problem, which is what really matters. Even if it was extremely
unlikely that the user would have a problem, the downside of being
wrong is still absurdly high, and the upside of being right is low
(possibly even indistinguishable from zero). It's just not worth
thinking about. Besides, we all know that selectivity estimates are
very often quite wrong without the user ever noticing. It's amazing
that database optimizers work as well as they do, really.
--
Peter Geoghegan
The problem I have with this idea is that I really don't know how to
properly calculate what the risk_factor should be set to. It seems
easy at first to set it to something that has the planner avoid these
silly 1-row estimate nested loop mistakes, but I think what we'd set
the risk_factor to would become much more important when more and more
Path types start using it. So if we did this and just guessed the
risk_factor, that might be fine when only 1 of the paths being
compared had a non-zero risk_factor, but as soon as both paths have
one set, unless they're set to something sensible, then we just end up
comparing garbage costs to garbage costs.
Risk factor is the inverse of confidence on estimate, lesser
confidence, higher risk. If we associate confidence with the
selectivity estimate, or computer confidence interval of the estimate
instead of a single number, we can associate risk factor with each
estimate. When we combine estimates to calculate new estimates, we
also combine their confidences/confidence intervals. If my memory
serves well, confidence intervals/confidences are calculated based on
the sample size and method used for estimation, so we should be able
to compute those during ANALYZE.
I have not come across many papers which leverage this idea. Googling
"selectivity estimation confidence interval", does not yield many
papers. Although I found [1]https://pi3.informatik.uni-mannheim.de/~moer/Publications/vldb18_smpl_synop.pdf -- Best Wishes, Ashutosh Bapat to be using a similar idea. So may be
there's not merit in this idea, thought theoretically it sounds fine
to me.
[1]: https://pi3.informatik.uni-mannheim.de/~moer/Publications/vldb18_smpl_synop.pdf -- Best Wishes, Ashutosh Bapat
--
Best Wishes,
Ashutosh Bapat
On Fri, Jun 18, 2021 at 6:20 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
wrote:
I have not come across many papers which leverage this idea. Googling
"selectivity estimation confidence interval", does not yield many
papers. Although I found [1] to be using a similar idea. So may be
there's not merit in this idea, thought theoretically it sounds fine
to me.[1]
https://pi3.informatik.uni-mannheim.de/~moer/Publications/vldb18_smpl_synop.pdf
Well, that paper's title shows it's a bit too far forward for us, since we
don't use samples during plan time (although that's a separate topic worth
considering). From the references, however, this one gives some
mathematical framing of the problem that lead to the thread subject,
although I haven't read enough to see if we can get practical advice from
it:
Y. E. Ioannidis and S. Christodoulakis. On the propagation of errors in the
size of join results.
https://www.csd.uoc.gr/~hy460/pdf/p268-ioannidis.pdf
--
John Naylor
EDB: http://www.enterprisedb.com
On Wed, 16 Jun 2021 at 15:08, Peter Geoghegan <pg@bowt.ie> wrote:
It seems important to distinguish between risk and uncertainty --
they're rather different things. The short version is that you can
model risk but you cannot model uncertainty. This seems like a problem
of uncertainty to me.
You might be right there. "Uncertainty" or "Certainty" seems more
like a value that clauselist_selectivity() would be able to calculate
itself. It would just be up to the planner to determine what to do
with it.
One thing I thought about is that if the costing modal was able to
separate out a cost of additional (unexpected) rows then it would be
easier for add_path() to take into account how bad things might go if
we underestimate.
For example, in an unparameterized Nested Loop that estimates the
outer Path to have 1 row will cost an entire additional inner cost if
there are 2 rows. With Hash Join the cost is just an additional
hashtable lookup, which is dead cheap. I don't know exactly how
add_path() would weigh all that up, but it seems to me that I wouldn't
take the risk unless I was 100% certain that the Nested Loop's outer
Path would only return 1 row exactly, if there was any chance at all
it could return more, I'd be picking some other join method.
David
On Tue, Jun 15, 2021 at 8:00 PM David Rowley <dgrowleyml@gmail.com> wrote:
In my experience, the most common reason that the planner chooses
non-parameterized nested loops wrongly is when there's row
underestimation that says there's just going to be 1 row returned by
some set of joins. The problem often comes when some subsequent join
is planned and the planner sees the given join rel only produces one
row. The cheapest join method we have to join 1 row is Nested Loop.
So the planner just sticks the 1-row join rel on the outer side
thinking the executor will only need to scan the inner side of the
join once. When the outer row count blows up, then we end up scanning
that inner side many more times. The problem is compounded when you
nest it a few joins deepMost of the time when I see that happen it's down to either the
selectivity of some correlated base-quals being multiplied down to a
number low enough that we clamp the estimate to be 1 row. The other
case is similar, but with join quals.
If an estimate is lower than 1, that should be a red flag that Something Is
Wrong. This is kind of a crazy idea, but what if we threw it back the other
way by computing 1 / est , and clamping that result to 2 <= res < 10 (or
100 or something)? The theory is, the more impossibly low it is, the more
wrong it is. I'm attracted to the idea of dealing with it as an estimation
problem and not needing to know about join types. Might have unintended
consequences, though.
Long term, it would be great to calculate something about the distribution
of cardinality estimates, so we can model risk in the estimates.
--
John Naylor
EDB: http://www.enterprisedb.com
Most of the time when I see that happen it's down to either the
selectivity of some correlated base-quals being multiplied down to a
number low enough that we clamp the estimate to be 1 row. The other
case is similar, but with join quals.If an estimate is lower than 1, that should be a red flag that Something Is
Wrong. This is kind of a crazy idea, but what if we threw it back the other
way by computing 1 / est , and clamping that result to 2 <= res < 10 (or
100 or something)? The theory is, the more impossibly low it is, the more
wrong it is. I'm attracted to the idea of dealing with it as an estimation
problem and not needing to know about join types. Might have unintended
consequences, though.Long term, it would be great to calculate something about the distribution
of cardinality estimates, so we can model risk in the estimates.
Hi,
Laurenz suggested clamping to 2 in this thread in 2017:
/messages/by-id/1509611428.3268.5.camel@cybertec.at
Having been the victim of this problem in the past, I like the risk
based approach to this. If the cost of being wrong in the estimate is
high, use a merge join instead. In every case that I have encountered,
that heuristic would have given the correct performant plan.
Regards,
Ken
On Mon, Jun 21, 2021 at 6:41 AM David Rowley <dgrowleyml@gmail.com> wrote:
For example, in an unparameterized Nested Loop that estimates the
outer Path to have 1 row will cost an entire additional inner cost if
there are 2 rows. With Hash Join the cost is just an additional
hashtable lookup, which is dead cheap. I don't know exactly how
add_path() would weigh all that up, but it seems to me that I wouldn't
take the risk unless I was 100% certain that the Nested Loop's outer
Path would only return 1 row exactly, if there was any chance at all
it could return more, I'd be picking some other join method.
It seems like everyone agrees that it would be good to do something
about this problem, but the question is whether it's best to do
something that tries to be general, or whether we should instead do
something about this specific case. I favor the latter approach. Risk
and uncertainty exist all over the place, but dealing with that in a
general way seems difficult, and maybe unnecessary. Addressing the
case of unparameterized nest loops specifically seems simpler, because
it's easier to reason about what the alternatives are. Your last
sentence here seems right on point to me.
Basically, what you argue for there is disabling unparameterized
nested loops entirely except when we can prove that the inner side
will never generate more than one row. But, that's almost never going
to be something that we can prove. If the inner side is coming from a
table or sub-join, it can turn out to be big. As far as I can see, the
only way that this doesn't happen is if it's something like a subquery
that aggregates everything down to one row, or has LIMIT 1, but those
are such special cases that I don't even think we should be worrying
about them.
So my counter-proposal is: how about if we split
merge_unsorted_outer() into two functions, one of which generates
nested loops only based on parameterized paths and the other of which
generates nested loops based only on unparameterized paths, and then
rejigger add_paths_to_joinrel() so that we do the latter between the
steps that are currently number 5 and 6 and only if we haven't got any
other paths yet? If somebody later comes up with logic for proving
that the inner side can never have more than 1 row, we can let this be
run in those cases as well. In the meantime, if somebody does
something like a JOIN b ON a.x < b.x, we will still generate these
paths because there's no other approach, or similarly if it's a.x =
b.x but for some strange type that doesn't have a hash-joinable or
merge-joinable equality operator. But otherwise we omit those paths
entirely.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Mon, Jun 21, 2021 at 7:45 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Jun 21, 2021 at 6:41 AM David Rowley <dgrowleyml@gmail.com> wrote:
For example, in an unparameterized Nested Loop that estimates the
outer Path to have 1 row will cost an entire additional inner cost if
there are 2 rows. With Hash Join the cost is just an additional
hashtable lookup, which is dead cheap. I don't know exactly how
add_path() would weigh all that up, but it seems to me that I wouldn't
take the risk unless I was 100% certain that the Nested Loop's outer
Path would only return 1 row exactly, if there was any chance at all
it could return more, I'd be picking some other join method.It seems like everyone agrees that it would be good to do something
about this problem, but the question is whether it's best to do
something that tries to be general, or whether we should instead do
something about this specific case. I favor the latter approach.
I agree with your conclusion, but FWIW I am sympathetic to David's
view too. I certainly understand why he'd naturally want to define the
class of problems that are like this one, to understand what the
boundaries are.
The heuristic that has the optimizer flat out avoids an
unparameterized nested loop join is justified by the belief that
that's fundamentally reckless. Even though we all agree on that much,
I don't know when it stops being reckless and starts being "too risky
for me, but not fundamentally reckless". I think that that's worth
living with, but it isn't very satisfying.
Risk
and uncertainty exist all over the place, but dealing with that in a
general way seems difficult, and maybe unnecessary. Addressing the
case of unparameterized nest loops specifically seems simpler, because
it's easier to reason about what the alternatives are. Your last
sentence here seems right on point to me.
Right. Part of why this is a good idea is that the user is exposed to
so many individual risks and uncertainties. We cannot see any one risk
as existing in a vacuum. It is not the only risk the user will ever
take in the planner -- if it was then it might actually be okay to
allow unparameterized nested loop joins.
A bad unparameterized nested loop join plan has, in a sense, unknown
and unbounded cost/downside. But it is only very slightly faster than
a hash join, by a fixed well understood amount. With enough "trials"
and on a long enough timeline, it will inevitably blow up and cause
the application to grind to a halt. It seems like no amount of fixed,
bounded benefit from "fast unparameterized nested loop joins" could
possibly make up for that. The life of Postgres users would be a lot
better if bad plans were at least "survivable events".
--
Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes:
The heuristic that has the optimizer flat out avoids an
unparameterized nested loop join is justified by the belief that
that's fundamentally reckless. Even though we all agree on that much,
I don't know when it stops being reckless and starts being "too risky
for me, but not fundamentally reckless". I think that that's worth
living with, but it isn't very satisfying.
There are certainly cases where the optimizer can prove (in principle;
it doesn't do so today) that a plan node will produce at most one row.
They're hardly uncommon either: an equality comparison on a unique
key, or a subquery with a simple aggregate function, come to mind.
In such cases, not only is this choice not reckless, but it's provably
superior to a hash join. So in the end this gets back to the planning
risk factor that we keep circling around but nobody quite wants to
tackle.
I'd be a lot happier if this proposal were couched around some sort
of estimate of the risk of the outer side producing more than the
expected number of rows. The arguments so far seem like fairly lame
rationalizations for not putting forth the effort to do that.
regards, tom lane
On Mon, Jun 21, 2021 at 8:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
There are certainly cases where the optimizer can prove (in principle;
it doesn't do so today) that a plan node will produce at most one row.
They're hardly uncommon either: an equality comparison on a unique
key, or a subquery with a simple aggregate function, come to mind.
That sounds like it might be useful in general.
In such cases, not only is this choice not reckless, but it's provably
superior to a hash join. So in the end this gets back to the planning
risk factor that we keep circling around but nobody quite wants to
tackle.
Let's assume for the sake of argument that we really have to have that
additional infrastructure to move forward with the idea. (I'm not sure
if it's possible in principle to use infrastructure like that for some
of the cases that Robert has in mind, but for now I'll assume that it
is both possible and a practical necessity.)
Even when I make this working assumption I don't see what it changes
at a fundamental level. You've merely come up with a slightly more
specific definition of the class of plans that are "reckless". You've
only refined the original provisional definition of "reckless" to
exclude specific "clearly not reckless" cases (I think). But the
definition of "reckless" is no less squishy than what we started out
with.
I'd be a lot happier if this proposal were couched around some sort
of estimate of the risk of the outer side producing more than the
expected number of rows. The arguments so far seem like fairly lame
rationalizations for not putting forth the effort to do that.
I'm not so sure that it is. The point isn't the risk, even if it could
be calculated. The point is the downsides of being wrong are huge and
pretty much unbounded, whereas the benefits of being right are tiny
and bounded. It almost doesn't matter what the underlying
probabilities are.
To be clear I'm not arguing against modelling risk. I'm just not sure
that it's useful to think of this problem as truly a problem of risk.
--
Peter Geoghegan
On 6/21/21 5:55 PM, Tom Lane wrote:
Peter Geoghegan <pg@bowt.ie> writes:
The heuristic that has the optimizer flat out avoids an
unparameterized nested loop join is justified by the belief that
that's fundamentally reckless. Even though we all agree on that much,
I don't know when it stops being reckless and starts being "too risky
for me, but not fundamentally reckless". I think that that's worth
living with, but it isn't very satisfying.There are certainly cases where the optimizer can prove (in principle;
it doesn't do so today) that a plan node will produce at most one row.
They're hardly uncommon either: an equality comparison on a unique
key, or a subquery with a simple aggregate function, come to mind.In such cases, not only is this choice not reckless, but it's provably
superior to a hash join. So in the end this gets back to the planning
risk factor that we keep circling around but nobody quite wants to
tackle.
Agreed.
I'd be a lot happier if this proposal were couched around some sort
of estimate of the risk of the outer side producing more than the
expected number of rows. The arguments so far seem like fairly lame
rationalizations for not putting forth the effort to do that.
I agree having such measure would be helpful, but do you have an idea
how it could be implemented?
I've been thinking about this a bit recently and searching for papers
talking about this, and but it's not clear to me how to calculate the
risk (and propagate it through the plan) without making the whole cost
evaluation way more complicated / expensive :-(
The "traditional approach" to quantifying risk would be confidence
intervals, i.e. for each cardinality estimate "e" we'd determine a range
[a,b] so that P(a <= e <= b) = p. So we could pick "p" depending on how
"robust" the plan choice should be (say 0.9 for "risky" and 0.999 for
"safe" plans) and calculate a,b. Or maybe we could calculate where the
plan changes, and then we'd see if those "break points" are within the
confidence interval. If not, great - we can assume the plan is stable,
otherwise we'd need to consider the other plans too, somehow.
But what I'm not sure about is:
1) Now we're dealing with three cardinality estimates (the original "e"
and the boundaries "a, "b"). So which one do we use to calculate cost
and pass to upper parts of the plan?
2) The outer relation may be a complex join, so we'd need to combine the
confidence intervals for the two input relations, somehow.
3) We'd need to know how to calculate the confidence intervals for most
plan nodes, which I'm not sure we know how to do. So it's not clear to
me how to do this, which seems rather problematic because we need to
propagate and combine those confidence intervals through the plan.
But maybe you have thought about some much simpler approach, addressing
this sufficiently well?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Peter Geoghegan <pg@bowt.ie> writes:
On Mon, Jun 21, 2021 at 8:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'd be a lot happier if this proposal were couched around some sort
of estimate of the risk of the outer side producing more than the
expected number of rows. The arguments so far seem like fairly lame
rationalizations for not putting forth the effort to do that.
I'm not so sure that it is. The point isn't the risk, even if it could
be calculated. The point is the downsides of being wrong are huge and
pretty much unbounded, whereas the benefits of being right are tiny
and bounded. It almost doesn't matter what the underlying
probabilities are.
You're throwing around a lot of pejorative adjectives there without
having bothered to quantify any of them. This sounds less like a sound
argument to me than like a witch trial.
Reflecting for a bit on the ancient principle that "the only good numbers
in computer science are 0, 1, and N", it seems to me that we could make
an effort to classify RelOptInfos as provably empty, provably at most one
row, and others. (This would be a property of relations, not individual
paths, so it needn't bloat struct Path.) We already understand about
provably-empty rels, so this is just generalizing that idea a little.
Once we know about that, at least for the really-common cases like unique
keys, I'd be okay with a hard rule that we don't consider unparameterized
nestloop joins with an outer side that falls in the "N" category.
Unless there's no alternative, of course.
Another thought that occurs to me here is that maybe we could get rid of
the enable_material knob in favor of forcing (or at least encouraging)
materialization when the outer side isn't provably one row.
regards, tom lane
Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
I've been thinking about this a bit recently and searching for papers
talking about this, and but it's not clear to me how to calculate the
risk (and propagate it through the plan) without making the whole cost
evaluation way more complicated / expensive :-(
Yeah, a truly complete approach using confidence intervals or the
like seems frighteningly complicated.
But maybe you have thought about some much simpler approach, addressing
this sufficiently well?
See my nearby response to Peter. The main case that's distressing me
is the possibility of forcing a hash join even when the outer side
is obviously only one row. If we could avoid that, at least for
large values of "obvious", I'd be far more comfortable.
regards, tom lane
On Mon, Jun 21, 2021 at 9:52 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'm not so sure that it is. The point isn't the risk, even if it could
be calculated. The point is the downsides of being wrong are huge and
pretty much unbounded, whereas the benefits of being right are tiny
and bounded. It almost doesn't matter what the underlying
probabilities are.You're throwing around a lot of pejorative adjectives there without
having bothered to quantify any of them. This sounds less like a sound
argument to me than like a witch trial.
I'm not sure what you mean by pejorative. Isn't what I said about
downsides and upsides pretty accurate?
Reflecting for a bit on the ancient principle that "the only good numbers
in computer science are 0, 1, and N", it seems to me that we could make
an effort to classify RelOptInfos as provably empty, provably at most one
row, and others. (This would be a property of relations, not individual
paths, so it needn't bloat struct Path.) We already understand about
provably-empty rels, so this is just generalizing that idea a little.
It sounds like you're concerned about properly distinguishing between:
1. Cases where the only non-reckless choice is a hash join over a
unparameterized nested loop join
2. Cases that look like that at first, that don't really have that
quality on closer examination.
This seems like a reasonable concern.
Once we know about that, at least for the really-common cases like unique
keys, I'd be okay with a hard rule that we don't consider unparameterized
nestloop joins with an outer side that falls in the "N" category.
Unless there's no alternative, of course.
I thought that you were arguing against the premise itself. It's now
clear that you weren't, though.
I don't have an opinion for or against bringing the provably-empty
rels stuff into the picture. At least not right now.
--
Peter Geoghegan
On Mon, Jun 21, 2021 at 11:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
There are certainly cases where the optimizer can prove (in principle;
it doesn't do so today) that a plan node will produce at most one row.
They're hardly uncommon either: an equality comparison on a unique
key, or a subquery with a simple aggregate function, come to mind.
Hmm, maybe I need to see an example of the sort of plan shape that you
have in mind. To me it feels like a comparison on a unique key ought
to use a *parameterized* nested loop. And it's also not clear to me
why a nested loop is actually better in a case like this. If the
nested loop iterates more than once because there are more rows on the
outer side, then you don't want to have something on the inner side
that might be expensive, and either an aggregate or an unparameterized
search for a unique value are potentially quite expensive. Now if you
put a materialize node on top of the inner side, then you don't have
to worry about that problem, but how much are you saving at that point
vs. just doing a hash join?
I'd be a lot happier if this proposal were couched around some sort
of estimate of the risk of the outer side producing more than the
expected number of rows. The arguments so far seem like fairly lame
rationalizations for not putting forth the effort to do that.
I don't understand how to generate a risk assessment or what we ought
to do with it if we had one. I don't even understand what units we
would use. We measure costs using abstract cost units, but those
abstract cost units are intended to be a proxy for runtime. If it's
not the case that a plan that runs for longer has a higher cost, then
something's wrong with the costing model or the settings. In the case
of risk, the whole thing seems totally subjective. We're talking about
the risk that our estimate is bogus, but how do we estimate the risk
that we don't know how to estimate? Given quals (x + 0) = x, x = some
MCV, and x = some non-MCV, we can say that we're most likely to be
wrong about the first one and least likely to be wrong about the
second one, but how much more likely? I don't know how you can decide
that, even in principle. We can also say that an unparameterized
nested loop is more risky than some other plan because it could turn
out to be crazy expensive, but is that more or less risky than
scanning the index on b as a way to implement SELECT * FROM foo WHERE
a = 1 ORDER BY b LIMIT 1? How much more risky, and why?
And then, even supposing we had a risk metric for every path, what
exactly would we do with it? Suppose path A is cheaper than path B,
but also more risky. Which should we keep? We could keep both, but
that seems to be just kicking the can down the road. If plan B is
likely to blow up in our face, we should probably just get rid of it,
or not even generate it in the first place. Even if we do keep both,
at some point we're going to have to make a cost-vs-risk tradeoff, and
I don't see how to do that intelligently, because the point is
precisely that if the risk is high, the cost number might be totally
wrong. If we know that plan A is cheaper than plan B, we should choose
plan A. But if all we know is that plan A would be cheaper than plan B
if our estimate of the cost were correct, but also that it probably
isn't, then what we actually know is nothing. We have no principled
basis for deciding anything based on cost unless we're reasonably
confident that the cost estimate is pretty good. So AFAICT the only
principled strategy here is to throw away high risk paths as early as
we possibly can. What am I missing?
The other thing is - the risk of a particular path doesn't matter in
an absolute sense, only a relative one. In the particular case I'm on
about here, we *know* there's a less-risky alternative. We don't need
to quantify the risk to know which of the two options has more. In
many other cases, the risk is irreducible e.g. a default estimate
could be totally bogus, but switching paths is of no help in getting
rid of it.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Mon, Jun 21, 2021 at 1:11 PM Peter Geoghegan <pg@bowt.ie> wrote:
On Mon, Jun 21, 2021 at 9:52 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'm not so sure that it is. The point isn't the risk, even if it could
be calculated. The point is the downsides of being wrong are huge and
pretty much unbounded, whereas the benefits of being right are tiny
and bounded. It almost doesn't matter what the underlying
probabilities are.You're throwing around a lot of pejorative adjectives there without
having bothered to quantify any of them. This sounds less like a sound
argument to me than like a witch trial.I'm not sure what you mean by pejorative. Isn't what I said about
downsides and upsides pretty accurate?
Yeah, I don't see why Peter's characterization deserves to be labelled
as pejorative here. A hash join or merge join or parameterized nested
loop can turn out to be slower than some other algorithm, but all of
those approaches have some component that tends to make the asymptotic
cost less than the product of the sizes of the inputs. I don't think
that's true in absolutely every case; for example, if a merge join has
every row duplicated on both sides, it will have to scan every inner
tuple once per outer tuple, just like a nested loop, and the other
algorithms also are going to degrade toward O(NM) performance in the
face of many duplicates. Also, a hash join can be pretty close to that
if it needs a shazillion batches. But in normal cases, any algorithm
other than an unparameterized nested loop tends to read each input
tuple on each side ONCE, so the cost is more like the sum of the input
sizes than the product. And there's nothing pejorative in saying that
N + M can be less than N * M by an unbounded amount. That's just the
facts.
--
Robert Haas
EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
On Mon, Jun 21, 2021 at 11:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
There are certainly cases where the optimizer can prove (in principle;
it doesn't do so today) that a plan node will produce at most one row.
They're hardly uncommon either: an equality comparison on a unique
key, or a subquery with a simple aggregate function, come to mind.
Hmm, maybe I need to see an example of the sort of plan shape that you
have in mind. To me it feels like a comparison on a unique key ought
to use a *parameterized* nested loop.
The unique-key comparison would be involved in the outer scan in
the cases I'm thinking of. As an example,
select * from t1, t2 where t1.id = constant and t1.x op t2.y;
where I'm not assuming much about the properties of "op".
This could be amenable to a plan like
NestLoop Join
Join Filter: t1.x op t2.y
-> Index Scan on t1_pkey
Index Cond: t1.id = constant
-> Seq Scan on t2
and if we can detect that the pkey indexscan produces just one row,
this is very possibly the best available plan. Nor do I think this
is an unusual situation that we can just ignore.
BTW, it strikes me that there might be an additional consideration
here: did parameterization actually help anything? That is, the
proposed rule wants to reject the above but allow
NestLoop Join
-> Index Scan on t1_pkey
Index Cond: t1.id = constant
-> Seq Scan on t2
Filter: t1.x op t2.y
even though the latter isn't meaningfully better. It's possible
this won't arise because we don't consider parameterized paths
except where the parameter is used in an indexqual or the like,
but I'm not confident of that. See in particular reparameterize_path
and friends before you assert there's no such issue. So we might
need to distinguish essential from incidental parameterization,
or something like that.
regards, tom lane
I wrote:
... As an example,
select * from t1, t2 where t1.id = constant and t1.x op t2.y;
where I'm not assuming much about the properties of "op".
This could be amenable to a plan like
NestLoop Join
Join Filter: t1.x op t2.y
-> Index Scan on t1_pkey
Index Cond: t1.id = constant
-> Seq Scan on t2
Also, to enlarge on that example: if "op" isn't hashable then the
original argument is moot. However, it'd still be useful to know
if the outer scan is sure to return no more than one row, as that
could inform the choice whether to plaster a Materialize node on
the inner scan.
regards, tom lane
On Mon, Jun 21, 2021 at 1:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hmm, maybe I need to see an example of the sort of plan shape that you
have in mind. To me it feels like a comparison on a unique key ought
to use a *parameterized* nested loop.The unique-key comparison would be involved in the outer scan in
the cases I'm thinking of. As an example,select * from t1, t2 where t1.id = constant and t1.x op t2.y;
where I'm not assuming much about the properties of "op".
This could be amenable to a plan likeNestLoop Join
Join Filter: t1.x op t2.y
-> Index Scan on t1_pkey
Index Cond: t1.id = constant
-> Seq Scan on t2and if we can detect that the pkey indexscan produces just one row,
this is very possibly the best available plan.
Hmm, yeah, I guess that's possible. How much do you think this loses
as compared with:
Hash Join
Hash Cond: t1.x op t2.y
-> Seq Scan on t2
-> Hash
-> Index Scan on t1_pkey
(If the operator is not hashable then this plan is impractical, but in
such a case the question of preferring the hash join over the nested
loop doesn't arise anyway.)
BTW, it strikes me that there might be an additional consideration
here: did parameterization actually help anything? That is, the
proposed rule wants to reject the above but allowNestLoop Join
-> Index Scan on t1_pkey
Index Cond: t1.id = constant
-> Seq Scan on t2
Filter: t1.x op t2.yeven though the latter isn't meaningfully better. It's possible
this won't arise because we don't consider parameterized paths
except where the parameter is used in an indexqual or the like,
but I'm not confident of that. See in particular reparameterize_path
and friends before you assert there's no such issue. So we might
need to distinguish essential from incidental parameterization,
or something like that.
Hmm, perhaps. I think it won't happen in the normal cases, but I can't
completely rule out the possibility that there are corner cases where
it does.
--
Robert Haas
EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
On Mon, Jun 21, 2021 at 1:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
NestLoop Join
Join Filter: t1.x op t2.y
-> Index Scan on t1_pkey
Index Cond: t1.id = constant
-> Seq Scan on t2
Hmm, yeah, I guess that's possible. How much do you think this loses
as compared with:
Hash Join
Hash Cond: t1.x op t2.y
-> Seq Scan on t2
-> Hash
-> Index Scan on t1_pkey
Hard to say. The hash overhead might or might not pay for itself.
If the equality operator proper is expensive and we get to avoid
applying it at most t2 rows, then this might be an OK alternative;
otherwise not so much.
In any case, the former looks like plans that we generate now,
the second not. Do you really want to field a lot of questions
about why we suddenly changed to a not-clearly-superior plan?
regards, tom lane
On Mon, Jun 21, 2021 at 10:14 AM Robert Haas <robertmhaas@gmail.com> wrote:
Hmm, maybe I need to see an example of the sort of plan shape that you
have in mind. To me it feels like a comparison on a unique key ought
to use a *parameterized* nested loop. And it's also not clear to me
why a nested loop is actually better in a case like this. If the
nested loop iterates more than once because there are more rows on the
outer side, then you don't want to have something on the inner side
that might be expensive, and either an aggregate or an unparameterized
search for a unique value are potentially quite expensive. Now if you
put a materialize node on top of the inner side, then you don't have
to worry about that problem, but how much are you saving at that point
vs. just doing a hash join?
I suspected that that was true, but even that doesn't seem like the
really important thing. While it may be true that the simple heuristic
can't be quite as simple as we'd hoped at first, ISTM that this is
ultimately not much of a problem. The basic fact remains: some more or
less simple heuristic makes perfect sense, and should be adapted.
This conclusion is counterintuitive because it's addressing a very
complicated problem with a very simple solution. However, if we lived
in a world where things that sound too good to be true always turned
out to be false, we'd also live in a world where optimizers were
completely impractical and useless. Optimizers have that quality
already, whether or not we officially acknowledge it.
I don't understand how to generate a risk assessment or what we ought
to do with it if we had one. I don't even understand what units we
would use. We measure costs using abstract cost units, but those
abstract cost units are intended to be a proxy for runtime. If it's
not the case that a plan that runs for longer has a higher cost, then
something's wrong with the costing model or the settings. In the case
of risk, the whole thing seems totally subjective. We're talking about
the risk that our estimate is bogus, but how do we estimate the risk
that we don't know how to estimate?
Clearly we need a risk estimate for our risk estimate!
The other thing is - the risk of a particular path doesn't matter in
an absolute sense, only a relative one. In the particular case I'm on
about here, we *know* there's a less-risky alternative.
Exactly! This, a thousand times.
This reminds me of how people behave in the real world. In the real
world people deal with this without too much difficulty. Everything is
situational and based on immediate trade-offs, with plenty of
uncertainty at every step. For example, if you think that there is
even a tiny chance of a piece of fruit being poisonous, you don't eat
the piece of fruit -- better to wait until lunchtime. This is one of
the *easiest* decisions I can think of, despite the uncertainty.
(Except perhaps if you happen to be in danger of dying of starvation,
in which case it might be a very different story. And so on.)
--
Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes:
On Mon, Jun 21, 2021 at 10:14 AM Robert Haas <robertmhaas@gmail.com> wrote:
The other thing is - the risk of a particular path doesn't matter in
an absolute sense, only a relative one. In the particular case I'm on
about here, we *know* there's a less-risky alternative.
Exactly! This, a thousand times.
This is a striking oversimplification.
You're ignoring the fact that the plan shape we generate now is in fact
*optimal*, and easily proven to be so, in some very common cases. I don't
think the people whose use-cases get worse are going to be mollified by
the argument that you reduced their risk, when there is provably no risk.
Obviously the people whose use-cases are currently hitting the wrong end
of the risk will be happy with any change whatever, but those may not be
the same people.
I'm willing to take some flak if there's not an easy proof that the outer
scan is single-row, but I don't think we should just up and change it
for cases where there is.
regards, tom lane
On Mon, Jun 21, 2021 at 1:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
You're ignoring the fact that the plan shape we generate now is in fact
*optimal*, and easily proven to be so, in some very common cases.
As I've said I don't reject the idea that there is room for
disagreement on the specifics. For example perhaps it'll turn out that
only a restricted subset of the cases that Robert originally had in
mind will truly turn out to work as well as hoped. But that just seems
like a case of Robert refining a very preliminary proposal. I
absolutely expect there to be some need to iron out the wrinkles.
I don't
think the people whose use-cases get worse are going to be mollified by
the argument that you reduced their risk, when there is provably no risk.
By definition what we're doing here is throwing away slightly cheaper
plans when the potential downside is much higher than the potential
upside of choosing a reasonable alternative. I don't think that the
downside is particularly likely. In fact I believe that it's fairly
unlikely in general. This is an imperfect trade-off, at least in
theory. I fully own that.
I'm willing to take some flak if there's not an easy proof that the outer
scan is single-row, but I don't think we should just up and change it
for cases where there is.
Seems reasonable to me.
--
Peter Geoghegan
On Mon, Jun 21, 2021 at 12:52:39PM -0400, Tom Lane wrote:
You're throwing around a lot of pejorative adjectives there without
having bothered to quantify any of them. This sounds less like a sound
argument to me than like a witch trial.Reflecting for a bit on the ancient principle that "the only good numbers
in computer science are 0, 1, and N", it seems to me that we could make
an effort to classify RelOptInfos as provably empty, provably at most one
row, and others. (This would be a property of relations, not individual
paths, so it needn't bloat struct Path.) We already understand about
provably-empty rels, so this is just generalizing that idea a little.
Once we know about that, at least for the really-common cases like unique
keys, I'd be okay with a hard rule that we don't consider unparameterized
nestloop joins with an outer side that falls in the "N" category.
Unless there's no alternative, of course.Another thought that occurs to me here is that maybe we could get rid of
the enable_material knob in favor of forcing (or at least encouraging)
materialization when the outer side isn't provably one row.
There were a lot of interesting ideas in this thread and I want to
analyze some of them. First, there is the common assumption (not
stated) that over-estimating by 5% and underestimating by 5% cause the
same harm, which is clearly false. If I go to a restaurant and estimate
the bill to be 5% higher or %5 lower, assuming I have sufficient funds,
under or over estimating is probably fine. If I am driving and
under-estimate the traction of my tires, I am probably fine, but if I
over-estimate their traction by 5%, I might crash.
Closer to home, Postgres is more tolerant of memory usage
under-estimation than over-estimation:
https://momjian.us/main/blogs/pgblog/2018.html#December_7_2018
What I hear Robert saying is that unparameterized nested loops are much
more sensitive to misestimation than hash joins, and only slightly
faster than hash joins when they use accurate row counts, so we might
want to avoid them by default. Tom is saying that if we know the outer
side will have zero or one row, we should still do unparameterized
nested loops because those are not more sensitive to misestimation than
hash joins, and slightly faster.
If that is accurate, I think the big question is how common are cases
where the outer side can't be proven to have zero or one row and nested
loops are enough of a win to risk its greater sensitivity to
misestimation. If it is uncommon, seems we could just code the
optimizer to use hash joins in those cases without a user-visible knob,
beyond the knob that already turns off nested loop joins.
Peter's comment about having nodes in the executor that adjust to the
row counts it finds is interesting, and eventually might be necessary
once we are sure we have gone as far as we can without it.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On Mon, Jun 21, 2021 at 4:51 PM Bruce Momjian <bruce@momjian.us> wrote:
There were a lot of interesting ideas in this thread and I want to
analyze some of them. First, there is the common assumption (not
stated) that over-estimating by 5% and underestimating by 5% cause the
same harm, which is clearly false. If I go to a restaurant and estimate
the bill to be 5% higher or %5 lower, assuming I have sufficient funds,
under or over estimating is probably fine. If I am driving and
under-estimate the traction of my tires, I am probably fine, but if I
over-estimate their traction by 5%, I might crash.
My favorite analogy is the home insurance one:
It might make sense to buy home insurance because losing one's home
(say through fire) is a loss that usually just cannot be tolerated --
you are literally ruined. We can debate how likely it is to happen,
but in the end it's not so unlikely that it can't be ruled out. At the
same time I may be completely unwilling to buy insurance for personal
electronic devices. I can afford to replace all of them if I truly
have to. And the chances of all of them breaking or being stolen on
the same day is remote (unless my home burns down!). If I drop my cell
phone and crack the screen, I'll be annoyed, but it's certainly not
the end of the world.
This behavior will make perfect sense to most people. But it doesn't
scale up or down. I have quite a few electronic devices, but only a
single home, so technically I'm taking risks way more often than I am
playing it safe here. Am I risk tolerant when it comes to insurance?
Conservative?
I myself don't think that it is sensible to apply either term here.
It's easier to just look at the specifics. A home is a pretty
important thing to almost everybody; we can afford to treat it as a
special case.
If that is accurate, I think the big question is how common are cases
where the outer side can't be proven to have zero or one row and nested
loops are enough of a win to risk its greater sensitivity to
misestimation. If it is uncommon, seems we could just code the
optimizer to use hash joins in those cases without a user-visible knob,
beyond the knob that already turns off nested loop joins.
I think it's possible that Robert's proposal will lead to very
slightly slower plans in the vast majority of cases that are affected,
while still being a very good idea. Why should insurance be 100% free,
though? Maybe it can be in some cases where we get lucky, but why
should that be the starting point? It just has to be very cheap
relative to what we do today for us to come out ahead, certainly, but
that seems quite possible in at least this case.
--
Peter Geoghegan
On 6/22/21 2:25 AM, Peter Geoghegan wrote:
On Mon, Jun 21, 2021 at 4:51 PM Bruce Momjian <bruce@momjian.us> wrote:
There were a lot of interesting ideas in this thread and I want to
analyze some of them. First, there is the common assumption (not
stated) that over-estimating by 5% and underestimating by 5% cause the
same harm, which is clearly false. If I go to a restaurant and estimate
the bill to be 5% higher or %5 lower, assuming I have sufficient funds,
under or over estimating is probably fine. If I am driving and
under-estimate the traction of my tires, I am probably fine, but if I
over-estimate their traction by 5%, I might crash.My favorite analogy is the home insurance one:
It might make sense to buy home insurance because losing one's home
(say through fire) is a loss that usually just cannot be tolerated --
you are literally ruined. We can debate how likely it is to happen,
but in the end it's not so unlikely that it can't be ruled out. At the
same time I may be completely unwilling to buy insurance for personal
electronic devices. I can afford to replace all of them if I truly
have to. And the chances of all of them breaking or being stolen on
the same day is remote (unless my home burns down!). If I drop my cell
phone and crack the screen, I'll be annoyed, but it's certainly not
the end of the world.This behavior will make perfect sense to most people. But it doesn't
scale up or down. I have quite a few electronic devices, but only a
single home, so technically I'm taking risks way more often than I am
playing it safe here. Am I risk tolerant when it comes to insurance?
Conservative?I myself don't think that it is sensible to apply either term here.
It's easier to just look at the specifics. A home is a pretty
important thing to almost everybody; we can afford to treat it as a
special case.If that is accurate, I think the big question is how common are cases
where the outer side can't be proven to have zero or one row and nested
loops are enough of a win to risk its greater sensitivity to
misestimation. If it is uncommon, seems we could just code the
optimizer to use hash joins in those cases without a user-visible knob,
beyond the knob that already turns off nested loop joins.I think it's possible that Robert's proposal will lead to very
slightly slower plans in the vast majority of cases that are affected,
while still being a very good idea. Why should insurance be 100% free,
though? Maybe it can be in some cases where we get lucky, but why
should that be the starting point? It just has to be very cheap
relative to what we do today for us to come out ahead, certainly, but
that seems quite possible in at least this case.
Yeah, I like the insurance analogy - it gets to the crux of the problem,
because insurance is pretty much exactly about managing risk. But making
everything slower will be a hard sell, because wast majority of
workloads already running on Postgres don't have this issue at all, so
for them it's not worth the expense. Following the insurance analogy,
selling tornado insurance in Europe is mostly pointless.
Insurance is also about personal preference / risk tolerance. Maybe I'm
fine with accepting risk that my house burns down, or whatever ...
And the lack of data also plays role - the insurance company will ask
for higher rates when it does not have enough accurate data about the
phenomenon, or when there's a lot of unknowns. Maybe this would allow
some basic measure of uncertainty, based on the number and type of
restrictions, joins, etc. The more restrictions we have, the less
certain the estimates are. Some conditions are estimated less
accurately, and using default estimates makes it much less accurate.
So maybe some fairly rough measure of uncertainty might work, and the
user might specify how much risk it's willing to tolerate.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
But making everything slower will be a hard sell, because vast majority of
workloads already running on Postgres don't have this issue at all, so
for them it's not worth the expense. Following the insurance analogy,
selling tornado insurance in Europe is mostly pointless.
Agree. I've been surprised about NOT hearing complaints from PostgreSQL
customers about a particular "bad" plan choice that was common in other
rdbms products where large, complex queries were the norm. The situation
occurs late in a plan with many joins where a hash join can be used and
where either side is estimated to fit in memory. On one side is a base table
with cardinality that we have statistics for, while the other side has an
estimated cardinality that is the result of many estimates each of which
has error that can compound, and that in some cases amounts to a wild guess.
(e.g. what is the selectivity of SUM(x) < 12 ?). If the planner's point estimate
of cardinality is such that both sides could fit in memory, then a bad plan can
easily be made. As Peter said, [ most ] humans have no trouble dealing with
these kind of situations. They take the risk of being wrong into account.
So in our world, the useful numbers are 0, 1, measured N, and estimated N,
but we don't distinguish between measured N and estimated N.
But that doesn't mean that OLTP customers would be willing to accept
slightly suboptimal plans to mitigate a risk they don't experience.
Insurance is also about personal preference / risk tolerance. Maybe I'm
fine with accepting risk that my house burns down, or whatever ...
Right, and that's why the problem mentioned above is still out there
annoying customers who have complex plans. To them it looks like
an obviously bad plan choice.
Something that might help is to have the planner cost be a structure instead
of a number. Costs of plans that are deemed "risky" are accumulated
separately from plans that make no risky choices, and for a given set
of join items you keep the minimum cost plan of both types. It may happen that all
plans eventually make a risky choice, in which case you take the plan with the minimum
cost, but if there exists a plan with no risky choices, then the minimum cost
plan with no risky choices is chosen, with a GUC that enables a customer to ignore
risk when making this choice. This is not in the spirit of the hoped for simple heuristic,
and it would be heuristic in its classification of plans that are risky, but in the NLJ case
the cost of an unparameterized NLJ could be deemed risky if the cardinality of the inner
relation is not 0, 1, or measured N.
Import Notes
Resolved by subject fallback
On Mon, Jun 21, 2021 at 4:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'm willing to take some flak if there's not an easy proof that the outer
scan is single-row, but I don't think we should just up and change it
for cases where there is.
I think that's a reasonable request. I'm not sure that I believe it's
100% necessary, but it's certainly an improvement on a technical
level, and given that the proposed change could impact quite a lot of
plans, it's fair to want to see some effort being put into mitigating
the possible downsides. Now, I'm not sure when I might have time to
actually try to do the work, which kind of sucks, but that's how it
goes sometimes.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Jun 22, 2021 at 2:53 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
Yeah, I like the insurance analogy - it gets to the crux of the problem,
because insurance is pretty much exactly about managing risk.
The user's exposure to harm is what truly matters. I admit that that's
very hard to quantify, but we should at least try to do so.
We sometimes think about a plan that is 10x slower as if it's
infinitely slow, or might as well be. But it's usually not like that
-- it is generally meaningfully much better than the plan being 100x
slower, which is itself sometimes appreciably better than 1000x
slower. And besides, users often don't get anything like the optimal
plan, even on what they would consider to be a good day (which is most
days). So maybe 10x slower is actually the baseline good case already,
without anybody knowing it. Most individual queries are not executed
very often, even on the busiest databases. The extremes really do
matter a lot.
If a web app or OLTP query is ~10x slower than optimal then it might
be the practical equivalent of an outage that affects the query alone
(i.e. "infinitely slow") -- but probably not. I think that it is worth
paying more than nothing to avoid plans that are so far from optimal
that they might as well take forever to execute. This is not
meaningfully different from a database outage affecting one particular
query. It kind of is in a category of its own that surpasses "slow
plan", albeit one that is very difficult or impossible to define
formally.
There may be a huge amount of variation in risk tolerance among
basically reasonable people. For example, if somebody chooses to
engage in some kind of extreme sport, to me it seems understandable.
It's just not my cup of tea. Whereas if somebody chooses to never wear
a seatbelt while driving, then to me they're simply behaving
foolishly. They're not willing to incur the tiniest inconvenience in
order to get a huge reduction in potential harm -- including a very
real risk of approximately the worst thing that can happen to you.
Sure, refusing to wear a seatbelt can theoretically be classified as
just another point on the risk tolerance spectrum, but that seems
utterly contrived to me. Some things (maybe not that many) really are
like that, or can at least be assumed to work that way as a practical
matter.
But making
everything slower will be a hard sell, because wast majority of
workloads already running on Postgres don't have this issue at all, so
for them it's not worth the expense.
I think that we're accepting too much risk here. But I bet it's also
true that we're not taking enough risk in other areas. That was really
my point with the insurance analogy -- we can afford to take lots of
individual risks as long as they don't increase our exposure to truly
disastrous outcomes -- by which I mean queries that might as well take
forever to execute as far as the user is concerned. (Easier said than
done, of course.)
A simple trade-off between fast and robust doesn't seem like a
universally helpful thing. Sometimes it's a very unhelpful way of
looking at the situation. If you make something more robust to extreme
bad outcomes, then you may have simultaneously made it *faster* (not
slower) for all practical purposes. This can happen when the increase
in robustness allows the user to tune the system aggressively, and
only take on new risks that they can truly live with (which wouldn't
have been possible without the increase in robustness). For example, I
imagine that the failsafe mechanism added to VACUUM will actually make
it possible to tune VACUUM much more aggressively -- it might actually
end up significantly improving performance for all practical purposes,
even though technically it has nothing to do with performance.
Having your indexes a little more bloated because the failsafe
kicked-in is a survivable event -- the DBA lives to fight another day,
and *learns* to tune vacuum/the app so it doesn't happen again and
again. An anti-wraparound failure is perhaps not a survivable event --
the DBA gets fired. This really does seem like a fundamental
difference to me.
Following the insurance analogy,
selling tornado insurance in Europe is mostly pointless.
Principled skepticism of this kind of thing is of course necessary and
welcome. It *could* be taken too far.
And the lack of data also plays role - the insurance company will ask
for higher rates when it does not have enough accurate data about the
phenomenon, or when there's a lot of unknowns. Maybe this would allow
some basic measure of uncertainty, based on the number and type of
restrictions, joins, etc.
I don't think that you can really model uncertainty. But you can have
true certainty (or close to it) about a trade-off that makes the
system fundamentally more robust over time. You can largely be certain
about both the cost of the insurance, as well as how it ameliorates
the problem in at least some cases.
So maybe some fairly rough measure of uncertainty might work, and the
user might specify how much risk it's willing to tolerate.
I think that most or all of the interesting stuff is where you have
this extreme asymmetry -- places where it's much more likely to be
true that basically everybody wants that. Kind of like wearing
seatbelts -- things that we really can claim are a universal good
without too much controversy. There might be as few as one or two
things in the optimizer that this could be said of. But they matter.
--
Peter Geoghegan
I think that it is worth paying more than nothing to avoid plans that are
so far from optimal that they might as well take forever to execute
I recently came across this article from 2016 that expounded upon a bad
plan of the sort discussed in this thread:
https://heap.io/blog/when-to-avoid-jsonb-in-a-postgresql-schema
(The proximate cause in this case was Postgresql not collecting statistics
for fields in a JSONB column, estimating rowcount of 1, and thus creating a
pathological slowdown.)
–Mike
On Tue, Jun 22, 2021 at 7:37 PM, Peter Geoghegan <pg@bowt.ie> wrote:
Show quoted text
On Tue, Jun 22, 2021 at 2:53 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:Yeah, I like the insurance analogy - it gets to the crux of the problem,
because insurance is pretty much exactly about managing risk.The user's exposure to harm is what truly matters. I admit that that's
very hard to quantify, but we should at least try to do so.We sometimes think about a plan that is 10x slower as if it's infinitely
slow, or might as well be. But it's usually not like that
-- it is generally meaningfully much better than the plan being 100x
slower, which is itself sometimes appreciably better than 1000x slower. And
besides, users often don't get anything like the optimal plan, even on what
they would consider to be a good day (which is most days). So maybe 10x
slower is actually the baseline good case already, without anybody knowing
it. Most individual queries are not executed very often, even on the
busiest databases. The extremes really do matter a lot.If a web app or OLTP query is ~10x slower than optimal then it might be
the practical equivalent of an outage that affects the query alone
(i.e. "infinitely slow") -- but probably not. I think that it is worth
paying more than nothing to avoid plans that are so far from optimal that
they might as well take forever to execute. This is not meaningfully
different from a database outage affecting one particular query. It kind of
is in a category of its own that surpasses "slow plan", albeit one that is
very difficult or impossible to define formally.There may be a huge amount of variation in risk tolerance among basically
reasonable people. For example, if somebody chooses to engage in some kind
of extreme sport, to me it seems understandable. It's just not my cup of
tea. Whereas if somebody chooses to never wear a seatbelt while driving,
then to me they're simply behaving foolishly. They're not willing to incur
the tiniest inconvenience in order to get a huge reduction in potential
harm -- including a very real risk of approximately the worst thing that
can happen to you. Sure, refusing to wear a seatbelt can theoretically be
classified as just another point on the risk tolerance spectrum, but that
seems utterly contrived to me. Some things (maybe not that many) really are
like that, or can at least be assumed to work that way as a practical
matter.But making
everything slower will be a hard sell, because wast majority of workloads
already running on Postgres don't have this issue at all, so for them it's
not worth the expense.I think that we're accepting too much risk here. But I bet it's also true
that we're not taking enough risk in other areas. That was really my point
with the insurance analogy -- we can afford to take lots of individual
risks as long as they don't increase our exposure to truly disastrous
outcomes -- by which I mean queries that might as well take forever to
execute as far as the user is concerned. (Easier said than done, of
course.)A simple trade-off between fast and robust doesn't seem like a universally
helpful thing. Sometimes it's a very unhelpful way of looking at the
situation. If you make something more robust to extreme bad outcomes, then
you may have simultaneously made it *faster* (not slower) for all practical
purposes. This can happen when the increase in robustness allows the user
to tune the system aggressively, and only take on new risks that they can
truly live with (which wouldn't have been possible without the increase in
robustness). For example, I imagine that the failsafe mechanism added to
VACUUM will actually make it possible to tune VACUUM much more aggressively
-- it might actually end up significantly improving performance for all
practical purposes, even though technically it has nothing to do with
performance.Having your indexes a little more bloated because the failsafe kicked-in
is a survivable event -- the DBA lives to fight another day, and *learns*
to tune vacuum/the app so it doesn't happen again and again. An
anti-wraparound failure is perhaps not a survivable event -- the DBA gets
fired. This really does seem like a fundamental difference to me.Following the insurance analogy,
selling tornado insurance in Europe is mostly pointless.Principled skepticism of this kind of thing is of course necessary and
welcome. It *could* be taken too far.And the lack of data also plays role - the insurance company will ask for
higher rates when it does not have enough accurate data about the
phenomenon, or when there's a lot of unknowns. Maybe this would allow some
basic measure of uncertainty, based on the number and type of restrictions,
joins, etc.I don't think that you can really model uncertainty. But you can have true
certainty (or close to it) about a trade-off that makes the system
fundamentally more robust over time. You can largely be certain about both
the cost of the insurance, as well as how it ameliorates the problem in at
least some cases.So maybe some fairly rough measure of uncertainty might work, and the user
might specify how much risk it's willing to tolerate.I think that most or all of the interesting stuff is where you have this
extreme asymmetry -- places where it's much more likely to be true that
basically everybody wants that. Kind of like wearing seatbelts -- things
that we really can claim are a universal good without too much controversy.
There might be as few as one or two things in the optimizer that this could
be said of. But they matter.--
Peter Geoghegan
On Tue, Jun 22, 2021 at 4:13 AM Robert Haas <robertmhaas@gmail.com> wrote:
I think that's a reasonable request. I'm not sure that I believe it's
100% necessary, but it's certainly an improvement on a technical
level, and given that the proposed change could impact quite a lot of
plans, it's fair to want to see some effort being put into mitigating
the possible downsides. Now, I'm not sure when I might have time to
actually try to do the work, which kind of sucks, but that's how it
goes sometimes.
Should I take it that you've dropped this project? I was rather hoping
that you'd get back to it, FWIW.
--
Peter Geoghegan
On Wed, Jan 12, 2022 at 7:20 PM Peter Geoghegan <pg@bowt.ie> wrote:
Should I take it that you've dropped this project? I was rather hoping
that you'd get back to it, FWIW.
Honestly, I'd forgotten all about it. While in theory I'd like to do
something about this, but I just have too many other things going on.
--
Robert Haas
EDB: http://www.enterprisedb.com