disfavoring unparameterized nested loops
From time to time, someone tells me that they've configured
enable_nestloop=false on postgresql.conf, which is a pretty bad idea
since there are a significant number of cases where such plans are the
only reasonable way of executing some query. However, it's no great
secret that PostgreSQL's optimizer sometimes produces nested loops
that are very, very, very slow, generally because it has grossly
underestimated the cardinality of the inner side of the nested loop.
The frustration which users experience as a result of such events is
understandable.
I read https://15721.courses.cs.cmu.edu/spring2020/papers/22-costmodels/p204-leis.pdf
today and found out that the authors of that paper did something a bit
more nuanced which, in their experiments, was very successful. It
sounds like what they basically did is disabled unparameterized nested
loops. They argue that such nested loops figure to gain very little as
compared with a hash join, but that they might turn out to lose a lot
if the cardinality estimation is inaccurate, and they present
experimental results to back up those claims. One observation that the
paper makes along the way is that every system they tested is more
likely to underestimate the cardinality of joins than to overestimate
it, and that this tendency becomes more pronounced as the size of the
join planning problem increases. On reflection, I believe this matches
my experience, and it makes sense that it should be so, since it
occasionally happens that the join selectivity estimate is essentially
zero, and a bigger join problem is more likely to have at least one
such case. On the other hand, the join selectivity estimate can never
be +infinity. Hence, it's more important in general for a database
system to be resilient against underestimates than to be resilient
against overestimates. Being less willing to choose unparameterized
nested loops is one way to move in that direction.
How to do that is not very clear. One very simple thing we could do
would be to introduce enable_nestloop=parameterized or
enable_parameterized_nestloop=false. That is a pretty blunt instrument
but the authors of the paper seem to have done that with positive
results, so maybe it's actually not a dumb idea. Another approach
would be to introduce a large fuzz factor for such nested loops e.g.
keep them only if the cost estimate is better than the comparable hash
join plan by at least a factor of N (or if no such plan is being
generated for some reason). I'm not very confident that this would
actually do what we want, though. In the problematic cases, a nested
loop is likely to look extremely cheap, so just imagining that the
cost might be higher is not very protective. Perhaps a better approach
would be something like: if the estimated number of inner rows is less
than 100, then re-estimate the cost of this approach and of the best
available hash join on the assumption that there are 100 inner rows.
If the hash join still wins, keep it; if it loses under that
assumption, throw it out. I think it's likely that this approach would
eliminate a large number of highly risky nested loop plans, probably
even with s/100/10/g, without causing many other problems (except
perhaps increased planner CPU consumption ... but maybe that's not too
bad).
Just to be clear, I do understand that there are cases where no Hash
Join is possible, but anything we do here could be made to apply only
when a hash join is in fact possible. We could also think about making
the point of comparison the best other plans of any sort rather than a
hash join specifically, which feels a little more principled but might
actually be worse. When a Nested Loop is a stupid idea, it's stupid
precisely because the inner side is big and we could've avoided
recomputing it over and over by using a Hash Join instead, not because
some Merge Join based plan turns out to be better. I mean, it is
possible that some Merge Join plan does turn out to be better, but
that's not rage-inducing in the same way. Nobody looks at a
complicated join plan that happened to use a Nested Loop and says
"obviously, this is inferior to a merge join," or if they do, they're
probably full of hot air. But people look at complicated join plans
that happen to use a Nested Loop and say "obviously, this is inferior
to a hash join" *all the time* and assuming the inner path is
unparameterized, they are very often correct.
Thoughts? I'd be particularly curious to hear about any cases anyone
knows about where an unparameterized nested loop and a hash join with
batches=1 are both possible and where the unparameterized nested loop
is *much* cheaper than the hash join.
Thanks,
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Jun 15, 2021 at 10:09 AM Robert Haas <robertmhaas@gmail.com> wrote:
How to do that is not very clear. One very simple thing we could do
would be to introduce enable_nestloop=parameterized or
enable_parameterized_nestloop=false. That is a pretty blunt instrument
but the authors of the paper seem to have done that with positive
results, so maybe it's actually not a dumb idea.
I think that it's probably a good idea as-is.
Simple heuristics that are very frequently wrong when considered in a
naive way can work very well in practice. This seems to happen when
they capture some kind of extreme naturally occuring cost/benefit
asymmetry -- especially one with fixed well understood costs and
unlimited benefits (this business with unparameterized nestloop joins
is about *avoiding* the inverse asymmetry, but that seems very
similar). My go to example of such an asymmetry is the rightmost page
split heuristic of applying leaf fillfactor regardless of any of the
other specifics; we effectively assume that all indexes are on columns
with ever-increasing values. Which is obviously wrong.
We're choosing between two alternatives (unparameterized nested loop
vs hash join) that are really very similar when things go as expected,
but diverge sharply when there is a misestimation -- who wouldn't take
the "conservative" choice here?
I guess that there is a hesitation to not introduce heuristics like
this because it doesn't fit into some larger framework that captures
risk -- it might be seen as an ugly special case. But isn't this
already actually kind of special, whether or not we officially think
so?
--
Peter Geoghegan
On Tue, Jun 15, 2021 at 2:04 PM Peter Geoghegan <pg@bowt.ie> wrote:
I guess that there is a hesitation to not introduce heuristics like
this because it doesn't fit into some larger framework that captures
risk -- it might be seen as an ugly special case. But isn't this
already actually kind of special, whether or not we officially think
so?
Yes, I think it is. Reading the paper really helped me crystallize my
thoughts about this, because when I've studied the problems myself, I
came, as you postulate here, to the conclusion that there's a lot of
stuff the planner does where there is risk and uncertainty, and thus
that a general framework would be necessary to deal with it. But the
fact that an academic researcher called this problem out as the only
one worth treating specially makes me think that perhaps it deserves
special handling. In defense of that approach, note that this is a
case where we know both that the Nested Loop is risky and that Hash
Join is a similar alternative with probably similar cost. I am not
sure there are any other cases where we can say quite so generally
both that a certain thing is risky and what we could do instead.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Jun 15, 2021 at 12:31 PM Robert Haas <robertmhaas@gmail.com> wrote:
Yes, I think it is. Reading the paper really helped me crystallize my
thoughts about this, because when I've studied the problems myself, I
came, as you postulate here, to the conclusion that there's a lot of
stuff the planner does where there is risk and uncertainty, and thus
that a general framework would be necessary to deal with it.
It is an example (perhaps the only example in the optimizer) of an
oasis of certainty in an ocean of uncertainty. As uncertain as
everything is, we seemingly can make strong robust statements about
the relative merits of each strategy *in general*, just in this
particular instance. It's just not reasonable to make such a reckless
choice, no matter what your general risk tolerance is.
Goetz Graefe is interviewed here, and goes into his philosophy on
robustness -- it seems really interesting to me:
https://sigmodrecord.org/publications/sigmodRecord/2009/pdfs/05_Profiles_Graefe.pdf
In defense of that approach, note that this is a
case where we know both that the Nested Loop is risky and that Hash
Join is a similar alternative with probably similar cost. I am not
sure there are any other cases where we can say quite so generally
both that a certain thing is risky and what we could do instead.
I tend to think of a hash join as like a nested loop join with an
inner index scan where you build the index yourself, dynamically. That
might be why I find it easy to make this mental leap. In theory you
could do this by giving the nestloop join runtime smarts -- make it
turn into a hash join adaptively. Like Graefe's G-Join design. That
way you could do this in a theoretically pure way.
I don't think that that's actually necessary just to deal with this
case -- it probably really is as simple as it seems. I point this out
because perhaps it's useful to have that theoretical anchoring.
--
Peter Geoghegan
On Wed, 16 Jun 2021 at 05:09, Robert Haas <robertmhaas@gmail.com> wrote:
How to do that is not very clear. One very simple thing we could do
would be to introduce enable_nestloop=parameterized or
enable_parameterized_nestloop=false. That is a pretty blunt instrument
but the authors of the paper seem to have done that with positive
results, so maybe it's actually not a dumb idea.
It's not great that people are having to use such blunt instruments to
get the planner to behave. It might not be a terrible idea to provide
them with something a bit sharper as you suggest. The GUC idea is
certainly something that could be done without too much effort.
There was some talk of doing that in [1]/messages/by-id/CAKJS1f8nsm-T0KMvGJz_bskUjQ=yGmGUUtUdAcFoEaZ_tuTXjA@mail.gmail.com.
Another approach
would be to introduce a large fuzz factor for such nested loops e.g.
keep them only if the cost estimate is better than the comparable hash
join plan by at least a factor of N (or if no such plan is being
generated for some reason).
In my experience, the most common reason that the planner chooses
non-parameterized nested loops wrongly is when there's row
underestimation that says there's just going to be 1 row returned by
some set of joins. The problem often comes when some subsequent join
is planned and the planner sees the given join rel only produces one
row. The cheapest join method we have to join 1 row is Nested Loop.
So the planner just sticks the 1-row join rel on the outer side
thinking the executor will only need to scan the inner side of the
join once. When the outer row count blows up, then we end up scanning
that inner side many more times. The problem is compounded when you
nest it a few joins deep
Most of the time when I see that happen it's down to either the
selectivity of some correlated base-quals being multiplied down to a
number low enough that we clamp the estimate to be 1 row. The other
case is similar, but with join quals.
It seems to me that each time we multiply 2 selectivities together
that the risk of the resulting selectivity being out increases. The
risk is likely lower when we have some extended statistics which
allows us to do fewer selectivity multiplications.
For that 1-row case, doing an unparameterized nested loop is only the
cheapest join method by a tiny amount. It really wouldn't be much
more expensive to just put that single row into a hash table. If that
1 estimated row turns out to be 10 actual rows then it's likely not
too big a deal for the hash join code to accommodate the 9 additional
rows.
This makes me think that it's insanely risky for the planner to be
picking Nested Loop joins in this case. And that puts me down the path
of thinking that this problem should be solved by having the planner
take into account risk as well as costs when generating plans.
I don't really have any concrete ideas on that, but a basic idea that
I have been considering is that a Path has a risk_factor field that is
decided somewhere like clauselist_selectivity(). Maybe the risk can go
up by 1 each time we multiply an individual selectivity. (As of
master, estimate_num_groups() allows the estimation to pass back some
further information to the caller. I added that for Result Cache so I
could allow the caller to get visibility about when the estimate fell
back on DEFAULT_NUM_DISTINCT. clauselist_selectivity() maybe could get
similar treatment to allow the risk_factor or number of nstats_used to
be passed back). We'd then add a GUC, something like
planner_risk_adversion which could be set to anything from 0.0 to some
positive number. During add_path() we could do the cost comparison
like: path1.cost * path1.risk_factor * (1.0 + planner_risk_adversion)
< path2.cost * path2.risk_factor * (1.0 + planner_risk_adversion).
That way, if you set planner_risk_adversion to 0.0, then the planner
does as it does today, i.e takes big risks.
The problem I have with this idea is that I really don't know how to
properly calculate what the risk_factor should be set to. It seems
easy at first to set it to something that has the planner avoid these
silly 1-row estimate nested loop mistakes, but I think what we'd set
the risk_factor to would become much more important when more and more
Path types start using it. So if we did this and just guessed the
risk_factor, that might be fine when only 1 of the paths being
compared had a non-zero risk_factor, but as soon as both paths have
one set, unless they're set to something sensible, then we just end up
comparing garbage costs to garbage costs.
Another common mistake the planner makes is around WHERE a = <value>
ORDER BY b LIMIT n; where there are separate indexes on (a) and (b).
Scanning the (b) index is pretty risky. All the "a" values you need
might be right at the end of the index. It seems safer to scan the (a)
index as we'd likely have statistics that tell us how many rows exist
with <value>. We don't have any stats that tell us where in the (b)
index are all the rows with a = <value>.
I don't really think we should solve this by having nodeNestloop.c
fall back on hashing when the going gets tough. Overloading our nodes
that way is not a sustainable thing to do. I'd rather see the
executor throw the plan back at the planner along with some hints
about what was wrong with it. We could do that providing we've not
sent anything back to the client yet.
David
[1]: /messages/by-id/CAKJS1f8nsm-T0KMvGJz_bskUjQ=yGmGUUtUdAcFoEaZ_tuTXjA@mail.gmail.com
On Tue, Jun 15, 2021 at 5:00 PM David Rowley <dgrowleyml@gmail.com> wrote:
I don't really think we should solve this by having nodeNestloop.c
fall back on hashing when the going gets tough. Overloading our nodes
that way is not a sustainable thing to do. I'd rather see the
executor throw the plan back at the planner along with some hints
about what was wrong with it. We could do that providing we've not
sent anything back to the client yet.
It wasn't a serious suggestion -- it was just a way of framing the
issue at hand that I thought might be useful.
If we did have something like that (which FWIW I think makes sense but
is hard to do right in a general way) then it might be expected to
preemptively refuse to even start down the road of using an
unparameterized nestloop join very early, or even before execution
time. Such an adaptive join algorithm/node might be expected to have a
huge bias against this particular plan shape, that can be reduced to a
simple heuristic. But you can have the simple heuristic without
needing to build everything else.
Whether or not we throw the plan back at the planner or "really change
our minds at execution time" seems like a distinction without a
difference. Either way we're changing our minds about the plan based
on information that is fundamentally execution time information, not
plan time information. Have I missed something?
--
Peter Geoghegan
On Wed, 16 Jun 2021 at 12:11, Peter Geoghegan <pg@bowt.ie> wrote:
Whether or not we throw the plan back at the planner or "really change
our minds at execution time" seems like a distinction without a
difference.
What is "really change our minds at execution time"? Is that switch
to another plan without consulting the planner? If so what decides
what that new plan should be? The planner is meant to be the expert in
that department. The new information might cause the join order to
completely change. It might not be as simple as swapping a Nested Loop
for a Hash Join.
Either way we're changing our minds about the plan based
on information that is fundamentally execution time information, not
plan time information. Have I missed something?
I don't really see why you think the number of rows that a given join
might produce is execution information. It's exactly the sort of
information the planner needs to make a good plan. It's just that
today we get that information from statistics. Plenty of other DBMSs
make decisions from sampling. FWIW, we do already have a minimalist
sampling already in get_actual_variable_range().
I'm just trying to highlight that I don't think overloading nodes is a
good path to go down. It's not a sustainable practice. It's a path
towards just having a single node that does everything. If your
suggestion was not serious then there's no point in discussing it
further.
David
On Tue, Jun 15, 2021 at 6:15 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Wed, 16 Jun 2021 at 12:11, Peter Geoghegan <pg@bowt.ie> wrote:
Whether or not we throw the plan back at the planner or "really change
our minds at execution time" seems like a distinction without a
difference.What is "really change our minds at execution time"? Is that switch
to another plan without consulting the planner?
I don't know what it means. That was my point -- it all seems like
semantics to me.
The strong separation between plan time and execution time isn't
necessarily a good thing, at least as far as solving some of the
thorniest problems goes. It seems obvious to me that cardinality
estimation is the main problem, and that the most promising solutions
are all fundamentally about using execution time information to change
course. Some problems with planning just can't be solved at plan time
-- no model can ever be smart enough. Better to focus on making query
execution more robust, perhaps by totally changing the plan when it is
clearly wrong. But also by using more techniques that we've
traditionally thought of as execution time techniques (e.g. role
reversal in hash join). The distinction is blurry to me.
There are no doubt practical software engineering issues with this --
separation of concerns and whatnot. But it seems premature to go into
that now.
The new information might cause the join order to
completely change. It might not be as simple as swapping a Nested Loop
for a Hash Join.
I agree that it might not be that simple at all. I think that Robert
is saying that this is one case where it really does appear to be that
simple, and so we really can expect to benefit from a simple plan-time
heuristic that works within the confines of the current model. Why
wouldn't we just take that easy win, once the general idea has been
validated some more? Why let the perfect be the enemy of the good?
I have perhaps muddied the waters by wading into the more general
question of robust execution, the inherent uncertainty with
cardinality estimation, and so on. Robert really didn't seem to be
talking about that at all (though it is clearly related).
Either way we're changing our minds about the plan based
on information that is fundamentally execution time information, not
plan time information. Have I missed something?I don't really see why you think the number of rows that a given join
might produce is execution information.
If we're 100% sure a join will produce at least n rows because we
executed it (with the express intention of actually doing real query
processing that returns rows to the client), and it already produced n
rows, then what else could it be called? Why isn't it that simple?
It's exactly the sort of
information the planner needs to make a good plan. It's just that
today we get that information from statistics. Plenty of other DBMSs
make decisions from sampling.
FWIW, we do already have a minimalist
sampling already in get_actual_variable_range().
I know, but that doesn't seem all that related -- it almost seems like
the opposite idea. It isn't the executor balking when it notices that
the plan is visibly wrong during execution, in some important way.
It's more like the planner using the executor to get information about
an index that is well within the scope of what we think of as plan
time.
To some degree the distinction gets really blurred due to nodes like
hash join, where some important individual decisions are delayed until
execution time already. It's really unclear when precisely it stops
being that, and starts being more of a case of either partially or
wholly replanning. I don't know how to talk about it without it being
confusing.
I'm just trying to highlight that I don't think overloading nodes is a
good path to go down. It's not a sustainable practice. It's a path
towards just having a single node that does everything. If your
suggestion was not serious then there's no point in discussing it
further.
As I said, it was a way of framing one particular issue that Robert is
concerned about.
--
Peter Geoghegan
On Tue, Jun 15, 2021 at 5:00 PM David Rowley <dgrowleyml@gmail.com> wrote:
Most of the time when I see that happen it's down to either the
selectivity of some correlated base-quals being multiplied down to a
number low enough that we clamp the estimate to be 1 row. The other
case is similar, but with join quals.It seems to me that each time we multiply 2 selectivities together
that the risk of the resulting selectivity being out increases. The
risk is likely lower when we have some extended statistics which
allows us to do fewer selectivity multiplications.
It seems important to distinguish between risk and uncertainty --
they're rather different things. The short version is that you can
model risk but you cannot model uncertainty. This seems like a problem
of uncertainty to me.
The example from the paper that Robert cited isn't interesting to me
because it hints at a way of managing the uncertainty, exactly. It's
interesting because it seems to emphasize the user's exposure to the
problem, which is what really matters. Even if it was extremely
unlikely that the user would have a problem, the downside of being
wrong is still absurdly high, and the upside of being right is low
(possibly even indistinguishable from zero). It's just not worth
thinking about. Besides, we all know that selectivity estimates are
very often quite wrong without the user ever noticing. It's amazing
that database optimizers work as well as they do, really.
--
Peter Geoghegan
The problem I have with this idea is that I really don't know how to
properly calculate what the risk_factor should be set to. It seems
easy at first to set it to something that has the planner avoid these
silly 1-row estimate nested loop mistakes, but I think what we'd set
the risk_factor to would become much more important when more and more
Path types start using it. So if we did this and just guessed the
risk_factor, that might be fine when only 1 of the paths being
compared had a non-zero risk_factor, but as soon as both paths have
one set, unless they're set to something sensible, then we just end up
comparing garbage costs to garbage costs.
Risk factor is the inverse of confidence on estimate, lesser
confidence, higher risk. If we associate confidence with the
selectivity estimate, or computer confidence interval of the estimate
instead of a single number, we can associate risk factor with each
estimate. When we combine estimates to calculate new estimates, we
also combine their confidences/confidence intervals. If my memory
serves well, confidence intervals/confidences are calculated based on
the sample size and method used for estimation, so we should be able
to compute those during ANALYZE.
I have not come across many papers which leverage this idea. Googling
"selectivity estimation confidence interval", does not yield many
papers. Although I found [1]https://pi3.informatik.uni-mannheim.de/~moer/Publications/vldb18_smpl_synop.pdf -- Best Wishes, Ashutosh Bapat to be using a similar idea. So may be
there's not merit in this idea, thought theoretically it sounds fine
to me.
[1]: https://pi3.informatik.uni-mannheim.de/~moer/Publications/vldb18_smpl_synop.pdf -- Best Wishes, Ashutosh Bapat
--
Best Wishes,
Ashutosh Bapat
On Fri, Jun 18, 2021 at 6:20 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
wrote:
I have not come across many papers which leverage this idea. Googling
"selectivity estimation confidence interval", does not yield many
papers. Although I found [1] to be using a similar idea. So may be
there's not merit in this idea, thought theoretically it sounds fine
to me.[1]
https://pi3.informatik.uni-mannheim.de/~moer/Publications/vldb18_smpl_synop.pdf
Well, that paper's title shows it's a bit too far forward for us, since we
don't use samples during plan time (although that's a separate topic worth
considering). From the references, however, this one gives some
mathematical framing of the problem that lead to the thread subject,
although I haven't read enough to see if we can get practical advice from
it:
Y. E. Ioannidis and S. Christodoulakis. On the propagation of errors in the
size of join results.
https://www.csd.uoc.gr/~hy460/pdf/p268-ioannidis.pdf
--
John Naylor
EDB: http://www.enterprisedb.com
On Wed, 16 Jun 2021 at 15:08, Peter Geoghegan <pg@bowt.ie> wrote:
It seems important to distinguish between risk and uncertainty --
they're rather different things. The short version is that you can
model risk but you cannot model uncertainty. This seems like a problem
of uncertainty to me.
You might be right there. "Uncertainty" or "Certainty" seems more
like a value that clauselist_selectivity() would be able to calculate
itself. It would just be up to the planner to determine what to do
with it.
One thing I thought about is that if the costing modal was able to
separate out a cost of additional (unexpected) rows then it would be
easier for add_path() to take into account how bad things might go if
we underestimate.
For example, in an unparameterized Nested Loop that estimates the
outer Path to have 1 row will cost an entire additional inner cost if
there are 2 rows. With Hash Join the cost is just an additional
hashtable lookup, which is dead cheap. I don't know exactly how
add_path() would weigh all that up, but it seems to me that I wouldn't
take the risk unless I was 100% certain that the Nested Loop's outer
Path would only return 1 row exactly, if there was any chance at all
it could return more, I'd be picking some other join method.
David
On Tue, Jun 15, 2021 at 8:00 PM David Rowley <dgrowleyml@gmail.com> wrote:
In my experience, the most common reason that the planner chooses
non-parameterized nested loops wrongly is when there's row
underestimation that says there's just going to be 1 row returned by
some set of joins. The problem often comes when some subsequent join
is planned and the planner sees the given join rel only produces one
row. The cheapest join method we have to join 1 row is Nested Loop.
So the planner just sticks the 1-row join rel on the outer side
thinking the executor will only need to scan the inner side of the
join once. When the outer row count blows up, then we end up scanning
that inner side many more times. The problem is compounded when you
nest it a few joins deepMost of the time when I see that happen it's down to either the
selectivity of some correlated base-quals being multiplied down to a
number low enough that we clamp the estimate to be 1 row. The other
case is similar, but with join quals.
If an estimate is lower than 1, that should be a red flag that Something Is
Wrong. This is kind of a crazy idea, but what if we threw it back the other
way by computing 1 / est , and clamping that result to 2 <= res < 10 (or
100 or something)? The theory is, the more impossibly low it is, the more
wrong it is. I'm attracted to the idea of dealing with it as an estimation
problem and not needing to know about join types. Might have unintended
consequences, though.
Long term, it would be great to calculate something about the distribution
of cardinality estimates, so we can model risk in the estimates.
--
John Naylor
EDB: http://www.enterprisedb.com
Most of the time when I see that happen it's down to either the
selectivity of some correlated base-quals being multiplied down to a
number low enough that we clamp the estimate to be 1 row. The other
case is similar, but with join quals.If an estimate is lower than 1, that should be a red flag that Something Is
Wrong. This is kind of a crazy idea, but what if we threw it back the other
way by computing 1 / est , and clamping that result to 2 <= res < 10 (or
100 or something)? The theory is, the more impossibly low it is, the more
wrong it is. I'm attracted to the idea of dealing with it as an estimation
problem and not needing to know about join types. Might have unintended
consequences, though.Long term, it would be great to calculate something about the distribution
of cardinality estimates, so we can model risk in the estimates.
Hi,
Laurenz suggested clamping to 2 in this thread in 2017:
/messages/by-id/1509611428.3268.5.camel@cybertec.at
Having been the victim of this problem in the past, I like the risk
based approach to this. If the cost of being wrong in the estimate is
high, use a merge join instead. In every case that I have encountered,
that heuristic would have given the correct performant plan.
Regards,
Ken
On Mon, Jun 21, 2021 at 6:41 AM David Rowley <dgrowleyml@gmail.com> wrote:
For example, in an unparameterized Nested Loop that estimates the
outer Path to have 1 row will cost an entire additional inner cost if
there are 2 rows. With Hash Join the cost is just an additional
hashtable lookup, which is dead cheap. I don't know exactly how
add_path() would weigh all that up, but it seems to me that I wouldn't
take the risk unless I was 100% certain that the Nested Loop's outer
Path would only return 1 row exactly, if there was any chance at all
it could return more, I'd be picking some other join method.
It seems like everyone agrees that it would be good to do something
about this problem, but the question is whether it's best to do
something that tries to be general, or whether we should instead do
something about this specific case. I favor the latter approach. Risk
and uncertainty exist all over the place, but dealing with that in a
general way seems difficult, and maybe unnecessary. Addressing the
case of unparameterized nest loops specifically seems simpler, because
it's easier to reason about what the alternatives are. Your last
sentence here seems right on point to me.
Basically, what you argue for there is disabling unparameterized
nested loops entirely except when we can prove that the inner side
will never generate more than one row. But, that's almost never going
to be something that we can prove. If the inner side is coming from a
table or sub-join, it can turn out to be big. As far as I can see, the
only way that this doesn't happen is if it's something like a subquery
that aggregates everything down to one row, or has LIMIT 1, but those
are such special cases that I don't even think we should be worrying
about them.
So my counter-proposal is: how about if we split
merge_unsorted_outer() into two functions, one of which generates
nested loops only based on parameterized paths and the other of which
generates nested loops based only on unparameterized paths, and then
rejigger add_paths_to_joinrel() so that we do the latter between the
steps that are currently number 5 and 6 and only if we haven't got any
other paths yet? If somebody later comes up with logic for proving
that the inner side can never have more than 1 row, we can let this be
run in those cases as well. In the meantime, if somebody does
something like a JOIN b ON a.x < b.x, we will still generate these
paths because there's no other approach, or similarly if it's a.x =
b.x but for some strange type that doesn't have a hash-joinable or
merge-joinable equality operator. But otherwise we omit those paths
entirely.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Mon, Jun 21, 2021 at 7:45 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Jun 21, 2021 at 6:41 AM David Rowley <dgrowleyml@gmail.com> wrote:
For example, in an unparameterized Nested Loop that estimates the
outer Path to have 1 row will cost an entire additional inner cost if
there are 2 rows. With Hash Join the cost is just an additional
hashtable lookup, which is dead cheap. I don't know exactly how
add_path() would weigh all that up, but it seems to me that I wouldn't
take the risk unless I was 100% certain that the Nested Loop's outer
Path would only return 1 row exactly, if there was any chance at all
it could return more, I'd be picking some other join method.It seems like everyone agrees that it would be good to do something
about this problem, but the question is whether it's best to do
something that tries to be general, or whether we should instead do
something about this specific case. I favor the latter approach.
I agree with your conclusion, but FWIW I am sympathetic to David's
view too. I certainly understand why he'd naturally want to define the
class of problems that are like this one, to understand what the
boundaries are.
The heuristic that has the optimizer flat out avoids an
unparameterized nested loop join is justified by the belief that
that's fundamentally reckless. Even though we all agree on that much,
I don't know when it stops being reckless and starts being "too risky
for me, but not fundamentally reckless". I think that that's worth
living with, but it isn't very satisfying.
Risk
and uncertainty exist all over the place, but dealing with that in a
general way seems difficult, and maybe unnecessary. Addressing the
case of unparameterized nest loops specifically seems simpler, because
it's easier to reason about what the alternatives are. Your last
sentence here seems right on point to me.
Right. Part of why this is a good idea is that the user is exposed to
so many individual risks and uncertainties. We cannot see any one risk
as existing in a vacuum. It is not the only risk the user will ever
take in the planner -- if it was then it might actually be okay to
allow unparameterized nested loop joins.
A bad unparameterized nested loop join plan has, in a sense, unknown
and unbounded cost/downside. But it is only very slightly faster than
a hash join, by a fixed well understood amount. With enough "trials"
and on a long enough timeline, it will inevitably blow up and cause
the application to grind to a halt. It seems like no amount of fixed,
bounded benefit from "fast unparameterized nested loop joins" could
possibly make up for that. The life of Postgres users would be a lot
better if bad plans were at least "survivable events".
--
Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes:
The heuristic that has the optimizer flat out avoids an
unparameterized nested loop join is justified by the belief that
that's fundamentally reckless. Even though we all agree on that much,
I don't know when it stops being reckless and starts being "too risky
for me, but not fundamentally reckless". I think that that's worth
living with, but it isn't very satisfying.
There are certainly cases where the optimizer can prove (in principle;
it doesn't do so today) that a plan node will produce at most one row.
They're hardly uncommon either: an equality comparison on a unique
key, or a subquery with a simple aggregate function, come to mind.
In such cases, not only is this choice not reckless, but it's provably
superior to a hash join. So in the end this gets back to the planning
risk factor that we keep circling around but nobody quite wants to
tackle.
I'd be a lot happier if this proposal were couched around some sort
of estimate of the risk of the outer side producing more than the
expected number of rows. The arguments so far seem like fairly lame
rationalizations for not putting forth the effort to do that.
regards, tom lane
On Mon, Jun 21, 2021 at 8:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
There are certainly cases where the optimizer can prove (in principle;
it doesn't do so today) that a plan node will produce at most one row.
They're hardly uncommon either: an equality comparison on a unique
key, or a subquery with a simple aggregate function, come to mind.
That sounds like it might be useful in general.
In such cases, not only is this choice not reckless, but it's provably
superior to a hash join. So in the end this gets back to the planning
risk factor that we keep circling around but nobody quite wants to
tackle.
Let's assume for the sake of argument that we really have to have that
additional infrastructure to move forward with the idea. (I'm not sure
if it's possible in principle to use infrastructure like that for some
of the cases that Robert has in mind, but for now I'll assume that it
is both possible and a practical necessity.)
Even when I make this working assumption I don't see what it changes
at a fundamental level. You've merely come up with a slightly more
specific definition of the class of plans that are "reckless". You've
only refined the original provisional definition of "reckless" to
exclude specific "clearly not reckless" cases (I think). But the
definition of "reckless" is no less squishy than what we started out
with.
I'd be a lot happier if this proposal were couched around some sort
of estimate of the risk of the outer side producing more than the
expected number of rows. The arguments so far seem like fairly lame
rationalizations for not putting forth the effort to do that.
I'm not so sure that it is. The point isn't the risk, even if it could
be calculated. The point is the downsides of being wrong are huge and
pretty much unbounded, whereas the benefits of being right are tiny
and bounded. It almost doesn't matter what the underlying
probabilities are.
To be clear I'm not arguing against modelling risk. I'm just not sure
that it's useful to think of this problem as truly a problem of risk.
--
Peter Geoghegan
On 6/21/21 5:55 PM, Tom Lane wrote:
Peter Geoghegan <pg@bowt.ie> writes:
The heuristic that has the optimizer flat out avoids an
unparameterized nested loop join is justified by the belief that
that's fundamentally reckless. Even though we all agree on that much,
I don't know when it stops being reckless and starts being "too risky
for me, but not fundamentally reckless". I think that that's worth
living with, but it isn't very satisfying.There are certainly cases where the optimizer can prove (in principle;
it doesn't do so today) that a plan node will produce at most one row.
They're hardly uncommon either: an equality comparison on a unique
key, or a subquery with a simple aggregate function, come to mind.In such cases, not only is this choice not reckless, but it's provably
superior to a hash join. So in the end this gets back to the planning
risk factor that we keep circling around but nobody quite wants to
tackle.
Agreed.
I'd be a lot happier if this proposal were couched around some sort
of estimate of the risk of the outer side producing more than the
expected number of rows. The arguments so far seem like fairly lame
rationalizations for not putting forth the effort to do that.
I agree having such measure would be helpful, but do you have an idea
how it could be implemented?
I've been thinking about this a bit recently and searching for papers
talking about this, and but it's not clear to me how to calculate the
risk (and propagate it through the plan) without making the whole cost
evaluation way more complicated / expensive :-(
The "traditional approach" to quantifying risk would be confidence
intervals, i.e. for each cardinality estimate "e" we'd determine a range
[a,b] so that P(a <= e <= b) = p. So we could pick "p" depending on how
"robust" the plan choice should be (say 0.9 for "risky" and 0.999 for
"safe" plans) and calculate a,b. Or maybe we could calculate where the
plan changes, and then we'd see if those "break points" are within the
confidence interval. If not, great - we can assume the plan is stable,
otherwise we'd need to consider the other plans too, somehow.
But what I'm not sure about is:
1) Now we're dealing with three cardinality estimates (the original "e"
and the boundaries "a, "b"). So which one do we use to calculate cost
and pass to upper parts of the plan?
2) The outer relation may be a complex join, so we'd need to combine the
confidence intervals for the two input relations, somehow.
3) We'd need to know how to calculate the confidence intervals for most
plan nodes, which I'm not sure we know how to do. So it's not clear to
me how to do this, which seems rather problematic because we need to
propagate and combine those confidence intervals through the plan.
But maybe you have thought about some much simpler approach, addressing
this sufficiently well?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Peter Geoghegan <pg@bowt.ie> writes:
On Mon, Jun 21, 2021 at 8:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'd be a lot happier if this proposal were couched around some sort
of estimate of the risk of the outer side producing more than the
expected number of rows. The arguments so far seem like fairly lame
rationalizations for not putting forth the effort to do that.
I'm not so sure that it is. The point isn't the risk, even if it could
be calculated. The point is the downsides of being wrong are huge and
pretty much unbounded, whereas the benefits of being right are tiny
and bounded. It almost doesn't matter what the underlying
probabilities are.
You're throwing around a lot of pejorative adjectives there without
having bothered to quantify any of them. This sounds less like a sound
argument to me than like a witch trial.
Reflecting for a bit on the ancient principle that "the only good numbers
in computer science are 0, 1, and N", it seems to me that we could make
an effort to classify RelOptInfos as provably empty, provably at most one
row, and others. (This would be a property of relations, not individual
paths, so it needn't bloat struct Path.) We already understand about
provably-empty rels, so this is just generalizing that idea a little.
Once we know about that, at least for the really-common cases like unique
keys, I'd be okay with a hard rule that we don't consider unparameterized
nestloop joins with an outer side that falls in the "N" category.
Unless there's no alternative, of course.
Another thought that occurs to me here is that maybe we could get rid of
the enable_material knob in favor of forcing (or at least encouraging)
materialization when the outer side isn't provably one row.
regards, tom lane