Making the planner more tolerant of implicit/explicit casts

Started by Tom Laneover 13 years ago4 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

I looked into the complaint in bug #7598,
http://archives.postgresql.org/pgsql-bugs/2012-10/msg00090.php
The core of the problem is in an inner sub-select that's written
like

outercol IN (SELECT varcharcol FROM ... WHERE varcharcol = anothercol ...

The "=" operator is actually texteq, since varchar has no equality
operator of its own, which means there's a RelabelType in there.
Likewise, the comparison expression generated for the IN construct
involves a RelabelType on varcharcol.

Now, ruleutils.c prefers to make the cast explicit, so this prints as

outercol IN (SELECT varcharcol FROM ... WHERE varcharcol::text = anothercol ...

which is semantically the same thing ... but after dump and reload, the
RelabelType in the inner WHERE now has CoercionForm COERCE_EXPLICIT_CAST
instead of COERCE_IMPLICIT_CAST. And it turns out that that causes
process_equivalence to not match it up with the varcharcol instance in
the IN expression, so the planner fails to make as many equivalence
deductions as it should, resulting in an inferior plan.

Basically the thing to do about this is to ensure that we consistently
use CoercionForm COERCE_DONTCARE in expressions that are getting fed to
the EquivalenceClass machinery. That doesn't change the semantics of
the expression tree, but it ensures that equivalent expressions will
be seen as equal().

The minimum-risk way to do that would be for canonicalize_ec_expression to
copy the presented expression and then apply set_coercionform_dontcare
to it. However, the requirement to copy is a bit annoying. Also, we've
seen bugs of this general ilk multiple times before, so I'm not entirely
satisfied with just hacking EquivalenceClasses for it.

What I'm thinking about is modifying eval_const_expressions so that
one of its responsibilities is to force CoercionForm fields to
COERCE_DONTCARE in the output; which would take basically no additional
code, for instance the fix for RelabelType looks like

*** a/src/backend/optimizer/util/clauses.c
--- b/src/backend/optimizer/util/clauses.c
*************** eval_const_expressions_mutator(Node *nod
*** 2669,2675 ****
  					newrelabel->resulttype = relabel->resulttype;
  					newrelabel->resulttypmod = relabel->resulttypmod;
  					newrelabel->resultcollid = relabel->resultcollid;
! 					newrelabel->relabelformat = relabel->relabelformat;
  					newrelabel->location = relabel->location;
  					return (Node *) newrelabel;
  				}
--- 2669,2675 ----
  					newrelabel->resulttype = relabel->resulttype;
  					newrelabel->resulttypmod = relabel->resulttypmod;
  					newrelabel->resultcollid = relabel->resultcollid;
! 					newrelabel->relabelformat = COERCE_DONTCARE;
  					newrelabel->location = relabel->location;
  					return (Node *) newrelabel;
  				}

The net effect of this is that *all* CoercionForms seen in the planner
would be COERCE_DONTCARE. We could get rid of set_coercionform_dontcare
altogether, and also get rid of the ugly special-case comparison logic
in equalfuncs.c.

This is a more invasive fix, for sure, but it would permanently prevent
the whole class of bugs instead of just stomping the manifestation in
process_equivalence.

Thoughts?

regards, tom lane

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#1)
Re: Making the planner more tolerant of implicit/explicit casts

I wrote:

What I'm thinking about is modifying eval_const_expressions so that
one of its responsibilities is to force CoercionForm fields to
COERCE_DONTCARE in the output;

I fooled around with that approach for awhile and didn't like the
results, mainly because it caused EXPLAIN output to change in unpleasant
ways (ruleutils.c needs the CoercionForm info to format its output nicely).

However, on further reflection I realized that we could fix it just by
making equal() ignore CoercionForm fields altogether. I remember having
considered that and shied away from it back when I first invented the
COERCE_DONTCARE hack, on the grounds that it would put too much semantic
awareness into equal(). However, we've long since abandoned the idea
that equal() should insist on full bitwise equality of nodes --- it's
ignored location fields for some time without ill effects, and there are
a number of other special cases in there too. So as long as we're
willing to consider that equal() can mean just semantic equivalence of
two node trees, this can be fixed by removing code rather than adding
it, along the lines of the attached patch.

We could take the further step of removing the COERCE_DONTCARE enum
value altogether; the remaining uses are only to fill in CoercionForm
fields in nodes that the planner creates out of whole cloth, and now we
could make it fill in one of the more standard values instead. I didn't
do that in the attached because it makes the patch longer but no more
enlightening (and in any case I don't think that change would be good to
back-patch).

I'm reasonably convinced that this is a good fix for HEAD, but am of two
minds whether to back-patch it or not. The problem complained of in
bug #7598 may seem a bit narrow, but the real point is that whether you
write a cast explicitly or not shouldn't affect planning if the
semantics are the same. This might well be a significant though
previously unrecognized performance issue, particularly for people who
use varchar columns heavily.

Thoughts?

regards, tom lane

#3Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#2)
Re: Making the planner more tolerant of implicit/explicit casts

On Thu, Oct 11, 2012 at 5:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm reasonably convinced that this is a good fix for HEAD, but am of two
minds whether to back-patch it or not. The problem complained of in
bug #7598 may seem a bit narrow, but the real point is that whether you
write a cast explicitly or not shouldn't affect planning if the
semantics are the same. This might well be a significant though
previously unrecognized performance issue, particularly for people who
use varchar columns heavily.

I have had a few bad experiences with people getting *really* upset
about plan changes in minor releases, so I would be disinclined to
back-patch this, even if we're fairly sure that it will be an
improvement in most/all cases. It's just not worth the risk of
discovering otherwise.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#3)
Re: Making the planner more tolerant of implicit/explicit casts

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, Oct 11, 2012 at 5:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm reasonably convinced that this is a good fix for HEAD, but am of two
minds whether to back-patch it or not. The problem complained of in
bug #7598 may seem a bit narrow, but the real point is that whether you
write a cast explicitly or not shouldn't affect planning if the
semantics are the same. This might well be a significant though
previously unrecognized performance issue, particularly for people who
use varchar columns heavily.

I have had a few bad experiences with people getting *really* upset
about plan changes in minor releases, so I would be disinclined to
back-patch this, even if we're fairly sure that it will be an
improvement in most/all cases. It's just not worth the risk of
discovering otherwise.

I stuck it into 9.2, but not further back.

regards, tom lane