FIX : teach expression walker about RestrictInfo

Started by Tomas Vondraover 10 years ago12 messages
#1Tomas Vondra
tomas.vondra@2ndquadrant.com
1 attachment(s)

Hi there,

the attached trivial patch adds handling of RestrictInfo nodes into
expression_tree_walker(). This is needed for example when calling
pull_varnos or (or other functions using the expression walker) in
clausesel.c, for example. An example of a query causing errors with
pull_varnos is

select * from t where (a >= 10 and a <= 20) or (b >= 15 and b <= 20);

which gets translated into an expression tree like this:

BoolExpr [OR_EXPR]
BoolExpr [AND_EXPR]
RestrictInfo
OpExpr [Var >= Const]
RestrictInfo
OpExpr [Var <= Const]
BoolExpr [AND_EXPR]
RestrictInfo
OpExpr [Var >= Const]
RestrictInfo
OpExpr [Var <= Const]

and the nested RestrictInfo nodes make the walker fail because of
unrecognized node.

It's possible that expression walker is not supposed to know about
RestrictInfo, but I don't really see why would that be the case.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

expression-walker-fix.patchtext/x-patch; name=expression-walker-fix.patchDownload
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index d6f1f5b..843f06d 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -1933,6 +1933,8 @@ expression_tree_walker(Node *node,
 			return walker(((PlaceHolderInfo *) node)->ph_var, context);
 		case T_RangeTblFunction:
 			return walker(((RangeTblFunction *) node)->funcexpr, context);
+		case T_RestrictInfo:
+			return walker(((RestrictInfo *) node)->clause, context);
 		default:
 			elog(ERROR, "unrecognized node type: %d",
 				 (int) nodeTag(node));
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tomas Vondra (#1)
Re: FIX : teach expression walker about RestrictInfo

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

the attached trivial patch adds handling of RestrictInfo nodes into
expression_tree_walker().

RestrictInfo is not a general expression node and support for it has
been deliberately omitted from expression_tree_walker(). So I think
what you are proposing is a bad idea and probably a band-aid for some
other bad idea.

This is needed for example when calling
pull_varnos or (or other functions using the expression walker) in
clausesel.c, for example. An example of a query causing errors with
pull_varnos is

select * from t where (a >= 10 and a <= 20) or (b >= 15 and b <= 20);

Really?

regression=# create table t (a int, b int);
CREATE TABLE
regression=# select * from t where (a >= 10 and a <= 20) or (b >= 15 and b <= 20);
a | b
---+---
(0 rows)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Tom Lane (#2)
Re: FIX : teach expression walker about RestrictInfo

Hi,

On 04/28/15 21:50, Tom Lane wrote:

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

the attached trivial patch adds handling of RestrictInfo nodes into
expression_tree_walker().

RestrictInfo is not a general expression node and support for it has
been deliberately omitted from expression_tree_walker(). So I think
what you are proposing is a bad idea and probably a band-aid for some
other bad idea.

This is needed for example when calling
pull_varnos or (or other functions using the expression walker) in
clausesel.c, for example. An example of a query causing errors with
pull_varnos is

select * from t where (a >= 10 and a <= 20) or (b >= 15 and b <= 20);

Really?

regression=# create table t (a int, b int);
CREATE TABLE
regression=# select * from t where (a >= 10 and a <= 20) or (b >= 15 and b <= 20);
a | b
---+---
(0 rows)

That's not what I said, though. I said that calling pull_varnos() causes
the issue - apparently that does not happen in master, but I ran into
that when hacking on a patch.

For example try adding this

Relids tmp = pull_varnos(clause);
elog(WARNING, "count = %d", bms_num_members(tmp));

into the or_clause branch in clause_selectivity(), and then running the
query will give you this:

db=# select * from t where (a >= 10 and a <= 20) or (b >= 15);
ERROR: unrecognized node type: 524

But as I said - maybe calls to pull_varnos are not supposed to happen in
this part of the code, for some reason, and it really is a bug in my patch.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tomas Vondra (#3)
Re: FIX : teach expression walker about RestrictInfo

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

On 04/28/15 21:50, Tom Lane wrote:

RestrictInfo is not a general expression node and support for it has
been deliberately omitted from expression_tree_walker(). So I think
what you are proposing is a bad idea and probably a band-aid for some
other bad idea.

That's not what I said, though. I said that calling pull_varnos() causes
the issue - apparently that does not happen in master, but I ran into
that when hacking on a patch.

pull_varnos is not, and should not be, applied to a RestrictInfo; for one
thing, it'd be redundant with work that was already done when creating the
RestrictInfo (cf make_restrictinfo_internal). You've not shown the
context of your problem very fully, but in general most code in the planner
should be well aware of whether it is working with a RestrictInfo (or list
of same) or a bare expression. I don't believe that it's a very good idea
to obscure that distinction.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Tom Lane (#4)
Re: FIX : teach expression walker about RestrictInfo

Hi,

On 04/29/15 05:55, Tom Lane wrote:

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

On 04/28/15 21:50, Tom Lane wrote:

RestrictInfo is not a general expression node and support for it has
been deliberately omitted from expression_tree_walker(). So I think
what you are proposing is a bad idea and probably a band-aid for some
other bad idea.

That's not what I said, though. I said that calling pull_varnos() causes
the issue - apparently that does not happen in master, but I ran into
that when hacking on a patch.

pull_varnos is not, and should not be, applied to a RestrictInfo; for one
thing, it'd be redundant with work that was already done when creating the
RestrictInfo (cf make_restrictinfo_internal). You've not shown the
context of your problem very fully, but in general most code in the planner
should be well aware of whether it is working with a RestrictInfo (or list
of same) or a bare expression. I don't believe that it's a very good idea
to obscure that distinction.

OK, let me explain the context a bit more. When working on the
multivariate statistics patch, I need to choose which stats to use for
estimating the clauses. I do that in clauselist_selectivity(), although
there might be other places where to do that.

Selecting the stats that best match the clauses is based on how well
they cover the vars in the clauses (and some other criteria, that is
mostly irrelevant here). So at some point I do call functions like
pull_varnos() and pull_varattnos() to get this information.

I do handle RestrictInfo nodes explicitly - for those nodes I can do get
the info from the node itself. But this does not work for the condition
I posted, because that contains nested RestrictInfo nodes. So I do call
pull_varnos() on a BoolExpr node, but in reality the node tree
representing the clauses

WHERE (a >= 10 AND a <= 20) OR (b >= 20 AND b <= 40)

looks like this:

BoolExpr [OR_EXPR]
BoolExpr [AND_EXPR]
RestrictInfo
OpExpr [Var >= Const]
RestrictInfo
OpExpr [Var <= Const]
BoolExpr [AND_EXPR]
RestrictInfo
OpExpr [Var >= Const]
RestrictInfo
OpExpr [Var <= Const]

so the pull_varnos() fails because it runs into RestrictInfo node while
walking the tree.

So I see three possibilities:

1) pull_varnos/pull_varattnos (and such functions, using the
expression walker internally) are not supposed to be called unless
you are sure there are no RestrictInfo nodes in the tree, but this
seems really awkward

2) expression walker is supposed to know about RestrictInfo nodes (as I
did in my patch), but you say this is not the case

3) pull_varnos_walker / pull_varattnos_walker are supposed to handle
RestrictInfo nodes explicitly (either by using the cached information
or by using RestrictInfo->clause in the next step)

regards

--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tomas Vondra (#5)
Re: FIX : teach expression walker about RestrictInfo

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

On 04/29/15 05:55, Tom Lane wrote:

pull_varnos is not, and should not be, applied to a RestrictInfo; for one
thing, it'd be redundant with work that was already done when creating the
RestrictInfo (cf make_restrictinfo_internal). You've not shown the
context of your problem very fully, but in general most code in the planner
should be well aware of whether it is working with a RestrictInfo (or list
of same) or a bare expression. I don't believe that it's a very good idea
to obscure that distinction.

OK, let me explain the context a bit more. When working on the
multivariate statistics patch, I need to choose which stats to use for
estimating the clauses. I do that in clauselist_selectivity(), although
there might be other places where to do that.

Hm, well, clauselist_selectivity and clause_selectivity already contain
extensive special-casing for RestrictInfos, so I'm not sure that I see why
you're having difficulty working within that structure for this change.

But there are basic reasons why expression_tree_walker should not try to
deal with RestrictInfos; the most obvious one being that it's not clear
whether it should descend into both the basic and OR-clause subtrees.
Also, if a node has expression_tree_walker support then it should
logically have expression_tree_mutator support as well, but that's
got multiple issues. RestrictInfos are not supposed to be copied
once created; and the mutator couldn't detect whether their derived
fields are still valid.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Tom Lane (#6)
Re: FIX : teach expression walker about RestrictInfo

On 04/29/15 18:26, Tom Lane wrote:

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

...

OK, let me explain the context a bit more. When working on the
multivariate statistics patch, I need to choose which stats to use for
estimating the clauses. I do that in clauselist_selectivity(), although
there might be other places where to do that.

Hm, well, clauselist_selectivity and clause_selectivity already contain
extensive special-casing for RestrictInfos, so I'm not sure that I see why
you're having difficulty working within that structure for this change.

So let's I'm in the clause_selectivity(), and have AND or OR clause to
estimate. I need to get varnos/varattnos for the clause(s). What should
I do? I can't call pull_varnos() or pull_varattnos() because there might
be nested RestrictInfos, as demonstrated by the query I posted.

But there are basic reasons why expression_tree_walker should not try
to deal with RestrictInfos; the most obvious one being that it's not
clear whether it should descend into both the basic and OR-clause
subtrees. Also, if a node has expression_tree_walker support then it
should logically have expression_tree_mutator support as well, but
that's got multiple issues. RestrictInfos are not supposed to be
copied once created; and the mutator couldn't detect whether their
derived fields are still valid.

OK, I do understand that. So what about pull_varnos_walker and
pull_varattnos_walker - what about teaching them about RestrictInfos?

--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Tomas Vondra (#7)
1 attachment(s)
Re: FIX : teach expression walker about RestrictInfo

On 04/29/15 18:33, Tomas Vondra wrote:

OK, I do understand that. So what about pull_varnos_walker and
pull_varattnos_walker - what about teaching them about RestrictInfos?

Attached is a patch fixing the issue by handling RestrictInfo in
pull_varnos_walker and pull_varattnos_walker.

--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

fix-varno-varattno-restrict-info.patchtext/x-patch; name=fix-varno-varattno-restrict-info.patchDownload
>From 2dc79b914c759d31becd8ae670b37b79663a595f Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@pgaddict.com>
Date: Tue, 28 Apr 2015 19:56:33 +0200
Subject: [PATCH 1/6] teach pull_(varno|varattno)_walker about RestrictInfo

otherwise pull_varnos fails when processing OR clauses
---
 src/backend/optimizer/util/var.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c
index 773e7b2..221b031 100644
--- a/src/backend/optimizer/util/var.c
+++ b/src/backend/optimizer/util/var.c
@@ -197,6 +197,13 @@ pull_varnos_walker(Node *node, pull_varnos_context *context)
 		context->sublevels_up--;
 		return result;
 	}
+	if (IsA(node, RestrictInfo))
+	{
+		RestrictInfo *rinfo = (RestrictInfo*)node;
+		context->varnos = bms_add_members(context->varnos,
+										  rinfo->clause_relids);
+		return false;
+	}
 	return expression_tree_walker(node, pull_varnos_walker,
 								  (void *) context);
 }
@@ -245,6 +252,15 @@ pull_varattnos_walker(Node *node, pull_varattnos_context *context)
 		return false;
 	}
 
+	if (IsA(node, RestrictInfo))
+	{
+		RestrictInfo *rinfo = (RestrictInfo *)node;
+
+		return expression_tree_walker((Node*)rinfo->clause,
+									  pull_varattnos_walker,
+									  (void*) context);
+	}
+
 	/* Should not find an unplanned subquery */
 	Assert(!IsA(node, Query));
 
-- 
1.9.3

#9Robert Haas
robertmhaas@gmail.com
In reply to: Tomas Vondra (#7)
Re: FIX : teach expression walker about RestrictInfo

On Wed, Apr 29, 2015 at 12:33 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

On 04/29/15 18:26, Tom Lane wrote:

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

OK, let me explain the context a bit more. When working on the
multivariate statistics patch, I need to choose which stats to use for
estimating the clauses. I do that in clauselist_selectivity(), although
there might be other places where to do that.

Hm, well, clauselist_selectivity and clause_selectivity already contain
extensive special-casing for RestrictInfos, so I'm not sure that I see why
you're having difficulty working within that structure for this change.

So let's I'm in the clause_selectivity(), and have AND or OR clause to
estimate. I need to get varnos/varattnos for the clause(s). What should I
do? I can't call pull_varnos() or pull_varattnos() because there might be
nested RestrictInfos, as demonstrated by the query I posted.

But there are basic reasons why expression_tree_walker should not try
to deal with RestrictInfos; the most obvious one being that it's not
clear whether it should descend into both the basic and OR-clause
subtrees. Also, if a node has expression_tree_walker support then it
should logically have expression_tree_mutator support as well, but
that's got multiple issues. RestrictInfos are not supposed to be
copied once created; and the mutator couldn't detect whether their
derived fields are still valid.

OK, I do understand that. So what about pull_varnos_walker and
pull_varattnos_walker - what about teaching them about RestrictInfos?

Tom:

This patch has become part 1 of many under the "multivariate
statistics vNNN" family of threads, but I guess it would be helpful if
you could opine on the reasonable-ness of this approach. I don't want
to commit anything over your objections, but this kind of tailed off
without a conclusion.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#9)
Re: FIX : teach expression walker about RestrictInfo

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Apr 29, 2015 at 12:33 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

On 04/29/15 18:26, Tom Lane wrote:

But there are basic reasons why expression_tree_walker should not try
to deal with RestrictInfos; the most obvious one being that it's not
clear whether it should descend into both the basic and OR-clause
subtrees. Also, if a node has expression_tree_walker support then it
should logically have expression_tree_mutator support as well, but
that's got multiple issues. RestrictInfos are not supposed to be
copied once created; and the mutator couldn't detect whether their
derived fields are still valid.

OK, I do understand that. So what about pull_varnos_walker and
pull_varattnos_walker - what about teaching them about RestrictInfos?

This patch has become part 1 of many under the "multivariate
statistics vNNN" family of threads, but I guess it would be helpful if
you could opine on the reasonable-ness of this approach. I don't want
to commit anything over your objections, but this kind of tailed off
without a conclusion.

I'm up to my ears in psql at the moment, but hope to get to the
multivariate stats patch later, maybe next week. In the meantime,
I remain of the opinion that RestrictInfo is not an expression node
and wanting expression_tree_walker to handle it is wrong. I'm
suspicious about pull_varnos; it might be all right given that we
can assume the same Vars are in both subtrees, but I still don't
really see why that one function has suddenly grown this need when
others have not.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#10)
Re: FIX : teach expression walker about RestrictInfo

On Fri, Mar 18, 2016 at 3:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Apr 29, 2015 at 12:33 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

On 04/29/15 18:26, Tom Lane wrote:

But there are basic reasons why expression_tree_walker should not try
to deal with RestrictInfos; the most obvious one being that it's not
clear whether it should descend into both the basic and OR-clause
subtrees. Also, if a node has expression_tree_walker support then it
should logically have expression_tree_mutator support as well, but
that's got multiple issues. RestrictInfos are not supposed to be
copied once created; and the mutator couldn't detect whether their
derived fields are still valid.

OK, I do understand that. So what about pull_varnos_walker and
pull_varattnos_walker - what about teaching them about RestrictInfos?

This patch has become part 1 of many under the "multivariate
statistics vNNN" family of threads, but I guess it would be helpful if
you could opine on the reasonable-ness of this approach. I don't want
to commit anything over your objections, but this kind of tailed off
without a conclusion.

I'm up to my ears in psql at the moment, but hope to get to the
multivariate stats patch later, maybe next week.

Oh, cool.

In the meantime,
I remain of the opinion that RestrictInfo is not an expression node
and wanting expression_tree_walker to handle it is wrong. I'm
suspicious about pull_varnos; it might be all right given that we
can assume the same Vars are in both subtrees, but I still don't
really see why that one function has suddenly grown this need when
others have not.

I haven't studied the patch series in enough detail to have an
educated opinion on that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Robert Haas (#11)
Re: FIX : teach expression walker about RestrictInfo

On 03/18/2016 08:53 PM, Robert Haas wrote:

On Fri, Mar 18, 2016 at 3:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Apr 29, 2015 at 12:33 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

On 04/29/15 18:26, Tom Lane wrote:

But there are basic reasons why expression_tree_walker should not try
to deal with RestrictInfos; the most obvious one being that it's not
clear whether it should descend into both the basic and OR-clause
subtrees. Also, if a node has expression_tree_walker support then it
should logically have expression_tree_mutator support as well, but
that's got multiple issues. RestrictInfos are not supposed to be
copied once created; and the mutator couldn't detect whether their
derived fields are still valid.

OK, I do understand that. So what about pull_varnos_walker and
pull_varattnos_walker - what about teaching them about RestrictInfos?

This patch has become part 1 of many under the "multivariate
statistics vNNN" family of threads, but I guess it would be helpful if
you could opine on the reasonable-ness of this approach. I don't want
to commit anything over your objections, but this kind of tailed off
without a conclusion.

I'm up to my ears in psql at the moment, but hope to get to the
multivariate stats patch later, maybe next week.

Oh, cool.

In the meantime,
I remain of the opinion that RestrictInfo is not an expression node
and wanting expression_tree_walker to handle it is wrong. I'm
suspicious about pull_varnos; it might be all right given that we
can assume the same Vars are in both subtrees, but I still don't
really see why that one function has suddenly grown this need when
others have not.

I haven't studied the patch series in enough detail to have an
educated opinion on that.

FWIW I'm not convinced this part of the patch (matching the clauses to
the available stats, including the pull_varnos changes) is perfectly
correct either, and it's quite possible it will need reworking. But it
needs a pair of fresh eyes, I guess.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers