review: Non-recursive processing of AND/OR lists

Started by Pavel Stehuleover 12 years ago26 messages

pavel.stehule@gmail.com

over 12 years ago

Hello

* motivation: remove recursive procession of AND/OR list (hangs with
10062 and more subexpressions)

* patch is short, clean and respect postgresql source code requirements
* patch was applied cleanly without warnings
* all regression tests was passed
* I successfully evaluated expression with 100000 subexpressions
* there is no significant slowdown

possible improvements

a = (A_Expr*) list_nth(pending, 0);

a = (A_Expr*) linitial(pending);

not well comment

should be -- "If the right branch is also an SAME condition, append it to the"

+			/*
+			 * If the right branch is also an AND condition, append it to the
+			 * pending list, to be processed later. This allows us to walk even
+			 * bushy trees, not just left-deep trees.
+			 */
+			if (IsA(a->rexpr, A_Expr) && ((A_Expr*)a->rexpr)->kind == root_kind)
+			{
+				pending = lappend(pending, a->rexpr);
+			}
+			else
+			{
+				expr = transformExprRecurse(pstate, a->rexpr);
+				expr = coerce_to_boolean(pstate, expr, root_kind == AEXPR_AND ?
"AND" : "OR");
+				exprs = lcons(expr, exprs);
+			}

I don't see any other issues, so after fixing comments this patch is
ready for commit

Regards

Pavel Stehule

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Gurjeet Singh

gurjeet@singh.im

over 12 years ago

In reply to: Pavel Stehule (#1)

Re: review: Non-recursive processing of AND/OR lists

Thanks for the review Pavel.

On Tue, Jun 18, 2013 at 3:01 PM, Pavel Stehule <pavel.stehule@gmail.com>wrote:

Hello

related to

https://commitfest.postgresql.org/action/patch_view?id=1130

/messages/by-id/CABwTF4V9rsjiBWE+87pK83Mmm7ACdrG7sZ08RQ-4qYMe8jvhbw@mail.gmail.com

* motivation: remove recursive procession of AND/OR list (hangs with
10062 and more subexpressions)

* patch is short, clean and respect postgresql source code requirements
* patch was applied cleanly without warnings
* all regression tests was passed
* I successfully evaluated expression with 100000 subexpressions
* there is no significant slowdown

possible improvements

a = (A_Expr*) list_nth(pending, 0);

a = (A_Expr*) linitial(pending);

not well comment

should be -- "If the right branch is also an SAME condition, append it to
the"
+                       /*
+                        * If the right branch is also an AND condition,
append it to the
+                        * pending list, to be processed later. This
allows us to walk even
+                        * bushy trees, not just left-deep trees.
+                        */
+                       if (IsA(a->rexpr, A_Expr) &&
((A_Expr*)a->rexpr)->kind == root_kind)
+                       {
+                               pending = lappend(pending, a->rexpr);
+                       }
+                       else
+                       {
+                               expr = transformExprRecurse(pstate,
a->rexpr);
+                               expr = coerce_to_boolean(pstate, expr,
root_kind == AEXPR_AND ?
"AND" : "OR");
+                               exprs = lcons(expr, exprs);
+                       }
I don't see any other issues, so after fixing comments this patch is
ready for commit

Regards

Pavel Stehule

--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

Gurjeet Singh

gurjeet@singh.im

over 12 years ago

In reply to: Pavel Stehule (#1)

1 attachment(s)

Re: review: Non-recursive processing of AND/OR lists

On Tue, Jun 18, 2013 at 3:01 PM, Pavel Stehule <pavel.stehule@gmail.com>wrote:

related to

https://commitfest.postgresql.org/action/patch_view?id=1130

/messages/by-id/CABwTF4V9rsjiBWE+87pK83Mmm7ACdrG7sZ08RQ-4qYMe8jvhbw@mail.gmail.com

* motivation: remove recursive procession of AND/OR list (hangs with
10062 and more subexpressions)

* patch is short, clean and respect postgresql source code requirements
* patch was applied cleanly without warnings
* all regression tests was passed
* I successfully evaluated expression with 100000 subexpressions
* there is no significant slowdown

possible improvements

a = (A_Expr*) list_nth(pending, 0);

a = (A_Expr*) linitial(pending);

I made that change, hesitantly. The comments above definition of linitial()
macro describe the confusion that API causes. I wanted to avoid that
confusion for new code, so I used the newer API which makes the intention
quite clear. But looking at that code closely, list_nth() causes at least 2
function calls, and that's pretty heavy compared to the linitiali() macro.

not well comment

should be -- "If the right branch is also an SAME condition, append it to
the"

I moved that comment above the outer bock, so that the intention of the
whole do-while code block is described in one place.

I don't see any other issues, so after fixing comments this patch is

ready for commit

Thanks for the review Pavel.

Attached is the updated patch, v4. It has the above edits, and a few code
improvements, like not repeating the (root_kind == AEPR_AND ? .. : ..)
ternary expression.

Best regards,
--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

Attachments:

non_recursive_and_or_transformation_v4.patchapplication/octet-stream; name=non_recursive_and_or_transformation_v4.patchDownload

diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 327557e..61a1cd9 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -40,8 +40,7 @@ bool		Transform_null_equals = false;
 static Node *transformExprRecurse(ParseState *pstate, Node *expr);
 static Node *transformParamRef(ParseState *pstate, ParamRef *pref);
 static Node *transformAExprOp(ParseState *pstate, A_Expr *a);
-static Node *transformAExprAnd(ParseState *pstate, A_Expr *a);
-static Node *transformAExprOr(ParseState *pstate, A_Expr *a);
+static Node *transformAExprAndOr(ParseState *pstate, A_Expr *a);
 static Node *transformAExprNot(ParseState *pstate, A_Expr *a);
 static Node *transformAExprOpAny(ParseState *pstate, A_Expr *a);
 static Node *transformAExprOpAll(ParseState *pstate, A_Expr *a);
@@ -223,10 +222,10 @@ transformExprRecurse(ParseState *pstate, Node *expr)
 						result = transformAExprOp(pstate, a);
 						break;
 					case AEXPR_AND:
-						result = transformAExprAnd(pstate, a);
+						result = transformAExprAndOr(pstate, a);
 						break;
 					case AEXPR_OR:
-						result = transformAExprOr(pstate, a);
+						result = transformAExprAndOr(pstate, a);
 						break;
 					case AEXPR_NOT:
 						result = transformAExprNot(pstate, a);
@@ -920,27 +916,72 @@ transformAExprOp(ParseState *pstate, A_Expr *a)
-static Node *
-transformAExprAnd(ParseState *pstate, A_Expr *a)
-{
-	Node	   *lexpr = transformExprRecurse(pstate, a->lexpr);
-	Node	   *rexpr = transformExprRecurse(pstate, a->rexpr);
-
-	lexpr = coerce_to_boolean(pstate, lexpr, "AND");
-	rexpr = coerce_to_boolean(pstate, rexpr, "AND");
-
-	return (Node *) makeBoolExpr(AND_EXPR,
-								 list_make2(lexpr, rexpr),
-								 a->location);
-}
-
-static Node *
-transformAExprOr(ParseState *pstate, A_Expr *a)
-{
-	Node	   *lexpr = transformExprRecurse(pstate, a->lexpr);
-	Node	   *rexpr = transformExprRecurse(pstate, a->rexpr);
-
-	lexpr = coerce_to_boolean(pstate, lexpr, "OR");
-	rexpr = coerce_to_boolean(pstate, rexpr, "OR");
-
-	return (Node *) makeBoolExpr(OR_EXPR,
-								 list_make2(lexpr, rexpr),
-								 a->location);
-}
+/*
+ * Transform the AND/OR trees non-recursively.
+ *
+ * The parser turns a list of consecutive AND expressions into a left-deep tree.
+ *
+ * a AND b AND c
+ *
+ *      AND
+ *     /  \
+ *   AND   c
+ *  /  \
+ * a    b
+ *
+ * For very long lists, it gets deep enough that processing it recursively causes
+ * check_stack_depth() to raise error and abort the query. Hence, it is necessary
+ * that we process these trees iteratively.
+ */
+static Node *
+transformAExprAndOr(ParseState *pstate, A_Expr *a)
+{
+	List		   *exprs = NIL;
+	List		   *pending = NIL;
+	Node		   *expr;
+	A_Expr		   *root = a;
+	A_Expr_Kind		root_kind = a->kind;
+	char		   *root_char = (root_kind == AEXPR_AND ? "AND" : "OR");
+	BoolExprType	root_bool_expr = (root_kind == AEXPR_AND ? AND_EXPR : OR_EXPR);
+
+	pending = lappend(pending, a);
+
+	while (list_length(pending) > 0)
+	{
+		Node *tmp;
+		a = (A_Expr*) linitial(pending);
+
+		pending = list_delete_first(pending);
+
+		/*
+		 * Follow the left links to walk the left-deep tree, and process all the
+		 * right branches.xi
+		 *
+		 * If a right branch is also the same kind of tree as the root, then
+		 * append it to the 'pending' list. The pending list is also processed
+		 * in this function call iteratively rather than recursing.  This allows
+		 * us to process even bushy trees, not just left-deep trees.
+		 */
+		tmp = (Node*)a;
+		do {
+			a = (A_Expr*) tmp;
+
+			if (IsA(a->rexpr, A_Expr) && ((A_Expr*)a->rexpr)->kind == root_kind)
+			{
+				pending = lappend(pending, a->rexpr);
+			}
+			else
+			{
+				expr = transformExprRecurse(pstate, a->rexpr);
+				expr = coerce_to_boolean(pstate, expr, root_char);
+				exprs = lcons(expr, exprs);
+			}
+
+			tmp = a->lexpr;
+		} while (IsA(tmp, A_Expr) && ((A_Expr*)tmp)->kind == root_kind);
+
+		/* Now process the last left expression */
+		expr = transformExprRecurse(pstate, a->lexpr);
+		expr = coerce_to_boolean(pstate, expr, root_char);
+		exprs = lcons(expr, exprs);
+	}
+
+	return (Node *) makeBoolExpr(root_bool_expr, exprs, root->location);
+}
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 57ae842..42f0ece 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2092,7 +2092,7 @@ SELECT viewname, definition FROM pg_views WHERE schemaname <> 'information_schem
                                  |     int4smaller(rsh.sh_avail, rsl.sl_avail) AS total_avail                                                                                                                                                     +
                                  |    FROM shoe rsh,                                                                                                                                                                                              +
                                  |     shoelace rsl                                                                                                                                                                                               +
-                                 |   WHERE (((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm)) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm));
+                                 |   WHERE ((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm));
  shoelace                        |  SELECT s.sl_name,                                                                                                                                                                                             +
                                  |     s.sl_avail,                                                                                                                                                                                                +
                                  |     s.sl_color,                                                                                                                                                                                                +

Pavel Stehule

pavel.stehule@gmail.com

over 12 years ago

In reply to: Gurjeet Singh (#3)

Re: review: Non-recursive processing of AND/OR lists

Hello

just one small notices

I dislike a name "root_bool_expr", because, there is not a expression,
but expression type. Can you use "root_bool_expr_type" instead? It is
little bit longer, but more correct. Same not best name is
"root_char", maybe "root_bool_op_name"

or root_expr_type and root_op_name ???

Have no other comments

Regards

Pavel

2013/6/30 Gurjeet Singh <gurjeet@singh.im>:

On Tue, Jun 18, 2013 at 3:01 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

related to

https://commitfest.postgresql.org/action/patch_view?id=1130

/messages/by-id/CABwTF4V9rsjiBWE+87pK83Mmm7ACdrG7sZ08RQ-4qYMe8jvhbw@mail.gmail.com

* motivation: remove recursive procession of AND/OR list (hangs with
10062 and more subexpressions)

* patch is short, clean and respect postgresql source code requirements
* patch was applied cleanly without warnings
* all regression tests was passed
* I successfully evaluated expression with 100000 subexpressions
* there is no significant slowdown

possible improvements

a = (A_Expr*) list_nth(pending, 0);

a = (A_Expr*) linitial(pending);

I made that change, hesitantly. The comments above definition of linitial()
macro describe the confusion that API causes. I wanted to avoid that
confusion for new code, so I used the newer API which makes the intention
quite clear. But looking at that code closely, list_nth() causes at least 2
function calls, and that's pretty heavy compared to the linitiali() macro.

not well comment

should be -- "If the right branch is also an SAME condition, append it to
the"

I moved that comment above the outer bock, so that the intention of the
whole do-while code block is described in one place.

I don't see any other issues, so after fixing comments this patch is
ready for commit

Thanks for the review Pavel.

Attached is the updated patch, v4. It has the above edits, and a few code
improvements, like not repeating the (root_kind == AEPR_AND ? .. : ..)
ternary expression.

Best regards,
--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Gurjeet Singh

gurjeet@singh.im

over 12 years ago

In reply to: Pavel Stehule (#4)

Re: review: Non-recursive processing of AND/OR lists

On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule <pavel.stehule@gmail.com>wrote:

Hello

just one small notices

I dislike a name "root_bool_expr", because, there is not a expression,
but expression type. Can you use "root_bool_expr_type" instead? It is
little bit longer, but more correct. Same not best name is
"root_char", maybe "root_bool_op_name"

or root_expr_type and root_op_name ???

How about naming those 3 variables as follows:

root_expr_kind
root_expr_name
root_bool_expr_type

--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

Pavel Stehule

pavel.stehule@gmail.com

over 12 years ago

In reply to: Gurjeet Singh (#5)

Re: review: Non-recursive processing of AND/OR lists

2013/6/30 Gurjeet Singh <gurjeet@singh.im>:

On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

Hello

just one small notices

I dislike a name "root_bool_expr", because, there is not a expression,
but expression type. Can you use "root_bool_expr_type" instead? It is
little bit longer, but more correct. Same not best name is
"root_char", maybe "root_bool_op_name"

or root_expr_type and root_op_name ???

How about naming those 3 variables as follows:

root_expr_kind
root_expr_name
root_bool_expr_type

Pavel

--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Gurjeet Singh

gurjeet@singh.im

over 12 years ago

In reply to: Pavel Stehule (#6)

Re: review: Non-recursive processing of AND/OR lists

On Sun, Jun 30, 2013 at 11:46 AM, Pavel Stehule <pavel.stehule@gmail.com>wrote:

2013/6/30 Gurjeet Singh <gurjeet@singh.im>:

On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule <pavel.stehule@gmail.com

wrote:

How about naming those 3 variables as follows:

root_expr_kind
root_expr_name
root_bool_expr_type

+1

Thanks. Attached is the patch with that change. I'll update the commitfest
entry with a link to this email.

--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

Gurjeet Singh

gurjeet@singh.im

over 12 years ago

In reply to: Gurjeet Singh (#7)

1 attachment(s)

Fwd: review: Non-recursive processing of AND/OR lists

On Sun, Jun 30, 2013 at 11:46 AM, Pavel Stehule <pavel.stehule@gmail.com>wrote:

2013/6/30 Gurjeet Singh <gurjeet@singh.im>:

On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule <pavel.stehule@gmail.com

wrote:

How about naming those 3 variables as follows:

root_expr_kind
root_expr_name
root_bool_expr_type

+1

Thanks. Attached is the patch with that change. I'll update the commitfest
entry with a link to this email.

--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

Attachments:

non_recursive_and_or_transformation_v5.patchapplication/octet-stream; name=non_recursive_and_or_transformation_v5.patchDownload

diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 327557e..61a1cd9 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -40,8 +40,7 @@ bool		Transform_null_equals = false;
 static Node *transformExprRecurse(ParseState *pstate, Node *expr);
 static Node *transformParamRef(ParseState *pstate, ParamRef *pref);
 static Node *transformAExprOp(ParseState *pstate, A_Expr *a);
-static Node *transformAExprAnd(ParseState *pstate, A_Expr *a);
-static Node *transformAExprOr(ParseState *pstate, A_Expr *a);
+static Node *transformAExprAndOr(ParseState *pstate, A_Expr *a);
 static Node *transformAExprNot(ParseState *pstate, A_Expr *a);
 static Node *transformAExprOpAny(ParseState *pstate, A_Expr *a);
 static Node *transformAExprOpAll(ParseState *pstate, A_Expr *a);
@@ -223,10 +222,10 @@ transformExprRecurse(ParseState *pstate, Node *expr)
 						result = transformAExprOp(pstate, a);
 						break;
 					case AEXPR_AND:
-						result = transformAExprAnd(pstate, a);
+						result = transformAExprAndOr(pstate, a);
 						break;
 					case AEXPR_OR:
-						result = transformAExprOr(pstate, a);
+						result = transformAExprAndOr(pstate, a);
 						break;
 					case AEXPR_NOT:
 						result = transformAExprNot(pstate, a);
@@ -920,27 +916,72 @@ transformAExprOp(ParseState *pstate, A_Expr *a)
-static Node *
-transformAExprAnd(ParseState *pstate, A_Expr *a)
-{
-	Node	   *lexpr = transformExprRecurse(pstate, a->lexpr);
-	Node	   *rexpr = transformExprRecurse(pstate, a->rexpr);
-
-	lexpr = coerce_to_boolean(pstate, lexpr, "AND");
-	rexpr = coerce_to_boolean(pstate, rexpr, "AND");
-
-	return (Node *) makeBoolExpr(AND_EXPR,
-								 list_make2(lexpr, rexpr),
-								 a->location);
-}
-
-static Node *
-transformAExprOr(ParseState *pstate, A_Expr *a)
-{
-	Node	   *lexpr = transformExprRecurse(pstate, a->lexpr);
-	Node	   *rexpr = transformExprRecurse(pstate, a->rexpr);
-
-	lexpr = coerce_to_boolean(pstate, lexpr, "OR");
-	rexpr = coerce_to_boolean(pstate, rexpr, "OR");
-
-	return (Node *) makeBoolExpr(OR_EXPR,
-								 list_make2(lexpr, rexpr),
-								 a->location);
-}
+/*
+ * Transform the AND/OR trees non-recursively.
+ *
+ * The parser turns a list of consecutive AND expressions into a left-deep tree.
+ *
+ * a AND b AND c
+ *
+ *      AND
+ *     /  \
+ *   AND   c
+ *  /  \
+ * a    b
+ *
+ * For very long lists, it gets deep enough that processing it recursively causes
+ * check_stack_depth() to raise error and abort the query. Hence, it is necessary
+ * that we process these trees iteratively.
+ */
+static Node *
+transformAExprAndOr(ParseState *pstate, A_Expr *a)
+{
+	List		   *exprs = NIL;
+	List		   *pending = NIL;
+	Node		   *expr;
+	A_Expr		   *root = a;
+	A_Expr_Kind		root_expr_kind = a->kind;
+	char		   *root_expr_name = (root_expr_kind == AEXPR_AND ? "AND" : "OR");
+	BoolExprType	root_bool_expr_type = (root_expr_kind == AEXPR_AND ? AND_EXPR : OR_EXPR);
+
+	pending = lappend(pending, a);
+
+	while (list_length(pending) > 0)
+	{
+		Node *tmp;
+		a = (A_Expr*) linitial(pending);
+
+		pending = list_delete_first(pending);
+
+		/*
+		 * Follow the left links to walk the left-deep tree, and process all the
+		 * right branches.
+		 *
+		 * If a right branch is also the same kind of tree as the root, then
+		 * append it to the 'pending' list. The pending list is also processed
+		 * in this function call iteratively rather than recursively.  This
+		 * allows us to process even bushy trees, not just left-deep trees.
+		 */
+		tmp = (Node*)a;
+		do {
+			a = (A_Expr*) tmp;
+
+			if (IsA(a->rexpr, A_Expr) && ((A_Expr*)a->rexpr)->kind == root_expr_kind)
+			{
+				pending = lappend(pending, a->rexpr);
+			}
+			else
+			{
+				expr = transformExprRecurse(pstate, a->rexpr);
+				expr = coerce_to_boolean(pstate, expr, root_expr_name);
+				exprs = lcons(expr, exprs);
+			}
+
+			tmp = a->lexpr;
+		} while (IsA(tmp, A_Expr) && ((A_Expr*)tmp)->kind == root_expr_kind);
+
+		/* Now process the last left expression */
+		expr = transformExprRecurse(pstate, a->lexpr);
+		expr = coerce_to_boolean(pstate, expr, root_expr_name);
+		exprs = lcons(expr, exprs);
+	}
+
+	return (Node *) makeBoolExpr(root_bool_expr_type, exprs, root->location);
+}
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 57ae842..42f0ece 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2092,7 +2092,7 @@ SELECT viewname, definition FROM pg_views WHERE schemaname <> 'information_schem
                                  |     int4smaller(rsh.sh_avail, rsl.sl_avail) AS total_avail                                                                                                                                                     +
                                  |    FROM shoe rsh,                                                                                                                                                                                              +
                                  |     shoelace rsl                                                                                                                                                                                               +
-                                 |   WHERE (((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm)) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm));
+                                 |   WHERE ((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm));
  shoelace                        |  SELECT s.sl_name,                                                                                                                                                                                             +
                                  |     s.sl_avail,                                                                                                                                                                                                +
                                  |     s.sl_color,                                                                                                                                                                                                +

Pavel Stehule

pavel.stehule@gmail.com

over 12 years ago

In reply to: Gurjeet Singh (#7)

Re: review: Non-recursive processing of AND/OR lists

2013/6/30 Gurjeet Singh <gurjeet@singh.im>:

On Sun, Jun 30, 2013 at 11:46 AM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

2013/6/30 Gurjeet Singh <gurjeet@singh.im>:

On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule
<pavel.stehule@gmail.com>
wrote:

How about naming those 3 variables as follows:

root_expr_kind
root_expr_name
root_bool_expr_type

+1

Thanks. Attached is the patch with that change. I'll update the commitfest
entry with a link to this email.

I chechecked it - patched without warnings, all tests passed

It is ready for commit

Regards

Pavel

--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Robert Haas

robertmhaas@gmail.com

over 12 years ago

In reply to: Pavel Stehule (#9)

Re: review: Non-recursive processing of AND/OR lists

On Sun, Jun 30, 2013 at 1:08 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

2013/6/30 Gurjeet Singh <gurjeet@singh.im>:

On Sun, Jun 30, 2013 at 11:46 AM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

2013/6/30 Gurjeet Singh <gurjeet@singh.im>:

On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule
<pavel.stehule@gmail.com>
wrote:

How about naming those 3 variables as follows:

root_expr_kind
root_expr_name
root_bool_expr_type

+1

Thanks. Attached is the patch with that change. I'll update the commitfest
entry with a link to this email.

ok

I chechecked it - patched without warnings, all tests passed

It is ready for commit

I think it's a waste of code to try to handle bushy trees. A list is
not a particularly efficient representation of the pending list; this
will probably be slower than recusing in the common case. I'd suggest
keeping the logic to handle left-deep trees, which I find rather
elegant, but ditching the pending list.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Josh Berkus

josh@agliodbs.com

over 12 years ago

In reply to: Pavel Stehule (#1)

Re: review: Non-recursive processing of AND/OR lists

I think it's a waste of code to try to handle bushy trees. A list is
not a particularly efficient representation of the pending list; this
will probably be slower than recusing in the common case. I'd suggest
keeping the logic to handle left-deep trees, which I find rather
elegant, but ditching the pending list.

Is there going to be further discussion of this patch, or do I return it?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Reply to msg id not found: WM2d8ebdb4581e5ebd3a0f307bc26e839cd7831623f1ebe3a81c0ea8e7af3d3c0fcc262a774f779b0b1b360aa4e0e6866e@asav-3.01.com

#12

Robert Haas

robertmhaas@gmail.com

over 12 years ago

In reply to: Josh Berkus (#11)

Re: review: Non-recursive processing of AND/OR lists

On Wed, Jul 10, 2013 at 9:02 PM, Josh Berkus <josh@agliodbs.com> wrote:

I think it's a waste of code to try to handle bushy trees. A list is
not a particularly efficient representation of the pending list; this
will probably be slower than recusing in the common case. I'd suggest
keeping the logic to handle left-deep trees, which I find rather
elegant, but ditching the pending list.

Is there going to be further discussion of this patch, or do I return it?

Considering it's not been updated, nor my comments responded to, in
almost two weeks, I think we return it at this point.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

Gurjeet Singh

gurjeet@singh.im

over 12 years ago

In reply to: Robert Haas (#12)

1 attachment(s)

Re: review: Non-recursive processing of AND/OR lists

On Sun, Jul 14, 2013 at 8:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Jul 10, 2013 at 9:02 PM, Josh Berkus <josh@agliodbs.com> wrote:

I think it's a waste of code to try to handle bushy trees. A list is
not a particularly efficient representation of the pending list; this
will probably be slower than recusing in the common case. I'd suggest
keeping the logic to handle left-deep trees, which I find rather
elegant, but ditching the pending list.

Somehow I find it hard to believe that recursing would be more efficient
than processing the items right there. The recursion is not direct either;
transformExprRecurse() is going to call this function again, but after a
few more switch-case comparisons.

Agreed that there's overhead in allocating list items, but is it more
overhead than pushing functions on the call stack? Not sure, so I leave it
to others who understand such things better than I do.

If by common-case you mean a list of just one logical AND/OR operator, then
I agree that creating and destroying a list may incur overhead that is
relatively very expensive. To that end, I have altered the patch, attached,
to not build a pending list until we encounter a node with root_expr_kind
in a right branch.

We're getting bushy-tree processing with very little extra code, but if you
deem it not worthwhile or adding complexity, please feel free to rip it out.

Is there going to be further discussion of this patch, or do I return it?

Considering it's not been updated, nor my comments responded to, in
almost two weeks, I think we return it at this point.

Sorry, I didn't notice that this patch was put back in 'Waiting on Author'
state.

Best regards,
--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

Attachments:

non_recursive_and_or_transformation_v6.patchapplication/octet-stream; name=non_recursive_and_or_transformation_v6.patchDownload

diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 06f6512..5cbf577 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -40,8 +40,7 @@ bool		Transform_null_equals = false;
 static Node *transformExprRecurse(ParseState *pstate, Node *expr);
 static Node *transformParamRef(ParseState *pstate, ParamRef *pref);
 static Node *transformAExprOp(ParseState *pstate, A_Expr *a);
-static Node *transformAExprAnd(ParseState *pstate, A_Expr *a);
-static Node *transformAExprOr(ParseState *pstate, A_Expr *a);
+static Node *transformAExprAndOr(ParseState *pstate, A_Expr *a);
 static Node *transformAExprNot(ParseState *pstate, A_Expr *a);
 static Node *transformAExprOpAny(ParseState *pstate, A_Expr *a);
 static Node *transformAExprOpAll(ParseState *pstate, A_Expr *a);
@@ -223,10 +222,10 @@ transformExprRecurse(ParseState *pstate, Node *expr)
 						result = transformAExprOp(pstate, a);
 						break;
 					case AEXPR_AND:
-						result = transformAExprAnd(pstate, a);
+						result = transformAExprAndOr(pstate, a);
 						break;
 					case AEXPR_OR:
-						result = transformAExprOr(pstate, a);
+						result = transformAExprAndOr(pstate, a);
 						break;
 					case AEXPR_NOT:
 						result = transformAExprNot(pstate, a);
@@ -917,32 +916,85 @@ transformAExprOp(ParseState *pstate, A_Expr *a)
 	return result;
 }
 
+/*
+ * Transform the AND/OR trees non-recursively.
+ *
+ * The parser turns a list of consecutive AND expressions into a left-deep tree.
+ *
+ * a AND b AND c
+ *
+ *      AND
+ *     /  \
+ *   AND   c
+ *  /  \
+ * a    b
+ *
+ * For very long lists, it gets deep enough that processing it recursively causes
+ * check_stack_depth() to raise error and abort the query. Hence, it is necessary
+ * that we process these trees iteratively.
+ */
 static Node *
-transformAExprAnd(ParseState *pstate, A_Expr *a)
+transformAExprAndOr(ParseState *pstate, A_Expr *a)
 {
-	Node	   *lexpr = transformExprRecurse(pstate, a->lexpr);
-	Node	   *rexpr = transformExprRecurse(pstate, a->rexpr);
+	List		   *exprs = NIL;
+	List		   *pending = NIL;
+	Node		   *expr;
+	A_Expr		   *root = a;
+	A_Expr_Kind		root_expr_kind = a->kind;
+	char		   *root_expr_name = (root_expr_kind == AEXPR_AND ? "AND" : "OR");
+	BoolExprType	root_bool_expr_type = (root_expr_kind == AEXPR_AND ? AND_EXPR : OR_EXPR);
 
-	lexpr = coerce_to_boolean(pstate, lexpr, "AND");
-	rexpr = coerce_to_boolean(pstate, rexpr, "AND");
+	do {
+		Node *tmp;
 
-	return (Node *) makeBoolExpr(AND_EXPR,
-								 list_make2(lexpr, rexpr),
-								 a->location);
-}
+		/*
+		 * Follow the left links to walk the left-deep tree, and process all the
+		 * right branches.
+		 *
+		 * If a right branch is also the same kind of tree as the root, then
+		 * append it to the 'pending' list. The pending list is also processed
+		 * in this function call iteratively rather than recursively.  This
+		 * allows us to process even bushy trees, not just left-deep trees.
+		 */
+		tmp = (Node*)a;
+		do {
+			a = (A_Expr*) tmp;
 
-static Node *
-transformAExprOr(ParseState *pstate, A_Expr *a)
-{
-	Node	   *lexpr = transformExprRecurse(pstate, a->lexpr);
-	Node	   *rexpr = transformExprRecurse(pstate, a->rexpr);
+			if (IsA(a->rexpr, A_Expr) && ((A_Expr*)a->rexpr)->kind == root_expr_kind)
+			{
+				pending = lappend(pending, a->rexpr);
+			}
+			else
+			{
+				expr = transformExprRecurse(pstate, a->rexpr);
+				expr = coerce_to_boolean(pstate, expr, root_expr_name);
+				exprs = lcons(expr, exprs);
+			}
 
-	lexpr = coerce_to_boolean(pstate, lexpr, "OR");
-	rexpr = coerce_to_boolean(pstate, rexpr, "OR");
+			tmp = a->lexpr;
+		} while (IsA(tmp, A_Expr) && ((A_Expr*)tmp)->kind == root_expr_kind);
 
-	return (Node *) makeBoolExpr(OR_EXPR,
-								 list_make2(lexpr, rexpr),
-								 a->location);
+		/* Now process the last left expression */
+		expr = transformExprRecurse(pstate, a->lexpr);
+		expr = coerce_to_boolean(pstate, expr, root_expr_name);
+		exprs = lcons(expr, exprs);
+
+        /*
+		 * Now that we're done processing the edge of the left-deep tree, pop
+		 * the first element from the front of the pending list and process any
+		 * interesting nodes we found earlier.
+		 */
+		if (list_length(pending) > 0)
+		{
+			a = (A_Expr*) linitial(pending);
+			pending = list_delete_first(pending);
+		}
+		else
+			a = NULL;
+
+	} while(a != NULL);
+
+	return (Node *) makeBoolExpr(root_bool_expr_type, exprs, root->location);
 }
 
 static Node *
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 8f24c51..41bb098 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2095,7 +2095,7 @@ SELECT viewname, definition FROM pg_views WHERE schemaname <> 'information_schem
                                  |     int4smaller(rsh.sh_avail, rsl.sl_avail) AS total_avail                                                                                                                                                     +
                                  |    FROM shoe rsh,                                                                                                                                                                                              +
                                  |     shoelace rsl                                                                                                                                                                                               +
-                                 |   WHERE (((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm)) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm));
+                                 |   WHERE ((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm));
  shoelace                        |  SELECT s.sl_name,                                                                                                                                                                                             +
                                  |     s.sl_avail,                                                                                                                                                                                                +
                                  |     s.sl_color,                                                                                                                                                                                                +

#14

Pavel Stehule

pavel.stehule@gmail.com

over 12 years ago

In reply to: Gurjeet Singh (#13)

Re: review: Non-recursive processing of AND/OR lists

Hello

2013/7/15 Gurjeet Singh <gurjeet@singh.im>:

On Sun, Jul 14, 2013 at 8:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Jul 10, 2013 at 9:02 PM, Josh Berkus <josh@agliodbs.com> wrote:

I think it's a waste of code to try to handle bushy trees. A list is
not a particularly efficient representation of the pending list; this
will probably be slower than recusing in the common case. I'd suggest
keeping the logic to handle left-deep trees, which I find rather
elegant, but ditching the pending list.

Somehow I find it hard to believe that recursing would be more efficient
than processing the items right there. The recursion is not direct either;
transformExprRecurse() is going to call this function again, but after a few
more switch-case comparisons.

Agreed that there's overhead in allocating list items, but is it more
overhead than pushing functions on the call stack? Not sure, so I leave it
to others who understand such things better than I do.

If by common-case you mean a list of just one logical AND/OR operator, then
I agree that creating and destroying a list may incur overhead that is
relatively very expensive. To that end, I have altered the patch, attached,
to not build a pending list until we encounter a node with root_expr_kind in
a right branch.

We're getting bushy-tree processing with very little extra code, but if you
deem it not worthwhile or adding complexity, please feel free to rip it out.

Is there going to be further discussion of this patch, or do I return
it?

Considering it's not been updated, nor my comments responded to, in
almost two weeks, I think we return it at this point.

Sorry, I didn't notice that this patch was put back in 'Waiting on Author'
state.

I did a some performance tests of v5 and v6 version and there v5 is
little bit faster than v6, and v6 has significantly higher stddev

but I am not sure, if my test is correct - I tested a speed of EXPLAIN
statement - result was forwarded to /dev/null

Result of this test is probably related to tested pattern of
expressions - in this case "expr or expr or expr or expr or expr ... "

10 000 exprs (ms)

v | avg | stddev
---+---------+--------
5 | 1839.14 | 13.68
6 | 1871.77 | 48.02

==v5 profile==
209064 43.5354 postgres equal
207849 43.2824 postgres process_equivalence
37453 7.7992 postgres datumIsEqual
3178 0.6618 postgres SearchCatCache
2350 0.4894 postgres AllocSetAlloc

==v6 profile==
193251 45.3998 postgres process_equivalence
178183 41.8599 postgres equal
30430 7.1488 postgres datumIsEqual
2819 0.6623 postgres SearchCatCache
1951 0.4583 postgres AllocSetAlloc

I found so 9.4 planner is about 1% slower (for test that sent by
Gurjeet), that than 9.2 planner, but it is not related to this patch

v6 is clean and all regression tests was passed

Regards

Pavel

Best regards,

--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Gurjeet Singh

gurjeet@singh.im

over 12 years ago

In reply to: Pavel Stehule (#14)

Re: review: Non-recursive processing of AND/OR lists

On Tue, Jul 16, 2013 at 4:04 PM, Pavel Stehule <pavel.stehule@gmail.com>wrote:

I did a some performance tests of v5 and v6 version and there v5 is
little bit faster than v6, and v6 has significantly higher stddev

Thanks Pavel.

The difference in average seems negligible, but stddev is interesting
because v6 does less work than v5 in common cases and in the test that I
had shared.

The current commitfest (2013-06) is marked as 'In Progress', so is it okay
to just mark the patch as 'Ready for Committer' or should I move it to the
next commitfest (2013-09).

What's the procedure of moving a patch to the next commitfest? Do I make a
fresh submission there with a link to current submission, or is the move
doable somehow in the application itself.

Best regards,
--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

#16

Gurjeet Singh

gurjeet@singh.im

over 12 years ago

In reply to: Gurjeet Singh (#15)

Re: review: Non-recursive processing of AND/OR lists

On Wed, Jul 17, 2013 at 8:21 AM, Gurjeet Singh <gurjeet@singh.im> wrote:

What's the procedure of moving a patch to the next commitfest?

Never mind, I see an email from Josh B. regarding this on my corporate
account.

Best regards,
--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

#17

Josh Berkus

josh@agliodbs.com

over 12 years ago

In reply to: Pavel Stehule (#1)

Re: review: Non-recursive processing of AND/OR lists

On 07/17/2013 05:21 AM, Gurjeet Singh wrote:

On Tue, Jul 16, 2013 at 4:04 PM, Pavel Stehule <pavel.stehule@gmail.com>wrote:

I did a some performance tests of v5 and v6 version and there v5 is
little bit faster than v6, and v6 has significantly higher stddev

Thanks Pavel.

The difference in average seems negligible, but stddev is interesting
because v6 does less work than v5 in common cases and in the test that I
had shared.

The current commitfest (2013-06) is marked as 'In Progress', so is it okay
to just mark the patch as 'Ready for Committer' or should I move it to the
next commitfest (2013-09).

If this is actually "ready for committer", I'll mark it as such.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Reply to msg id not found: WM6504935b21bc86ac72a14375b73a34b0e1ac5cb01ad0fa3f155e7ac3aac092eacb1dae67f0b791cd39dfd85505870d32@asav-1.01.com

#18

Robert Haas

robertmhaas@gmail.com

over 12 years ago

In reply to: Gurjeet Singh (#13)

Re: review: Non-recursive processing of AND/OR lists

On Mon, Jul 15, 2013 at 12:45 AM, Gurjeet Singh <gurjeet@singh.im> wrote:

Agreed that there's overhead in allocating list items, but is it more
overhead than pushing functions on the call stack? Not sure, so I leave it
to others who understand such things better than I do.

If you think that a palloc can ever be cheaper that pushing a frame on
the callstack, you're wrong. palloc is not some kind of an atomic
primitive. It's implemented by the AllocSetAlloc function, and you're
going to have to push that function on the call stack, too, in order
to run it.

My main point here is that if the user writes a = 1 and b = 1 and c =
1 and d = 1, they're not going to end up with a bushy tree. They're
going to end up with a tree that's only deep in one direction (left, I
guess) and that's the case we might want to consider optimizing. To
obtain a bushy tree, they're going to have to write a = 1 and (b = 1
and c = 1) and d = 1, or something like that, and I don't see why we
should stress out about that case. It will be rare in practice.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

Gurjeet Singh

gurjeet@singh.im

over 12 years ago

In reply to: Robert Haas (#18)

Re: review: Non-recursive processing of AND/OR lists

On Wed, Jul 17, 2013 at 1:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Jul 15, 2013 at 12:45 AM, Gurjeet Singh <gurjeet@singh.im> wrote:

Agreed that there's overhead in allocating list items, but is it more
overhead than pushing functions on the call stack? Not sure, so I leave

it

to others who understand such things better than I do.

If you think that a palloc can ever be cheaper that pushing a frame on
the callstack, you're wrong. palloc is not some kind of an atomic
primitive. It's implemented by the AllocSetAlloc function, and you're
going to have to push that function on the call stack, too, in order
to run it.

Agreed. I take my objection back. Even if AllocSetAlloc() reuses memory
that was pfree'd earlier, it'll still be at least as expensive as recursing.

My main point here is that if the user writes a = 1 and b = 1 and c =
1 and d = 1, they're not going to end up with a bushy tree. They're
going to end up with a tree that's only deep in one direction (left, I
guess) and that's the case we might want to consider optimizing. To
obtain a bushy tree, they're going to have to write a = 1 and (b = 1
and c = 1) and d = 1, or something like that, and I don't see why we
should stress out about that case. It will be rare in practice.

In v6 of the patch, I have deferred the 'pending' list initialization to
until we actually hit a candidate right-branch. So in the common case the
pending list will never be populated, and if we find a bushy or right-deep
tree (for some reason an ORM/tool may choose to build AND/OR lists that may
end being right-deep when in Postgres), then the pending list will be used
to process them iteratively.

Does that alleviate your concern about 'pending' list management causing an
overhead.

Agreed that bushy/right-deep trees are a remote corner case, but we are
addressing a remote corner case in the first place (insanely long AND
lists) and why not handle another remote corner case right now if it
doesn't cause an overhead for common case.

Best regards,
--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

#20

Robert Haas

robertmhaas@gmail.com

over 12 years ago

In reply to: Gurjeet Singh (#19)

Re: review: Non-recursive processing of AND/OR lists

On Wed, Jul 17, 2013 at 2:03 PM, Gurjeet Singh <gurjeet@singh.im> wrote:

In v6 of the patch, I have deferred the 'pending' list initialization to
until we actually hit a candidate right-branch. So in the common case the
pending list will never be populated, and if we find a bushy or right-deep
tree (for some reason an ORM/tool may choose to build AND/OR lists that may
end being right-deep when in Postgres), then the pending list will be used
to process them iteratively.

Does that alleviate your concern about 'pending' list management causing an
overhead.

Agreed that bushy/right-deep trees are a remote corner case, but we are
addressing a remote corner case in the first place (insanely long AND lists)
and why not handle another remote corner case right now if it doesn't cause
an overhead for common case.

Because simpler code is less likely to have bugs and is easier to
maintain. It's worth noting that the change you're proposing is in
fact user-visible, as demonstrated by the fact that you had to update
the regression test output:

-                                 |   WHERE (((rsl.sl_color =
rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm)) AND
(rsl.sl_len_cm <= rsh.slmaxlen_cm));
+                                 |   WHERE ((rsl.sl_color =
rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm) AND (rsl.sl_len_cm
<= rsh.slmaxlen_cm));

Now, I think that change is actually an improvement, because here's
what that WHERE clause looked like when it was entered:

WHERE rsl.sl_color = rsh.slcolor
AND rsl.sl_len_cm >= rsh.slminlen_cm
AND rsl.sl_len_cm <= rsh.slmaxlen_cm;

But flattening a = 1 AND (b = 1 AND c = 1 AND d = 1) AND e = 1 to a
flat list doesn't have any of the same advantages.

At the end of the day, this is a judgement call, and I'm giving you
mine. If somebody else wants to weigh in, that's fine. I can't think
of anything that would actually be outright broken under your proposed
approach, but my personal feeling is that it's better to only add the
amount of code that we know is needed to solve the problem actually
observed in practice, and no more.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Robert Haas (#20)

Re: review: Non-recursive processing of AND/OR lists

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Jul 17, 2013 at 2:03 PM, Gurjeet Singh <gurjeet@singh.im> wrote:

Agreed that bushy/right-deep trees are a remote corner case, but we are
addressing a remote corner case in the first place (insanely long AND lists)
and why not handle another remote corner case right now if it doesn't cause
an overhead for common case.

Because simpler code is less likely to have bugs and is easier to
maintain.

I agree with that point, but one should also remember Polya's Inventor's
Paradox: the more general problem may be easier to solve. That is, if
done right, code that fully flattens an AND tree might actually be
simpler than code that does just a subset of the transformation. The
current patch fails to meet this expectation, but maybe you just haven't
thought about it the right way.

My concerns about this patch have little to do with that, though, and
much more to do with the likelihood that it breaks some other piece of
code that is expecting AND/OR to be strictly binary operators, which
is what they've always been in parsetrees that haven't reached the
planner. It doesn't appear to me that you've done any research on that
point whatsoever --- you have not even updated the comment for BoolExpr
(in primnodes.h) that this patch falsifies.

It's worth noting that the change you're proposing is in
fact user-visible, as demonstrated by the fact that you had to update
the regression test output:

The point to worry about here is whether rule dump and reload is still
safe. In particular, the logic in ruleutils.c for deciding whether it's
safe to omit parentheses has only really been thought about/tested for
the binary AND/OR case. Although that code can dump N-way AND/OR
because it's also used to print post-planner expression trees in EXPLAIN,
that case has never been held to the standard of "is the parser
guaranteed to interpret this expression the same as before?". Perhaps
it's fine, but has anyone looked at that issue?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22

Gurjeet Singh

gurjeet@singh.im

over 12 years ago

In reply to: Tom Lane (#21)

Re: review: Non-recursive processing of AND/OR lists

On Thu, Jul 18, 2013 at 10:19 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Because simpler code is less likely to have bugs and is easier to
maintain.

I agree with that point, but one should also remember Polya's Inventor's
Paradox: the more general problem may be easier to solve. That is, if
done right, code that fully flattens an AND tree might actually be
simpler than code that does just a subset of the transformation. The
current patch fails to meet this expectation,

The current patch does completely flatten any type of tree (left/right-deep
or bushy) without recursing, and right-deep and bushy tree processing is
what Robert is recommending to defer to recursive processing. Maybe I
haven't considered a case where it doesn't flatten the tree; do you have an
example in mind.

but maybe you just haven't
thought about it the right way.

My concerns about this patch have little to do with that, though, and
much more to do with the likelihood that it breaks some other piece of
code that is expecting AND/OR to be strictly binary operators, which
is what they've always been in parsetrees that haven't reached the
planner. It doesn't appear to me that you've done any research on that
point whatsoever

No, I haven't, and I might not be able to research it for a few more weeks.

you have not even updated the comment for BoolExpr
(in primnodes.h) that this patch falsifies.

I will fix that.

Best regards,
--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

#23

Gurjeet Singh

gurjeet@singh.im

over 11 years ago

In reply to: Gurjeet Singh (#22)

1 attachment(s)

Re: review: Non-recursive processing of AND/OR lists

On Thu, Jul 18, 2013 at 1:54 PM, Gurjeet Singh <gurjeet@singh.im> wrote:

On Thu, Jul 18, 2013 at 10:19 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Because simpler code is less likely to have bugs and is easier to
maintain.

I agree with that point, but one should also remember Polya's Inventor's
Paradox: the more general problem may be easier to solve. That is, if
done right, code that fully flattens an AND tree might actually be
simpler than code that does just a subset of the transformation. The
current patch fails to meet this expectation,

The current patch does completely flatten any type of tree
(left/right-deep or bushy) without recursing, and right-deep and bushy tree
processing is what Robert is recommending to defer to recursive processing.
Maybe I haven't considered a case where it doesn't flatten the tree; do you
have an example in mind.

but maybe you just haven't
thought about it the right way.

I tried to eliminate the 'pending' list, but I don't see a way around it.
We need temporary storage somewhere to store the branches encountered on
the right; in recursion case the call stack was serving that purpose.

My concerns about this patch have little to do with that, though, and
much more to do with the likelihood that it breaks some other piece of
code that is expecting AND/OR to be strictly binary operators, which
is what they've always been in parsetrees that haven't reached the
planner. It doesn't appear to me that you've done any research on that
point whatsoever

No, I haven't, and I might not be able to research it for a few more weeks.

There are about 30 files (including optimizer and executor) that match
case-insensitive search for BoolExpr, and I scanned those for the usage of
the member 'args'. All the instances where BoolExpr.args is being accessed,
it's being treated as a null-terminated list. There's one exception that I
could find, and it was in context of NOT expression: not_clause() in
clauses.c.

you have not even updated the comment for BoolExpr
(in primnodes.h) that this patch falsifies.

I will fix that.

I think this line in that comment already covers the fact that in some
"special" cases a BoolExpr may have more than 2 arguments.

"There are also a few special cases where more arguments can appear before
optimization."

I have updated the comment nevertheless, and removed another comment in
parse_expr.c that claimed to be the only place where a BoolExpr with more
than 2 args is generated.

I have isolated the code for right-deep and bushy tree processing via the
macro PROCESS_BUSHY_TREES. Also, I have shortened some variable names while
retaining their meaning.

Please find the updated patch attached (based on master).

Best regards,
--
Gurjeet Singh http://gurjeet.singh.im/

EDB www.EnterpriseDB.com <http://www.enterprisedb.com>

Attachments:

non_recursive_and_or_transformation_v7.patchtext/x-diff; charset=US-ASCII; name=non_recursive_and_or_transformation_v7.patchDownload

diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 81c9338..eb35d70 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -41,8 +41,7 @@ bool		Transform_null_equals = false;
 static Node *transformExprRecurse(ParseState *pstate, Node *expr);
 static Node *transformParamRef(ParseState *pstate, ParamRef *pref);
 static Node *transformAExprOp(ParseState *pstate, A_Expr *a);
-static Node *transformAExprAnd(ParseState *pstate, A_Expr *a);
-static Node *transformAExprOr(ParseState *pstate, A_Expr *a);
+static Node *transformAExprAndOr(ParseState *pstate, A_Expr *a);
 static Node *transformAExprNot(ParseState *pstate, A_Expr *a);
 static Node *transformAExprOpAny(ParseState *pstate, A_Expr *a);
 static Node *transformAExprOpAll(ParseState *pstate, A_Expr *a);
@@ -224,10 +223,10 @@ transformExprRecurse(ParseState *pstate, Node *expr)
 						result = transformAExprOp(pstate, a);
 						break;
 					case AEXPR_AND:
-						result = transformAExprAnd(pstate, a);
+						result = transformAExprAndOr(pstate, a);
 						break;
 					case AEXPR_OR:
-						result = transformAExprOr(pstate, a);
+						result = transformAExprAndOr(pstate, a);
 						break;
 					case AEXPR_NOT:
 						result = transformAExprNot(pstate, a);
@@ -918,32 +917,102 @@ transformAExprOp(ParseState *pstate, A_Expr *a)
 	return result;
 }
 
+/*
+ * Transform the AND/OR trees non-recursively.
+ *
+ * The parser turns a list of consecutive AND expressions into a left-deep tree.
+ *
+ * a AND b AND c
+ *
+ *      AND
+ *     /  \
+ *   AND   c
+ *  /  \
+ * a    b
+ *
+ * For very long lists, it gets deep enough that processing it recursively causes
+ * check_stack_depth() to raise error and abort the query. Hence, it is necessary
+ * that we process these trees iteratively.
+ */
 static Node *
-transformAExprAnd(ParseState *pstate, A_Expr *a)
+transformAExprAndOr(ParseState *pstate, A_Expr *a)
 {
-	Node	   *lexpr = transformExprRecurse(pstate, a->lexpr);
-	Node	   *rexpr = transformExprRecurse(pstate, a->rexpr);
+#define PROCESS_BUSHY_TREES 1
+	List		   *exprs = NIL;
+#if PROCESS_BUSHY_TREES
+	List		   *pending = NIL;
+#endif
+	Node		   *expr;
+	A_Expr		   *root = a;
+	A_Expr_Kind		root_kind = a->kind;
+	char		   *root_name = (root_kind == AEXPR_AND ? "AND" : "OR");
+	BoolExprType	root_expr_type = (root_kind == AEXPR_AND ? AND_EXPR : OR_EXPR);
+
+	do {
+		Node *tmp;
 
-	lexpr = coerce_to_boolean(pstate, lexpr, "AND");
-	rexpr = coerce_to_boolean(pstate, rexpr, "AND");
+		/*
+		 * Follow the left links to walk the left-deep tree, and process all the
+		 * right branches. We traverse down the left branch as long as we find
+		 * an unbroken chain of node types that match the root node. Then we
+		 * stop and process the rest of the tree in normal recursive manner.
+#if PROCESS_BUSHY_TREES
+		 *
+		 * If a right branch is also the same kind of tree as the root, then
+		 * append it to the 'pending' list. The pending list is also processed
+		 * in this function call iteratively rather than recursively.  This
+		 * allows us to process bushy and even right-deep trees, not just
+		 * left-deep trees.
+#endif
+		 */
+		tmp = (Node*)a;
+		do {
+			a = (A_Expr*) tmp;
 
-	return (Node *) makeBoolExpr(AND_EXPR,
-								 list_make2(lexpr, rexpr),
-								 a->location);
-}
+#if PROCESS_BUSHY_TREES
+			if (IsA(a->rexpr, A_Expr) && ((A_Expr*)a->rexpr)->kind == root_kind)
+			{
+				pending = lappend(pending, a->rexpr);
+			}
+			else
+#endif
+			{
+				expr = transformExprRecurse(pstate, a->rexpr);
+				expr = coerce_to_boolean(pstate, expr, root_name);
+				/*
+				 * Use lcons instead of lappend to retain the order of
+				 * processing of right branches close to the way it was in the
+				 * case of recursive processing.
+				 */
+				exprs = lcons(expr, exprs);
+			}
 
-static Node *
-transformAExprOr(ParseState *pstate, A_Expr *a)
-{
-	Node	   *lexpr = transformExprRecurse(pstate, a->lexpr);
-	Node	   *rexpr = transformExprRecurse(pstate, a->rexpr);
+			tmp = a->lexpr;
+		} while (IsA(tmp, A_Expr) && ((A_Expr*)tmp)->kind == root_kind);
 
-	lexpr = coerce_to_boolean(pstate, lexpr, "OR");
-	rexpr = coerce_to_boolean(pstate, rexpr, "OR");
+		/* Now process the node in left branch that broke our chain. */
+		expr = transformExprRecurse(pstate, a->lexpr);
+		expr = coerce_to_boolean(pstate, expr, root_name);
+		exprs = lcons(expr, exprs);
 
-	return (Node *) makeBoolExpr(OR_EXPR,
-								 list_make2(lexpr, rexpr),
-								 a->location);
+#if PROCESS_BUSHY_TREES
+		/*
+		 * Now that we're done processing the edge of the left-deep tree, pop
+		 * the first element from the front of the pending list and process any
+		 * interesting nodes we found earlier.
+		 */
+		if (list_length(pending) > 0)
+		{
+			a = (A_Expr*) linitial(pending);
+			pending = list_delete_first(pending);
+		}
+		else
+#endif
+			a = NULL;
+
+	} while(a != NULL);
+
+	return (Node *) makeBoolExpr(root_expr_type, exprs, root->location);
 }
 
 static Node *
@@ -2428,10 +2497,6 @@ make_row_comparison_op(ParseState *pstate, List *opname,
 	/*
 	 * For = and <> cases, we just combine the pairwise operators with AND or
 	 * OR respectively.
-	 *
-	 * Note: this is presently the only place where the parser generates
-	 * BoolExpr with more than two arguments.  Should be OK since the rest of
-	 * the system thinks BoolExpr is N-argument anyway.
 	 */
 	if (rctype == ROWCOMPARE_EQ)
 		return (Node *) makeBoolExpr(AND_EXPR, opexprs, location);
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 9cce60b..5b5c22b 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -458,12 +458,9 @@ typedef struct ScalarArrayOpExpr
  * BoolExpr - expression node for the basic Boolean operators AND, OR, NOT
  *
  * Notice the arguments are given as a List.  For NOT, of course the list
- * must always have exactly one element.  For AND and OR, the executor can
- * handle any number of arguments.	The parser generally treats AND and OR
- * as binary and so it typically only produces two-element lists, but the
- * optimizer will flatten trees of AND and OR nodes to produce longer lists
- * when possible.  There are also a few special cases where more arguments
- * can appear before optimization.
+ * must always have exactly one element.  For AND and OR, there are two or
+ * more arguments in the list. The optimizer and the executor are capable
+ * of processing these flat lists.
  */
 typedef enum BoolExprType
 {
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 6c51d0d..c3188a7 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2117,7 +2117,7 @@ shoe_ready| SELECT rsh.shoename,
     int4smaller(rsh.sh_avail, rsl.sl_avail) AS total_avail
    FROM shoe rsh,
     shoelace rsl
-  WHERE (((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm)) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm));
+  WHERE ((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm));
 shoelace| SELECT s.sl_name,
     s.sl_avail,
     s.sl_color,

#24

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: Gurjeet Singh (#23)

1 attachment(s)

Re: review: Non-recursive processing of AND/OR lists

Gurjeet Singh <gurjeet@singh.im> writes:

I tried to eliminate the 'pending' list, but I don't see a way around it.
We need temporary storage somewhere to store the branches encountered on
the right; in recursion case the call stack was serving that purpose.

I still think we should fix this in the grammar, rather than introducing
complicated logic to try to get rid of the recursion later. For example,
as attached.

The existing A_Expr representation of raw AND/OR nodes isn't conducive to
this, but it's not that hard to change it. The attached patch chooses to
use BoolExpr as both the raw and transformed representation of AND/OR/NOT;
we could alternatively invent some new raw-parsetree node type, but I
don't see any advantage in that.

I continue to think that more thought is needed about downstream
processing. For instance, at least the comment at the head of prepqual.c
is wrong now, and it's worth wondering whether the planner still needs to
worry about AND/OR flattening at all. (It probably does, to deal with
view-flattening cases for example; but it's worth considering whether
anything could be saved if we stopped doing that.)

regards, tom lane

Attachments:

flatten-and-or-in-grammar-1.patchtext/x-diff; charset=us-ascii; name=flatten-and-or-in-grammar-1.patchDownload

diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 1e48a7f..95f5dd2 100644
*** a/src/backend/nodes/nodeFuncs.c
--- b/src/backend/nodes/nodeFuncs.c
*************** raw_expression_tree_walker(Node *node,
*** 3047,3052 ****
--- 3047,3060 ----
  				/* operator name is deemed uninteresting */
  			}
  			break;
+ 		case T_BoolExpr:
+ 			{
+ 				BoolExpr   *expr = (BoolExpr *) node;
+ 
+ 				if (walker(expr->args, context))
+ 					return true;
+ 			}
+ 			break;
  		case T_ColumnRef:
  			/* we assume the fields contain nothing interesting */
  			break;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 10e8139..cd4bce1 100644
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
*************** _outAExpr(StringInfo str, const A_Expr *
*** 2437,2451 ****
  			appendStringInfoChar(str, ' ');
  			WRITE_NODE_FIELD(name);
  			break;
- 		case AEXPR_AND:
- 			appendStringInfoString(str, " AND");
- 			break;
- 		case AEXPR_OR:
- 			appendStringInfoString(str, " OR");
- 			break;
- 		case AEXPR_NOT:
- 			appendStringInfoString(str, " NOT");
- 			break;
  		case AEXPR_OP_ANY:
  			appendStringInfoChar(str, ' ');
  			WRITE_NODE_FIELD(name);
--- 2437,2442 ----
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 7b9895d..dd04b1a 100644
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
*************** static void insertSelectOptions(SelectSt
*** 151,156 ****
--- 151,159 ----
  static Node *makeSetOp(SetOperation op, bool all, Node *larg, Node *rarg);
  static Node *doNegate(Node *n, int location);
  static void doNegateFloat(Value *v);
+ static Node *makeAndExpr(Node *lexpr, Node *rexpr, int location);
+ static Node *makeOrExpr(Node *lexpr, Node *rexpr, int location);
+ static Node *makeNotExpr(Node *expr, int location);
  static Node *makeAArrayExpr(List *elements, int location);
  static Node *makeXmlExpr(XmlExprOp op, char *name, List *named_args,
  						 List *args, int location);
*************** a_expr:		c_expr									{ $$ = $1; }
*** 10849,10859 ****
  				{ $$ = (Node *) makeA_Expr(AEXPR_OP, $2, $1, NULL, @2); }
  
  			| a_expr AND a_expr
! 				{ $$ = (Node *) makeA_Expr(AEXPR_AND, NIL, $1, $3, @2); }
  			| a_expr OR a_expr
! 				{ $$ = (Node *) makeA_Expr(AEXPR_OR, NIL, $1, $3, @2); }
  			| NOT a_expr
! 				{ $$ = (Node *) makeA_Expr(AEXPR_NOT, NIL, NULL, $2, @1); }
  
  			| a_expr LIKE a_expr
  				{ $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "~~", $1, $3, @2); }
--- 10852,10862 ----
  				{ $$ = (Node *) makeA_Expr(AEXPR_OP, $2, $1, NULL, @2); }
  
  			| a_expr AND a_expr
! 				{ $$ = makeAndExpr($1, $3, @2); }
  			| a_expr OR a_expr
! 				{ $$ = makeOrExpr($1, $3, @2); }
  			| NOT a_expr
! 				{ $$ = makeNotExpr($2, @1); }
  
  			| a_expr LIKE a_expr
  				{ $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "~~", $1, $3, @2); }
*************** a_expr:		c_expr									{ $$ = $1; }
*** 11022,11032 ****
  				}
  			| a_expr IS NOT DISTINCT FROM a_expr		%prec IS
  				{
! 					$$ = (Node *) makeA_Expr(AEXPR_NOT, NIL, NULL,
! 									(Node *) makeSimpleA_Expr(AEXPR_DISTINCT,
! 															  "=", $1, $6, @2),
! 											 @2);
! 
  				}
  			| a_expr IS OF '(' type_list ')'			%prec IS
  				{
--- 11025,11033 ----
  				}
  			| a_expr IS NOT DISTINCT FROM a_expr		%prec IS
  				{
! 					$$ = makeNotExpr((Node *) makeSimpleA_Expr(AEXPR_DISTINCT,
! 															   "=", $1, $6, @2),
! 									 @2);
  				}
  			| a_expr IS OF '(' type_list ')'			%prec IS
  				{
*************** a_expr:		c_expr									{ $$ = $1; }
*** 11044,11086 ****
  			 */
  			| a_expr BETWEEN opt_asymmetric b_expr AND b_expr		%prec BETWEEN
  				{
! 					$$ = (Node *) makeA_Expr(AEXPR_AND, NIL,
  						(Node *) makeSimpleA_Expr(AEXPR_OP, ">=", $1, $4, @2),
  						(Node *) makeSimpleA_Expr(AEXPR_OP, "<=", $1, $6, @2),
! 											 @2);
  				}
  			| a_expr NOT BETWEEN opt_asymmetric b_expr AND b_expr	%prec BETWEEN
  				{
! 					$$ = (Node *) makeA_Expr(AEXPR_OR, NIL,
  						(Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $5, @2),
  						(Node *) makeSimpleA_Expr(AEXPR_OP, ">", $1, $7, @2),
! 											 @2);
  				}
  			| a_expr BETWEEN SYMMETRIC b_expr AND b_expr			%prec BETWEEN
  				{
! 					$$ = (Node *) makeA_Expr(AEXPR_OR, NIL,
! 						(Node *) makeA_Expr(AEXPR_AND, NIL,
  							(Node *) makeSimpleA_Expr(AEXPR_OP, ">=", $1, $4, @2),
  							(Node *) makeSimpleA_Expr(AEXPR_OP, "<=", $1, $6, @2),
! 											@2),
! 						(Node *) makeA_Expr(AEXPR_AND, NIL,
  							(Node *) makeSimpleA_Expr(AEXPR_OP, ">=", $1, $6, @2),
  							(Node *) makeSimpleA_Expr(AEXPR_OP, "<=", $1, $4, @2),
! 											@2),
! 											 @2);
  				}
  			| a_expr NOT BETWEEN SYMMETRIC b_expr AND b_expr		%prec BETWEEN
  				{
! 					$$ = (Node *) makeA_Expr(AEXPR_AND, NIL,
! 						(Node *) makeA_Expr(AEXPR_OR, NIL,
  							(Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $5, @2),
  							(Node *) makeSimpleA_Expr(AEXPR_OP, ">", $1, $7, @2),
! 											@2),
! 						(Node *) makeA_Expr(AEXPR_OR, NIL,
  							(Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $7, @2),
  							(Node *) makeSimpleA_Expr(AEXPR_OP, ">", $1, $5, @2),
! 											@2),
! 											 @2);
  				}
  			| a_expr IN_P in_expr
  				{
--- 11045,11087 ----
  			 */
  			| a_expr BETWEEN opt_asymmetric b_expr AND b_expr		%prec BETWEEN
  				{
! 					$$ = makeAndExpr(
  						(Node *) makeSimpleA_Expr(AEXPR_OP, ">=", $1, $4, @2),
  						(Node *) makeSimpleA_Expr(AEXPR_OP, "<=", $1, $6, @2),
! 									 @2);
  				}
  			| a_expr NOT BETWEEN opt_asymmetric b_expr AND b_expr	%prec BETWEEN
  				{
! 					$$ = makeOrExpr(
  						(Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $5, @2),
  						(Node *) makeSimpleA_Expr(AEXPR_OP, ">", $1, $7, @2),
! 									@2);
  				}
  			| a_expr BETWEEN SYMMETRIC b_expr AND b_expr			%prec BETWEEN
  				{
! 					$$ = makeOrExpr(
! 						  makeAndExpr(
  							(Node *) makeSimpleA_Expr(AEXPR_OP, ">=", $1, $4, @2),
  							(Node *) makeSimpleA_Expr(AEXPR_OP, "<=", $1, $6, @2),
! 									  @2),
! 						  makeAndExpr(
  							(Node *) makeSimpleA_Expr(AEXPR_OP, ">=", $1, $6, @2),
  							(Node *) makeSimpleA_Expr(AEXPR_OP, "<=", $1, $4, @2),
! 									  @2),
! 									@2);
  				}
  			| a_expr NOT BETWEEN SYMMETRIC b_expr AND b_expr		%prec BETWEEN
  				{
! 					$$ = makeAndExpr(
! 						   makeOrExpr(
  							(Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $5, @2),
  							(Node *) makeSimpleA_Expr(AEXPR_OP, ">", $1, $7, @2),
! 									  @2),
! 						   makeOrExpr(
  							(Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $7, @2),
  							(Node *) makeSimpleA_Expr(AEXPR_OP, ">", $1, $5, @2),
! 									  @2),
! 									 @2);
  				}
  			| a_expr IN_P in_expr
  				{
*************** a_expr:		c_expr									{ $$ = $1; }
*** 11114,11120 ****
  						n->operName = list_make1(makeString("="));
  						n->location = @3;
  						/* Stick a NOT on top */
! 						$$ = (Node *) makeA_Expr(AEXPR_NOT, NIL, NULL, (Node *) n, @2);
  					}
  					else
  					{
--- 11115,11121 ----
  						n->operName = list_make1(makeString("="));
  						n->location = @3;
  						/* Stick a NOT on top */
! 						$$ = makeNotExpr((Node *) n, @2);
  					}
  					else
  					{
*************** a_expr:		c_expr									{ $$ = $1; }
*** 11162,11171 ****
  				}
  			| a_expr IS NOT DOCUMENT_P				%prec IS
  				{
! 					$$ = (Node *) makeA_Expr(AEXPR_NOT, NIL, NULL,
! 											 makeXmlExpr(IS_DOCUMENT, NULL, NIL,
! 														 list_make1($1), @2),
! 											 @2);
  				}
  		;
  
--- 11163,11171 ----
  				}
  			| a_expr IS NOT DOCUMENT_P				%prec IS
  				{
! 					$$ = makeNotExpr(makeXmlExpr(IS_DOCUMENT, NULL, NIL,
! 												 list_make1($1), @2),
! 									 @2);
  				}
  		;
  
*************** b_expr:		c_expr
*** 11216,11223 ****
  				}
  			| b_expr IS NOT DISTINCT FROM b_expr	%prec IS
  				{
! 					$$ = (Node *) makeA_Expr(AEXPR_NOT, NIL,
! 						NULL, (Node *) makeSimpleA_Expr(AEXPR_DISTINCT, "=", $1, $6, @2), @2);
  				}
  			| b_expr IS OF '(' type_list ')'		%prec IS
  				{
--- 11216,11224 ----
  				}
  			| b_expr IS NOT DISTINCT FROM b_expr	%prec IS
  				{
! 					$$ = makeNotExpr((Node *) makeSimpleA_Expr(AEXPR_DISTINCT,
! 															   "=", $1, $6, @2),
! 									 @2);
  				}
  			| b_expr IS OF '(' type_list ')'		%prec IS
  				{
*************** b_expr:		c_expr
*** 11234,11243 ****
  				}
  			| b_expr IS NOT DOCUMENT_P				%prec IS
  				{
! 					$$ = (Node *) makeA_Expr(AEXPR_NOT, NIL, NULL,
! 											 makeXmlExpr(IS_DOCUMENT, NULL, NIL,
! 														 list_make1($1), @2),
! 											 @2);
  				}
  		;
  
--- 11235,11243 ----
  				}
  			| b_expr IS NOT DOCUMENT_P				%prec IS
  				{
! 					$$ = makeNotExpr(makeXmlExpr(IS_DOCUMENT, NULL, NIL,
! 												 list_make1($1), @2),
! 									 @2);
  				}
  		;
  
*************** doNegateFloat(Value *v)
*** 13693,13698 ****
--- 13693,13738 ----
  }
  
  static Node *
+ makeAndExpr(Node *lexpr, Node *rexpr, int location)
+ {
+ 	/* Flatten "a AND b AND c ..." to a single BoolExpr on sight */
+ 	if (IsA(lexpr, BoolExpr))
+ 	{
+ 		BoolExpr *blexpr = (BoolExpr *) lexpr;
+ 
+ 		if (blexpr->boolop == AND_EXPR)
+ 		{
+ 			blexpr->args = lappend(blexpr->args, rexpr);
+ 			return (Node *) blexpr;
+ 		}
+ 	}
+ 	return (Node *) makeBoolExpr(AND_EXPR, list_make2(lexpr, rexpr), location);
+ }
+ 
+ static Node *
+ makeOrExpr(Node *lexpr, Node *rexpr, int location)
+ {
+ 	/* Flatten "a OR b OR c ..." to a single BoolExpr on sight */
+ 	if (IsA(lexpr, BoolExpr))
+ 	{
+ 		BoolExpr *blexpr = (BoolExpr *) lexpr;
+ 
+ 		if (blexpr->boolop == OR_EXPR)
+ 		{
+ 			blexpr->args = lappend(blexpr->args, rexpr);
+ 			return (Node *) blexpr;
+ 		}
+ 	}
+ 	return (Node *) makeBoolExpr(OR_EXPR, list_make2(lexpr, rexpr), location);
+ }
+ 
+ static Node *
+ makeNotExpr(Node *expr, int location)
+ {
+ 	return (Node *) makeBoolExpr(NOT_EXPR, list_make1(expr), location);
+ }
+ 
+ static Node *
  makeAArrayExpr(List *elements, int location)
  {
  	A_ArrayExpr *n = makeNode(A_ArrayExpr);
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index aa704bb..a71f611 100644
*** a/src/backend/parser/parse_clause.c
--- b/src/backend/parser/parse_clause.c
*************** transformJoinUsingClause(ParseState *pst
*** 332,338 ****
  						 RangeTblEntry *leftRTE, RangeTblEntry *rightRTE,
  						 List *leftVars, List *rightVars)
  {
! 	Node	   *result = NULL;
  	ListCell   *lvars,
  			   *rvars;
  
--- 332,339 ----
  						 RangeTblEntry *leftRTE, RangeTblEntry *rightRTE,
  						 List *leftVars, List *rightVars)
  {
! 	Node	   *result;
! 	List	   *andargs = NIL;
  	ListCell   *lvars,
  			   *rvars;
  
*************** transformJoinUsingClause(ParseState *pst
*** 358,375 ****
  							 copyObject(lvar), copyObject(rvar),
  							 -1);
  
! 		/* And combine into an AND clause, if multiple join columns */
! 		if (result == NULL)
! 			result = (Node *) e;
! 		else
! 		{
! 			A_Expr	   *a;
! 
! 			a = makeA_Expr(AEXPR_AND, NIL, result, (Node *) e, -1);
! 			result = (Node *) a;
! 		}
  	}
  
  	/*
  	 * Since the references are already Vars, and are certainly from the input
  	 * relations, we don't have to go through the same pushups that
--- 359,374 ----
  							 copyObject(lvar), copyObject(rvar),
  							 -1);
  
! 		/* Prepare to combine into an AND clause, if multiple join columns */
! 		andargs = lappend(andargs, e);
  	}
  
+ 	/* Only need an AND if there's more than one join column */
+ 	if (list_length(andargs) == 1)
+ 		result = (Node *) linitial(andargs);
+ 	else
+ 		result = (Node *) makeBoolExpr(AND_EXPR, andargs, -1);
+ 
  	/*
  	 * Since the references are already Vars, and are certainly from the input
  	 * relations, we don't have to go through the same pushups that
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 81c9338..cb76133 100644
*** a/src/backend/parser/parse_expr.c
--- b/src/backend/parser/parse_expr.c
*************** bool		Transform_null_equals = false;
*** 41,55 ****
  static Node *transformExprRecurse(ParseState *pstate, Node *expr);
  static Node *transformParamRef(ParseState *pstate, ParamRef *pref);
  static Node *transformAExprOp(ParseState *pstate, A_Expr *a);
- static Node *transformAExprAnd(ParseState *pstate, A_Expr *a);
- static Node *transformAExprOr(ParseState *pstate, A_Expr *a);
- static Node *transformAExprNot(ParseState *pstate, A_Expr *a);
  static Node *transformAExprOpAny(ParseState *pstate, A_Expr *a);
  static Node *transformAExprOpAll(ParseState *pstate, A_Expr *a);
  static Node *transformAExprDistinct(ParseState *pstate, A_Expr *a);
  static Node *transformAExprNullIf(ParseState *pstate, A_Expr *a);
  static Node *transformAExprOf(ParseState *pstate, A_Expr *a);
  static Node *transformAExprIn(ParseState *pstate, A_Expr *a);
  static Node *transformFuncCall(ParseState *pstate, FuncCall *fn);
  static Node *transformCaseExpr(ParseState *pstate, CaseExpr *c);
  static Node *transformSubLink(ParseState *pstate, SubLink *sublink);
--- 41,53 ----
  static Node *transformExprRecurse(ParseState *pstate, Node *expr);
  static Node *transformParamRef(ParseState *pstate, ParamRef *pref);
  static Node *transformAExprOp(ParseState *pstate, A_Expr *a);
  static Node *transformAExprOpAny(ParseState *pstate, A_Expr *a);
  static Node *transformAExprOpAll(ParseState *pstate, A_Expr *a);
  static Node *transformAExprDistinct(ParseState *pstate, A_Expr *a);
  static Node *transformAExprNullIf(ParseState *pstate, A_Expr *a);
  static Node *transformAExprOf(ParseState *pstate, A_Expr *a);
  static Node *transformAExprIn(ParseState *pstate, A_Expr *a);
+ static Node *transformBoolExpr(ParseState *pstate, BoolExpr *a);
  static Node *transformFuncCall(ParseState *pstate, FuncCall *fn);
  static Node *transformCaseExpr(ParseState *pstate, CaseExpr *c);
  static Node *transformSubLink(ParseState *pstate, SubLink *sublink);
*************** transformExprRecurse(ParseState *pstate,
*** 223,237 ****
  					case AEXPR_OP:
  						result = transformAExprOp(pstate, a);
  						break;
- 					case AEXPR_AND:
- 						result = transformAExprAnd(pstate, a);
- 						break;
- 					case AEXPR_OR:
- 						result = transformAExprOr(pstate, a);
- 						break;
- 					case AEXPR_NOT:
- 						result = transformAExprNot(pstate, a);
- 						break;
  					case AEXPR_OP_ANY:
  						result = transformAExprOpAny(pstate, a);
  						break;
--- 221,226 ----
*************** transformExprRecurse(ParseState *pstate,
*** 258,263 ****
--- 247,256 ----
  				break;
  			}
  
+ 		case T_BoolExpr:
+ 			result = transformBoolExpr(pstate, (BoolExpr *) expr);
+ 			break;
+ 
  		case T_FuncCall:
  			result = transformFuncCall(pstate, (FuncCall *) expr);
  			break;
*************** transformExprRecurse(ParseState *pstate,
*** 337,343 ****
  		case T_DistinctExpr:
  		case T_NullIfExpr:
  		case T_ScalarArrayOpExpr:
- 		case T_BoolExpr:
  		case T_FieldSelect:
  		case T_FieldStore:
  		case T_RelabelType:
--- 330,335 ----
*************** transformAExprOp(ParseState *pstate, A_E
*** 919,964 ****
  }
  
  static Node *
- transformAExprAnd(ParseState *pstate, A_Expr *a)
- {
- 	Node	   *lexpr = transformExprRecurse(pstate, a->lexpr);
- 	Node	   *rexpr = transformExprRecurse(pstate, a->rexpr);
- 
- 	lexpr = coerce_to_boolean(pstate, lexpr, "AND");
- 	rexpr = coerce_to_boolean(pstate, rexpr, "AND");
- 
- 	return (Node *) makeBoolExpr(AND_EXPR,
- 								 list_make2(lexpr, rexpr),
- 								 a->location);
- }
- 
- static Node *
- transformAExprOr(ParseState *pstate, A_Expr *a)
- {
- 	Node	   *lexpr = transformExprRecurse(pstate, a->lexpr);
- 	Node	   *rexpr = transformExprRecurse(pstate, a->rexpr);
- 
- 	lexpr = coerce_to_boolean(pstate, lexpr, "OR");
- 	rexpr = coerce_to_boolean(pstate, rexpr, "OR");
- 
- 	return (Node *) makeBoolExpr(OR_EXPR,
- 								 list_make2(lexpr, rexpr),
- 								 a->location);
- }
- 
- static Node *
- transformAExprNot(ParseState *pstate, A_Expr *a)
- {
- 	Node	   *rexpr = transformExprRecurse(pstate, a->rexpr);
- 
- 	rexpr = coerce_to_boolean(pstate, rexpr, "NOT");
- 
- 	return (Node *) makeBoolExpr(NOT_EXPR,
- 								 list_make1(rexpr),
- 								 a->location);
- }
- 
- static Node *
  transformAExprOpAny(ParseState *pstate, A_Expr *a)
  {
  	Node	   *lexpr = transformExprRecurse(pstate, a->lexpr);
--- 911,916 ----
*************** transformAExprIn(ParseState *pstate, A_E
*** 1238,1243 ****
--- 1190,1231 ----
  }
  
  static Node *
+ transformBoolExpr(ParseState *pstate, BoolExpr *a)
+ {
+ 	List	   *args = NIL;
+ 	const char *opname;
+ 	ListCell   *lc;
+ 
+ 	switch (a->boolop)
+ 	{
+ 		case AND_EXPR:
+ 			opname = "AND";
+ 			break;
+ 		case OR_EXPR:
+ 			opname = "OR";
+ 			break;
+ 		case NOT_EXPR:
+ 			opname = "NOT";
+ 			break;
+ 		default:
+ 			elog(ERROR, "unrecognized boolop: %d", (int) a->boolop);
+ 			opname = NULL;		/* keep compiler quiet */
+ 			break;
+ 	}
+ 
+ 	foreach(lc, a->args)
+ 	{
+ 		Node	   *arg = (Node *) lfirst(lc);
+ 
+ 		arg = transformExprRecurse(pstate, arg);
+ 		arg = coerce_to_boolean(pstate, arg, opname);
+ 		args = lappend(args, arg);
+ 	}
+ 
+ 	return (Node *) makeBoolExpr(a->boolop, args, a->location);
+ }
+ 
+ static Node *
  transformFuncCall(ParseState *pstate, FuncCall *fn)
  {
  	List	   *targs;
*************** make_row_comparison_op(ParseState *pstat
*** 2428,2437 ****
  	/*
  	 * For = and <> cases, we just combine the pairwise operators with AND or
  	 * OR respectively.
- 	 *
- 	 * Note: this is presently the only place where the parser generates
- 	 * BoolExpr with more than two arguments.  Should be OK since the rest of
- 	 * the system thinks BoolExpr is N-argument anyway.
  	 */
  	if (rctype == ROWCOMPARE_EQ)
  		return (Node *) makeBoolExpr(AND_EXPR, opexprs, location);
--- 2416,2421 ----
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 18d4991..59916eb 100644
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
*************** typedef struct ParamRef
*** 225,233 ****
  typedef enum A_Expr_Kind
  {
  	AEXPR_OP,					/* normal operator */
- 	AEXPR_AND,					/* booleans - name field is unused */
- 	AEXPR_OR,
- 	AEXPR_NOT,
  	AEXPR_OP_ANY,				/* scalar op ANY (array) */
  	AEXPR_OP_ALL,				/* scalar op ALL (array) */
  	AEXPR_DISTINCT,				/* IS DISTINCT FROM - name must be "=" */
--- 225,230 ----
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 9cce60b..ed73109 100644
*** a/src/include/nodes/primnodes.h
--- b/src/include/nodes/primnodes.h
*************** typedef struct ScalarArrayOpExpr
*** 458,469 ****
   * BoolExpr - expression node for the basic Boolean operators AND, OR, NOT
   *
   * Notice the arguments are given as a List.  For NOT, of course the list
!  * must always have exactly one element.  For AND and OR, the executor can
!  * handle any number of arguments.	The parser generally treats AND and OR
!  * as binary and so it typically only produces two-element lists, but the
!  * optimizer will flatten trees of AND and OR nodes to produce longer lists
!  * when possible.  There are also a few special cases where more arguments
!  * can appear before optimization.
   */
  typedef enum BoolExprType
  {
--- 458,465 ----
   * BoolExpr - expression node for the basic Boolean operators AND, OR, NOT
   *
   * Notice the arguments are given as a List.  For NOT, of course the list
!  * must always have exactly one element.  For AND and OR, there can be two
!  * or more arguments.
   */
  typedef enum BoolExprType
  {
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 6c51d0d..c3188a7 100644
*** a/src/test/regress/expected/rules.out
--- b/src/test/regress/expected/rules.out
*************** shoe_ready| SELECT rsh.shoename,
*** 2117,2123 ****
      int4smaller(rsh.sh_avail, rsl.sl_avail) AS total_avail
     FROM shoe rsh,
      shoelace rsl
!   WHERE (((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm)) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm));
  shoelace| SELECT s.sl_name,
      s.sl_avail,
      s.sl_color,
--- 2117,2123 ----
      int4smaller(rsh.sh_avail, rsl.sl_avail) AS total_avail
     FROM shoe rsh,
      shoelace rsl
!   WHERE ((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm));
  shoelace| SELECT s.sl_name,
      s.sl_avail,
      s.sl_color,

#25

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: Tom Lane (#24)

Re: review: Non-recursive processing of AND/OR lists

I wrote:

Gurjeet Singh <gurjeet@singh.im> writes:

I tried to eliminate the 'pending' list, but I don't see a way around it.
We need temporary storage somewhere to store the branches encountered on
the right; in recursion case the call stack was serving that purpose.

I still think we should fix this in the grammar, rather than introducing
complicated logic to try to get rid of the recursion later. For example,
as attached.

I went looking for (and found) some additional obsoleted comments, and
convinced myself that ruleutils.c is okay as-is, and pushed this.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26

Gurjeet Singh

gurjeet@singh.im

over 11 years ago

In reply to: Tom Lane (#25)

Re: review: Non-recursive processing of AND/OR lists

Thanks!

On Mon, Jun 16, 2014 at 3:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

Gurjeet Singh <gurjeet@singh.im> writes:

I tried to eliminate the 'pending' list, but I don't see a way around it.
We need temporary storage somewhere to store the branches encountered on
the right; in recursion case the call stack was serving that purpose.

I still think we should fix this in the grammar, rather than introducing
complicated logic to try to get rid of the recursion later. For example,
as attached.

I went looking for (and found) some additional obsoleted comments, and
convinced myself that ruleutils.c is okay as-is, and pushed this.

regards, tom lane

--
Gurjeet Singh http://gurjeet.singh.im/

EDB www.EnterpriseDB.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers