constraint exclusion and nulls in IN (..) clause

Started by Amit Langotealmost 8 years ago24 messages

Langote_Amit_f8@lab.ntt.co.jp

almost 8 years ago

1 attachment(s)

Hi.

When addressing a review comment on the fast partition pruning thread [1]/messages/by-id/64241fee-09af-fe4b-a0ab-7cd739965041@lab.ntt.co.jp,
I noticed that specifying null in the IN-list will cause constraint
exclusion to wrongly fail to refute a table's check predicate.

create table foo (a int check (a = 1));
insert into foo values (null), (1);

-- ExecEvalScalarArrayOp() won't return true for any record in foo
select * from foo where a in (null, 2);
a
---
(0 rows)

-- The null in the IN-list prevents constraint exclusion to exclude foo
-- from being scanned in the first place
explain (costs off) select * from foo where a in (null, 2);
QUERY PLAN
---------------------------------------------
Seq Scan on foo
Filter: (a = ANY ('{NULL,2}'::integer[]))
(2 rows)

I propose a patch that teaches predtest.c to disregard any null values in
a SAOP (i.e., the IN (..) expression) when performing constraint exclusion
using that SAOP, because they cause predicate_refuted_by_recurse()'s logic
to fail to conclude the refutation. There is a rule that all items of an
OR clause (SAOP is treated as one) must refute the predicate. But the
OpExpr constructed with null as its constant argument won't refute
anything and thus will cause the whole OR clause to fail to refute the
predicate.

-- With the patch
explain (costs off) select * from foo where a in (null, 2);
QUERY PLAN
--------------------------
Result
One-Time Filter: false
(2 rows)

explain (costs off) select * from foo where a in (null, 2, 1);
QUERY PLAN
-----------------------------------------------
Seq Scan on foo
Filter: (a = ANY ('{NULL,2,1}'::integer[]))
(2 rows)

Thoughts?

Thanks,
Amit

[1]: /messages/by-id/64241fee-09af-fe4b-a0ab-7cd739965041@lab.ntt.co.jp
/messages/by-id/64241fee-09af-fe4b-a0ab-7cd739965041@lab.ntt.co.jp

Attachments:

v1-0001-Disregard-nulls-in-SAOP-rightarg-array-list-durin.patchtext/plain; charset=UTF-8; name=v1-0001-Disregard-nulls-in-SAOP-rightarg-array-list-durin.patchDownload

From 4e3408541da4b67c37ee5dc733c807c0708e1ba7 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 1 Feb 2018 11:32:23 +0900
Subject: [PATCH v1] Disregard nulls in SAOP rightarg array/list during CE

---
 src/backend/optimizer/util/predtest.c | 6 ++++++
 src/test/regress/expected/inherit.out | 6 +-----
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/src/backend/optimizer/util/predtest.c b/src/backend/optimizer/util/predtest.c
index 8106010329..0928306c62 100644
--- a/src/backend/optimizer/util/predtest.c
+++ b/src/backend/optimizer/util/predtest.c
@@ -968,6 +968,9 @@ arrayconst_next_fn(PredIterInfo info)
 
 	if (state->next_elem >= state->num_elems)
 		return NULL;
+	/* skip nulls */
+	while (state->elem_nulls[state->next_elem])
+		state->next_elem++;
 	state->constexpr.constvalue = state->elem_values[state->next_elem];
 	state->constexpr.constisnull = state->elem_nulls[state->next_elem];
 	state->next_elem++;
@@ -1028,6 +1031,9 @@ arrayexpr_next_fn(PredIterInfo info)
 
 	if (state->next == NULL)
 		return NULL;
+	/* skip nulls */
+	while (IsA(state->next, Const) && ((Const *) state->next)->constisnull)
+		state->next = lnext(state->next);
 	lsecond(state->opexpr.args) = lfirst(state->next);
 	state->next = lnext(state->next);
 	return (Node *) &(state->opexpr);
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index a79f891da7..d56e5d25d9 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1715,11 +1715,7 @@ explain (costs off) select * from list_parted where a = 'ab' or a in (null, 'cd'
  Append
    ->  Seq Scan on part_ab_cd
          Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
-   ->  Seq Scan on part_ef_gh
-         Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
-   ->  Seq Scan on part_null_xy
-         Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
-(7 rows)
+(3 rows)
 
 explain (costs off) select * from list_parted where a = 'ab';
                 QUERY PLAN                
-- 
2.11.0

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

almost 8 years ago

In reply to: Amit Langote (#1)

Re: constraint exclusion and nulls in IN (..) clause

On Thu, Feb 1, 2018 at 12:26 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

Hi.

When addressing a review comment on the fast partition pruning thread [1],
I noticed that specifying null in the IN-list will cause constraint
exclusion to wrongly fail to refute a table's check predicate.

create table foo (a int check (a = 1));
insert into foo values (null), (1);

-- ExecEvalScalarArrayOp() won't return true for any record in foo
select * from foo where a in (null, 2);
a
---
(0 rows)

AFAIU, that's true only when = operator is strict. For a non-strict
operator which treats two NULL values as equivalent it would return
row with NULL value.

-- The null in the IN-list prevents constraint exclusion to exclude foo
-- from being scanned in the first place
explain (costs off) select * from foo where a in (null, 2);
QUERY PLAN
---------------------------------------------
Seq Scan on foo
Filter: (a = ANY ('{NULL,2}'::integer[]))
(2 rows)

I propose a patch that teaches predtest.c to disregard any null values in
a SAOP (i.e., the IN (..) expression) when performing constraint exclusion
using that SAOP, because they cause predicate_refuted_by_recurse()'s logic
to fail to conclude the refutation. There is a rule that all items of an
OR clause (SAOP is treated as one) must refute the predicate. But the
OpExpr constructed with null as its constant argument won't refute
anything and thus will cause the whole OR clause to fail to refute the
predicate.

A cursory look through constraint exclusion code esp.
operator_predicate_proof() doesn't show any special handling for
strict/non-strict operators. Probably that's why that function refuses
to refute/imply anything when it encounters NULLs.
1593 * We have two identical subexpressions, and two other
subexpressions that
1594 * are not identical but are both Consts; and we have commuted the
1595 * operators if necessary so that the Consts are on the
right. We'll need
1596 * to compare the Consts' values. If either is NULL, fail.
1597 */
1598 if (pred_const->constisnull)
1599 return false;
1600 if (clause_const->constisnull)
1601 return false;

-- With the patch
explain (costs off) select * from foo where a in (null, 2);
QUERY PLAN
--------------------------
Result
One-Time Filter: false
(2 rows)

explain (costs off) select * from foo where a in (null, 2, 1);
QUERY PLAN
-----------------------------------------------
Seq Scan on foo
Filter: (a = ANY ('{NULL,2,1}'::integer[]))
(2 rows)

Thoughts?

AFAIU, this doesn't look to be the right way to fix the problem; it
assumes that the operators are strict. Sorry, if I have misunderstood
the patch and your thoughts behind it. May be constraint exclusion
code should be taught to treat strict/non-strict operators separately.
I am not sure.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

almost 8 years ago

In reply to: Ashutosh Bapat (#2)

1 attachment(s)

Re: constraint exclusion and nulls in IN (..) clause

Thanks for the comments.

On 2018/02/01 16:40, Ashutosh Bapat wrote:

On Thu, Feb 1, 2018 at 12:26 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

Hi.

When addressing a review comment on the fast partition pruning thread [1],
I noticed that specifying null in the IN-list will cause constraint
exclusion to wrongly fail to refute a table's check predicate.

create table foo (a int check (a = 1));
insert into foo values (null), (1);

-- ExecEvalScalarArrayOp() won't return true for any record in foo
select * from foo where a in (null, 2);
a
---
(0 rows)

AFAIU, that's true only when = operator is strict. For a non-strict
operator which treats two NULL values as equivalent it would return
row with NULL value.

That's true.

-- The null in the IN-list prevents constraint exclusion to exclude foo
-- from being scanned in the first place
explain (costs off) select * from foo where a in (null, 2);
QUERY PLAN
---------------------------------------------
Seq Scan on foo
Filter: (a = ANY ('{NULL,2}'::integer[]))
(2 rows)

I propose a patch that teaches predtest.c to disregard any null values in
a SAOP (i.e., the IN (..) expression) when performing constraint exclusion
using that SAOP, because they cause predicate_refuted_by_recurse()'s logic
to fail to conclude the refutation. There is a rule that all items of an
OR clause (SAOP is treated as one) must refute the predicate. But the
OpExpr constructed with null as its constant argument won't refute
anything and thus will cause the whole OR clause to fail to refute the
predicate.

A cursory look through constraint exclusion code esp.
operator_predicate_proof() doesn't show any special handling for
strict/non-strict operators. Probably that's why that function refuses
to refute/imply anything when it encounters NULLs.
1593 * We have two identical subexpressions, and two other
subexpressions that
1594 * are not identical but are both Consts; and we have commuted the
1595 * operators if necessary so that the Consts are on the
right. We'll need
1596 * to compare the Consts' values. If either is NULL, fail.
1597 */
1598 if (pred_const->constisnull)
1599 return false;
1600 if (clause_const->constisnull)
1601 return false;

There does actually exist strictness check in
predicate_refuted_by_simple_clause(), but comes into play only if the
predicate is a IS NULL clause. In our case, both predicate and clause
expressions would be operator clauses for which the strictness doesn't
seem to be considered.

-- With the patch
explain (costs off) select * from foo where a in (null, 2);
QUERY PLAN
--------------------------
Result
One-Time Filter: false
(2 rows)

explain (costs off) select * from foo where a in (null, 2, 1);
QUERY PLAN
-----------------------------------------------
Seq Scan on foo
Filter: (a = ANY ('{NULL,2,1}'::integer[]))
(2 rows)

Thoughts?

AFAIU, this doesn't look to be the right way to fix the problem; it
assumes that the operators are strict. Sorry, if I have misunderstood
the patch and your thoughts behind it. May be constraint exclusion
code should be taught to treat strict/non-strict operators separately.
I am not sure.

Yeah, the patch in its current form is wrong, because it will give wrong
answers if the operator being used in a SAOP is non-strict. I modified
the patch to consider operator strictness before doing anything with nulls.

Thanks,
Amit

Attachments:

v2-0001-Disregard-nulls-in-SAOP-rightarg-array-list-durin.patchtext/plain; charset=UTF-8; name=v2-0001-Disregard-nulls-in-SAOP-rightarg-array-list-durin.patchDownload

From b68afd6323852b7a503e634aef3699b922b9d567 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 1 Feb 2018 11:32:23 +0900
Subject: [PATCH v2] Disregard nulls in SAOP rightarg array/list during CE

---
 src/backend/optimizer/util/predtest.c | 28 ++++++++++++++++++++++++++++
 src/test/regress/expected/inherit.out |  6 +-----
 2 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/src/backend/optimizer/util/predtest.c b/src/backend/optimizer/util/predtest.c
index 8106010329..5efd7f688b 100644
--- a/src/backend/optimizer/util/predtest.c
+++ b/src/backend/optimizer/util/predtest.c
@@ -905,6 +905,7 @@ boolexpr_startup_fn(Node *clause, PredIterInfo info)
 typedef struct
 {
 	OpExpr		opexpr;
+	bool		opisstrict;		/* Is the operator strict? */
 	Const		constexpr;
 	int			next_elem;
 	int			num_elems;
@@ -957,6 +958,12 @@ arrayconst_startup_fn(Node *clause, PredIterInfo info)
 	state->constexpr.constbyval = elmbyval;
 	lsecond(state->opexpr.args) = &state->constexpr;
 
+	/*
+	 * Record if the operator is strict to perform appropriate action on
+	 * encountering null values in the element array.
+	 */
+	state->opisstrict = op_strict(saop->opno);
+
 	/* Initialize iteration state */
 	state->next_elem = 0;
 }
@@ -968,6 +975,12 @@ arrayconst_next_fn(PredIterInfo info)
 
 	if (state->next_elem >= state->num_elems)
 		return NULL;
+	/* skip nulls if ok to do so */
+	if (state->opisstrict)
+	{
+		while (state->elem_nulls[state->next_elem])
+			state->next_elem++;
+	}
 	state->constexpr.constvalue = state->elem_values[state->next_elem];
 	state->constexpr.constisnull = state->elem_nulls[state->next_elem];
 	state->next_elem++;
@@ -992,6 +1005,7 @@ arrayconst_cleanup_fn(PredIterInfo info)
 typedef struct
 {
 	OpExpr		opexpr;
+	bool		opisstrict;		/* Is the operator strict? */
 	ListCell   *next;
 } ArrayExprIterState;
 
@@ -1016,6 +1030,12 @@ arrayexpr_startup_fn(Node *clause, PredIterInfo info)
 	state->opexpr.inputcollid = saop->inputcollid;
 	state->opexpr.args = list_copy(saop->args);
 
+	/*
+	 * Record if the operator is strict to perform appropriate action on
+	 * encountering null values in the element array.
+	 */
+	state->opisstrict = op_strict(saop->opno);
+
 	/* Initialize iteration variable to first member of ArrayExpr */
 	arrayexpr = (ArrayExpr *) lsecond(saop->args);
 	state->next = list_head(arrayexpr->elements);
@@ -1028,6 +1048,14 @@ arrayexpr_next_fn(PredIterInfo info)
 
 	if (state->next == NULL)
 		return NULL;
+	/* skip nulls if ok to do so */
+	if (state->opisstrict)
+	{
+		Node *node = (Node *) lfirst(state->next);
+
+		while (IsA(node, Const) && ((Const *) node)->constisnull)
+			state->next = lnext(state->next);
+	}
 	lsecond(state->opexpr.args) = lfirst(state->next);
 	state->next = lnext(state->next);
 	return (Node *) &(state->opexpr);
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index a79f891da7..d56e5d25d9 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1715,11 +1715,7 @@ explain (costs off) select * from list_parted where a = 'ab' or a in (null, 'cd'
  Append
    ->  Seq Scan on part_ab_cd
          Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
-   ->  Seq Scan on part_ef_gh
-         Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
-   ->  Seq Scan on part_null_xy
-         Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
-(7 rows)
+(3 rows)
 
 explain (costs off) select * from list_parted where a = 'ab';
                 QUERY PLAN                
-- 
2.11.0

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

almost 8 years ago

In reply to: Amit Langote (#3)

Re: constraint exclusion and nulls in IN (..) clause

On Thu, Feb 1, 2018 at 2:23 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

Yeah, the patch in its current form is wrong, because it will give wrong
answers if the operator being used in a SAOP is non-strict. I modified
the patch to consider operator strictness before doing anything with nulls.

That's fine, but I am not sure whether this fits constraint
exclusion's charter. Please add this patch to the next commitfest so
that we can have a better opinion.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

almost 8 years ago

In reply to: Ashutosh Bapat (#4)

Re: constraint exclusion and nulls in IN (..) clause

On 2018/02/05 13:20, Ashutosh Bapat wrote:

On Thu, Feb 1, 2018 at 2:23 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

Yeah, the patch in its current form is wrong, because it will give wrong
answers if the operator being used in a SAOP is non-strict. I modified
the patch to consider operator strictness before doing anything with nulls.

That's fine, but I am not sure whether this fits constraint
exclusion's charter. Please add this patch to the next commitfest so
that we can have a better opinion.

OK, done.

Thanks,
Amit

Robert Haas

robertmhaas@gmail.com

almost 8 years ago

In reply to: Ashutosh Bapat (#4)

Re: constraint exclusion and nulls in IN (..) clause

On Sun, Feb 4, 2018 at 11:20 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

On Thu, Feb 1, 2018 at 2:23 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

Yeah, the patch in its current form is wrong, because it will give wrong
answers if the operator being used in a SAOP is non-strict. I modified
the patch to consider operator strictness before doing anything with nulls.

That's fine, but I am not sure whether this fits constraint
exclusion's charter. Please add this patch to the next commitfest so
that we can have a better opinion.

It seems like a perfectly reasonable extension of the existing
functionality to me. (I have, at present, no opinion on the patch
itself.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Emre Hasegeli

emre@hasegeli.com

almost 8 years ago

In reply to: Amit Langote (#3)

Re: constraint exclusion and nulls in IN (..) clause

Yeah, the patch in its current form is wrong, because it will give wrong
answers if the operator being used in a SAOP is non-strict. I modified
the patch to consider operator strictness before doing anything with nulls.

I tried to review this patch without any familiarity to the code.

arrayconst_next_fn():

+   /* skip nulls if ok to do so */
+   if (state->opisstrict)
+   {
+       while (state->elem_nulls[state->next_elem])
+           state->next_elem++;
+   }

Shouldn't we check if we consumed all elements (state->next_elem >=
state->num_elems) inside the while loop?

arrayexpr_next_fn():

+   /* skip nulls if ok to do so */
+   if (state->opisstrict)
+   {
+       Node *node = (Node *) lfirst(state->next);
+
+       while (IsA(node, Const) && ((Const *) node)->constisnull)
+           state->next = lnext(state->next);
+   }

I cannot find a way to test this change. Can you imagine a query to
exercise it on the regression tests?

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

almost 8 years ago

In reply to: Emre Hasegeli (#7)

1 attachment(s)

Re: constraint exclusion and nulls in IN (..) clause

Hi.

On 2018/03/04 22:12, Emre Hasegeli wrote:

Yeah, the patch in its current form is wrong, because it will give wrong
answers if the operator being used in a SAOP is non-strict. I modified
the patch to consider operator strictness before doing anything with nulls.

I tried to review this patch without any familiarity to the code.

Thanks for the review.

arrayconst_next_fn():
+   /* skip nulls if ok to do so */
+   if (state->opisstrict)
+   {
+       while (state->elem_nulls[state->next_elem])
+           state->next_elem++;
+   }
Shouldn't we check if we consumed all elements (state->next_elem >=
state->num_elems) inside the while loop?

You're right. Fixed.

arrayexpr_next_fn():
+   /* skip nulls if ok to do so */
+   if (state->opisstrict)
+   {
+       Node *node = (Node *) lfirst(state->next);
+
+       while (IsA(node, Const) && ((Const *) node)->constisnull)
+           state->next = lnext(state->next);
+   }
I cannot find a way to test this change. Can you imagine a query to
exercise it on the regression tests?

So far, I hadn't either. I figured one out and added it to inherit.sql.
Basically, I needed to write the query such that an IN () expression
doesn't get const-simplified to a Const containing array, but rather
remains an ArrayExpr. So, arrayexpr_*() functions in predtest.c are now
exercised.

Attached updated patch.

Thanks,
Amit

Attachments:

v3-0001-Disregard-nulls-in-SAOP-rightarg-array-list-durin.patchtext/plain; charset=UTF-8; name=v3-0001-Disregard-nulls-in-SAOP-rightarg-array-list-durin.patchDownload

From d0f26929eccba2e5671e18db0466ea73af424430 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 1 Feb 2018 11:32:23 +0900
Subject: [PATCH v3] Disregard nulls in SAOP rightarg array/list during CE

---
 src/backend/optimizer/util/predtest.c | 33 +++++++++++++++++++++++++++++++++
 src/test/regress/expected/inherit.out | 34 +++++++++++++++++++++++++++++-----
 src/test/regress/sql/inherit.sql      |  1 +
 3 files changed, 63 insertions(+), 5 deletions(-)

diff --git a/src/backend/optimizer/util/predtest.c b/src/backend/optimizer/util/predtest.c
index 8106010329..a20a3070bf 100644
--- a/src/backend/optimizer/util/predtest.c
+++ b/src/backend/optimizer/util/predtest.c
@@ -905,6 +905,7 @@ boolexpr_startup_fn(Node *clause, PredIterInfo info)
 typedef struct
 {
 	OpExpr		opexpr;
+	bool		opisstrict;		/* Is the operator strict? */
 	Const		constexpr;
 	int			next_elem;
 	int			num_elems;
@@ -957,6 +958,12 @@ arrayconst_startup_fn(Node *clause, PredIterInfo info)
 	state->constexpr.constbyval = elmbyval;
 	lsecond(state->opexpr.args) = &state->constexpr;
 
+	/*
+	 * Record if the operator is strict to perform appropriate action on
+	 * encountering null values in the element array.
+	 */
+	state->opisstrict = op_strict(saop->opno);
+
 	/* Initialize iteration state */
 	state->next_elem = 0;
 }
@@ -968,6 +975,13 @@ arrayconst_next_fn(PredIterInfo info)
 
 	if (state->next_elem >= state->num_elems)
 		return NULL;
+	/* skip nulls if ok to do so */
+	if (state->opisstrict)
+	{
+		while (state->next_elem < state->num_elems &&
+			   state->elem_nulls[state->next_elem])
+			state->next_elem++;
+	}
 	state->constexpr.constvalue = state->elem_values[state->next_elem];
 	state->constexpr.constisnull = state->elem_nulls[state->next_elem];
 	state->next_elem++;
@@ -992,6 +1006,7 @@ arrayconst_cleanup_fn(PredIterInfo info)
 typedef struct
 {
 	OpExpr		opexpr;
+	bool		opisstrict;		/* Is the operator strict? */
 	ListCell   *next;
 } ArrayExprIterState;
 
@@ -1016,6 +1031,12 @@ arrayexpr_startup_fn(Node *clause, PredIterInfo info)
 	state->opexpr.inputcollid = saop->inputcollid;
 	state->opexpr.args = list_copy(saop->args);
 
+	/*
+	 * Record if the operator is strict to perform appropriate action on
+	 * encountering null values in the element array.
+	 */
+	state->opisstrict = op_strict(saop->opno);
+
 	/* Initialize iteration variable to first member of ArrayExpr */
 	arrayexpr = (ArrayExpr *) lsecond(saop->args);
 	state->next = list_head(arrayexpr->elements);
@@ -1028,6 +1049,18 @@ arrayexpr_next_fn(PredIterInfo info)
 
 	if (state->next == NULL)
 		return NULL;
+	/* skip nulls if ok to do so */
+	if (state->opisstrict && state->next)
+	{
+		Node   *node = (Node *) lfirst(state->next);
+
+		while (node && IsA(node, Const) && ((Const *) node)->constisnull)
+		{
+			state->next = lnext(state->next);
+			if (state->next)
+				node = (Node *) lfirst(state->next);
+		}
+	}
 	lsecond(state->opexpr.args) = lfirst(state->next);
 	state->next = lnext(state->next);
 	return (Node *) &(state->opexpr);
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index a79f891da7..065ea86d72 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1715,11 +1715,7 @@ explain (costs off) select * from list_parted where a = 'ab' or a in (null, 'cd'
  Append
    ->  Seq Scan on part_ab_cd
          Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
-   ->  Seq Scan on part_ef_gh
-         Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
-   ->  Seq Scan on part_null_xy
-         Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
-(7 rows)
+(3 rows)
 
 explain (costs off) select * from list_parted where a = 'ab';
                 QUERY PLAN                
@@ -1797,6 +1793,34 @@ explain (costs off) select * from range_list_parted where a between 3 and 23 and
          Filter: ((a >= 3) AND (a <= 23) AND (b = 'ab'::bpchar))
 (7 rows)
 
+explain (costs off) select * from generate_series(11, 12) v(a) where exists (select count(*) from range_list_parted where a = 1 or a in (null, v.a));
+                                    QUERY PLAN                                    
+----------------------------------------------------------------------------------
+ Function Scan on generate_series v
+   Filter: (SubPlan 1)
+   SubPlan 1
+     ->  Aggregate
+           ->  Append
+                 ->  Seq Scan on part_1_10_ab
+                       Filter: ((a = 1) OR (a = ANY (ARRAY[NULL::integer, v.a])))
+                 ->  Seq Scan on part_1_10_cd
+                       Filter: ((a = 1) OR (a = ANY (ARRAY[NULL::integer, v.a])))
+                 ->  Seq Scan on part_10_20_ab
+                       Filter: ((a = 1) OR (a = ANY (ARRAY[NULL::integer, v.a])))
+                 ->  Seq Scan on part_10_20_cd
+                       Filter: ((a = 1) OR (a = ANY (ARRAY[NULL::integer, v.a])))
+                 ->  Seq Scan on part_21_30_ab
+                       Filter: ((a = 1) OR (a = ANY (ARRAY[NULL::integer, v.a])))
+                 ->  Seq Scan on part_21_30_cd
+                       Filter: ((a = 1) OR (a = ANY (ARRAY[NULL::integer, v.a])))
+                 ->  Seq Scan on part_40_inf_ab
+                       Filter: ((a = 1) OR (a = ANY (ARRAY[NULL::integer, v.a])))
+                 ->  Seq Scan on part_40_inf_cd
+                       Filter: ((a = 1) OR (a = ANY (ARRAY[NULL::integer, v.a])))
+                 ->  Seq Scan on part_40_inf_null
+                       Filter: ((a = 1) OR (a = ANY (ARRAY[NULL::integer, v.a])))
+(23 rows)
+
 /* Should select no rows because range partition key cannot be null */
 explain (costs off) select * from range_list_parted where a is null;
         QUERY PLAN        
diff --git a/src/test/regress/sql/inherit.sql b/src/test/regress/sql/inherit.sql
index 2e42ae115d..c31cd6dab4 100644
--- a/src/test/regress/sql/inherit.sql
+++ b/src/test/regress/sql/inherit.sql
@@ -651,6 +651,7 @@ explain (costs off) select * from range_list_parted;
 explain (costs off) select * from range_list_parted where a = 5;
 explain (costs off) select * from range_list_parted where b = 'ab';
 explain (costs off) select * from range_list_parted where a between 3 and 23 and b in ('ab');
+explain (costs off) select * from generate_series(11, 12) v(a) where exists (select count(*) from range_list_parted where a = 1 or a in (null, v.a));
 
 /* Should select no rows because range partition key cannot be null */
 explain (costs off) select * from range_list_parted where a is null;
-- 
2.11.0

Emre Hasegeli

emre@hasegeli.com

almost 8 years ago

In reply to: Amit Langote (#8)

Re: constraint exclusion and nulls in IN (..) clause

Shouldn't we check if we consumed all elements (state->next_elem >=
state->num_elems) inside the while loop?

You're right. Fixed.

I don't think the fix is correct. arrayconst_next_fn() can still
execute state->next_elem++ without checking if we consumed all
elements. I couldn't manage to crash it though.

I am also not sure if it is proper to skip some items inside the
"next_fn", but I don't know the code to suggest a better alternative.

+   /* skip nulls if ok to do so */
+   if (state->opisstrict && state->next)

Do we need && state->next in here? It is checked for (state->next ==
NULL) 2 lines above.

+   {
+       Node   *node = (Node *) lfirst(state->next);
+
+       while (node && IsA(node, Const) && ((Const *) node)->constisnull)

Should we spell the first check like node != NULL as the rest of the code does.

+       {
+           state->next = lnext(state->next);
+           if (state->next)
+               node = (Node *) lfirst(state->next);
+       }
+   }

I think this function is also not correct as it can call
lnext(state->next) without checking.

So far, I hadn't either. I figured one out and added it to inherit.sql.
Basically, I needed to write the query such that an IN () expression
doesn't get const-simplified to a Const containing array, but rather
remains an ArrayExpr. So, arrayexpr_*() functions in predtest.c are now
exercised.

Nice. I noticed that this part of the code was not covered at all by
the tests before.

#10

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

almost 8 years ago

In reply to: Emre Hasegeli (#9)

1 attachment(s)

Re: constraint exclusion and nulls in IN (..) clause

Hi.

Thanks for reviewing again.

On 2018/03/05 23:04, Emre Hasegeli wrote:

Shouldn't we check if we consumed all elements (state->next_elem >=
state->num_elems) inside the while loop?

You're right. Fixed.

I don't think the fix is correct. arrayconst_next_fn() can still
execute state->next_elem++ without checking if we consumed all
elements. I couldn't manage to crash it though.

I couldn't get it to crash either, but it's wrong anyway. What happens is
the calling code would perform constraint exclusion with a garbage clause
(that is one containing a garbage value picked from elem_values when
next_elem has exceeded num_elems) and that may alter the result of
partition pruning. Verified that with the following example.

create table p (a int) partition by list (a);
create table p1 partition of p for values in (1);
create table p2 partition of p for values in (2);

-- before fixing
explain (costs off) select * from p where a in (2, null);
QUERY PLAN
---------------------------------------------------
Append
-> Seq Scan on p1
Filter: (a = ANY ('{2,NULL}'::integer[]))
-> Seq Scan on p2
Filter: (a = ANY ('{2,NULL}'::integer[]))
(5 rows)

So, it looks as if the proposed patch has no effect at all.

-- after fixing
explain (costs off) select * from p where a in (2, null);
QUERY PLAN
----------------------------------------------------------
Append
-> Seq Scan on p2
Filter: (a = ANY ('{2,NULL}'::integer[]))

Was a bit surprised that the calling code didn't crash while handling a
garbage clause.

I am also not sure if it is proper to skip some items inside the
"next_fn", but I don't know the code to suggest a better alternative.

Patch teaches it to ignore nulls when it's known that the operator being
used is strict. It is harmless and has the benefit that constraint
exclusion gives an answer that is consistent with what actually running
such a qual against a table's rows would do.

Consider this example.

create table foo (a int check (a = 1));

insert into foo values (1), (null);

select * from foo where a in (2, null);
a
---
(0 rows)

set constraint_exclusion to on;

-- you'd think constraint exclusion would avoid the scan
explain select * from foo where a in (2, null);
QUERY PLAN
-----------------------------------------------------
Seq Scan on foo (cost=0.00..41.88 rows=13 width=4)
Filter: (a = ANY ('{2,NULL}'::integer[]))
(2 rows)

But it didn't, because the null in that list caused constraint exclusion
to mistakenly decide that the clause as a whole doesn't refute the
constraint (check (a = 1)).

-- with the patch
explain select * from foo where a in (2, null);
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.00 rows=0 width=4)
One-Time Filter: false
(2 rows)

After ignoring null, only (a = 2) is left to consider and that does refute
(a = 1), so pruning works as desired.

BTW, if you compose the qual using an OR directly, it works as desired
even with HEAD:

explain select * from foo where a = 2 or a = null;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.00 rows=0 width=4)
One-Time Filter: false
(2 rows)

That's because a = null is const-simplified in an earlier planning phase
to false, so (a = 2 or false) reduces to just a = 2, which does refute
foo's constraint (a = 1). I think the same thing should happen when
constraint exclusion internally turns an IN (..) clause into a set of OR
clauses. Skipping nulls in these next_fn functions is, in a way, same as
const-simplifying them away.

+   /* skip nulls if ok to do so */
+   if (state->opisstrict && state->next)
Do we need && state->next in here? It is checked for (state->next ==
NULL) 2 lines above.

Yeah, it wasn't needed.

+   {
+       Node   *node = (Node *) lfirst(state->next);
+
+       while (node && IsA(node, Const) && ((Const *) node)->constisnull)
Should we spell the first check like node != NULL as the rest of the code does.

OK.

+       {
+           state->next = lnext(state->next);
+           if (state->next)
+               node = (Node *) lfirst(state->next);
+       }
+   }
I think this function is also not correct as it can call
lnext(state->next) without checking.

Yeah. Rearranged the code to fix that.

Attached updated patch. Thanks again.

Regards,
Amit

Attachments:

v4-0001-Disregard-nulls-in-SAOP-rightarg-array-list-durin.patchtext/plain; charset=UTF-8; name=v4-0001-Disregard-nulls-in-SAOP-rightarg-array-list-durin.patchDownload

From 1c16473da04dfa2d41fdcf5de6983b65aabd5ff4 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 1 Feb 2018 11:32:23 +0900
Subject: [PATCH v4] Disregard nulls in SAOP rightarg array/list during CE

---
 src/backend/optimizer/util/predtest.c | 45 +++++++++++++++++++++++++++++++++++
 src/test/regress/expected/inherit.out | 34 ++++++++++++++++++++++----
 src/test/regress/sql/inherit.sql      |  1 +
 3 files changed, 75 insertions(+), 5 deletions(-)

diff --git a/src/backend/optimizer/util/predtest.c b/src/backend/optimizer/util/predtest.c
index 8106010329..3e8171622e 100644
--- a/src/backend/optimizer/util/predtest.c
+++ b/src/backend/optimizer/util/predtest.c
@@ -905,6 +905,7 @@ boolexpr_startup_fn(Node *clause, PredIterInfo info)
 typedef struct
 {
 	OpExpr		opexpr;
+	bool		opisstrict;		/* Is the operator strict? */
 	Const		constexpr;
 	int			next_elem;
 	int			num_elems;
@@ -957,6 +958,12 @@ arrayconst_startup_fn(Node *clause, PredIterInfo info)
 	state->constexpr.constbyval = elmbyval;
 	lsecond(state->opexpr.args) = &state->constexpr;
 
+	/*
+	 * Record if the operator is strict to perform appropriate action on
+	 * encountering null values in the element array.
+	 */
+	state->opisstrict = op_strict(saop->opno);
+
 	/* Initialize iteration state */
 	state->next_elem = 0;
 }
@@ -966,8 +973,21 @@ arrayconst_next_fn(PredIterInfo info)
 {
 	ArrayConstIterState *state = (ArrayConstIterState *) info->state;
 
+	/*
+	 * Skip nulls if the operator is strict.  Skipping nulls like this has
+	 * same effect as const-simplifying a clause containing null argument to
+	 * false.
+	 */
+	if (state->opisstrict)
+	{
+		while (state->next_elem < state->num_elems &&
+			   state->elem_nulls[state->next_elem])
+			state->next_elem++;
+	}
+
 	if (state->next_elem >= state->num_elems)
 		return NULL;
+
 	state->constexpr.constvalue = state->elem_values[state->next_elem];
 	state->constexpr.constisnull = state->elem_nulls[state->next_elem];
 	state->next_elem++;
@@ -992,6 +1012,7 @@ arrayconst_cleanup_fn(PredIterInfo info)
 typedef struct
 {
 	OpExpr		opexpr;
+	bool		opisstrict;		/* Is the operator strict? */
 	ListCell   *next;
 } ArrayExprIterState;
 
@@ -1016,6 +1037,12 @@ arrayexpr_startup_fn(Node *clause, PredIterInfo info)
 	state->opexpr.inputcollid = saop->inputcollid;
 	state->opexpr.args = list_copy(saop->args);
 
+	/*
+	 * Record if the operator is strict to perform appropriate action on
+	 * encountering null values in the element array.
+	 */
+	state->opisstrict = op_strict(saop->opno);
+
 	/* Initialize iteration variable to first member of ArrayExpr */
 	arrayexpr = (ArrayExpr *) lsecond(saop->args);
 	state->next = list_head(arrayexpr->elements);
@@ -1026,8 +1053,26 @@ arrayexpr_next_fn(PredIterInfo info)
 {
 	ArrayExprIterState *state = (ArrayExprIterState *) info->state;
 
+	/*
+	 * Skip nulls if the operator is strict.  Skipping nulls like this has
+	 * same effect as const-simplifying a clause containing null argument to
+	 * false.
+	 */
+	if (state->opisstrict)
+	{
+		Node   *node = (state->next != NULL) ? lfirst(state->next) : NULL;
+
+		while (node != NULL && IsA(node, Const) &&
+			   ((Const *) node)->constisnull)
+		{
+			state->next = (state->next != NULL) ? lnext(state->next) : NULL;
+			node = (state->next != NULL) ? lfirst(state->next) : NULL;
+		}
+	}
+
 	if (state->next == NULL)
 		return NULL;
+
 	lsecond(state->opexpr.args) = lfirst(state->next);
 	state->next = lnext(state->next);
 	return (Node *) &(state->opexpr);
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index a79f891da7..065ea86d72 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1715,11 +1715,7 @@ explain (costs off) select * from list_parted where a = 'ab' or a in (null, 'cd'
  Append
    ->  Seq Scan on part_ab_cd
          Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
-   ->  Seq Scan on part_ef_gh
-         Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
-   ->  Seq Scan on part_null_xy
-         Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
-(7 rows)
+(3 rows)
 
 explain (costs off) select * from list_parted where a = 'ab';
                 QUERY PLAN                
@@ -1797,6 +1793,34 @@ explain (costs off) select * from range_list_parted where a between 3 and 23 and
          Filter: ((a >= 3) AND (a <= 23) AND (b = 'ab'::bpchar))
 (7 rows)
 
+explain (costs off) select * from generate_series(11, 12) v(a) where exists (select count(*) from range_list_parted where a = 1 or a in (null, v.a));
+                                    QUERY PLAN                                    
+----------------------------------------------------------------------------------
+ Function Scan on generate_series v
+   Filter: (SubPlan 1)
+   SubPlan 1
+     ->  Aggregate
+           ->  Append
+                 ->  Seq Scan on part_1_10_ab
+                       Filter: ((a = 1) OR (a = ANY (ARRAY[NULL::integer, v.a])))
+                 ->  Seq Scan on part_1_10_cd
+                       Filter: ((a = 1) OR (a = ANY (ARRAY[NULL::integer, v.a])))
+                 ->  Seq Scan on part_10_20_ab
+                       Filter: ((a = 1) OR (a = ANY (ARRAY[NULL::integer, v.a])))
+                 ->  Seq Scan on part_10_20_cd
+                       Filter: ((a = 1) OR (a = ANY (ARRAY[NULL::integer, v.a])))
+                 ->  Seq Scan on part_21_30_ab
+                       Filter: ((a = 1) OR (a = ANY (ARRAY[NULL::integer, v.a])))
+                 ->  Seq Scan on part_21_30_cd
+                       Filter: ((a = 1) OR (a = ANY (ARRAY[NULL::integer, v.a])))
+                 ->  Seq Scan on part_40_inf_ab
+                       Filter: ((a = 1) OR (a = ANY (ARRAY[NULL::integer, v.a])))
+                 ->  Seq Scan on part_40_inf_cd
+                       Filter: ((a = 1) OR (a = ANY (ARRAY[NULL::integer, v.a])))
+                 ->  Seq Scan on part_40_inf_null
+                       Filter: ((a = 1) OR (a = ANY (ARRAY[NULL::integer, v.a])))
+(23 rows)
+
 /* Should select no rows because range partition key cannot be null */
 explain (costs off) select * from range_list_parted where a is null;
         QUERY PLAN        
diff --git a/src/test/regress/sql/inherit.sql b/src/test/regress/sql/inherit.sql
index 2e42ae115d..c31cd6dab4 100644
--- a/src/test/regress/sql/inherit.sql
+++ b/src/test/regress/sql/inherit.sql
@@ -651,6 +651,7 @@ explain (costs off) select * from range_list_parted;
 explain (costs off) select * from range_list_parted where a = 5;
 explain (costs off) select * from range_list_parted where b = 'ab';
 explain (costs off) select * from range_list_parted where a between 3 and 23 and b in ('ab');
+explain (costs off) select * from generate_series(11, 12) v(a) where exists (select count(*) from range_list_parted where a = 1 or a in (null, v.a));
 
 /* Should select no rows because range partition key cannot be null */
 explain (costs off) select * from range_list_parted where a is null;
-- 
2.11.0

#11

Emre Hasegeli

emre@hasegeli.com

almost 8 years ago

In reply to: Amit Langote (#10)

Re: constraint exclusion and nulls in IN (..) clause

Patch teaches it to ignore nulls when it's known that the operator being
used is strict. It is harmless and has the benefit that constraint
exclusion gives an answer that is consistent with what actually running
such a qual against a table's rows would do.

Yes, I understood that. I just meant that I don't know if it is the
best way to skip NULLs inside "next_fn". Maybe the caller of the
"next_fn"s should skip them. Anyway, the committer can judge this
better.

Yeah. Rearranged the code to fix that.

This version looks correct to me.

+           state->next = (state->next != NULL) ? lnext(state->next) : NULL;
+           node = (state->next != NULL) ? lfirst(state->next) : NULL;

I think it is unnecessary to check for (state->next != NULL) two
times. We can put those in a single if.

#12

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

almost 8 years ago

In reply to: Emre Hasegeli (#11)

Re: constraint exclusion and nulls in IN (..) clause

On 2018/03/06 18:46, Emre Hasegeli wrote:

Patch teaches it to ignore nulls when it's known that the operator being
used is strict. It is harmless and has the benefit that constraint
exclusion gives an answer that is consistent with what actually running
such a qual against a table's rows would do.

Yes, I understood that. I just meant that I don't know if it is the
best way to skip NULLs inside "next_fn". Maybe the caller of the
"next_fn"s should skip them. Anyway, the committer can judge this
better.

I think putting that inside those next_fn functions tends to centralize
the logic and would run only only for the intended cases.

Yeah. Rearranged the code to fix that.

This version looks correct to me.
+           state->next = (state->next != NULL) ? lnext(state->next) : NULL;
+           node = (state->next != NULL) ? lfirst(state->next) : NULL;
I think it is unnecessary to check for (state->next != NULL) two
times. We can put those in a single if.

Hmm, state->next refers to two different pointer values on line 1 and line
2. It may end up being set to NULL on line 1. Am I missing something?

Thanks,
Amit

#13

Emre Hasegeli

emre@hasegeli.com

almost 8 years ago

In reply to: Amit Langote (#12)

Re: constraint exclusion and nulls in IN (..) clause

Hmm, state->next refers to two different pointer values on line 1 and line
2. It may end up being set to NULL on line 1. Am I missing something?

True, lnext(state->next) can set it to NULL. I confused by the below
code on the same function doing the steps in reverse order. With this
cleared, I have nothing else to say, so I am setting this to ready for
committer.

#14

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

almost 8 years ago

In reply to: Emre Hasegeli (#13)

Re: constraint exclusion and nulls in IN (..) clause

On 2018/03/06 19:16, Emre Hasegeli wrote:

Hmm, state->next refers to two different pointer values on line 1 and line
2. It may end up being set to NULL on line 1. Am I missing something?

True, lnext(state->next) can set it to NULL. I confused by the below
code on the same function doing the steps in reverse order. With this
cleared, I have nothing else to say, so I am setting this to ready for
committer.

Thanks. :)

Regards,
Amit

#15

Tom Lane

tgl@sss.pgh.pa.us

almost 8 years ago

In reply to: Amit Langote (#10)

Re: constraint exclusion and nulls in IN (..) clause

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:

[ v4-0001-Disregard-nulls-in-SAOP-rightarg-array-list-durin.patch ]

This patch seems pretty wrong to me. The proposed proof rule is wrong
for !useOr expressions (that is, "scalar op ALL (array)"); you can't
just ignore null items in that case. It's also wrong when we need strong
refutation. I was about to point to the comment for predicate_refuted_by,

* An important fine point is that truth of the clauses must imply that
* the predicate returns FALSE, not that it does not return TRUE. This
* is normally used to try to refute CHECK constraints, and the only
* thing we can assume about a CHECK constraint is that it didn't return
* FALSE --- a NULL result isn't a violation per the SQL spec. (Someday

and reject the patch as unfixably wrong, when I noticed that in
fact somebody had added support for weak refutation to this code.
Now I'm just desperately unhappy about the lack of comment-updating
work in that commit (b08df9cab, looks like). It's not acceptable
to falsify comments and not update them. I have a nasty feeling that
there are outright bugs in the logic, too.

I'm inclined to think that you've attacked the problem at the wrong place
anyway. The notion that one arm of an OR reducing to NULL might not
prevent proving what we want is by no means specific to ScalarArrayOp.
I think it'd make more sense to see about incorporating that idea in
predicate_implied_by_simple_clause/predicate_refuted_by_simple_clause.
It might be necessary to change the boolean results in the recursive
logic to three-way, not sure.

I'm setting this patch back to Waiting On Author pending fixes
for the above issues. In the meantime I see a separate work item
here to do code review for b08df9cab.

regards, tom lane

#16

Tom Lane

tgl@sss.pgh.pa.us

almost 8 years ago

In reply to: Tom Lane (#15)

1 attachment(s)

Re: constraint exclusion and nulls in IN (..) clause

I wrote:

I think it'd make more sense to see about incorporating that idea in
predicate_implied_by_simple_clause/predicate_refuted_by_simple_clause.

After further thought, it seems like the place to deal with this is
really operator_predicate_proof(), as in the attached draft patch
against HEAD. This passes the smell test for me, in the sense that
it's an arguably correct and general extension of the proof rules,
but it could use more testing.

TBH, the change in the existing regression test case in inherit.sql
makes me itch. We've got

create table list_parted (
a varchar
) partition by list (a);
...
create table part_null_xy partition of list_parted for values in (null, 'xy');
...
explain (costs off) select * from list_parted where a = 'ab' or a in (null, 'cd');

Now, the fact that "null" is included in this query's IN clause is a
complete no-op, because the IN is using a strict equality operator.
So it's nice that the planner can see that and realize that there's
no point in scanning part_null_xy ... but this means that the syntax
that's been chosen for list partitioning is seriously misleading.
"in (null, 'xy')" in the CREATE TABLE command has nothing to do with
the semantics of that identical clause in any other context, and indeed
it seems chosen in a way to confuse even (or especially?) seasoned
experts.

I suppose it's too late to do anything about that now, but it sure
seems like NULL should've been handled some other way.

regards, tom lane

Attachments:

handle-null-constants-in-predtest-1.patchtext/x-diff; charset=us-ascii; name=handle-null-constants-in-predtest-1.patchDownload

diff --git a/src/backend/optimizer/util/predtest.c b/src/backend/optimizer/util/predtest.c
index fb963d0..446207d 100644
*** a/src/backend/optimizer/util/predtest.c
--- b/src/backend/optimizer/util/predtest.c
*************** static Node *extract_not_arg(Node *claus
*** 100,106 ****
  static Node *extract_strong_not_arg(Node *clause);
  static bool clause_is_strict_for(Node *clause, Node *subexpr);
  static bool operator_predicate_proof(Expr *predicate, Node *clause,
! 						 bool refute_it);
  static bool operator_same_subexprs_proof(Oid pred_op, Oid clause_op,
  							 bool refute_it);
  static bool operator_same_subexprs_lookup(Oid pred_op, Oid clause_op,
--- 100,106 ----
  static Node *extract_strong_not_arg(Node *clause);
  static bool clause_is_strict_for(Node *clause, Node *subexpr);
  static bool operator_predicate_proof(Expr *predicate, Node *clause,
! 						 bool refute_it, bool weak);
  static bool operator_same_subexprs_proof(Oid pred_op, Oid clause_op,
  							 bool refute_it);
  static bool operator_same_subexprs_lookup(Oid pred_op, Oid clause_op,
*************** predicate_implied_by_simple_clause(Expr 
*** 1137,1143 ****
  	}
  
  	/* Else try operator-related knowledge */
! 	return operator_predicate_proof(predicate, clause, false);
  }
  
  /*----------
--- 1137,1143 ----
  	}
  
  	/* Else try operator-related knowledge */
! 	return operator_predicate_proof(predicate, clause, false, weak);
  }
  
  /*----------
*************** predicate_refuted_by_simple_clause(Expr 
*** 1232,1238 ****
  	}
  
  	/* Else try operator-related knowledge */
! 	return operator_predicate_proof(predicate, clause, true);
  }
  
  
--- 1232,1238 ----
  	}
  
  	/* Else try operator-related knowledge */
! 	return operator_predicate_proof(predicate, clause, true, weak);
  }
  
  
*************** static const StrategyNumber BT_refute_ta
*** 1498,1506 ****
   * values, since then the operators aren't being given identical inputs.  But
   * we only support that for btree operators, for which we can assume that all
   * non-null inputs result in non-null outputs, so that it doesn't matter which
!  * two non-null constants we consider.  Currently the code below just reports
!  * "proof failed" if either constant is NULL, but in some cases we could be
!  * smarter (and that likely would require checking strong vs. weak proofs).
   *
   * We can make proofs involving several expression forms (here "foo" and "bar"
   * represent subexpressions that are identical according to equal()):
--- 1498,1505 ----
   * values, since then the operators aren't being given identical inputs.  But
   * we only support that for btree operators, for which we can assume that all
   * non-null inputs result in non-null outputs, so that it doesn't matter which
!  * two non-null constants we consider.  If either constant is NULL, we have
!  * to think harder, but sometimes the proof still works, as explained below.
   *
   * We can make proofs involving several expression forms (here "foo" and "bar"
   * represent subexpressions that are identical according to equal()):
*************** static const StrategyNumber BT_refute_ta
*** 1528,1534 ****
   * and we dare not make deductions with those.
   */
  static bool
! operator_predicate_proof(Expr *predicate, Node *clause, bool refute_it)
  {
  	OpExpr	   *pred_opexpr,
  			   *clause_opexpr;
--- 1527,1534 ----
   * and we dare not make deductions with those.
   */
  static bool
! operator_predicate_proof(Expr *predicate, Node *clause,
! 						 bool refute_it, bool weak)
  {
  	OpExpr	   *pred_opexpr,
  			   *clause_opexpr;
*************** operator_predicate_proof(Expr *predicate
*** 1675,1691 ****
  	 * We have two identical subexpressions, and two other subexpressions that
  	 * are not identical but are both Consts; and we have commuted the
  	 * operators if necessary so that the Consts are on the right.  We'll need
! 	 * to compare the Consts' values.  If either is NULL, fail.
! 	 *
! 	 * Future work: in some cases the desired proof might hold even with NULL
! 	 * constants.  But beware that we've not yet identified the operators as
! 	 * btree ops, so for instance it'd be quite unsafe to assume they are
! 	 * strict without checking.
  	 */
- 	if (pred_const->constisnull)
- 		return false;
  	if (clause_const->constisnull)
  		return false;
  
  	/*
  	 * Lookup the constant-comparison operator using the system catalogs and
--- 1675,1720 ----
  	 * We have two identical subexpressions, and two other subexpressions that
  	 * are not identical but are both Consts; and we have commuted the
  	 * operators if necessary so that the Consts are on the right.  We'll need
! 	 * to compare the Consts' values.  If either is NULL, we can't do that, so
! 	 * usually the proof fails ... but in some cases we can claim success.
  	 */
  	if (clause_const->constisnull)
+ 	{
+ 		/* If clause_op isn't strict, we can't prove anything */
+ 		if (!op_strict(clause_op))
+ 			return false;
+ 
+ 		/*
+ 		 * At this point we know that the clause returns NULL.  For proof
+ 		 * types that assume truth of the clause, this means the proof is
+ 		 * vacuously true (a/k/a "false implies anything").  That's all proof
+ 		 * types except weak implication.
+ 		 */
+ 		if (!(weak && !refute_it))
+ 			return true;
+ 
+ 		/*
+ 		 * For weak implication, it's still possible for the proof to succeed,
+ 		 * if the predicate can also be proven NULL.  In that case we've got
+ 		 * NULL => NULL which is valid for this proof type.
+ 		 */
+ 		if (pred_const->constisnull && op_strict(pred_op))
+ 			return true;
+ 		/* Else the proof fails */
+ 		return false;
+ 	}
+ 	if (pred_const->constisnull)
+ 	{
+ 		/*
+ 		 * If the pred_op is strict, we know the predicate yields NULL, which
+ 		 * means the proof succeeds for either weak implication or weak
+ 		 * refutation.
+ 		 */
+ 		if (weak && op_strict(pred_op))
+ 			return true;
+ 		/* Else the proof fails */
  		return false;
+ 	}
  
  	/*
  	 * Lookup the constant-comparison operator using the system catalogs and
diff --git a/src/test/modules/test_predtest/expected/test_predtest.out b/src/test/modules/test_predtest/expected/test_predtest.out
index 9dd8aec..5574e03 100644
*** a/src/test/modules/test_predtest/expected/test_predtest.out
--- b/src/test/modules/test_predtest/expected/test_predtest.out
*************** w_i_holds         | f
*** 781,793 ****
  s_r_holds         | f
  w_r_holds         | f
  
- -- XXX ideally, we could prove this case too, for strong implication
  select * from test_predtest($$
  select x <= 5, x in (1,3,5,null)
  from integers
  $$);
  -[ RECORD 1 ]-----+--
! strong_implied_by | f
  weak_implied_by   | f
  strong_refuted_by | f
  weak_refuted_by   | f
--- 781,792 ----
  s_r_holds         | f
  w_r_holds         | f
  
  select * from test_predtest($$
  select x <= 5, x in (1,3,5,null)
  from integers
  $$);
  -[ RECORD 1 ]-----+--
! strong_implied_by | t
  weak_implied_by   | f
  strong_refuted_by | f
  weak_refuted_by   | f
diff --git a/src/test/modules/test_predtest/sql/test_predtest.sql b/src/test/modules/test_predtest/sql/test_predtest.sql
index 298b8bf..2734735 100644
*** a/src/test/modules/test_predtest/sql/test_predtest.sql
--- b/src/test/modules/test_predtest/sql/test_predtest.sql
*************** select x <= 5, x in (1,3,5,7)
*** 306,312 ****
  from integers
  $$);
  
- -- XXX ideally, we could prove this case too, for strong implication
  select * from test_predtest($$
  select x <= 5, x in (1,3,5,null)
  from integers
--- 306,311 ----
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index a79f891..d56e5d2 100644
*** a/src/test/regress/expected/inherit.out
--- b/src/test/regress/expected/inherit.out
*************** explain (costs off) select * from list_p
*** 1715,1725 ****
   Append
     ->  Seq Scan on part_ab_cd
           Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
!    ->  Seq Scan on part_ef_gh
!          Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
!    ->  Seq Scan on part_null_xy
!          Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
! (7 rows)
  
  explain (costs off) select * from list_parted where a = 'ab';
                  QUERY PLAN                
--- 1715,1721 ----
   Append
     ->  Seq Scan on part_ab_cd
           Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
! (3 rows)
  
  explain (costs off) select * from list_parted where a = 'ab';
                  QUERY PLAN

#17

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

almost 8 years ago

In reply to: Tom Lane (#16)

Re: constraint exclusion and nulls in IN (..) clause

On 2018/03/10 13:40, Tom Lane wrote:

I wrote:

I think it'd make more sense to see about incorporating that idea in
predicate_implied_by_simple_clause/predicate_refuted_by_simple_clause.

After further thought, it seems like the place to deal with this is
really operator_predicate_proof(), as in the attached draft patch
against HEAD. This passes the smell test for me, in the sense that
it's an arguably correct and general extension of the proof rules,
but it could use more testing.

Thanks for the patch. I agree it handles the case I presented my patch
for in a more principled manner. So, I've marked the CF entry for my
patch as Rejected.

Looking at the patch, I guess it'd be right to think that the case for
which the following block of code will be executed only arises for
OpExpr's generated by predtest.c (that is, when iterating an SAOP):

     if (clause_const->constisnull)
+    {
+        /* If clause_op isn't strict, we can't prove anything */
+        if (!op_strict(clause_op))
+            return false;
+
<snip>

That's because any other expr = null would've been reduced to
constant-false at some earlier point.

Then, in the same block I see that there is:

+        /*
+         * For weak implication, it's still possible for the proof to
succeed,
+         * if the predicate can also be proven NULL.  In that case we've got
+         * NULL => NULL which is valid for this proof type.
+         */
+        if (pred_const->constisnull && op_strict(pred_op))
+            return true;

which, afaics, will never be able to return true, because
pred_const->constisnull will never be true here. That's because the
following code that will run before the above code will determine the
result of operator_predicate_proof() in that case:

if (equal(pred_leftop, clause_leftop))
{
if (equal(pred_rightop, clause_rightop))
{
/* We have x op1 y and x op2 y */
return operator_same_subexprs_proof(pred_op, clause_op,
refute_it);

I wonder if the result would be valid in that case, because the proof
cache should only be looked up for non-null sub-expressions, no?

TBH, the change in the existing regression test case in inherit.sql
makes me itch. We've got

create table list_parted (
a varchar
) partition by list (a);
...
create table part_null_xy partition of list_parted for values in (null, 'xy');
...
explain (costs off) select * from list_parted where a = 'ab' or a in (null, 'cd');

Now, the fact that "null" is included in this query's IN clause is a
complete no-op, because the IN is using a strict equality operator.
So it's nice that the planner can see that and realize that there's
no point in scanning part_null_xy ... but this means that the syntax
that's been chosen for list partitioning is seriously misleading.
"in (null, 'xy')" in the CREATE TABLE command has nothing to do with
the semantics of that identical clause in any other context, and indeed
it seems chosen in a way to confuse even (or especially?) seasoned
experts.

Hmm, yeah, it is confusing. Actually, what it really internally becomes
is the following expression:

(a IS NULL) OR (a = 'xy')

or if the CREATE TABLE had "in (null, 'xy', 'yz'), the expression will become:

(a IS NULL) OR (a = ANY (ARRAY['xy'::text, 'yz'::text]))

which, as you say, is not the same thing as how the original expression is
interpreted elsewhere.

I suppose it's too late to do anything about that now, but it sure
seems like NULL should've been handled some other way.

Maybe, write something in the documentation about that?

Thanks,
Amit

#18

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

almost 8 years ago

In reply to: Amit Langote (#17)

Re: constraint exclusion and nulls in IN (..) clause

On 2018/03/14 17:16, Amit Langote wrote:

On 2018/03/10 13:40, Tom Lane wrote:

I wrote:

I think it'd make more sense to see about incorporating that idea in
predicate_implied_by_simple_clause/predicate_refuted_by_simple_clause.

After further thought, it seems like the place to deal with this is
really operator_predicate_proof(), as in the attached draft patch
against HEAD. This passes the smell test for me, in the sense that
it's an arguably correct and general extension of the proof rules,
but it could use more testing.

Thanks for the patch. I agree it handles the case I presented my patch
for in a more principled manner. So, I've marked the CF entry for my
patch as Rejected.

Oops, sorry I hadn't actually seen the CF entry before hitting send on
this email. Seeing that Tom intends to attach his patch with this CF
entry, I will leave the entry alone for now.

Thanks,
Amit

#19

Tom Lane

tgl@sss.pgh.pa.us

almost 8 years ago

In reply to: Tom Lane (#16)

Re: constraint exclusion and nulls in IN (..) clause

I wrote:

After further thought, it seems like the place to deal with this is
really operator_predicate_proof(), as in the attached draft patch
against HEAD. This passes the smell test for me, in the sense that
it's an arguably correct and general extension of the proof rules,
but it could use more testing.

Was anyone planning to do more work or testing on this? Or should
I just push it so we can close the CF entry?

regards, tom lane

#20

Emre Hasegeli

emre@hasegeli.com

almost 8 years ago

In reply to: Tom Lane (#16)

Re: constraint exclusion and nulls in IN (..) clause

After further thought, it seems like the place to deal with this is
really operator_predicate_proof(), as in the attached draft patch
against HEAD. This passes the smell test for me, in the sense that
it's an arguably correct and general extension of the proof rules,
but it could use more testing.

I am not sure if we are covering the case when clause_const and
pred_const are both NULL. In this case, we should be able to return
true only by checking op_strict(pred_op) or maybe even without
checking that. Am I mistaken?

#21

Tom Lane

tgl@sss.pgh.pa.us

almost 8 years ago

In reply to: Emre Hasegeli (#20)

Re: constraint exclusion and nulls in IN (..) clause

Emre Hasegeli <emre@hasegeli.com> writes:

I am not sure if we are covering the case when clause_const and
pred_const are both NULL. In this case, we should be able to return
true only by checking op_strict(pred_op) or maybe even without
checking that. Am I mistaken?

Yeah, that's there. We need both operators to be strict, I think;
otherwise we can't really assume anything about what they'd return
for NULL inputs. But if they are, we have NULL => NULL which is
valid for all proof cases.

regards, tom lane

#22

Emre Hasegeli

emre@hasegeli.com

almost 8 years ago

In reply to: Tom Lane (#21)

Re: constraint exclusion and nulls in IN (..) clause

Yeah, that's there. We need both operators to be strict, I think;
otherwise we can't really assume anything about what they'd return
for NULL inputs. But if they are, we have NULL => NULL which is
valid for all proof cases.

I understand. I don’t see any problems in this case.

#23

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

almost 8 years ago

In reply to: Tom Lane (#21)

Re: constraint exclusion and nulls in IN (..) clause

On 2018/03/21 23:00, Tom Lane wrote:

Emre Hasegeli <emre@hasegeli.com> writes:

I am not sure if we are covering the case when clause_const and
pred_const are both NULL. In this case, we should be able to return
true only by checking op_strict(pred_op) or maybe even without
checking that. Am I mistaken?

Yeah, that's there. We need both operators to be strict, I think;
otherwise we can't really assume anything about what they'd return
for NULL inputs. But if they are, we have NULL => NULL which is
valid for all proof cases.

Thank you for closing the CF entry.

I too wasn't sure if the patch's modifications to
operator_predicate_proof() led to correct handling for the case where both
clause_const and pred_const are both NULL consts. ISTM that the result in
that case becomes what operator_same_subexprs_proof() would return for the
pred_op (possibly commuted) and clause_op pair. But maybe we don't end up
in that case much.

Regards,
Amit

#24

Tom Lane

tgl@sss.pgh.pa.us

almost 8 years ago

In reply to: Amit Langote (#23)

Re: constraint exclusion and nulls in IN (..) clause

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:

I too wasn't sure if the patch's modifications to
operator_predicate_proof() led to correct handling for the case where both
clause_const and pred_const are both NULL consts. ISTM that the result in
that case becomes what operator_same_subexprs_proof() would return for the
pred_op (possibly commuted) and clause_op pair. But maybe we don't end up
in that case much.

Probably not. It's hard to get to this type of case at all, because in
most cases, if we have a null constant as a subexpression, previous
const-simplification would have replaced the whole expression with null.
I think in practice it's only interesting when the upper levels of
predtest.c have constructed an operator expression out of parts of the
given clauses, which is exactly what happens for cases like IN-lists.

regards, tom lane