partition pruning doesn't work with IS NULL clause in multikey range partition case

Started by Ashutosh Bapatover 7 years ago16 messages
#1Ashutosh Bapat
ashutosh.bapat@enterprisedb.com

Hi,
Consider following test case.
create table prt (a int, b int, c int) partition by range(a, b);
create table prt_p1 partition of prt for values (0, 0) to (100, 100);
create table prt_p1 partition of prt for values from (0, 0) to (100, 100);
create table prt_p2 partition of prt for values from (100, 100) to (200, 200);
create table prt_def partition of prt default;

In a range partitioned table, a row with any partition key NULL goes
to the default partition if it exists.
insert into prt values (null, 1);
insert into prt values (1, null);
insert into prt values (null, null);
select tableoid::regclass, * from prt;
tableoid | a | b | c
----------+---+---+---
prt_def | | 1 |
prt_def | 1 | |
prt_def | | |
(3 rows)

There's a comment in get_partition_for_tuple(), which says so.
/*
* No range includes NULL, so this will be accepted by the
* default partition if there is one, and otherwise rejected.
*/

But when there is IS NULL clause on any of the partition keys with
some condition on other partition key, all the partitions scanned. I
expected pruning to prune all the partitions except the default one.

explain verbose select * from prt where a is null and b = 100;
QUERY PLAN
----------------------------------------------------------------------
Append (cost=0.00..106.52 rows=3 width=12)
-> Seq Scan on public.prt_p1 (cost=0.00..35.50 rows=1 width=12)
Output: prt_p1.a, prt_p1.b, prt_p1.c
Filter: ((prt_p1.a IS NULL) AND (prt_p1.b = 100))
-> Seq Scan on public.prt_p2 (cost=0.00..35.50 rows=1 width=12)
Output: prt_p2.a, prt_p2.b, prt_p2.c
Filter: ((prt_p2.a IS NULL) AND (prt_p2.b = 100))
-> Seq Scan on public.prt_def (cost=0.00..35.50 rows=1 width=12)
Output: prt_def.a, prt_def.b, prt_def.c
Filter: ((prt_def.a IS NULL) AND (prt_def.b = 100))
(10 rows)

I thought that the following code in get_matching_range_bounds()
/*
* If there are no datums to compare keys with, or if we got an IS NULL
* clause just return the default partition, if it exists.
*/
if (boundinfo->ndatums == 0 || !bms_is_empty(nullkeys))
{
result->scan_default = partition_bound_has_default(boundinfo);
return result;
}

would do the trick but through the debugger I saw that nullkeys is
NULL for this query.

I didn't investigate further to see why nullkeys is NULL, but it looks
like that's the problem and we are missing an optimization.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

#2Dilip Kumar
dilipbalaut@gmail.com
In reply to: Ashutosh Bapat (#1)
Re: partition pruning doesn't work with IS NULL clause in multikey range partition case

On Wed, Jul 11, 2018 at 3:06 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

Hi,
Consider following test case.
create table prt (a int, b int, c int) partition by range(a, b);
create table prt_p1 partition of prt for values (0, 0) to (100, 100);
create table prt_p1 partition of prt for values from (0, 0) to (100, 100);
create table prt_p2 partition of prt for values from (100, 100) to (200, 200);
create table prt_def partition of prt default;

In a range partitioned table, a row with any partition key NULL goes
to the default partition if it exists.
insert into prt values (null, 1);
insert into prt values (1, null);
insert into prt values (null, null);
select tableoid::regclass, * from prt;
tableoid | a | b | c
----------+---+---+---
prt_def | | 1 |
prt_def | 1 | |
prt_def | | |
(3 rows)

There's a comment in get_partition_for_tuple(), which says so.
/*
* No range includes NULL, so this will be accepted by the
* default partition if there is one, and otherwise rejected.
*/

But when there is IS NULL clause on any of the partition keys with
some condition on other partition key, all the partitions scanned. I
expected pruning to prune all the partitions except the default one.

explain verbose select * from prt where a is null and b = 100;
QUERY PLAN
----------------------------------------------------------------------
Append (cost=0.00..106.52 rows=3 width=12)
-> Seq Scan on public.prt_p1 (cost=0.00..35.50 rows=1 width=12)
Output: prt_p1.a, prt_p1.b, prt_p1.c
Filter: ((prt_p1.a IS NULL) AND (prt_p1.b = 100))
-> Seq Scan on public.prt_p2 (cost=0.00..35.50 rows=1 width=12)
Output: prt_p2.a, prt_p2.b, prt_p2.c
Filter: ((prt_p2.a IS NULL) AND (prt_p2.b = 100))
-> Seq Scan on public.prt_def (cost=0.00..35.50 rows=1 width=12)
Output: prt_def.a, prt_def.b, prt_def.c
Filter: ((prt_def.a IS NULL) AND (prt_def.b = 100))
(10 rows)

I thought that the following code in get_matching_range_bounds()
/*
* If there are no datums to compare keys with, or if we got an IS NULL
* clause just return the default partition, if it exists.
*/
if (boundinfo->ndatums == 0 || !bms_is_empty(nullkeys))
{
result->scan_default = partition_bound_has_default(boundinfo);
return result;
}

would do the trick but through the debugger I saw that nullkeys is
NULL for this query.

I didn't investigate further to see why nullkeys is NULL, but it looks
like that's the problem and we are missing an optimization.

I think the problem is that the gen_partprune_steps_internal expect
that all the keys should match to IS NULL clause, only in such case
the "partition pruning step" will store the nullkeys.

After a small change, it is able to prune the partitions.

--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -857,7 +857,7 @@
gen_partprune_steps_internal(GeneratePruningStepsContext *context,
         * If generate_opsteps is set to false it means no OpExprs were directly
         * present in the input list.
         */
-       if (!generate_opsteps)
+       if (nullkeys || !generate_opsteps)
        {
                /*
                 * Generate one prune step for the information derived
from IS NULL,
@@ -865,8 +865,7 @@
gen_partprune_steps_internal(GeneratePruningStepsContext *context,
                 * clauses for all partition keys.
                 */
                if (!bms_is_empty(nullkeys) &&
-                       (part_scheme->strategy != PARTITION_STRATEGY_HASH ||
-                        bms_num_members(nullkeys) == part_scheme->partnatts))
+                       (part_scheme->strategy != PARTITION_STRATEGY_HASH))
                {
                        PartitionPruneStep *step;

postgres=# explain verbose select * from prt where a is null and b = 100;
QUERY PLAN
----------------------------------------------------------------------
Append (cost=0.00..35.51 rows=1 width=12)
-> Seq Scan on public.prt_def (cost=0.00..35.50 rows=1 width=12)
Output: prt_def.a, prt_def.b, prt_def.c
Filter: ((prt_def.a IS NULL) AND (prt_def.b = 100))
(4 rows)

Above fix is just to show the root cause of the issue, I haven't
investigated that what should be the exact fix for this issue.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#3Dilip Kumar
dilipbalaut@gmail.com
In reply to: Dilip Kumar (#2)
1 attachment(s)
Re: partition pruning doesn't work with IS NULL clause in multikey range partition case

On Wed, Jul 11, 2018 at 4:20 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Jul 11, 2018 at 3:06 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

Hi,
Consider following test case.
create table prt (a int, b int, c int) partition by range(a, b);
create table prt_p1 partition of prt for values (0, 0) to (100, 100);
create table prt_p1 partition of prt for values from (0, 0) to (100, 100);
create table prt_p2 partition of prt for values from (100, 100) to (200, 200);
create table prt_def partition of prt default;

--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -857,7 +857,7 @@
gen_partprune_steps_internal(GeneratePruningStepsContext *context,
* If generate_opsteps is set to false it means no OpExprs were directly
* present in the input list.
*/
-       if (!generate_opsteps)
+       if (nullkeys || !generate_opsteps)
{
/*
* Generate one prune step for the information derived
from IS NULL,
@@ -865,8 +865,7 @@
gen_partprune_steps_internal(GeneratePruningStepsContext *context,
* clauses for all partition keys.
*/
if (!bms_is_empty(nullkeys) &&
-                       (part_scheme->strategy != PARTITION_STRATEGY_HASH ||
-                        bms_num_members(nullkeys) == part_scheme->partnatts))
+                       (part_scheme->strategy != PARTITION_STRATEGY_HASH))
{
PartitionPruneStep *step;

postgres=# explain verbose select * from prt where a is null and b = 100;
QUERY PLAN
----------------------------------------------------------------------
Append (cost=0.00..35.51 rows=1 width=12)
-> Seq Scan on public.prt_def (cost=0.00..35.50 rows=1 width=12)
Output: prt_def.a, prt_def.b, prt_def.c
Filter: ((prt_def.a IS NULL) AND (prt_def.b = 100))
(4 rows)

Above fix is just to show the root cause of the issue, I haven't
investigated that what should be the exact fix for this issue.

I think the actual fix should be as attached.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

generate_prunestep_for_isnull.patchapplication/octet-stream; name=generate_prunestep_for_isnull.patchDownload
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index cdc61a8..5de28cf 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -854,27 +854,22 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context,
 	}
 
 	/*
-	 * If generate_opsteps is set to false it means no OpExprs were directly
-	 * present in the input list.
+	 * Generate one prune step for the information derived from IS NULL,
+	 * if any.  To prune hash partitions, we must have found IS NULL
+	 * clauses for all partition keys.
 	 */
-	if (!generate_opsteps)
+	if (!bms_is_empty(nullkeys) &&
+		(part_scheme->strategy != PARTITION_STRATEGY_HASH ||
+		 bms_num_members(nullkeys) == part_scheme->partnatts))
 	{
-		/*
-		 * Generate one prune step for the information derived from IS NULL,
-		 * if any.  To prune hash partitions, we must have found IS NULL
-		 * clauses for all partition keys.
-		 */
-		if (!bms_is_empty(nullkeys) &&
-			(part_scheme->strategy != PARTITION_STRATEGY_HASH ||
-			 bms_num_members(nullkeys) == part_scheme->partnatts))
-		{
-			PartitionPruneStep *step;
-
-			step = gen_prune_step_op(context, InvalidStrategy,
-									 false, NIL, NIL, nullkeys);
-			result = lappend(result, step);
-		}
+		PartitionPruneStep *step;
 
+		step = gen_prune_step_op(context, InvalidStrategy,
+								 false, NIL, NIL, nullkeys);
+		result = lappend(result, step);
+	}
+	else if (!generate_opsteps)
+	{
 		/*
 		 * Note that for IS NOT NULL clauses, simply having step suffices;
 		 * there is no need to propagate the exact details of which keys are
#4amul sul
sulamul@gmail.com
In reply to: Dilip Kumar (#3)
Re: partition pruning doesn't work with IS NULL clause in multikey range partition case

On Wed, Jul 11, 2018 at 5:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Jul 11, 2018 at 4:20 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Jul 11, 2018 at 3:06 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

Hi,
Consider following test case.
create table prt (a int, b int, c int) partition by range(a, b);
create table prt_p1 partition of prt for values (0, 0) to (100, 100);
create table prt_p1 partition of prt for values from (0, 0) to (100, 100);
create table prt_p2 partition of prt for values from (100, 100) to (200, 200);
create table prt_def partition of prt default;

--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -857,7 +857,7 @@
gen_partprune_steps_internal(GeneratePruningStepsContext *context,
* If generate_opsteps is set to false it means no OpExprs were directly
* present in the input list.
*/
-       if (!generate_opsteps)
+       if (nullkeys || !generate_opsteps)
{
/*
* Generate one prune step for the information derived
from IS NULL,
@@ -865,8 +865,7 @@
gen_partprune_steps_internal(GeneratePruningStepsContext *context,
* clauses for all partition keys.
*/
if (!bms_is_empty(nullkeys) &&
-                       (part_scheme->strategy != PARTITION_STRATEGY_HASH ||
-                        bms_num_members(nullkeys) == part_scheme->partnatts))
+                       (part_scheme->strategy != PARTITION_STRATEGY_HASH))
{
PartitionPruneStep *step;

postgres=# explain verbose select * from prt where a is null and b = 100;
QUERY PLAN
----------------------------------------------------------------------
Append (cost=0.00..35.51 rows=1 width=12)
-> Seq Scan on public.prt_def (cost=0.00..35.50 rows=1 width=12)
Output: prt_def.a, prt_def.b, prt_def.c
Filter: ((prt_def.a IS NULL) AND (prt_def.b = 100))
(4 rows)

Above fix is just to show the root cause of the issue, I haven't
investigated that what should be the exact fix for this issue.

I think the actual fix should be as attached.

I am not sure that I have understand the following comments
11 + * Generate one prune step for the information derived from IS NULL,
12 + * if any. To prune hash partitions, we must have found IS NULL
13 + * clauses for all partition keys.
14 */

I am not sure that I have understood this -- no such restriction
required to prune the hash partitions, if I am not missing anything.

Regards,
Amul

#5Dilip Kumar
dilipbalaut@gmail.com
In reply to: amul sul (#4)
Re: partition pruning doesn't work with IS NULL clause in multikey range partition case

On Wed, Jul 11, 2018 at 5:36 PM, amul sul <sulamul@gmail.com> wrote:

On Wed, Jul 11, 2018 at 5:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I am not sure that I have understand the following comments
11 + * Generate one prune step for the information derived from IS NULL,
12 + * if any. To prune hash partitions, we must have found IS NULL
13 + * clauses for all partition keys.
14 */

I am not sure that I have understood this -- no such restriction
required to prune the hash partitions, if I am not missing anything.

Maybe it's not very clear but this is the original comments I have
retained. Just moved it out of the (!generate_opsteps) condition.

Just the explain this comment consider below example,

create table hp (a int, b text) partition by hash (a int, b text);
create table hp0 partition of hp for values with (modulus 4, remainder 0);
create table hp3 partition of hp for values with (modulus 4, remainder 3);
create table hp1 partition of hp for values with (modulus 4, remainder 1);
create table hp2 partition of hp for values with (modulus 4, remainder 2);

postgres=# insert into hp values (1, null);
INSERT 0 1
postgres=# insert into hp values (2, null);
INSERT 0 1
postgres=# select tableoid::regclass, * from hp;
tableoid | a | b
----------+---+---
hp1 | 1 |
hp2 | 2 |
(2 rows)

Now, if we query based on "b is null" then we can not decide which
partition should be pruned whereas in case
of other schemes, it will go to default partition so we can prune all
other partitions.

explain (costs off) select * from hp where b is null;

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#6Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Dilip Kumar (#5)
1 attachment(s)
Re: partition pruning doesn't work with IS NULL clause in multikey range partition case

Thanks Ashutosh for reporting and Dilip for the analysis and the patch.

On 2018/07/11 21:39, Dilip Kumar wrote:

On Wed, Jul 11, 2018 at 5:36 PM, amul sul <sulamul@gmail.com> wrote:

On Wed, Jul 11, 2018 at 5:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I am not sure that I have understand the following comments
11 + * Generate one prune step for the information derived from IS NULL,
12 + * if any. To prune hash partitions, we must have found IS NULL
13 + * clauses for all partition keys.
14 */

I am not sure that I have understood this -- no such restriction
required to prune the hash partitions, if I am not missing anything.

Maybe it's not very clear but this is the original comments I have
retained. Just moved it out of the (!generate_opsteps) condition.

Just the explain this comment consider below example,

create table hp (a int, b text) partition by hash (a int, b text);
create table hp0 partition of hp for values with (modulus 4, remainder 0);
create table hp3 partition of hp for values with (modulus 4, remainder 3);
create table hp1 partition of hp for values with (modulus 4, remainder 1);
create table hp2 partition of hp for values with (modulus 4, remainder 2);

postgres=# insert into hp values (1, null);
INSERT 0 1
postgres=# insert into hp values (2, null);
INSERT 0 1
postgres=# select tableoid::regclass, * from hp;
tableoid | a | b
----------+---+---
hp1 | 1 |
hp2 | 2 |
(2 rows)

Now, if we query based on "b is null" then we can not decide which
partition should be pruned whereas in case
of other schemes, it will go to default partition so we can prune all
other partitions.

That's right. By generating a pruning step with only nullkeys set, we are
effectively discarding OpExprs that may have been found for some partition
keys. That's fine for list/range partitioning, because nulls can only be
found in a designated partition, so it's okay to prune all other
partitions and for that it's enough to generate the pruning step like
that. For hash partitioning, nulls could be contained in any partition so
it's not okay to discard OpExpr's like that. We can generate pruning
steps with combination of null and non-null keys in the hash partitioning
case if there are any OpExprs.

I think your fix is correct. I slightly modified it along with updating
nearby comments and added regression tests.

Thanks,
Amit

Attachments:

generate_prunestep_for_isnull-v2.patchtext/plain; charset=UTF-8; name=generate_prunestep_for_isnull-v2.patchDownload
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index cdc61a8997..d9612bf65b 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -854,44 +854,42 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context,
 	}
 
 	/*
-	 * If generate_opsteps is set to false it means no OpExprs were directly
-	 * present in the input list.
+	 * For list and range partitioning, null partition keys can only be found
+	 * in one designated partition, so if there are IS NULL clauses containing
+	 * partition keys we should generate a pruning step that gets rid of all
+	 * partitions but the special null-accepting partitions.  For range
+	 * partitioning, that means we will end up disregarding OpExpr's that may
+	 * have been found for some other keys, but that's fine, because it is not
+	 * possible to prune range partitions with a combination of null and
+	 * non-null keys.
+	 *
+	 * For hash partitioning however, it is possible to combine null and non-
+	 * null keys in a pruning step, so do this only if *all* partition keys
+	 * are involved in IS NULL clauses.
 	 */
-	if (!generate_opsteps)
+	if (!bms_is_empty(nullkeys) &&
+		(part_scheme->strategy != PARTITION_STRATEGY_HASH ||
+		 bms_num_members(nullkeys) == part_scheme->partnatts))
+	{
+		PartitionPruneStep *step;
+
+		step = gen_prune_step_op(context, InvalidStrategy,
+								 false, NIL, NIL, nullkeys);
+		result = lappend(result, step);
+	}
+	else if (!generate_opsteps && !bms_is_empty(notnullkeys))
 	{
 		/*
-		 * Generate one prune step for the information derived from IS NULL,
-		 * if any.  To prune hash partitions, we must have found IS NULL
-		 * clauses for all partition keys.
+		 * There are no OpExpr's, but there are IS NOT NULL clauses, which
+		 * can be used to eliminate the null-partition-key-only partition.
 		 */
-		if (!bms_is_empty(nullkeys) &&
-			(part_scheme->strategy != PARTITION_STRATEGY_HASH ||
-			 bms_num_members(nullkeys) == part_scheme->partnatts))
-		{
-			PartitionPruneStep *step;
+		PartitionPruneStep *step;
 
-			step = gen_prune_step_op(context, InvalidStrategy,
-									 false, NIL, NIL, nullkeys);
-			result = lappend(result, step);
-		}
-
-		/*
-		 * Note that for IS NOT NULL clauses, simply having step suffices;
-		 * there is no need to propagate the exact details of which keys are
-		 * required to be NOT NULL.  Hash partitioning expects to see actual
-		 * values to perform any pruning.
-		 */
-		if (!bms_is_empty(notnullkeys) &&
-			part_scheme->strategy != PARTITION_STRATEGY_HASH)
-		{
-			PartitionPruneStep *step;
-
-			step = gen_prune_step_op(context, InvalidStrategy,
-									 false, NIL, NIL, NULL);
-			result = lappend(result, step);
-		}
+		step = gen_prune_step_op(context, InvalidStrategy,
+								 false, NIL, NIL, NULL);
+		result = lappend(result, step);
 	}
-	else
+	else if (generate_opsteps)
 	{
 		PartitionPruneStep *step;
 
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index d15f1d37f1..ffee6a0b74 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -3125,3 +3125,89 @@ explain (costs off) select * from pp_temp_parent where a = 2;
 (3 rows)
 
 drop table pp_temp_parent;
+-- multi-column range partitioning and pruning with IS NULL clauses on some
+-- columns
+create table pp_multirange (a int, b int) partition by range (a, b);
+create table pp_multirange1 partition of pp_multirange for values from (1, 1) to (100, 100);
+create table pp_multirange2 partition of pp_multirange for values from (100, 100) to (200, 200);
+-- no default partition, so all of the following should result in both
+-- partitions being pruned
+explain (costs off) select * from pp_multirange where a = 1 and b is null;
+        QUERY PLAN        
+--------------------------
+ Result
+   One-Time Filter: false
+(2 rows)
+
+explain (costs off) select * from pp_multirange where a is null and b is null;
+        QUERY PLAN        
+--------------------------
+ Result
+   One-Time Filter: false
+(2 rows)
+
+explain (costs off) select * from pp_multirange where a is null and b = 1;
+        QUERY PLAN        
+--------------------------
+ Result
+   One-Time Filter: false
+(2 rows)
+
+explain (costs off) select * from pp_multirange where a is null;
+        QUERY PLAN        
+--------------------------
+ Result
+   One-Time Filter: false
+(2 rows)
+
+explain (costs off) select * from pp_multirange where b is null;
+        QUERY PLAN        
+--------------------------
+ Result
+   One-Time Filter: false
+(2 rows)
+
+-- create default partition so that all of the above queries now scan only
+-- the default partition
+create table pp_multirange_def partition of pp_multirange default;
+explain (costs off) select * from pp_multirange where a = 1 and b is null;
+                QUERY PLAN                 
+-------------------------------------------
+ Append
+   ->  Seq Scan on pp_multirange_def
+         Filter: ((b IS NULL) AND (a = 1))
+(3 rows)
+
+explain (costs off) select * from pp_multirange where a is null and b is null;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Append
+   ->  Seq Scan on pp_multirange_def
+         Filter: ((a IS NULL) AND (b IS NULL))
+(3 rows)
+
+explain (costs off) select * from pp_multirange where a is null and b = 1;
+                QUERY PLAN                 
+-------------------------------------------
+ Append
+   ->  Seq Scan on pp_multirange_def
+         Filter: ((a IS NULL) AND (b = 1))
+(3 rows)
+
+explain (costs off) select * from pp_multirange where a is null;
+             QUERY PLAN              
+-------------------------------------
+ Append
+   ->  Seq Scan on pp_multirange_def
+         Filter: (a IS NULL)
+(3 rows)
+
+explain (costs off) select * from pp_multirange where b is null;
+             QUERY PLAN              
+-------------------------------------
+ Append
+   ->  Seq Scan on pp_multirange_def
+         Filter: (b IS NULL)
+(3 rows)
+
+drop table pp_multirange;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index b8e823d562..1cc5cf31f4 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -823,3 +823,27 @@ create temp table pp_temp_part_def partition of pp_temp_parent default;
 explain (costs off) select * from pp_temp_parent where true;
 explain (costs off) select * from pp_temp_parent where a = 2;
 drop table pp_temp_parent;
+
+-- multi-column range partitioning and pruning with IS NULL clauses on some
+-- columns
+create table pp_multirange (a int, b int) partition by range (a, b);
+create table pp_multirange1 partition of pp_multirange for values from (1, 1) to (100, 100);
+create table pp_multirange2 partition of pp_multirange for values from (100, 100) to (200, 200);
+
+-- no default partition, so all of the following should result in both
+-- partitions being pruned
+explain (costs off) select * from pp_multirange where a = 1 and b is null;
+explain (costs off) select * from pp_multirange where a is null and b is null;
+explain (costs off) select * from pp_multirange where a is null and b = 1;
+explain (costs off) select * from pp_multirange where a is null;
+explain (costs off) select * from pp_multirange where b is null;
+
+-- create default partition so that all of the above queries now scan only
+-- the default partition
+create table pp_multirange_def partition of pp_multirange default;
+explain (costs off) select * from pp_multirange where a = 1 and b is null;
+explain (costs off) select * from pp_multirange where a is null and b is null;
+explain (costs off) select * from pp_multirange where a is null and b = 1;
+explain (costs off) select * from pp_multirange where a is null;
+explain (costs off) select * from pp_multirange where b is null;
+drop table pp_multirange;
#7Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#6)
1 attachment(s)
Re: partition pruning doesn't work with IS NULL clause in multikey range partition case

On 2018/07/12 14:32, Amit Langote wrote:

Thanks Ashutosh for reporting and Dilip for the analysis and the patch.

On 2018/07/11 21:39, Dilip Kumar wrote:

On Wed, Jul 11, 2018 at 5:36 PM, amul sul <sulamul@gmail.com> wrote:

On Wed, Jul 11, 2018 at 5:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I am not sure that I have understand the following comments
11 + * Generate one prune step for the information derived from IS NULL,
12 + * if any. To prune hash partitions, we must have found IS NULL
13 + * clauses for all partition keys.
14 */

I am not sure that I have understood this -- no such restriction
required to prune the hash partitions, if I am not missing anything.

Maybe it's not very clear but this is the original comments I have
retained. Just moved it out of the (!generate_opsteps) condition.

Just the explain this comment consider below example,

create table hp (a int, b text) partition by hash (a int, b text);
create table hp0 partition of hp for values with (modulus 4, remainder 0);
create table hp3 partition of hp for values with (modulus 4, remainder 3);
create table hp1 partition of hp for values with (modulus 4, remainder 1);
create table hp2 partition of hp for values with (modulus 4, remainder 2);

postgres=# insert into hp values (1, null);
INSERT 0 1
postgres=# insert into hp values (2, null);
INSERT 0 1
postgres=# select tableoid::regclass, * from hp;
tableoid | a | b
----------+---+---
hp1 | 1 |
hp2 | 2 |
(2 rows)

Now, if we query based on "b is null" then we can not decide which
partition should be pruned whereas in case
of other schemes, it will go to default partition so we can prune all
other partitions.

That's right. By generating a pruning step with only nullkeys set, we are
effectively discarding OpExprs that may have been found for some partition
keys. That's fine for list/range partitioning, because nulls can only be
found in a designated partition, so it's okay to prune all other
partitions and for that it's enough to generate the pruning step like
that. For hash partitioning, nulls could be contained in any partition so
it's not okay to discard OpExpr's like that. We can generate pruning
steps with combination of null and non-null keys in the hash partitioning
case if there are any OpExprs.

I think your fix is correct. I slightly modified it along with updating
nearby comments and added regression tests.

I updated regression tests to reduce lines. There is no point in
repeating tests like v2 patch did.

Thanks,
Amit

Attachments:

generate_prunestep_for_isnull-v3.patchtext/plain; charset=UTF-8; name=generate_prunestep_for_isnull-v3.patchDownload
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index cdc61a8997..d9612bf65b 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -854,44 +854,42 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context,
 	}
 
 	/*
-	 * If generate_opsteps is set to false it means no OpExprs were directly
-	 * present in the input list.
+	 * For list and range partitioning, null partition keys can only be found
+	 * in one designated partition, so if there are IS NULL clauses containing
+	 * partition keys we should generate a pruning step that gets rid of all
+	 * partitions but the special null-accepting partitions.  For range
+	 * partitioning, that means we will end up disregarding OpExpr's that may
+	 * have been found for some other keys, but that's fine, because it is not
+	 * possible to prune range partitions with a combination of null and
+	 * non-null keys.
+	 *
+	 * For hash partitioning however, it is possible to combine null and non-
+	 * null keys in a pruning step, so do this only if *all* partition keys
+	 * are involved in IS NULL clauses.
 	 */
-	if (!generate_opsteps)
+	if (!bms_is_empty(nullkeys) &&
+		(part_scheme->strategy != PARTITION_STRATEGY_HASH ||
+		 bms_num_members(nullkeys) == part_scheme->partnatts))
+	{
+		PartitionPruneStep *step;
+
+		step = gen_prune_step_op(context, InvalidStrategy,
+								 false, NIL, NIL, nullkeys);
+		result = lappend(result, step);
+	}
+	else if (!generate_opsteps && !bms_is_empty(notnullkeys))
 	{
 		/*
-		 * Generate one prune step for the information derived from IS NULL,
-		 * if any.  To prune hash partitions, we must have found IS NULL
-		 * clauses for all partition keys.
+		 * There are no OpExpr's, but there are IS NOT NULL clauses, which
+		 * can be used to eliminate the null-partition-key-only partition.
 		 */
-		if (!bms_is_empty(nullkeys) &&
-			(part_scheme->strategy != PARTITION_STRATEGY_HASH ||
-			 bms_num_members(nullkeys) == part_scheme->partnatts))
-		{
-			PartitionPruneStep *step;
+		PartitionPruneStep *step;
 
-			step = gen_prune_step_op(context, InvalidStrategy,
-									 false, NIL, NIL, nullkeys);
-			result = lappend(result, step);
-		}
-
-		/*
-		 * Note that for IS NOT NULL clauses, simply having step suffices;
-		 * there is no need to propagate the exact details of which keys are
-		 * required to be NOT NULL.  Hash partitioning expects to see actual
-		 * values to perform any pruning.
-		 */
-		if (!bms_is_empty(notnullkeys) &&
-			part_scheme->strategy != PARTITION_STRATEGY_HASH)
-		{
-			PartitionPruneStep *step;
-
-			step = gen_prune_step_op(context, InvalidStrategy,
-									 false, NIL, NIL, NULL);
-			result = lappend(result, step);
-		}
+		step = gen_prune_step_op(context, InvalidStrategy,
+								 false, NIL, NIL, NULL);
+		result = lappend(result, step);
 	}
-	else
+	else if (generate_opsteps)
 	{
 		PartitionPruneStep *step;
 
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index d15f1d37f1..a3b21130e3 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -3125,3 +3125,51 @@ explain (costs off) select * from pp_temp_parent where a = 2;
 (3 rows)
 
 drop table pp_temp_parent;
+-- multi-column range partitioning and pruning with IS NULL clauses on some
+-- columns
+create table pp_multirange (a int, b int) partition by range (a, b);
+create table pp_multirange1 partition of pp_multirange for values from (1, 1) to (100, 100);
+create table pp_multirange2 partition of pp_multirange for values from (100, 100) to (200, 200);
+create table pp_multirange_def partition of pp_multirange default;
+-- all partitions but the default one should be pruned
+explain (costs off) select * from pp_multirange where a = 1 and b is null;
+                QUERY PLAN                 
+-------------------------------------------
+ Append
+   ->  Seq Scan on pp_multirange_def
+         Filter: ((b IS NULL) AND (a = 1))
+(3 rows)
+
+explain (costs off) select * from pp_multirange where a is null and b is null;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Append
+   ->  Seq Scan on pp_multirange_def
+         Filter: ((a IS NULL) AND (b IS NULL))
+(3 rows)
+
+explain (costs off) select * from pp_multirange where a is null and b = 1;
+                QUERY PLAN                 
+-------------------------------------------
+ Append
+   ->  Seq Scan on pp_multirange_def
+         Filter: ((a IS NULL) AND (b = 1))
+(3 rows)
+
+explain (costs off) select * from pp_multirange where a is null;
+             QUERY PLAN              
+-------------------------------------
+ Append
+   ->  Seq Scan on pp_multirange_def
+         Filter: (a IS NULL)
+(3 rows)
+
+explain (costs off) select * from pp_multirange where b is null;
+             QUERY PLAN              
+-------------------------------------
+ Append
+   ->  Seq Scan on pp_multirange_def
+         Filter: (b IS NULL)
+(3 rows)
+
+drop table pp_multirange;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index b8e823d562..cf56fdac40 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -823,3 +823,19 @@ create temp table pp_temp_part_def partition of pp_temp_parent default;
 explain (costs off) select * from pp_temp_parent where true;
 explain (costs off) select * from pp_temp_parent where a = 2;
 drop table pp_temp_parent;
+
+-- multi-column range partitioning and pruning with IS NULL clauses on some
+-- columns
+create table pp_multirange (a int, b int) partition by range (a, b);
+create table pp_multirange1 partition of pp_multirange for values from (1, 1) to (100, 100);
+create table pp_multirange2 partition of pp_multirange for values from (100, 100) to (200, 200);
+create table pp_multirange_def partition of pp_multirange default;
+
+-- all partitions but the default one should be pruned
+explain (costs off) select * from pp_multirange where a = 1 and b is null;
+explain (costs off) select * from pp_multirange where a is null and b is null;
+explain (costs off) select * from pp_multirange where a is null and b = 1;
+explain (costs off) select * from pp_multirange where a is null;
+explain (costs off) select * from pp_multirange where b is null;
+
+drop table pp_multirange;
#8Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Langote (#7)
Re: partition pruning doesn't work with IS NULL clause in multikey range partition case

On Thu, Jul 12, 2018 at 11:10 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

On 2018/07/12 14:32, Amit Langote wrote:

Thanks Ashutosh for reporting and Dilip for the analysis and the patch.

I think your fix is correct. I slightly modified it along with updating
nearby comments and added regression tests.

Thanks, Your changes look fine to me.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#9Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Amit Langote (#7)
Re: partition pruning doesn't work with IS NULL clause in multikey range partition case

On Thu, Jul 12, 2018 at 11:10 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

I think your fix is correct. I slightly modified it along with updating
nearby comments and added regression tests.

I updated regression tests to reduce lines. There is no point in
repeating tests like v2 patch did.

+     *
+     * For hash partitioning however, it is possible to combine null and non-
+     * null keys in a pruning step, so do this only if *all* partition keys
+     * are involved in IS NULL clauses.

I don't think this is true. When equality conditions and IS NULL clauses cover
all partition keys of a hash partitioned table and do not have contradictory
clauses, we should be able to find the partition which will remain unpruned. I
see that we already have this supported in get_matching_hash_bounds()
/*
* For hash partitioning we can only perform pruning based on equality
* clauses to the partition key or IS NULL clauses. We also can only
* prune if we got values for all keys.
*/
if (nvalues + bms_num_members(nullkeys) == partnatts)
{

      */
-    if (!generate_opsteps)
+    if (!bms_is_empty(nullkeys) &&
+        (part_scheme->strategy != PARTITION_STRATEGY_HASH ||
+         bms_num_members(nullkeys) == part_scheme->partnatts))

So, it looks like we don't need bms_num_members(nullkeys) ==
part_scheme->partnatts there.

Also, I think, we don't know how some new partition strategy will treat NULL
values so above condition looks wrong to me. Instead it should explicitly check
the strategies for which we know that the NULL values go to a single partition.

         /*
-         * Note that for IS NOT NULL clauses, simply having step suffices;
-         * there is no need to propagate the exact details of which keys are
-         * required to be NOT NULL.  Hash partitioning expects to see actual
-         * values to perform any pruning.
+         * There are no OpExpr's, but there are IS NOT NULL clauses, which
+         * can be used to eliminate the null-partition-key-only partition.

I don't understand this. When there are IS NOT NULL clauses for all the
partition keys, it's only then that we could eliminate the partition containing
NULL values, not otherwise.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

#10Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Ashutosh Bapat (#9)
Re: partition pruning doesn't work with IS NULL clause in multikey range partition case

I think we should add this to open items list so that it gets tracked.

On Thu, Jul 12, 2018 at 6:31 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

On Thu, Jul 12, 2018 at 11:10 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

I think your fix is correct. I slightly modified it along with updating
nearby comments and added regression tests.

I updated regression tests to reduce lines. There is no point in
repeating tests like v2 patch did.

+     *
+     * For hash partitioning however, it is possible to combine null and non-
+     * null keys in a pruning step, so do this only if *all* partition keys
+     * are involved in IS NULL clauses.

I don't think this is true. When equality conditions and IS NULL clauses cover
all partition keys of a hash partitioned table and do not have contradictory
clauses, we should be able to find the partition which will remain unpruned. I
see that we already have this supported in get_matching_hash_bounds()
/*
* For hash partitioning we can only perform pruning based on equality
* clauses to the partition key or IS NULL clauses. We also can only
* prune if we got values for all keys.
*/
if (nvalues + bms_num_members(nullkeys) == partnatts)
{

*/
-    if (!generate_opsteps)
+    if (!bms_is_empty(nullkeys) &&
+        (part_scheme->strategy != PARTITION_STRATEGY_HASH ||
+         bms_num_members(nullkeys) == part_scheme->partnatts))

So, it looks like we don't need bms_num_members(nullkeys) ==
part_scheme->partnatts there.

Also, I think, we don't know how some new partition strategy will treat NULL
values so above condition looks wrong to me. Instead it should explicitly check
the strategies for which we know that the NULL values go to a single partition.

/*
-         * Note that for IS NOT NULL clauses, simply having step suffices;
-         * there is no need to propagate the exact details of which keys are
-         * required to be NOT NULL.  Hash partitioning expects to see actual
-         * values to perform any pruning.
+         * There are no OpExpr's, but there are IS NOT NULL clauses, which
+         * can be used to eliminate the null-partition-key-only partition.

I don't understand this. When there are IS NOT NULL clauses for all the
partition keys, it's only then that we could eliminate the partition containing
NULL values, not otherwise.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

#11Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ashutosh Bapat (#9)
1 attachment(s)
Re: partition pruning doesn't work with IS NULL clause in multikey range partition case

Thanks for the review.

On 2018/07/12 22:01, Ashutosh Bapat wrote:

On Thu, Jul 12, 2018 at 11:10 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

I think your fix is correct. I slightly modified it along with updating
nearby comments and added regression tests.

I updated regression tests to reduce lines. There is no point in
repeating tests like v2 patch did.

+     *
+     * For hash partitioning however, it is possible to combine null and non-
+     * null keys in a pruning step, so do this only if *all* partition keys
+     * are involved in IS NULL clauses.

I don't think this is true. When equality conditions and IS NULL clauses cover
all partition keys of a hash partitioned table and do not have contradictory
clauses, we should be able to find the partition which will remain unpruned.

I was trying to say the same thing, but maybe the comment doesn't like it.
How about this:

+     * For hash partitioning, if we found IS NULL clauses for some keys and
+     * OpExpr's for others, gen_prune_steps_from_opexps() will generate the
+     * necessary pruning steps if all partition keys are taken care of.
+     * If we found only IS NULL clauses, then we can generate the pruning
+     * step here but only if all partition keys are covered.

I
see that we already have this supported in get_matching_hash_bounds()
/*
* For hash partitioning we can only perform pruning based on equality
* clauses to the partition key or IS NULL clauses. We also can only
* prune if we got values for all keys.
*/
if (nvalues + bms_num_members(nullkeys) == partnatts)
{

*/
-    if (!generate_opsteps)
+    if (!bms_is_empty(nullkeys) &&
+        (part_scheme->strategy != PARTITION_STRATEGY_HASH ||
+         bms_num_members(nullkeys) == part_scheme->partnatts))

So, it looks like we don't need bms_num_members(nullkeys) ==
part_scheme->partnatts there.

Yes, we can perform pruning in all three cases for hash partitioning:

* all keys are covered by OpExprs providing non-null keys

* some keys are covered by IS NULL clauses, others by OpExprs (all keys
covered)

* all keys are covered by IS NULL clauses

... as long as we generate a pruning step at all. The issue here was that
we skipped generating the pruning step due to poorly coded condition in
gen_partprune_steps_internal in some cases.

Also, I think, we don't know how some new partition strategy will treat NULL
values so above condition looks wrong to me. Instead it should explicitly check
the strategies for which we know that the NULL values go to a single partition.

How about if we explicitly spell out the strategies like this:

+    if (!bms_is_empty(nullkeys) &&
+        (part_scheme->strategy == PARTITION_STRATEGY_LIST ||
+         part_scheme->strategy == PARTITION_STRATEGY_RANGE ||
+         (part_scheme->strategy == PARTITION_STRATEGY_HASH &&
+          bms_num_members(nullkeys) == part_scheme->partnatts)))

The proposed comment also describes why the condition looks like that.

/*
-         * Note that for IS NOT NULL clauses, simply having step suffices;
-         * there is no need to propagate the exact details of which keys are
-         * required to be NOT NULL.  Hash partitioning expects to see actual
-         * values to perform any pruning.
+         * There are no OpExpr's, but there are IS NOT NULL clauses, which
+         * can be used to eliminate the null-partition-key-only partition.

I don't understand this. When there are IS NOT NULL clauses for all the
partition keys, it's only then that we could eliminate the partition containing
NULL values, not otherwise.

Actually, there is only one case where the pruning step generated by that
block of code is useful -- to prune a list partition that accepts only
nulls. List partitioning only allows one partition, key so this works,
but let's say only accidentally. I modified the condition as follows:

+    else if (!generate_opsteps && !bms_is_empty(notnullkeys) &&
+             bms_num_members(notnullkeys) == part_scheme->partnatts)

Attached updated patch.

Thanks,
Amit

Attachments:

generate_prunestep_for_isnull-v4.patchtext/plain; charset=UTF-8; name=generate_prunestep_for_isnull-v4.patchDownload
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index cdc61a8997..a89cec0759 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -854,44 +854,47 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context,
 	}
 
 	/*
-	 * If generate_opsteps is set to false it means no OpExprs were directly
-	 * present in the input list.
+	 * For list and range partitioning, null partition keys can only be found
+	 * in one designated partition, so if there are IS NULL clauses containing
+	 * partition keys we should generate a pruning step that gets rid of all
+	 * partitions but the special null-accepting partitions.  For range
+	 * partitioning, that means we will end up disregarding OpExpr's that may
+	 * have been found for some other keys, but that's fine, because it is not
+	 * possible to prune range partitions with a combination of null and
+	 * non-null keys.
+	 *
+	 * For hash partitioning, if we found IS NULL clauses for some keys and
+	 * OpExpr's for others, gen_prune_steps_from_opexps() will generate the
+	 * necessary pruning steps if all partition keys are taken care of.
+	 * If we found only IS NULL clauses, then we can generate the pruning
+	 * step here but only if all partition keys are covered.
 	 */
-	if (!generate_opsteps)
+	if (!bms_is_empty(nullkeys) &&
+		(part_scheme->strategy == PARTITION_STRATEGY_LIST ||
+		 part_scheme->strategy == PARTITION_STRATEGY_RANGE ||
+		 (part_scheme->strategy == PARTITION_STRATEGY_HASH &&
+		  bms_num_members(nullkeys) == part_scheme->partnatts)))
+	{
+		PartitionPruneStep *step;
+
+		step = gen_prune_step_op(context, InvalidStrategy,
+								 false, NIL, NIL, nullkeys);
+		result = lappend(result, step);
+	}
+	else if (!generate_opsteps && !bms_is_empty(notnullkeys) &&
+			 bms_num_members(notnullkeys) == part_scheme->partnatts)
 	{
 		/*
-		 * Generate one prune step for the information derived from IS NULL,
-		 * if any.  To prune hash partitions, we must have found IS NULL
-		 * clauses for all partition keys.
+		 * There are no OpExpr's, but there are IS NOT NULL clauses, which
+		 * can be used to eliminate the null-partition-key-only partition.
 		 */
-		if (!bms_is_empty(nullkeys) &&
-			(part_scheme->strategy != PARTITION_STRATEGY_HASH ||
-			 bms_num_members(nullkeys) == part_scheme->partnatts))
-		{
-			PartitionPruneStep *step;
+		PartitionPruneStep *step;
 
-			step = gen_prune_step_op(context, InvalidStrategy,
-									 false, NIL, NIL, nullkeys);
-			result = lappend(result, step);
-		}
-
-		/*
-		 * Note that for IS NOT NULL clauses, simply having step suffices;
-		 * there is no need to propagate the exact details of which keys are
-		 * required to be NOT NULL.  Hash partitioning expects to see actual
-		 * values to perform any pruning.
-		 */
-		if (!bms_is_empty(notnullkeys) &&
-			part_scheme->strategy != PARTITION_STRATEGY_HASH)
-		{
-			PartitionPruneStep *step;
-
-			step = gen_prune_step_op(context, InvalidStrategy,
-									 false, NIL, NIL, NULL);
-			result = lappend(result, step);
-		}
+		step = gen_prune_step_op(context, InvalidStrategy,
+								 false, NIL, NIL, NULL);
+		result = lappend(result, step);
 	}
-	else
+	else if (generate_opsteps)
 	{
 		PartitionPruneStep *step;
 
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index d15f1d37f1..a3b21130e3 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -3125,3 +3125,51 @@ explain (costs off) select * from pp_temp_parent where a = 2;
 (3 rows)
 
 drop table pp_temp_parent;
+-- multi-column range partitioning and pruning with IS NULL clauses on some
+-- columns
+create table pp_multirange (a int, b int) partition by range (a, b);
+create table pp_multirange1 partition of pp_multirange for values from (1, 1) to (100, 100);
+create table pp_multirange2 partition of pp_multirange for values from (100, 100) to (200, 200);
+create table pp_multirange_def partition of pp_multirange default;
+-- all partitions but the default one should be pruned
+explain (costs off) select * from pp_multirange where a = 1 and b is null;
+                QUERY PLAN                 
+-------------------------------------------
+ Append
+   ->  Seq Scan on pp_multirange_def
+         Filter: ((b IS NULL) AND (a = 1))
+(3 rows)
+
+explain (costs off) select * from pp_multirange where a is null and b is null;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Append
+   ->  Seq Scan on pp_multirange_def
+         Filter: ((a IS NULL) AND (b IS NULL))
+(3 rows)
+
+explain (costs off) select * from pp_multirange where a is null and b = 1;
+                QUERY PLAN                 
+-------------------------------------------
+ Append
+   ->  Seq Scan on pp_multirange_def
+         Filter: ((a IS NULL) AND (b = 1))
+(3 rows)
+
+explain (costs off) select * from pp_multirange where a is null;
+             QUERY PLAN              
+-------------------------------------
+ Append
+   ->  Seq Scan on pp_multirange_def
+         Filter: (a IS NULL)
+(3 rows)
+
+explain (costs off) select * from pp_multirange where b is null;
+             QUERY PLAN              
+-------------------------------------
+ Append
+   ->  Seq Scan on pp_multirange_def
+         Filter: (b IS NULL)
+(3 rows)
+
+drop table pp_multirange;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index b8e823d562..cf56fdac40 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -823,3 +823,19 @@ create temp table pp_temp_part_def partition of pp_temp_parent default;
 explain (costs off) select * from pp_temp_parent where true;
 explain (costs off) select * from pp_temp_parent where a = 2;
 drop table pp_temp_parent;
+
+-- multi-column range partitioning and pruning with IS NULL clauses on some
+-- columns
+create table pp_multirange (a int, b int) partition by range (a, b);
+create table pp_multirange1 partition of pp_multirange for values from (1, 1) to (100, 100);
+create table pp_multirange2 partition of pp_multirange for values from (100, 100) to (200, 200);
+create table pp_multirange_def partition of pp_multirange default;
+
+-- all partitions but the default one should be pruned
+explain (costs off) select * from pp_multirange where a = 1 and b is null;
+explain (costs off) select * from pp_multirange where a is null and b is null;
+explain (costs off) select * from pp_multirange where a is null and b = 1;
+explain (costs off) select * from pp_multirange where a is null;
+explain (costs off) select * from pp_multirange where b is null;
+
+drop table pp_multirange;
#12Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Amit Langote (#11)
Re: partition pruning doesn't work with IS NULL clause in multikey range partition case

On Fri, Jul 13, 2018 at 1:15 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

I don't think this is true. When equality conditions and IS NULL clauses cover
all partition keys of a hash partitioned table and do not have contradictory
clauses, we should be able to find the partition which will remain unpruned.

I was trying to say the same thing, but maybe the comment doesn't like it.
How about this:

+     * For hash partitioning, if we found IS NULL clauses for some keys and
+     * OpExpr's for others, gen_prune_steps_from_opexps() will generate the
+     * necessary pruning steps if all partition keys are taken care of.
+     * If we found only IS NULL clauses, then we can generate the pruning
+     * step here but only if all partition keys are covered.

It's still confusing a bit. I think "taken care of" doesn't convey the
same meaning as "covered". I don't think the second sentence is
necessary, it talks about one special case of the first sentence. But
this is better than the prior version.

I
see that we already have this supported in get_matching_hash_bounds()
/*
* For hash partitioning we can only perform pruning based on equality
* clauses to the partition key or IS NULL clauses. We also can only
* prune if we got values for all keys.
*/
if (nvalues + bms_num_members(nullkeys) == partnatts)
{

*/
-    if (!generate_opsteps)
+    if (!bms_is_empty(nullkeys) &&
+        (part_scheme->strategy != PARTITION_STRATEGY_HASH ||
+         bms_num_members(nullkeys) == part_scheme->partnatts))

So, it looks like we don't need bms_num_members(nullkeys) ==
part_scheme->partnatts there.

Yes, we can perform pruning in all three cases for hash partitioning:

* all keys are covered by OpExprs providing non-null keys

* some keys are covered by IS NULL clauses, others by OpExprs (all keys
covered)

* all keys are covered by IS NULL clauses

... as long as we generate a pruning step at all. The issue here was that
we skipped generating the pruning step due to poorly coded condition in
gen_partprune_steps_internal in some cases.

Also, I think, we don't know how some new partition strategy will treat NULL
values so above condition looks wrong to me. Instead it should explicitly check
the strategies for which we know that the NULL values go to a single partition.

How about if we explicitly spell out the strategies like this:

+    if (!bms_is_empty(nullkeys) &&
+        (part_scheme->strategy == PARTITION_STRATEGY_LIST ||
+         part_scheme->strategy == PARTITION_STRATEGY_RANGE ||
+         (part_scheme->strategy == PARTITION_STRATEGY_HASH &&
+          bms_num_members(nullkeys) == part_scheme->partnatts)))

I still do not understand why do we need (part_scheme->strategy ==
PARTITION_STRATEGY_HASH && bms_num_members(nullkeys) ==
part_scheme->partnatts) there. I thought that this will be handled by
code, which deals with null keys and op_exprs together, somewhere
else.

The proposed comment also describes why the condition looks like that.

/*
-         * Note that for IS NOT NULL clauses, simply having step suffices;
-         * there is no need to propagate the exact details of which keys are
-         * required to be NOT NULL.  Hash partitioning expects to see actual
-         * values to perform any pruning.
+         * There are no OpExpr's, but there are IS NOT NULL clauses, which
+         * can be used to eliminate the null-partition-key-only partition.

I don't understand this. When there are IS NOT NULL clauses for all the
partition keys, it's only then that we could eliminate the partition containing
NULL values, not otherwise.

Actually, there is only one case where the pruning step generated by that
block of code is useful -- to prune a list partition that accepts only
nulls. List partitioning only allows one partition, key so this works,
but let's say only accidentally. I modified the condition as follows:

+    else if (!generate_opsteps && !bms_is_empty(notnullkeys) &&
+             bms_num_members(notnullkeys) == part_scheme->partnatts)

Hmm. Ok. May be it's better to explicitly say that it's only useful in
case of list partitioned table.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

#13Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Ashutosh Bapat (#12)
1 attachment(s)
Re: partition pruning doesn't work with IS NULL clause in multikey range partition case

On 2018-Jul-13, Ashutosh Bapat wrote:

On Fri, Jul 13, 2018 at 1:15 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

I don't think this is true. When equality conditions and IS NULL clauses cover
all partition keys of a hash partitioned table and do not have contradictory
clauses, we should be able to find the partition which will remain unpruned.

I was trying to say the same thing, but maybe the comment doesn't like it.
How about this:

+     * For hash partitioning, if we found IS NULL clauses for some keys and
+     * OpExpr's for others, gen_prune_steps_from_opexps() will generate the
+     * necessary pruning steps if all partition keys are taken care of.
+     * If we found only IS NULL clauses, then we can generate the pruning
+     * step here but only if all partition keys are covered.

It's still confusing a bit. I think "taken care of" doesn't convey the
same meaning as "covered". I don't think the second sentence is
necessary, it talks about one special case of the first sentence. But
this is better than the prior version.

Hmm, let me reword this comment completely. How about the attached?

I also changed the order of the tests, putting the 'generate_opsteps'
one in the middle instead of at the end (so the not-null one doesn't
test that boolean again). AFAICS it's logically the same, and it makes
more sense to me that way.

I also changed the tests so that they apply to the existing mc2p table,
which is identical to the one Amit was adding.

How about if we explicitly spell out the strategies like this:

+    if (!bms_is_empty(nullkeys) &&
+        (part_scheme->strategy == PARTITION_STRATEGY_LIST ||
+         part_scheme->strategy == PARTITION_STRATEGY_RANGE ||
+         (part_scheme->strategy == PARTITION_STRATEGY_HASH &&
+          bms_num_members(nullkeys) == part_scheme->partnatts)))

I still do not understand why do we need (part_scheme->strategy ==
PARTITION_STRATEGY_HASH && bms_num_members(nullkeys) ==
part_scheme->partnatts) there. I thought that this will be handled by
code, which deals with null keys and op_exprs together, somewhere
else.

I'm not sure I understand this question. This strategy applies to hash
partitioning only if we have null tests for all keys, so there are no
op_exprs. If there are any, there's no point in checking them.

Now this case is a fun one:
select * from mc2p where a in (1, 2) and b is null;
Note that no partitions are pruned in that case; yet if you do this:
select * from mc2p where a in (1) and b is null;
then it does prune. I think the reason for this is that the
PARTCLAUSE_MATCH_STEPS is ... pretty special.

Actually, there is only one case where the pruning step generated by that
block of code is useful -- to prune a list partition that accepts only
nulls. List partitioning only allows one partition, key so this works,
but let's say only accidentally. I modified the condition as follows:

+    else if (!generate_opsteps && !bms_is_empty(notnullkeys) &&
+             bms_num_members(notnullkeys) == part_scheme->partnatts)

Hmm. Ok. May be it's better to explicitly say that it's only useful in
case of list partitioned table.

Yeah, maybe.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

generate_prunestep_for_isnull-v5.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index cdc61a8997..e44df8ac94 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -853,54 +853,62 @@ gen_partprune_steps_internal(GeneratePruningStepsContext *context,
 		}
 	}
 
-	/*
-	 * If generate_opsteps is set to false it means no OpExprs were directly
-	 * present in the input list.
+	/*-----------
+	 * Now generate some (more) pruning steps.
+	 *
+	 * We may be able to (1) generate pruning steps based on IS NULL clauses,
+	 * if there are enough of them:
+	 * a) For list partitioning, null partition keys can only be found in the
+	 *    designated partition, so if there are IS NULL clauses containing
+	 *    partition keys we should generate a pruning step that gets rid of
+	 *    all partitions but the special null-accepting partitions.
+	 * b) For range partitioning, the same applies.  Doing this check first
+	 *    means we'll disregard OpExprs that may have been found for other
+	 *    keys, but that's fine, because it is not possible to prune range
+	 *    partitions with a combination of null and non-null keys.
+	 * c) For hash partitioning, we only apply this strategy if we have
+	 *    IS NULL clauses for all the keys.  Strategy 2 will take care of the
+	 *    case where some keys have OpExprs and others have IS NULL clauses.
+	 *
+	 * If not, then (2) generate steps based on OpExprs we have (if any).
+	 *
+	 * If this doesn't work either, we may be able to (3) generate steps to
+	 * prune (just) the null-accepting partition (if one exists), if we have
+	 * IS NOT NULL clauses for all partition keys.
 	 */
-	if (!generate_opsteps)
-	{
-		/*
-		 * Generate one prune step for the information derived from IS NULL,
-		 * if any.  To prune hash partitions, we must have found IS NULL
-		 * clauses for all partition keys.
-		 */
-		if (!bms_is_empty(nullkeys) &&
-			(part_scheme->strategy != PARTITION_STRATEGY_HASH ||
-			 bms_num_members(nullkeys) == part_scheme->partnatts))
-		{
-			PartitionPruneStep *step;
-
-			step = gen_prune_step_op(context, InvalidStrategy,
-									 false, NIL, NIL, nullkeys);
-			result = lappend(result, step);
-		}
-
-		/*
-		 * Note that for IS NOT NULL clauses, simply having step suffices;
-		 * there is no need to propagate the exact details of which keys are
-		 * required to be NOT NULL.  Hash partitioning expects to see actual
-		 * values to perform any pruning.
-		 */
-		if (!bms_is_empty(notnullkeys) &&
-			part_scheme->strategy != PARTITION_STRATEGY_HASH)
-		{
-			PartitionPruneStep *step;
-
-			step = gen_prune_step_op(context, InvalidStrategy,
-									 false, NIL, NIL, NULL);
-			result = lappend(result, step);
-		}
-	}
-	else
+	if (!bms_is_empty(nullkeys) &&
+		(part_scheme->strategy == PARTITION_STRATEGY_LIST ||
+		 part_scheme->strategy == PARTITION_STRATEGY_RANGE ||
+		 (part_scheme->strategy == PARTITION_STRATEGY_HASH &&
+		  bms_num_members(nullkeys) == part_scheme->partnatts)))
 	{
 		PartitionPruneStep *step;
 
-		/* Generate pruning steps from OpExpr clauses in keyclauses. */
+		/* Strategy 1 */
+		step = gen_prune_step_op(context, InvalidStrategy,
+								 false, NIL, NIL, nullkeys);
+		result = lappend(result, step);
+	}
+	else if (generate_opsteps)
+	{
+		PartitionPruneStep *step;
+
+		/* Strategy 2 */
 		step = gen_prune_steps_from_opexps(part_scheme, context,
 										   keyclauses, nullkeys);
 		if (step != NULL)
 			result = lappend(result, step);
 	}
+	else if (!bms_is_empty(notnullkeys) &&
+			 bms_num_members(notnullkeys) == part_scheme->partnatts)
+	{
+		PartitionPruneStep *step;
+
+		/* Strategy 3 */
+		step = gen_prune_step_op(context, InvalidStrategy,
+								 false, NIL, NIL, NULL);
+		result = lappend(result, step);
+	}
 
 	/*
 	 * Finally, results from all entries appearing in result should be
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index d15f1d37f1..022b7c55c7 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -993,6 +993,47 @@ explain (costs off) select * from mc2p where a = 1 and b > 1;
          Filter: ((b > 1) AND (a = 1))
 (3 rows)
 
+-- all partitions but the default one should be pruned
+explain (costs off) select * from mc2p where a = 1 and b is null;
+                QUERY PLAN                 
+-------------------------------------------
+ Append
+   ->  Seq Scan on mc2p_default
+         Filter: ((b IS NULL) AND (a = 1))
+(3 rows)
+
+explain (costs off) select * from mc2p where a is null and b is null;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Append
+   ->  Seq Scan on mc2p_default
+         Filter: ((a IS NULL) AND (b IS NULL))
+(3 rows)
+
+explain (costs off) select * from mc2p where a is null and b = 1;
+                QUERY PLAN                 
+-------------------------------------------
+ Append
+   ->  Seq Scan on mc2p_default
+         Filter: ((a IS NULL) AND (b = 1))
+(3 rows)
+
+explain (costs off) select * from mc2p where a is null;
+           QUERY PLAN           
+--------------------------------
+ Append
+   ->  Seq Scan on mc2p_default
+         Filter: (a IS NULL)
+(3 rows)
+
+explain (costs off) select * from mc2p where b is null;
+           QUERY PLAN           
+--------------------------------
+ Append
+   ->  Seq Scan on mc2p_default
+         Filter: (b IS NULL)
+(3 rows)
+
 -- boolean partitioning
 create table boolpart (a bool) partition by list (a);
 create table boolpart_default partition of boolpart default;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index b8e823d562..2357f02cde 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -137,6 +137,13 @@ explain (costs off) select * from mc2p where a = 2 and b < 1;
 explain (costs off) select * from mc2p where a > 1;
 explain (costs off) select * from mc2p where a = 1 and b > 1;
 
+-- all partitions but the default one should be pruned
+explain (costs off) select * from mc2p where a = 1 and b is null;
+explain (costs off) select * from mc2p where a is null and b is null;
+explain (costs off) select * from mc2p where a is null and b = 1;
+explain (costs off) select * from mc2p where a is null;
+explain (costs off) select * from mc2p where b is null;
+
 -- boolean partitioning
 create table boolpart (a bool) partition by list (a);
 create table boolpart_default partition of boolpart default;
#14Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Alvaro Herrera (#13)
Re: partition pruning doesn't work with IS NULL clause in multikey range partition case

On Sat, Jul 14, 2018 at 2:41 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

On 2018-Jul-13, Ashutosh Bapat wrote:

On Fri, Jul 13, 2018 at 1:15 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

I don't think this is true. When equality conditions and IS NULL clauses cover
all partition keys of a hash partitioned table and do not have contradictory
clauses, we should be able to find the partition which will remain unpruned.

I was trying to say the same thing, but maybe the comment doesn't like it.
How about this:

+     * For hash partitioning, if we found IS NULL clauses for some keys and
+     * OpExpr's for others, gen_prune_steps_from_opexps() will generate the
+     * necessary pruning steps if all partition keys are taken care of.
+     * If we found only IS NULL clauses, then we can generate the pruning
+     * step here but only if all partition keys are covered.

It's still confusing a bit. I think "taken care of" doesn't convey the
same meaning as "covered". I don't think the second sentence is
necessary, it talks about one special case of the first sentence. But
this is better than the prior version.

Hmm, let me reword this comment completely. How about the attached?

I also changed the order of the tests, putting the 'generate_opsteps'
one in the middle instead of at the end (so the not-null one doesn't
test that boolean again). AFAICS it's logically the same, and it makes
more sense to me that way.

That looks much better. However it took me a small while to understand
that (1), (2) and (3) correspond to strategies.

I also changed the tests so that they apply to the existing mc2p table,
which is identical to the one Amit was adding.

+1 for that.

How about if we explicitly spell out the strategies like this:

+    if (!bms_is_empty(nullkeys) &&
+        (part_scheme->strategy == PARTITION_STRATEGY_LIST ||
+         part_scheme->strategy == PARTITION_STRATEGY_RANGE ||
+         (part_scheme->strategy == PARTITION_STRATEGY_HASH &&
+          bms_num_members(nullkeys) == part_scheme->partnatts)))

I still do not understand why do we need (part_scheme->strategy ==
PARTITION_STRATEGY_HASH && bms_num_members(nullkeys) ==
part_scheme->partnatts) there. I thought that this will be handled by
code, which deals with null keys and op_exprs together, somewhere
else.

I'm not sure I understand this question. This strategy applies to hash
partitioning only if we have null tests for all keys, so there are no
op_exprs. If there are any, there's no point in checking them.

With the rearranged code, it's much more simple to understand this
code. No change needed.

Hmm. Ok. May be it's better to explicitly say that it's only useful in
case of list partitioned table.

Yeah, maybe.

No need with the re-written code.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

#15Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Ashutosh Bapat (#14)
Re: partition pruning doesn't work with IS NULL clause in multikey range partition case

On 2018-Jul-16, Ashutosh Bapat wrote:

Hmm, let me reword this comment completely. How about the attached?

That looks much better. However it took me a small while to understand
that (1), (2) and (3) correspond to strategies.

You're right. Amended again and pushed. I also marked the open item
closed.

Thanks everyone

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#16Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Alvaro Herrera (#15)
Re: partition pruning doesn't work with IS NULL clause in multikey range partition case

On 2018/07/17 8:17, Alvaro Herrera wrote:

On 2018-Jul-16, Ashutosh Bapat wrote:

Hmm, let me reword this comment completely. How about the attached?

That looks much better. However it took me a small while to understand
that (1), (2) and (3) correspond to strategies.

You're right. Amended again and pushed. I also marked the open item
closed.

Thanks everyone

Thank you Ashutosh for further review and Alvaro for improving it a quite
a bit before committing.

Thanks,
Amit