A problem about partitionwise join

Started by Richard Guoover 6 years ago48 messageshackers
Jump to latest
#1Richard Guo
guofenglinux@gmail.com

Hi All,

To generate partitionwise join, we need to make sure there exists an
equi-join condition for each pair of partition keys, which is performed
by have_partkey_equi_join(). This makes sense and works well.

But if, let's say, one certain pair of partition keys (foo.k = bar.k)
has formed an equivalence class containing consts, no join clause would
be generated for it, since we have already generated 'foo.k = const' and
'bar.k = const' and pushed them into the proper restrictions earlier.

This will make partitionwise join fail to be planned if there are
multiple partition keys and the pushed-down restrictions 'xxx = const'
fail to prune away any partitions.

Consider the examples below:

create table p (k1 int, k2 int, val int) partition by range(k1,k2);
create table p_1 partition of p for values from (1,1) to (10,100);
create table p_2 partition of p for values from (10,100) to (20,200);

If we are joining on each pair of partition keys, we can generate
partitionwise join:

# explain (costs off)
select * from p as foo join p as bar on foo.k1 = bar.k1 and foo.k2 = bar.k2;
QUERY PLAN
----------------------------------------------------------------------
Append
-> Hash Join
Hash Cond: ((foo.k1 = bar.k1) AND (foo.k2 = bar.k2))
-> Seq Scan on p_1 foo
-> Hash
-> Seq Scan on p_1 bar
-> Hash Join
Hash Cond: ((foo_1.k1 = bar_1.k1) AND (foo_1.k2 = bar_1.k2))
-> Seq Scan on p_2 foo_1
-> Hash
-> Seq Scan on p_2 bar_1
(11 rows)

But if we add another qual 'foo.k2 = const', we will be unable to
generate partitionwise join any more, because have_partkey_equi_join()
thinks not every partition key has an equi-join condition.

# explain (costs off)
select * from p as foo join p as bar on foo.k1 = bar.k1 and foo.k2 = bar.k2
and foo.k2 = 16;
QUERY PLAN
-----------------------------------------
Hash Join
Hash Cond: (foo.k1 = bar.k1)
-> Append
-> Seq Scan on p_1 foo
Filter: (k2 = 16)
-> Seq Scan on p_2 foo_1
Filter: (k2 = 16)
-> Hash
-> Append
-> Seq Scan on p_1 bar
Filter: (k2 = 16)
-> Seq Scan on p_2 bar_1
Filter: (k2 = 16)
(13 rows)

Is this a problem?

Thanks
Richard

#2Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Richard Guo (#1)
Re: A problem about partitionwise join

Hi Richard,

On Mon, Aug 26, 2019 at 6:33 PM Richard Guo <riguo@pivotal.io> wrote:

Hi All,

To generate partitionwise join, we need to make sure there exists an
equi-join condition for each pair of partition keys, which is performed
by have_partkey_equi_join(). This makes sense and works well.

But if, let's say, one certain pair of partition keys (foo.k = bar.k)
has formed an equivalence class containing consts, no join clause would
be generated for it, since we have already generated 'foo.k = const' and
'bar.k = const' and pushed them into the proper restrictions earlier.

This will make partitionwise join fail to be planned if there are
multiple partition keys and the pushed-down restrictions 'xxx = const'
fail to prune away any partitions.

Consider the examples below:

create table p (k1 int, k2 int, val int) partition by range(k1,k2);
create table p_1 partition of p for values from (1,1) to (10,100);
create table p_2 partition of p for values from (10,100) to (20,200);

If we are joining on each pair of partition keys, we can generate
partitionwise join:

# explain (costs off)
select * from p as foo join p as bar on foo.k1 = bar.k1 and foo.k2 = bar.k2;
QUERY PLAN
----------------------------------------------------------------------
Append
-> Hash Join
Hash Cond: ((foo.k1 = bar.k1) AND (foo.k2 = bar.k2))
-> Seq Scan on p_1 foo
-> Hash
-> Seq Scan on p_1 bar
-> Hash Join
Hash Cond: ((foo_1.k1 = bar_1.k1) AND (foo_1.k2 = bar_1.k2))
-> Seq Scan on p_2 foo_1
-> Hash
-> Seq Scan on p_2 bar_1
(11 rows)

But if we add another qual 'foo.k2 = const', we will be unable to
generate partitionwise join any more, because have_partkey_equi_join()
thinks not every partition key has an equi-join condition.

# explain (costs off)
select * from p as foo join p as bar on foo.k1 = bar.k1 and foo.k2 = bar.k2 and foo.k2 = 16;
QUERY PLAN
-----------------------------------------
Hash Join
Hash Cond: (foo.k1 = bar.k1)
-> Append
-> Seq Scan on p_1 foo
Filter: (k2 = 16)
-> Seq Scan on p_2 foo_1
Filter: (k2 = 16)
-> Hash
-> Append
-> Seq Scan on p_1 bar
Filter: (k2 = 16)
-> Seq Scan on p_2 bar_1
Filter: (k2 = 16)
(13 rows)

Is this a problem?

Perhaps. Maybe it has to do with the way have_partkey_equi_join() has
been coded. If it was coded such that it figured out on its own that
the equivalence (foo.k2, bar.k2, ...) does exist, then that would
allow partitionwise join to occur, which I think would be OK to do.
But maybe I'm missing something.

Thanks,
Amit

#3Richard Guo
guofenglinux@gmail.com
In reply to: Amit Langote (#2)
Re: A problem about partitionwise join

On Tue, Aug 27, 2019 at 8:51 AM Amit Langote <amitlangote09@gmail.com>
wrote:

Hi Richard,

On Mon, Aug 26, 2019 at 6:33 PM Richard Guo <riguo@pivotal.io> wrote:

Hi All,

To generate partitionwise join, we need to make sure there exists an
equi-join condition for each pair of partition keys, which is performed
by have_partkey_equi_join(). This makes sense and works well.

But if, let's say, one certain pair of partition keys (foo.k = bar.k)
has formed an equivalence class containing consts, no join clause would
be generated for it, since we have already generated 'foo.k = const' and
'bar.k = const' and pushed them into the proper restrictions earlier.

This will make partitionwise join fail to be planned if there are
multiple partition keys and the pushed-down restrictions 'xxx = const'
fail to prune away any partitions.

Consider the examples below:

create table p (k1 int, k2 int, val int) partition by range(k1,k2);
create table p_1 partition of p for values from (1,1) to (10,100);
create table p_2 partition of p for values from (10,100) to (20,200);

If we are joining on each pair of partition keys, we can generate
partitionwise join:

# explain (costs off)
select * from p as foo join p as bar on foo.k1 = bar.k1 and foo.k2 =

bar.k2;

QUERY PLAN
----------------------------------------------------------------------
Append
-> Hash Join
Hash Cond: ((foo.k1 = bar.k1) AND (foo.k2 = bar.k2))
-> Seq Scan on p_1 foo
-> Hash
-> Seq Scan on p_1 bar
-> Hash Join
Hash Cond: ((foo_1.k1 = bar_1.k1) AND (foo_1.k2 = bar_1.k2))
-> Seq Scan on p_2 foo_1
-> Hash
-> Seq Scan on p_2 bar_1
(11 rows)

But if we add another qual 'foo.k2 = const', we will be unable to
generate partitionwise join any more, because have_partkey_equi_join()
thinks not every partition key has an equi-join condition.

# explain (costs off)
select * from p as foo join p as bar on foo.k1 = bar.k1 and foo.k2 =

bar.k2 and foo.k2 = 16;

QUERY PLAN
-----------------------------------------
Hash Join
Hash Cond: (foo.k1 = bar.k1)
-> Append
-> Seq Scan on p_1 foo
Filter: (k2 = 16)
-> Seq Scan on p_2 foo_1
Filter: (k2 = 16)
-> Hash
-> Append
-> Seq Scan on p_1 bar
Filter: (k2 = 16)
-> Seq Scan on p_2 bar_1
Filter: (k2 = 16)
(13 rows)

Is this a problem?

Perhaps. Maybe it has to do with the way have_partkey_equi_join() has
been coded. If it was coded such that it figured out on its own that
the equivalence (foo.k2, bar.k2, ...) does exist, then that would
allow partitionwise join to occur, which I think would be OK to do.
But maybe I'm missing something.

This should be caused by how we deduce join clauses from equivalence
classes. ECs containing consts will not be considered so we cannot
generate (foo.k2 = bar.k2) for the query above.

In addition, when generating join clauses from equivalence classes, we
only select the joinclause with the 'best score', or the first
joinclause with a score of 3. This may make us miss some joinclause on
partition keys.

Check the query below as a more illustrative example:

create table p (k int, val int) partition by range(k);
create table p_1 partition of p for values from (1) to (10);
create table p_2 partition of p for values from (10) to (100);

If we use quals 'foo.k = bar.k and foo.k = bar.val', we can generate
partitionwise join:

# explain (costs off)
select * from p as foo join p as bar on foo.k = bar.k and foo.k = bar.val;
QUERY PLAN
-----------------------------------------
Append
-> Hash Join
Hash Cond: (foo.k = bar.k)
-> Seq Scan on p_1 foo
-> Hash
-> Seq Scan on p_1 bar
Filter: (k = val)
-> Hash Join
Hash Cond: (foo_1.k = bar_1.k)
-> Seq Scan on p_2 foo_1
-> Hash
-> Seq Scan on p_2 bar_1
Filter: (k = val)
(13 rows)

But if we exchange the order of the two quals to 'foo.k = bar.val and
foo.k = bar.k', then partitionwise join cannot be generated any more,
because we only have joinclause 'foo.k = bar.val' as it first reached
score of 3. We have missed the joinclause on the partition key although
it does exist.

# explain (costs off)
select * from p as foo join p as bar on foo.k = bar.val and foo.k = bar.k;
QUERY PLAN
-----------------------------------------
Hash Join
Hash Cond: (foo.k = bar.val)
-> Append
-> Seq Scan on p_1 foo
-> Seq Scan on p_2 foo_1
-> Hash
-> Append
-> Seq Scan on p_1 bar
Filter: (val = k)
-> Seq Scan on p_2 bar_1
Filter: (val = k)
(11 rows)

Thanks
Richard

#4Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Richard Guo (#3)
Re: A problem about partitionwise join

Hi,

On Tue, Aug 27, 2019 at 4:57 PM Richard Guo <riguo@pivotal.io> wrote:

Check the query below as a more illustrative example:

create table p (k int, val int) partition by range(k);
create table p_1 partition of p for values from (1) to (10);
create table p_2 partition of p for values from (10) to (100);

If we use quals 'foo.k = bar.k and foo.k = bar.val', we can generate
partitionwise join:

# explain (costs off)
select * from p as foo join p as bar on foo.k = bar.k and foo.k = bar.val;
QUERY PLAN
-----------------------------------------
Append
-> Hash Join
Hash Cond: (foo.k = bar.k)
-> Seq Scan on p_1 foo
-> Hash
-> Seq Scan on p_1 bar
Filter: (k = val)
-> Hash Join
Hash Cond: (foo_1.k = bar_1.k)
-> Seq Scan on p_2 foo_1
-> Hash
-> Seq Scan on p_2 bar_1
Filter: (k = val)
(13 rows)

But if we exchange the order of the two quals to 'foo.k = bar.val and
foo.k = bar.k', then partitionwise join cannot be generated any more,
because we only have joinclause 'foo.k = bar.val' as it first reached
score of 3. We have missed the joinclause on the partition key although
it does exist.

# explain (costs off)
select * from p as foo join p as bar on foo.k = bar.val and foo.k = bar.k;
QUERY PLAN
-----------------------------------------
Hash Join
Hash Cond: (foo.k = bar.val)
-> Append
-> Seq Scan on p_1 foo
-> Seq Scan on p_2 foo_1
-> Hash
-> Append
-> Seq Scan on p_1 bar
Filter: (val = k)
-> Seq Scan on p_2 bar_1
Filter: (val = k)
(11 rows)

I think it would be nice if we can address this issue.

Best regards,
Etsuro Fujita

#5Richard Guo
guofenglinux@gmail.com
In reply to: Etsuro Fujita (#4)
Re: A problem about partitionwise join

On Wed, Aug 28, 2019 at 6:49 PM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:

Hi,

On Tue, Aug 27, 2019 at 4:57 PM Richard Guo <riguo@pivotal.io> wrote:

Check the query below as a more illustrative example:

create table p (k int, val int) partition by range(k);
create table p_1 partition of p for values from (1) to (10);
create table p_2 partition of p for values from (10) to (100);

If we use quals 'foo.k = bar.k and foo.k = bar.val', we can generate
partitionwise join:

# explain (costs off)
select * from p as foo join p as bar on foo.k = bar.k and foo.k =

bar.val;

QUERY PLAN
-----------------------------------------
Append
-> Hash Join
Hash Cond: (foo.k = bar.k)
-> Seq Scan on p_1 foo
-> Hash
-> Seq Scan on p_1 bar
Filter: (k = val)
-> Hash Join
Hash Cond: (foo_1.k = bar_1.k)
-> Seq Scan on p_2 foo_1
-> Hash
-> Seq Scan on p_2 bar_1
Filter: (k = val)
(13 rows)

But if we exchange the order of the two quals to 'foo.k = bar.val and
foo.k = bar.k', then partitionwise join cannot be generated any more,
because we only have joinclause 'foo.k = bar.val' as it first reached
score of 3. We have missed the joinclause on the partition key although
it does exist.

# explain (costs off)
select * from p as foo join p as bar on foo.k = bar.val and foo.k =

bar.k;

QUERY PLAN
-----------------------------------------
Hash Join
Hash Cond: (foo.k = bar.val)
-> Append
-> Seq Scan on p_1 foo
-> Seq Scan on p_2 foo_1
-> Hash
-> Append
-> Seq Scan on p_1 bar
Filter: (val = k)
-> Seq Scan on p_2 bar_1
Filter: (val = k)
(11 rows)

I think it would be nice if we can address this issue.

Thank you.

Attached is a patch as an attempt to address this issue. The idea is
quite straightforward. When building partition info for joinrel, we
generate any possible EC-derived joinclauses of form 'outer_em =
inner_em', which will be used together with the original restrictlist to
check if there exists an equi-join condition for each pair of partition
keys.

Any comments are welcome!

Thanks
Richard

Attachments:

v1-0001-Fix-up-partitionwise-join.patchapplication/octet-stream; name=v1-0001-Fix-up-partitionwise-join.patchDownload+161-7
#6Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Richard Guo (#5)
Re: A problem about partitionwise join

On Thu, Aug 29, 2019 at 6:45 PM Richard Guo <riguo@pivotal.io> wrote:

On Wed, Aug 28, 2019 at 6:49 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:

On Tue, Aug 27, 2019 at 4:57 PM Richard Guo <riguo@pivotal.io> wrote:

Check the query below as a more illustrative example:

create table p (k int, val int) partition by range(k);
create table p_1 partition of p for values from (1) to (10);
create table p_2 partition of p for values from (10) to (100);

If we use quals 'foo.k = bar.k and foo.k = bar.val', we can generate
partitionwise join:

# explain (costs off)
select * from p as foo join p as bar on foo.k = bar.k and foo.k = bar.val;
QUERY PLAN
-----------------------------------------
Append
-> Hash Join
Hash Cond: (foo.k = bar.k)
-> Seq Scan on p_1 foo
-> Hash
-> Seq Scan on p_1 bar
Filter: (k = val)
-> Hash Join
Hash Cond: (foo_1.k = bar_1.k)
-> Seq Scan on p_2 foo_1
-> Hash
-> Seq Scan on p_2 bar_1
Filter: (k = val)
(13 rows)

But if we exchange the order of the two quals to 'foo.k = bar.val and
foo.k = bar.k', then partitionwise join cannot be generated any more,
because we only have joinclause 'foo.k = bar.val' as it first reached
score of 3. We have missed the joinclause on the partition key although
it does exist.

# explain (costs off)
select * from p as foo join p as bar on foo.k = bar.val and foo.k = bar.k;
QUERY PLAN
-----------------------------------------
Hash Join
Hash Cond: (foo.k = bar.val)
-> Append
-> Seq Scan on p_1 foo
-> Seq Scan on p_2 foo_1
-> Hash
-> Append
-> Seq Scan on p_1 bar
Filter: (val = k)
-> Seq Scan on p_2 bar_1
Filter: (val = k)
(11 rows)

I think it would be nice if we can address this issue.

Attached is a patch as an attempt to address this issue. The idea is
quite straightforward. When building partition info for joinrel, we
generate any possible EC-derived joinclauses of form 'outer_em =
inner_em', which will be used together with the original restrictlist to
check if there exists an equi-join condition for each pair of partition
keys.

Thank you for the patch! Will review. Could you add the patch to the
upcoming CF so that it doesn’t get lost?

Best regards,
Etsuro Fujita

#7Richard Guo
guofenglinux@gmail.com
In reply to: Etsuro Fujita (#6)
Re: A problem about partitionwise join

On Fri, Aug 30, 2019 at 2:08 AM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:

On Thu, Aug 29, 2019 at 6:45 PM Richard Guo <riguo@pivotal.io> wrote:

On Wed, Aug 28, 2019 at 6:49 PM Etsuro Fujita <etsuro.fujita@gmail.com>

wrote:

On Tue, Aug 27, 2019 at 4:57 PM Richard Guo <riguo@pivotal.io> wrote:

Check the query below as a more illustrative example:

create table p (k int, val int) partition by range(k);
create table p_1 partition of p for values from (1) to (10);
create table p_2 partition of p for values from (10) to (100);

If we use quals 'foo.k = bar.k and foo.k = bar.val', we can generate
partitionwise join:

# explain (costs off)
select * from p as foo join p as bar on foo.k = bar.k and foo.k =

bar.val;

QUERY PLAN
-----------------------------------------
Append
-> Hash Join
Hash Cond: (foo.k = bar.k)
-> Seq Scan on p_1 foo
-> Hash
-> Seq Scan on p_1 bar
Filter: (k = val)
-> Hash Join
Hash Cond: (foo_1.k = bar_1.k)
-> Seq Scan on p_2 foo_1
-> Hash
-> Seq Scan on p_2 bar_1
Filter: (k = val)
(13 rows)

But if we exchange the order of the two quals to 'foo.k = bar.val and
foo.k = bar.k', then partitionwise join cannot be generated any more,
because we only have joinclause 'foo.k = bar.val' as it first reached
score of 3. We have missed the joinclause on the partition key

although

it does exist.

# explain (costs off)
select * from p as foo join p as bar on foo.k = bar.val and foo.k =

bar.k;

QUERY PLAN
-----------------------------------------
Hash Join
Hash Cond: (foo.k = bar.val)
-> Append
-> Seq Scan on p_1 foo
-> Seq Scan on p_2 foo_1
-> Hash
-> Append
-> Seq Scan on p_1 bar
Filter: (val = k)
-> Seq Scan on p_2 bar_1
Filter: (val = k)
(11 rows)

I think it would be nice if we can address this issue.

Attached is a patch as an attempt to address this issue. The idea is
quite straightforward. When building partition info for joinrel, we
generate any possible EC-derived joinclauses of form 'outer_em =
inner_em', which will be used together with the original restrictlist to
check if there exists an equi-join condition for each pair of partition
keys.

Thank you for the patch! Will review. Could you add the patch to the
upcoming CF so that it doesn’t get lost?

Added this patch: https://commitfest.postgresql.org/24/2266/

Thanks
Richard

#8Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Richard Guo (#7)
Re: A problem about partitionwise join

On Fri, Aug 30, 2019 at 12:15 PM Richard Guo <riguo@pivotal.io> wrote:

On Fri, Aug 30, 2019 at 2:08 AM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:

On Thu, Aug 29, 2019 at 6:45 PM Richard Guo <riguo@pivotal.io> wrote:

Attached is a patch as an attempt to address this issue. The idea is
quite straightforward. When building partition info for joinrel, we
generate any possible EC-derived joinclauses of form 'outer_em =
inner_em', which will be used together with the original restrictlist to
check if there exists an equi-join condition for each pair of partition
keys.

Could you add the patch to the
upcoming CF so that it doesn’t get lost?

Added this patch: https://commitfest.postgresql.org/24/2266/

Thanks!

Best regards,
Etsuro Fujita

#9Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Richard Guo (#5)
Re: A problem about partitionwise join

So in this patch, the input restrictlist is modified to include the
clauses generated by generate_join_implied_equalities_for_all. That
doesn't seem okay -- doesn't it affect downstream usage of the
restrictlist in the caller of set_joinrel_size_estimates?

I wonder if it's possible to do this by using the ECs directly in
have_partkey_equi_join instead of using them to create fake join
clauses.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#10Richard Guo
guofenglinux@gmail.com
In reply to: Alvaro Herrera (#9)
Re: A problem about partitionwise join

Hi Alvaro,

Thank you for reviewing this patch.

On Wed, Sep 11, 2019 at 4:48 AM Alvaro Herrera from 2ndQuadrant <
alvherre@alvh.no-ip.org> wrote:

So in this patch, the input restrictlist is modified to include the
clauses generated by generate_join_implied_equalities_for_all. That
doesn't seem okay -- doesn't it affect downstream usage of the
restrictlist in the caller of set_joinrel_size_estimates?

Actually the joinclauses generated by
generate_join_implied_equalities_for_all only affects the restrictlist
used in have_partkey_equi_join to check equi-join conditions for
partition keys. The input restrictlist would not be altered.

I wonder if it's possible to do this by using the ECs directly in
have_partkey_equi_join instead of using them to create fake join
clauses.

Hmm.. I thought about this option and at last figured that what we need
to do in have_partkey_equi_join with the ECs is actually the same as in
generate_join_implied_equalities_for_all. Maybe I didn't think it
correctly.

Thanks
Richard

#11Dilip Kumar
dilipbalaut@gmail.com
In reply to: Richard Guo (#5)
Re: A problem about partitionwise join

On Thu, Aug 29, 2019 at 3:15 PM Richard Guo <riguo@pivotal.io> wrote:

Attached is a patch as an attempt to address this issue. The idea is
quite straightforward. When building partition info for joinrel, we
generate any possible EC-derived joinclauses of form 'outer_em =
inner_em', which will be used together with the original restrictlist to
check if there exists an equi-join condition for each pair of partition
keys.

Any comments are welcome!

 /*
+ * generate_join_implied_equalities_for_all
+ *   Create any EC-derived joinclauses of form 'outer_em = inner_em'.
+ *
+ * This is used when building partition info for joinrel.
+ */
+List *
+generate_join_implied_equalities_for_all(PlannerInfo *root,
+ Relids join_relids,
+ Relids outer_relids,
+ Relids inner_relids)

I think we need to have more detailed comments about why we need this
separate function, we can also explain that
generate_join_implied_equalities function will avoid
the join clause if EC has the constant but for partition-wise join, we
need that clause too.

+ while ((i = bms_next_member(matching_ecs, i)) >= 0)
+ {
+ EquivalenceClass *ec = (EquivalenceClass *) list_nth(root->eq_classes, i);
+ List    *outer_members = NIL;
+ List    *inner_members = NIL;
+ ListCell   *lc1;
+
+ /* Do not consider this EC if it's ec_broken */
+ if (ec->ec_broken)
+ continue;
+
+ /* Single-member ECs won't generate any deductions */
+ if (list_length(ec->ec_members) <= 1)
+ continue;
+

I am wondering isn't it possible to just process the missing join
clause? I mean 'generate_join_implied_equalities' has only skipped
the ECs which has const so
can't we create join clause only for those ECs and append it the
"Restrictlist" we already have? I might be missing something?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#12Richard Guo
guofenglinux@gmail.com
In reply to: Dilip Kumar (#11)
Re: A problem about partitionwise join

Hi Dilip,

Thank you for reviewing this patch.

On Fri, Sep 20, 2019 at 12:48 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Aug 29, 2019 at 3:15 PM Richard Guo <riguo@pivotal.io> wrote:

Attached is a patch as an attempt to address this issue. The idea is
quite straightforward. When building partition info for joinrel, we
generate any possible EC-derived joinclauses of form 'outer_em =
inner_em', which will be used together with the original restrictlist to
check if there exists an equi-join condition for each pair of partition
keys.

Any comments are welcome!

/*
+ * generate_join_implied_equalities_for_all
+ *   Create any EC-derived joinclauses of form 'outer_em = inner_em'.
+ *
+ * This is used when building partition info for joinrel.
+ */
+List *
+generate_join_implied_equalities_for_all(PlannerInfo *root,
+ Relids join_relids,
+ Relids outer_relids,
+ Relids inner_relids)

I think we need to have more detailed comments about why we need this
separate function, we can also explain that
generate_join_implied_equalities function will avoid
the join clause if EC has the constant but for partition-wise join, we
need that clause too.

Thank you for the suggestion.

+ while ((i = bms_next_member(matching_ecs, i)) >= 0)
+ {
+ EquivalenceClass *ec = (EquivalenceClass *) list_nth(root->eq_classes,
i);
+ List    *outer_members = NIL;
+ List    *inner_members = NIL;
+ ListCell   *lc1;
+
+ /* Do not consider this EC if it's ec_broken */
+ if (ec->ec_broken)
+ continue;
+
+ /* Single-member ECs won't generate any deductions */
+ if (list_length(ec->ec_members) <= 1)
+ continue;
+

I am wondering isn't it possible to just process the missing join
clause? I mean 'generate_join_implied_equalities' has only skipped
the ECs which has const so
can't we create join clause only for those ECs and append it the
"Restrictlist" we already have? I might be missing something?

For ECs without const, 'generate_join_implied_equalities' also has
skipped some join clauses since it only selects the joinclause with
'best_score' between outer members and inner members. And the missing
join clauses are needed to generate partitionwise join. That's why the
query below cannot be planned as partitionwise join, as we have missed
joinclause 'foo.k = bar.k'.

select * from p as foo join p as bar on foo.k = bar.val and foo.k = bar.k;

And yes 'generate_join_implied_equalities_for_all' will create join
clauses that have existed in restrictlist. I think it's OK since the
same RestrictInfo deduced from EC will share the same pointer and
list_concat_unique_ptr will make sure there are no duplicates in the
restrictlist used by have_partkey_equi_join.

Thanks
Richard

#13Dilip Kumar
dilipbalaut@gmail.com
In reply to: Richard Guo (#12)
Re: A problem about partitionwise join

On Fri, Sep 20, 2019 at 2:33 PM Richard Guo <riguo@pivotal.io> wrote:

Hi Dilip,

Thank you for reviewing this patch.

On Fri, Sep 20, 2019 at 12:48 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Aug 29, 2019 at 3:15 PM Richard Guo <riguo@pivotal.io> wrote:

Attached is a patch as an attempt to address this issue. The idea is
quite straightforward. When building partition info for joinrel, we
generate any possible EC-derived joinclauses of form 'outer_em =
inner_em', which will be used together with the original restrictlist to
check if there exists an equi-join condition for each pair of partition
keys.

Any comments are welcome!

/*
+ * generate_join_implied_equalities_for_all
+ *   Create any EC-derived joinclauses of form 'outer_em = inner_em'.
+ *
+ * This is used when building partition info for joinrel.
+ */
+List *
+generate_join_implied_equalities_for_all(PlannerInfo *root,
+ Relids join_relids,
+ Relids outer_relids,
+ Relids inner_relids)

I think we need to have more detailed comments about why we need this
separate function, we can also explain that
generate_join_implied_equalities function will avoid
the join clause if EC has the constant but for partition-wise join, we
need that clause too.

Thank you for the suggestion.

+ while ((i = bms_next_member(matching_ecs, i)) >= 0)
+ {
+ EquivalenceClass *ec = (EquivalenceClass *) list_nth(root->eq_classes, i);
+ List    *outer_members = NIL;
+ List    *inner_members = NIL;
+ ListCell   *lc1;
+
+ /* Do not consider this EC if it's ec_broken */
+ if (ec->ec_broken)
+ continue;
+
+ /* Single-member ECs won't generate any deductions */
+ if (list_length(ec->ec_members) <= 1)
+ continue;
+

I am wondering isn't it possible to just process the missing join
clause? I mean 'generate_join_implied_equalities' has only skipped
the ECs which has const so
can't we create join clause only for those ECs and append it the
"Restrictlist" we already have? I might be missing something?

For ECs without const, 'generate_join_implied_equalities' also has
skipped some join clauses since it only selects the joinclause with
'best_score' between outer members and inner members. And the missing
join clauses are needed to generate partitionwise join. That's why the
query below cannot be planned as partitionwise join, as we have missed
joinclause 'foo.k = bar.k'.

oh right. I missed that part.

select * from p as foo join p as bar on foo.k = bar.val and foo.k = bar.k;

And yes 'generate_join_implied_equalities_for_all' will create join
clauses that have existed in restrictlist. I think it's OK since the
same RestrictInfo deduced from EC will share the same pointer and
list_concat_unique_ptr will make sure there are no duplicates in the
restrictlist used by have_partkey_equi_join.

ok

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#14Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#6)
Re: A problem about partitionwise join

Hi Richard,

On Fri, Aug 30, 2019 at 3:08 AM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:

On Thu, Aug 29, 2019 at 6:45 PM Richard Guo <riguo@pivotal.io> wrote:

Attached is a patch as an attempt to address this issue. The idea is
quite straightforward. When building partition info for joinrel, we
generate any possible EC-derived joinclauses of form 'outer_em =
inner_em', which will be used together with the original restrictlist to
check if there exists an equi-join condition for each pair of partition
keys.

Will review.

I've just started reviewing this patch. One comment I have for now
is: this is categorized into Bug Fixes, but we have a workaround at
least to the regression test case in the patch (ie, just reorder join
clauses), so this seems to me more like an improvement than a bug fix.

Sorry for the delay.

Best regards,
Etsuro Fujita

#15Michael Paquier
michael@paquier.xyz
In reply to: Etsuro Fujita (#14)
Re: A problem about partitionwise join

On Tue, Nov 26, 2019 at 08:35:33PM +0900, Etsuro Fujita wrote:

I've just started reviewing this patch. One comment I have for now
is: this is categorized into Bug Fixes, but we have a workaround at
least to the regression test case in the patch (ie, just reorder join
clauses), so this seems to me more like an improvement than a bug fix.

Hmm. Agreed. Changed the category and moved to next CF.
--
Michael

#16Richard Guo
guofenglinux@gmail.com
In reply to: Michael Paquier (#15)
Re: A problem about partitionwise join

On Fri, Nov 29, 2019 at 11:03 AM Michael Paquier <michael@paquier.xyz>
wrote:

On Tue, Nov 26, 2019 at 08:35:33PM +0900, Etsuro Fujita wrote:

I've just started reviewing this patch. One comment I have for now
is: this is categorized into Bug Fixes, but we have a workaround at
least to the regression test case in the patch (ie, just reorder join
clauses), so this seems to me more like an improvement than a bug fix.

Hmm. Agreed. Changed the category and moved to next CF.

Thanks Etsuro for the comment and thanks Michael for the change.

Thanks
Richard

#17Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Richard Guo (#16)
Re: A problem about partitionwise join

On Fri, Nov 29, 2019 at 12:08 PM Richard Guo <riguo@pivotal.io> wrote:

On Fri, Nov 29, 2019 at 11:03 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Nov 26, 2019 at 08:35:33PM +0900, Etsuro Fujita wrote:

I've just started reviewing this patch. One comment I have for now
is: this is categorized into Bug Fixes, but we have a workaround at
least to the regression test case in the patch (ie, just reorder join
clauses), so this seems to me more like an improvement than a bug fix.

Hmm. Agreed. Changed the category and moved to next CF.

thanks Michael for the change.

+1

Best regards,
Etsuro Fujita

#18Richard Guo
guofenglinux@gmail.com
In reply to: Etsuro Fujita (#17)
Re: A problem about partitionwise join

Rebased the patch with latest master and also addressed the test case
failure reported by PostgreSQL Patch Tester.

Thanks
Richard

On Fri, Nov 29, 2019 at 11:35 AM Etsuro Fujita <etsuro.fujita@gmail.com>
wrote:

Show quoted text

On Fri, Nov 29, 2019 at 12:08 PM Richard Guo <riguo@pivotal.io> wrote:

On Fri, Nov 29, 2019 at 11:03 AM Michael Paquier <michael@paquier.xyz>

wrote:

On Tue, Nov 26, 2019 at 08:35:33PM +0900, Etsuro Fujita wrote:

I've just started reviewing this patch. One comment I have for now
is: this is categorized into Bug Fixes, but we have a workaround at
least to the regression test case in the patch (ie, just reorder join
clauses), so this seems to me more like an improvement than a bug fix.

Hmm. Agreed. Changed the category and moved to next CF.

thanks Michael for the change.

+1

Best regards,
Etsuro Fujita

Attachments:

v2-0001-Fix-up-partitionwise-join.patchapplication/octet-stream; name=v2-0001-Fix-up-partitionwise-join.patchDownload+161-7
#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Richard Guo (#18)
Re: A problem about partitionwise join

Richard Guo <riguo@pivotal.io> writes:

Rebased the patch with latest master and also addressed the test case
failure reported by PostgreSQL Patch Tester.

I looked this patch over, but I don't like it too much: it seems very
brute-force (and badly under-commented). Creating all those extra
RestrictInfos isn't too cheap in itself, plus they'll jam up the
equivalence-class machinery for future tests.

There is already something in equivclass.c that would almost do what
we want here: exprs_known_equal() would tell us whether the partkeys
can be found in the same eclass, without having to generate data
structures along the way. The current implementation is not watertight
because it doesn't check opclass semantics, but that consideration
can be bolted on readily enough. So that leads me to something like
the attached.

One argument that could be made against this approach is that if there
are a lot of partkey expressions, this requires O(N^2) calls to
exprs_known_equal, something that's already not too cheap. I think
that that's not a big problem because the number of partkey expressions
would only be equal to the join degree (ie it doesn't scale with the
number of partitions of the baserels) ... but maybe I'm wrong about
that? I also wonder if it's really necessary to check every pair
of partkey expressions. It seems at least plausible that in the
cases we care about, all the partkeys on each side would be in the same
eclasses anyway, so that comparing the first members of each list would
be sufficient. But I haven't beat on that point.

regards, tom lane

Attachments:

v3-0001-Fix-up-partitionwise-join.patchtext/x-diff; charset=us-ascii; name=v3-0001-Fix-up-partitionwise-join.patchDownload+169-30
#20Richard Guo
guofenglinux@gmail.com
In reply to: Tom Lane (#19)
Re: A problem about partitionwise join

On Sun, Apr 5, 2020 at 4:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Richard Guo <riguo@pivotal.io> writes:

Rebased the patch with latest master and also addressed the test case
failure reported by PostgreSQL Patch Tester.

I looked this patch over, but I don't like it too much: it seems very
brute-force (and badly under-commented). Creating all those extra
RestrictInfos isn't too cheap in itself, plus they'll jam up the
equivalence-class machinery for future tests.

Thanks for the review.

There is already something in equivclass.c that would almost do what
we want here: exprs_known_equal() would tell us whether the partkeys
can be found in the same eclass, without having to generate data
structures along the way. The current implementation is not watertight
because it doesn't check opclass semantics, but that consideration
can be bolted on readily enough. So that leads me to something like
the attached.

I looked through this patch and it's much more elegant than the previous
one. Thank you for working on it.

For partkeys which fail to be identified as equal by looking through
restrictlist, it's a good idea to check them in ECs with the help of
exprs_known_equal().

I have some concern about we only check non-nullable partexprs. Is it
possible that two nullable partexprs come from the same EC? I tried to
give an example but failed.

One argument that could be made against this approach is that if there
are a lot of partkey expressions, this requires O(N^2) calls to
exprs_known_equal, something that's already not too cheap. I think
that that's not a big problem because the number of partkey expressions
would only be equal to the join degree (ie it doesn't scale with the
number of partitions of the baserels) ... but maybe I'm wrong about
that?

You are right. According to how partexpr is formed for joinrel in
set_joinrel_partition_key_exprs(), each base relation within the join
contributes one partexpr, so the number of partexprs would be equal to
the join degree.

I also wonder if it's really necessary to check every pair
of partkey expressions. It seems at least plausible that in the
cases we care about, all the partkeys on each side would be in the same
eclasses anyway, so that comparing the first members of each list would
be sufficient. But I haven't beat on that point.

Not sure about it. But cannot come out with a counterexample.

Thanks
Richard

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Richard Guo (#20)
#22Richard Guo
guofenglinux@gmail.com
In reply to: Tom Lane (#21)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Richard Guo (#22)
#24Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Richard Guo (#22)
#25Anastasia Lubennikova
a.lubennikova@postgrespro.ru
In reply to: Richard Guo (#22)
#26Richard Guo
guofenglinux@gmail.com
In reply to: Anastasia Lubennikova (#25)
#27Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Richard Guo (#26)
#28David Steele
david@pgmasters.net
In reply to: Ashutosh Bapat (#27)
#29Richard Guo
guofenglinux@gmail.com
In reply to: Ashutosh Bapat (#27)
#30Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Richard Guo (#29)
#31Richard Guo
guofenglinux@gmail.com
In reply to: Jaime Casanova (#30)
#32Richard Guo
guofenglinux@gmail.com
In reply to: Richard Guo (#31)
#33Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Richard Guo (#32)
#34Richard Guo
guofenglinux@gmail.com
In reply to: Jacob Champion (#33)
#35Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Richard Guo (#34)
#36Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Ashutosh Bapat (#35)
#37Richard Guo
guofenglinux@gmail.com
In reply to: Ashutosh Bapat (#36)
#38Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Richard Guo (#37)
#39Richard Guo
guofenglinux@gmail.com
In reply to: Ashutosh Bapat (#38)
#40Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Richard Guo (#39)
#41Robert Haas
robertmhaas@gmail.com
In reply to: Richard Guo (#34)
#42Richard Guo
guofenglinux@gmail.com
In reply to: Robert Haas (#41)
#43Robert Haas
robertmhaas@gmail.com
In reply to: Richard Guo (#42)
#44Richard Guo
guofenglinux@gmail.com
In reply to: Robert Haas (#43)
#45Richard Guo
guofenglinux@gmail.com
In reply to: Richard Guo (#44)
#46Richard Guo
guofenglinux@gmail.com
In reply to: Richard Guo (#1)
#47Richard Guo
guofenglinux@gmail.com
In reply to: Ashutosh Bapat (#40)
#48Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Richard Guo (#46)