BUG #15430: partition-wise join only works in combination with pruning on 1 partition

Started by PG Bug reporting formover 7 years ago4 messagesbugs

noreply@postgresql.org

over 7 years ago

The following bug has been logged on the website:

Bug reference: 15430
Logged by: Bart Debersaques
Email address: agile.data.analytics@gmail.com
PostgreSQL version: 11rc1
Operating system: Centos 7
Description:

psql <<-EOF
create schema part_test
;
CREATE TABLE part_test.measurement (
city_id int not null,
logdate date not null,
peaktemp int,
unitsales
int
) PARTITION BY RANGE (logdate)
;
CREATE TABLE part_test.measurement_y2006m02 PARTITION OF
part_test.measurement
FOR VALUES FROM ('2006-02-01') TO ('2006-03-01')
;
CREATE TABLE part_test.measurement_y2006m03 PARTITION OF
part_test.measurement
FOR VALUES FROM ('2006-03-01') TO ('2006-04-01')
;
EOF

psql <<-EOF
CREATE TABLE part_test.measurement2 (
city_id int not null,
logdate date not null,
peaktemp int,
unitsales
int
) PARTITION BY RANGE (logdate)
;
CREATE TABLE part_test.measurement2_y2006m02 PARTITION OF
part_test.measurement2
FOR VALUES FROM ('2006-02-01') TO ('2006-03-01')
;
CREATE TABLE part_test.measurement2_y2006m03 PARTITION OF
part_test.measurement2
FOR VALUES FROM ('2006-03-01') TO ('2006-04-01')
;
EOF

psql <<-EOF
explain
select *
from
part_test.measurement m1
inner join part_test.measurement2 m2
on m1.city_id = m2.city_id and m1.logdate = m2.logdate
where m1.logdate = '2006-02-01'::date EOF
# Hash Join (cost=33.28..66.58 rows=9 width=32)
# Hash Cond: (m1.city_id = m2.city_id)
# -> Append (cost=0.00..33.17 rows=9 width=16)
# -> Seq Scan on measurement_y2006m02 m1 (cost=0.00..33.12 rows=9
width=16)
# Filter: (logdate = '2006-02-01'::date)
# -> Hash (cost=33.17..33.17 rows=9 width=16)
# -> Append (cost=0.00..33.17 rows=9 width=16)
# -> Seq Scan on measurement2_y2006m02 m2 (cost=0.00..33.12
rows=9 width=16)
# Filter: (logdate = '2006-02-01'::date)

# conclusion: pruning applied to both tables = OK

psql <<-EOF
explain
select *
from
part_test.measurement m1
inner join part_test.measurement2 m2
on m1.city_id = m2.city_id and m1.logdate = m2.logdate
where m1.logdate in ('2006-02-01'::date, '2006-03-01'::date) EOF
# Hash Join (cost=66.97..170.25 rows=3 width=32)
# Hash Cond: ((m2.city_id = m1.city_id) AND (m2.logdate = m1.logdate))
# -> Append (cost=0.00..75.50 rows=3700 width=16)
# -> Seq Scan on measurement2_y2006m02 m2 (cost=0.00..28.50
rows=1850 width=16)
# -> Seq Scan on measurement2_y2006m03 m2_1 (cost=0.00..28.50
rows=1850 width=16)
# -> Hash (cost=66.43..66.43 rows=36 width=16)
# -> Append (cost=0.00..66.43 rows=36 width=16)
# -> Seq Scan on measurement_y2006m02 m1 (cost=0.00..33.12
rows=18 width=16)
# Filter: (logdate = ANY
('{2006-02-01,2006-03-01}'::date[]))
# -> Seq Scan on measurement_y2006m03 m1_1 (cost=0.00..33.12
rows=18 width=16)
# Filter: (logdate = ANY
('{2006-02-01,2006-03-01}'::date[]))

# conclusion: join performed on full tab scans, join not performed on a
per-partion basis

psql <<-EOF
explain
select *
from
part_test.measurement m1
inner join part_test.measurement2 m2
on m1.city_id = m2.city_id and m1.logdate = m2.logdate
EOF
# Merge Join (cost=589.57..648.49 rows=342 width=32)
# Merge Cond: ((m1.city_id = m2.city_id) AND (m1.logdate = m2.logdate))
# -> Sort (cost=294.79..304.04 rows=3700 width=16)
# Sort Key: m1.city_id, m1.logdate
# -> Append (cost=0.00..75.50 rows=3700 width=16)
# -> Seq Scan on measurement_y2006m02 m1 (cost=0.00..28.50
rows=1850 width=16)
# -> Seq Scan on measurement_y2006m03 m1_1 (cost=0.00..28.50
rows=1850 width=16)
# -> Sort (cost=294.79..304.04 rows=3700 width=16)
# Sort Key: m2.city_id, m2.logdate
# -> Append (cost=0.00..75.50 rows=3700 width=16)
# -> Seq Scan on measurement2_y2006m02 m2 (cost=0.00..28.50
rows=1850 width=16)
# -> Seq Scan on measurement2_y2006m03 m2_1
(cost=0.00..28.50 rows=1850 width=16)

# conclusion: join performed on full tab scans, join not performed on a
per-partion basis

# desired behaviour:

# in serial execution and for an equi join: partitions in m1 should be
iterated and joined with the equivaluent in m2
# parallel execution: the above should be done with several partitions in
parallel

# please assign to Ashutosh Bapat as described in
https://www.enterprisedb.com/blog/partition-wise-joins-%E2%80%9Cdivide-and-conquer-joins-between-partitioned-table

reading

https://www.enterprisedb.com/blog/partition-wise-joins-%E2%80%9Cdivide-and-conquer-joins-between-partitioned-table

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

over 7 years ago

In reply to: PG Bug reporting form (#1)

Re: BUG #15430: partition-wise join only works in combination with pruning on 1 partition

Hi,

On 2018/10/15 0:20, PG Bug reporting form wrote:

The following bug has been logged on the website:

Bug reference: 15430
Logged by: Bart Debersaques
Email address: agile.data.analytics@gmail.com
PostgreSQL version: 11rc1
Operating system: Centos 7
Description:

[ ... ]

psql <<-EOF
explain
select *
from
part_test.measurement m1
inner join part_test.measurement2 m2
on m1.city_id = m2.city_id and m1.logdate = m2.logdate
where m1.logdate in ('2006-02-01'::date, '2006-03-01'::date) EOF
# Hash Join (cost=66.97..170.25 rows=3 width=32)
# Hash Cond: ((m2.city_id = m1.city_id) AND (m2.logdate = m1.logdate))
# -> Append (cost=0.00..75.50 rows=3700 width=16)
# -> Seq Scan on measurement2_y2006m02 m2 (cost=0.00..28.50
rows=1850 width=16)
# -> Seq Scan on measurement2_y2006m03 m2_1 (cost=0.00..28.50
rows=1850 width=16)
# -> Hash (cost=66.43..66.43 rows=36 width=16)
# -> Append (cost=0.00..66.43 rows=36 width=16)
# -> Seq Scan on measurement_y2006m02 m1 (cost=0.00..33.12
rows=18 width=16)
# Filter: (logdate = ANY
('{2006-02-01,2006-03-01}'::date[]))
# -> Seq Scan on measurement_y2006m03 m1_1 (cost=0.00..33.12
rows=18 width=16)
# Filter: (logdate = ANY
('{2006-02-01,2006-03-01}'::date[]))

# conclusion: join performed on full tab scans, join not performed on a
per-partion basis

Have you changed the value of enable_partitionwise_join parameter to 'on'?
It's 'off' by default, so join is not performed on per-partition basis.

Thanks,
Amit

Bart D

agile.data.analytics@gmail.com

over 7 years ago

In reply to: Amit Langote (#2)

Re: BUG #15430: partition-wise join only works in combination with pruning on 1 partition

Amit,

You are right, I think the feature should be enabled by default though.

This feature guarantees linear scalability.

My apologies, please close the bug.

Please find results below.

Thanks in advance,

psql <<-EOF
show enable_partitionwise_join;
set enable_partitionwise_join = on;
show enable_partitionwise_join;
explain
select *
from
part_test.measurement m1
inner join part_test.measurement2 m2
on m1.city_id = m2.city_id and m1.logdate = m2.logdate
EOF
# Append (cost=257.79..573.65 rows=172 width=32)
# -> Merge Join (cost=257.79..286.40 rows=86 width=32)
# Merge Cond: ((m1.city_id = m2.city_id) AND (m1.logdate = m2.logdate))
# -> Sort (cost=128.89..133.52 rows=1850 width=16)
# Sort Key: m1.city_id, m1.logdate
# -> Seq Scan on measurement_y2006m02 m1 (cost=0.00..28.50 rows=1850 width=16)
# -> Sort (cost=128.89..133.52 rows=1850 width=16)
# Sort Key: m2.city_id, m2.logdate
# -> Seq Scan on measurement2_y2006m02 m2 (cost=0.00..28.50 rows=1850 width=16)
# -> Merge Join (cost=257.79..286.40 rows=86 width=32)
# Merge Cond: ((m1_1.city_id = m2_1.city_id) AND (m1_1.logdate = m2_1.logdate))
# -> Sort (cost=128.89..133.52 rows=1850 width=16)
# Sort Key: m1_1.city_id, m1_1.logdate
# -> Seq Scan on measurement_y2006m03 m1_1 (cost=0.00..28.50 rows=1850 width=16)
# -> Sort (cost=128.89..133.52 rows=1850 width=16)
# Sort Key: m2_1.city_id, m2_1.logdate
# -> Seq Scan on measurement2_y2006m03 m2_1 (cost=0.00..28.50 rows=1850 width=16)

psql <<-EOF
show enable_partitionwise_join;
set enable_partitionwise_join = on;
show enable_partitionwise_join;
explain
select *
from
part_test.measurement m1
inner join part_test.measurement2 m2
on m1.city_id = m2.city_id and m1.logdate = m2.logdate
where m1.logdate in ('2006-02-01'::date, '2006-03-01'::date)
EOF
# Append (cost=33.40..160.82 rows=2 width=32)
# -> Hash Join (cost=33.40..80.41 rows=1 width=32)
# Hash Cond: ((m2.city_id = m1.city_id) AND (m2.logdate = m1.logdate))
# -> Seq Scan on measurement2_y2006m02 m2 (cost=0.00..28.50 rows=1850 width=16)
# -> Hash (cost=33.12..33.12 rows=18 width=16)
# -> Seq Scan on measurement_y2006m02 m1 (cost=0.00..33.12 rows=18 width=16)
# Filter: (logdate = ANY ('{2006-02-01,2006-03-01}'::date[]))
# -> Hash Join (cost=33.40..80.41 rows=1 width=32)
# Hash Cond: ((m2_1.city_id = m1_1.city_id) AND (m2_1.logdate = m1_1.logdate))
# -> Seq Scan on measurement2_y2006m03 m2_1 (cost=0.00..28.50 rows=1850 width=16)
# -> Hash (cost=33.12..33.12 rows=18 width=16)
# -> Seq Scan on measurement_y2006m03 m1_1 (cost=0.00..33.12 rows=18 width=16)
# Filter: (logdate = ANY ('{2006-02-01,2006-03-01}'::date[]))

Show quoted text

On 15 Oct 2018, at 06:18, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:

Hi,

On 2018/10/15 0:20, PG Bug reporting form wrote:

The following bug has been logged on the website:

Bug reference: 15430
Logged by: Bart Debersaques
Email address: agile.data.analytics@gmail.com
PostgreSQL version: 11rc1
Operating system: Centos 7
Description:

[ ... ]

psql <<-EOF
explain
select *
from
part_test.measurement m1
inner join part_test.measurement2 m2
on m1.city_id = m2.city_id and m1.logdate = m2.logdate
where m1.logdate in ('2006-02-01'::date, '2006-03-01'::date)

EOF

# Hash Join (cost=66.97..170.25 rows=3 width=32)
# Hash Cond: ((m2.city_id = m1.city_id) AND (m2.logdate = m1.logdate))
# -> Append (cost=0.00..75.50 rows=3700 width=16)
# -> Seq Scan on measurement2_y2006m02 m2 (cost=0.00..28.50
rows=1850 width=16)
# -> Seq Scan on measurement2_y2006m03 m2_1 (cost=0.00..28.50
rows=1850 width=16)
# -> Hash (cost=66.43..66.43 rows=36 width=16)
# -> Append (cost=0.00..66.43 rows=36 width=16)
# -> Seq Scan on measurement_y2006m02 m1 (cost=0.00..33.12
rows=18 width=16)
# Filter: (logdate = ANY
('{2006-02-01,2006-03-01}'::date[]))
# -> Seq Scan on measurement_y2006m03 m1_1 (cost=0.00..33.12
rows=18 width=16)
# Filter: (logdate = ANY
('{2006-02-01,2006-03-01}'::date[]))

# conclusion: join performed on full tab scans, join not performed on a
per-partion basis

Have you changed the value of enable_partitionwise_join parameter to 'on'?
It's 'off' by default, so join is not performed on per-partition basis.

Thanks,
Amit

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

over 7 years ago

In reply to: Bart D (#3)

Re: BUG #15430: partition-wise join only works in combination with pruning on 1 partition

On 2018/10/16 4:30, Bart D wrote:

Amit,

You are right, I think the feature should be enabled by default though.

This feature guarantees linear scalability.

It isn't enabled by default in PG 11, because its current implementation
is CPU and memory consuming. If and when that's fixed in some future
version, it will be enabled.

Thanks,
Amit