No hash join across partitioned tables?

Started by Kris Jurkaabout 17 years ago43 messageshackers

books@ejurka.com

about 17 years ago

PG (8.3.7) doesn't seem to want to do a hash join across two partitioned
tables. I have two partition hierarchies: impounds (with different
impound sources) and liens (with vehicle liens from different companies).
Trying to match those up gives:

EXPLAIN SELECT COUNT(*)
FROM impounds i
JOIN liens l ON (i.vin = l.vin);

Aggregate (cost=11164042.66..11164042.67 rows=1 width=0)
-> Nested Loop (cost=0.27..3420012.94 rows=3097611886 width=0)
Join Filter: ((i.vin)::text = (l.vin)::text)
-> Append (cost=0.00..1072.77 rows=33577 width=21)
-> Seq Scan on impounds i (cost=0.00..11.40 rows=140 width=21)
-> Seq Scan on impounds_s1 i (cost=0.00..926.87 rows=29587 width=18)
-> Seq Scan on impounds_s2 i (cost=0.00..99.96 rows=3296 width=18)
-> Seq Scan on impounds_s3 i (cost=0.00..23.14 rows=414 width=18)
-> Seq Scan on impounds_s4 i (cost=0.00..11.40 rows=140 width=21)
-> Append (cost=0.27..101.64 rows=15 width=21)
-> Bitmap Heap Scan on liens l (cost=0.27..5.60 rows=2 width=21)
Recheck Cond: ((l.vin)::text = (i.vin)::text)
-> Bitmap Index Scan on liens_pk (cost=0.00..0.27 rows=2 width=0)
Index Cond: ((l.vin)::text = (i.vin)::text)
-> Index Scan using liens_s1_pk on liens_s1 l (cost=0.00..7.02 rows=1 width=18)
Index Cond: ((l.vin)::text = (i.vin)::text)
-> Index Scan using liens_s2_pk on liens_s2 l (cost=0.00..3.47 rows=1 width=21)
Index Cond: ((l.vin)::text = (i.vin)::text)
-> Index Scan using newliens_s3_pk on liens_s3 l (cost=0.00..7.52 rows=1 width=18)
Index Cond: ((l.vin)::text = (i.vin)::text)
-> Index Scan using newliens_s4_pk on liens_s4 l (cost=0.00..7.67 rows=1 width=18)
Index Cond: ((l.vin)::text = (i.vin)::text)
-> Index Scan using newliens_s5_pk on liens_s5 l (cost=0.00..7.62 rows=1 width=18)
Index Cond: ((l.vin)::text = (i.vin)::text)
-> Index Scan using newliens_s6_pk on liens_s6 l (cost=0.00..7.61 rows=1 width=18)
Index Cond: ((l.vin)::text = (i.vin)::text)
-> Index Scan using newliens_s7_pk on liens_s7 l (cost=0.00..7.50 rows=1 width=18)
Index Cond: ((l.vin)::text = (i.vin)::text)
-> Index Scan using newliens_s8_pk on liens_s8 l (cost=0.00..7.36 rows=1 width=18)
Index Cond: ((l.vin)::text = (i.vin)::text)
-> Index Scan using newliens_s9_pk on liens_s9 l (cost=0.00..7.43 rows=1 width=18)
Index Cond: ((l.vin)::text = (i.vin)::text)
-> Index Scan using newliens_s10_pk on liens_s10 l (cost=0.00..7.79 rows=1 width=18)
Index Cond: ((l.vin)::text = (i.vin)::text)
-> Index Scan using newliens_s11_pk on liens_s11 l (cost=0.00..8.07 rows=1 width=18)
Index Cond: ((l.vin)::text = (i.vin)::text)
-> Index Scan using newliens_s12_pk on liens_s12 l (cost=0.00..8.45 rows=1 width=18)
Index Cond: ((l.vin)::text = (i.vin)::text)
-> Index Scan using newliens_s13_pk on liens_s13 l (cost=0.00..8.53 rows=1 width=18)
Index Cond: ((l.vin)::text = (i.vin)::text)

This takes quite a while as it's got to do tons of index probes which
results it tons of random IO. I killed this after five minutes of
running.

But if I do:

CREATE TABLE i1 AS SELECT * FROM impounds;
CREATE TABLE l1 AS SELECT * FROM liens;

I get a reasonable plan, which runs in about 15 seconds, from:

EXPLAIN SELECT COUNT(*)
FROM i1 i
JOIN l1 l ON (i.vin = l.vin);

Aggregate (cost=749054.78..749054.79 rows=1 width=0)
-> Hash Join (cost=1444.18..748971.43 rows=33338 width=0)
Hash Cond: ((l.vin)::text = (i.vin)::text)
-> Seq Scan on l1 l (cost=0.00..332068.96 rows=18449996
width=18)
-> Hash (cost=1027.97..1027.97 rows=33297 width=18)
-> Seq Scan on i1 i (cost=0.00..1027.97 rows=33297
width=18)

I've tried to force the hash join plan on the partitioned tables via:

set enable_nestloop to off;

This results in a merge join plan which needs to do a giant sort, again
killed after five minutes.

Aggregate (cost=58285765.20..58285765.21 rows=1 width=0)
-> Merge Join (cost=4077389.31..50541735.48 rows=3097611886 width=0)
Merge Cond: ((i.vin)::text = (l.vin)::text)
-> Sort (cost=4286.45..4370.39 rows=33577 width=21)
Sort Key: i.vin
-> Append (cost=0.00..1072.77 rows=33577 width=21)
-> Seq Scan on impounds i (cost=0.00..11.40 rows=140 width=21)
-> [Seq Scans on other partitions]
-> Materialize (cost=4073102.86..4303737.81 rows=18450796 width=21)
-> Sort (cost=4073102.86..4119229.85 rows=18450796 width=21)
Sort Key: l.vin
-> Append (cost=0.00..332797.96 rows=18450796 width=21)
-> Seq Scan on liens l (cost=0.00..14.00 rows=400 width=21)
-> [Seq Scans on other partitions]

Disabling mergejoin pushes it back to a nestloop join. Why can't it hash
join these two together?

Kris Jurka