Join Filter vs. Index Cond (performance regression 9.1->9.2+/HEAD)
This is distilled down from a performance regression problem that came
past on IRC earlier today:
create table t1 (a integer, b integer, c integer, primary key (a,b,c));
create table t2 (k2 integer, a integer, primary key (k2,a));
create table t3 (k3 integer, b integer, primary key (k3,b));
create table t4 (k4 integer, c integer, primary key (k4,c));
insert into t1 select i,i,i from generate_series(1,1000,20) i;
insert into t1 select 2,2,i from generate_series(1,500) i;
insert into t2 select i,i from generate_series(1,1000) i;
insert into t3 select i,i from generate_series(1,1000) i;
insert into t4 select i,i from generate_series(1,1000) i;
analyze;
explain analyze
select * from t4
left join t3 on (t4.c=t3.k3)
left join t2 on (t3.b=t2.k2)
left join t1 on (t1.a=t2.a and t1.b=t3.b and t1.c=t4.c)
where t4.k4=2;
The plan for this on 9.4.2 comes out like this:
Nested Loop Left Join (cost=1.10..17.28 rows=1 width=36) (actual time=0.089..0.448 rows=1 loops=1)
Join Filter: (t1.c = t4.c)
Rows Removed by Join Filter: 499
-> Nested Loop Left Join (cost=0.83..16.94 rows=1 width=24) (actual time=0.056..0.059 rows=1 loops=1)
-> Nested Loop Left Join (cost=0.55..16.60 rows=1 width=16) (actual time=0.044..0.046 rows=1 loops=1)
-> Index Only Scan using t4_pkey on t4 (cost=0.28..8.29 rows=1 width=8) (actual time=0.024..0.025 rows=1 loops=1)
Index Cond: (k4 = 2)
Heap Fetches: 1
-> Index Only Scan using t3_pkey on t3 (cost=0.28..8.29 rows=1 width=8) (actual time=0.011..0.012 rows=1 loops=1)
Index Cond: (k3 = t4.c)
Heap Fetches: 1
-> Index Only Scan using t2_pkey on t2 (cost=0.28..0.33 rows=1 width=8) (actual time=0.010..0.011 rows=1 loops=1)
Index Cond: (k2 = t3.b)
Heap Fetches: 1
-> Index Only Scan using t1_pkey on t1 (cost=0.28..0.33 rows=1 width=12) (actual time=0.025..0.281 rows=500 loops=1)
Index Cond: ((a = t2.a) AND (b = t3.b))
Heap Fetches: 500
Whereas 9.1 gives this:
Nested Loop Left Join (cost=0.00..33.12 rows=1 width=36) (actual time=0.074..0.096 rows=1 loops=1)
-> Nested Loop Left Join (cost=0.00..24.83 rows=1 width=24) (actual time=0.054..0.069 rows=1 loops=1)
-> Nested Loop Left Join (cost=0.00..16.55 rows=1 width=16) (actual time=0.039..0.048 rows=1 loops=1)
-> Index Scan using t4_pkey on t4 (cost=0.00..8.27 rows=1 width=8) (actual time=0.020..0.022 rows=1 loops=1)
Index Cond: (k4 = 2)
-> Index Scan using t3_pkey on t3 (cost=0.00..8.27 rows=1 width=8) (actual time=0.009..0.011 rows=1 loops=1)
Index Cond: (t4.c = k3)
-> Index Scan using t2_pkey on t2 (cost=0.00..8.27 rows=1 width=8) (actual time=0.008..0.010 rows=1 loops=1)
Index Cond: (t3.b = k2)
-> Index Scan using t1_pkey on t1 (cost=0.00..8.27 rows=1 width=12) (actual time=0.013..0.016 rows=1 loops=1)
Index Cond: ((a = t2.a) AND (b = t3.b) AND (c = t4.c))
In the real example, the join filter in the 9.4.2 plan was discarding 40
million rows, not just 500, so the performance impact was quite serious.
Obviously it makes little sense to use an (a,b,c) index to look up just
(a,b) and then filter on c; the question is, what is the planner doing
that leads it to get this so wrong? Finding a workaround for it was not
easy, either - the only thing that I found that worked was replacing the
t1 join with a lateral join with an OFFSET 0 clause to nobble the
planner entirely.
--
Andrew (irc:RhodiumToad)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, May 29, 2015 at 6:45 PM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
Obviously it makes little sense to use an (a,b,c) index to look up just
(a,b) and then filter on c; the question is, what is the planner doing
that leads it to get this so wrong? Finding a workaround for it was not
easy, either - the only thing that I found that worked was replacing the
t1 join with a lateral join with an OFFSET 0 clause to nobble the
planner entirely.
I agree. That looks like a bug.
The fact that it chooses index-only scans is also a little strange,
considering that they seem to be entirely ineffective. The tables
have never been vacuumed, so presumably, the all-visible bits are all
0, and the planner knows it.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
Obviously it makes little sense to use an (a,b,c) index to look up just
(a,b) and then filter on c; the question is, what is the planner doing
that leads it to get this so wrong?
It's not so astonishing as all that; compare
regression=# explain select * from t1 where a=3 and b=4;
QUERY PLAN
------------------------------------------------------------------------
Index Only Scan using t1_pkey on t1 (cost=0.28..8.29 rows=1 width=12)
Index Cond: ((a = 3) AND (b = 4))
(2 rows)
regression=# explain select * from t1 where a=3 and b=4 and c=5;
QUERY PLAN
------------------------------------------------------------------------
Index Only Scan using t1_pkey on t1 (cost=0.28..8.30 rows=1 width=12)
Index Cond: ((a = 3) AND (b = 4) AND (c = 5))
(2 rows)
Once you're down to an estimate of one row retrieved, adding additional
index conditions simply increases the cost (not by much, but it increases)
without delivering any visible benefit.
I believe what probably happened in this case is that the planner
considered both forms of the indexscan path and concluded that they were
fuzzily the same cost and rowcount, yet the path using only t2.a and t3.b
clearly dominated by requiring strictly fewer outer relations for
parameters. So it threw away the path that also had the c = t4.c
comparison before it ever got to the join stage. Even had it kept that
path, the join cost estimate wouldn't have looked any better than the one
for the join it did pick, so there would have been no certainty of picking
the "correct" plan.
The real problem in your example is thus the incorrect rowcount estimate;
with better rowcount estimates the two cases wouldn't have appeared to
have the same output rowcount.
For the toy data in your example, this can probably be blamed on the fact
that eqjoinsel_inner doesn't have any smarts for the case of having an MCV
list for only one side (though as noted in the comments, it's not obvious
what it should do instead). However, it's not very clear what was
happening in the real-world case.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> Once you're down to an estimate of one row retrieved, adding
Tom> additional index conditions simply increases the cost (not by
Tom> much, but it increases) without delivering any visible benefit.
OK, but this is a serious problem because "estimate of one row" is a
very common estimation failure mode, and isn't always solvable in the
sense of arranging for better estimates (in the absence of hints, ugh).
Tom> I believe what probably happened in this case is that the planner
Tom> considered both forms of the indexscan path and concluded that
Tom> they were fuzzily the same cost and rowcount, yet the path using
Tom> only t2.a and t3.b clearly dominated by requiring strictly fewer
Tom> outer relations for parameters. So it threw away the path that
Tom> also had the c = t4.c comparison before it ever got to the join
Tom> stage. Even had it kept that path, the join cost estimate
Tom> wouldn't have looked any better than the one for the join it did
Tom> pick, so there would have been no certainty of picking the
Tom> "correct" plan.
Tom> The real problem in your example is thus the incorrect rowcount
Tom> estimate; with better rowcount estimates the two cases wouldn't
Tom> have appeared to have the same output rowcount.
Tom> For the toy data in your example, this can probably be blamed on
Tom> the fact that eqjoinsel_inner doesn't have any smarts for the case
Tom> of having an MCV list for only one side (though as noted in the
Tom> comments, it's not obvious what it should do instead). However,
Tom> it's not very clear what was happening in the real-world case.
In the real-world case, t1 was something like an "overrides" table for
data otherwise obtained from the other tables, i.e. special-case
exceptions for general rules. As such it is highly skew, with many
possible (a,b) values having no row at all, but others having hundreds
of matches on (a,b) (but only one at most on (a,b,c) since this was the
pkey in the real data as well as the testcase).
Accordingly, there was no way that we could identify of getting any kind
of better estimate of rowcount.
--
Andrew (irc:RhodiumToad)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> Once you're down to an estimate of one row retrieved, adding
Tom> additional index conditions simply increases the cost (not by
Tom> much, but it increases) without delivering any visible benefit.
OK, but this is a serious problem because "estimate of one row" is a
very common estimation failure mode, and isn't always solvable in the
sense of arranging for better estimates (in the absence of hints, ugh).
Yeah. I've occasionally wondered about removing the clamp-to-one-row
behavior, so that additional conditions would still look like they
contributed something (ie, 0.1 row is better than 1 row). However,
that seems likely to break about as many cases as it fixes :-(.
A variant of that would be to only allow the minimum to be 1 row if
we are absolutely certain that's what we'll get (eg, we're searching
on a unique-key equality condition), and otherwise clamp to at least
2 rows. Again though, this would be destabilizing lots of cases that
work well today.
I doubt there are any simple solutions here.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, May 31, 2015 at 12:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
Obviously it makes little sense to use an (a,b,c) index to look up just
(a,b) and then filter on c; the question is, what is the planner doing
that leads it to get this so wrong?It's not so astonishing as all that; compare
regression=# explain select * from t1 where a=3 and b=4;
QUERY PLAN
------------------------------------------------------------------------
Index Only Scan using t1_pkey on t1 (cost=0.28..8.29 rows=1 width=12)
Index Cond: ((a = 3) AND (b = 4))
(2 rows)regression=# explain select * from t1 where a=3 and b=4 and c=5;
QUERY PLAN
------------------------------------------------------------------------
Index Only Scan using t1_pkey on t1 (cost=0.28..8.30 rows=1 width=12)
Index Cond: ((a = 3) AND (b = 4) AND (c = 5))
(2 rows)Once you're down to an estimate of one row retrieved, adding additional
index conditions simply increases the cost (not by much, but it increases)
without delivering any visible benefit.
But Andrew's example is equivalent to planning the second query by
putting the quals on a and b into the index qual and treating c=5 as a
post-filter condition even though the index we're using is on (a, b,
c), which I don't believe we'd ever do. If we did, I'd find that
astonishing, too.
I believe what probably happened in this case is that the planner
considered both forms of the indexscan path and concluded that they were
fuzzily the same cost and rowcount, yet the path using only t2.a and t3.b
clearly dominated by requiring strictly fewer outer relations for
parameters. So it threw away the path that also had the c = t4.c
comparison before it ever got to the join stage. Even had it kept that
path, the join cost estimate wouldn't have looked any better than the one
for the join it did pick, so there would have been no certainty of picking
the "correct" plan.
It's just hard to believe that it's ever better to treat something as
a join filter than as an index condition. Yes, checking the index
condition isn't free, either. But it doesn't seem like it should be
particularly more expensive than checking the same thing as a join
filter. And if there a lot of rows involved, it's going to be a whole
lot LESS expensive.
I guess it's hard for me to credit the idea that a parameterized index
path constraining a superset of the columns present in some other
parameterized path on the same index has insufficient additional
selectivity to justify its existence. Even in a world where planner
estimates are never wrong, I doubt treating the qual as a join filter
is ever meaningfully better. The absolute best case - or so it seems
to me - is a tie. If we underestimate the row counts, it's a loss.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers