strange plan with bitmap heap scan and multiple partial indexes

Started by Tomas Vondraover 10 years ago6 messages
#1Tomas Vondra
tomas.vondra@2ndquadrant.com

Hi,

While working on the "IOS with partial indexes" patch, I've noticed a
bit strange plan. It's unrelated to that particular patch (reproducible
on master), so I'm starting a new thread for it.

To reproduce it, all you have to do is this (on a new cluster, all
settings on default):

CREATE TABLE t AS SELECT i AS a, i AS b
FROM generate_series(1,10000000) s(i);

CREATE INDEX idx001 ON t (a) where b < 100;
CREATE INDEX idx002 ON t (a) where b < 200;
CREATE INDEX idx003 ON t (a) where b < 300;

ANALYZE t;

EXPLAIN SELECT a FROM t WHERE b < 100;

QUERY PLAN
--------------------------------------------------------------------
Bitmap Heap Scan on t (cost=9.01..13.02 rows=1000 width=4)
Recheck Cond: ((b < 300) AND (b < 200))
Filter: (b < 100)
-> BitmapAnd (cost=9.01..9.01 rows=1 width=0)
-> Bitmap Index Scan on idx003
(cost=0.00..4.13 rows=1000 width=0)
-> Bitmap Index Scan on idx002
(cost=0.00..4.13 rows=1000 width=0)

Now, that's strange IMHO. There's a perfectly matching partial index,
with exactly the same predicate (b<100), but we instead choose the two
other indexes, and combine them using BitmapAnd. That seems a bit
strange - choosing one of them over the perfectly matching one would be
strange too, but why use two and combine them?

Another thing is that this gets fixed by a simple VACUUM on the table.

EXPLAIN SELECT a FROM t WHERE b < 100;

QUERY PLAN
--------------------------------------------------------------------
Index Scan using idx001 on t (cost=0.14..29.14 rows=1000 width=4)

Any idea what's going on here? FWIW all this was on 51d0fe5d (July 23).

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Andres Freund
andres@anarazel.de
In reply to: Tomas Vondra (#1)
Re: strange plan with bitmap heap scan and multiple partial indexes

On 2015-07-11 14:31:25 +0200, Tomas Vondra wrote:

While working on the "IOS with partial indexes" patch, I've noticed a bit
strange plan. It's unrelated to that particular patch (reproducible on
master), so I'm starting a new thread for it.

To reproduce it, all you have to do is this (on a new cluster, all settings
on default):

CREATE TABLE t AS SELECT i AS a, i AS b
FROM generate_series(1,10000000) s(i);

CREATE INDEX idx001 ON t (a) where b < 100;
CREATE INDEX idx002 ON t (a) where b < 200;
CREATE INDEX idx003 ON t (a) where b < 300;

ANALYZE t;

EXPLAIN SELECT a FROM t WHERE b < 100;

QUERY PLAN

It's indeed interesting. Running
ANALYZE t;EXPLAIN SELECT a FROM t WHERE b < 100;
repeatedly switches back and forth between the plans.

Note that
Bitmap Heap Scan on t (cost=9.01..13.02 rows=1000 width=4)
is actually costed significantly cheaper than
Index Scan using idx001 on t (cost=0.15..30.32 rows=1000 width=4)
which means yet another wierd thing is that it's not consistently
choosing that plan.

When the index scan plan is chosen you interestingly get the bitmapscan
plan, but at a slightly higher cost:
postgres[32046][1]=# EXPLAIN SELECT a FROM t WHERE b < 100;
┌────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├────────────────────────────────────────────────────────────────────┤
│ Index Scan using idx001 on t (cost=0.15..30.32 rows=1000 width=4) │
└────────────────────────────────────────────────────────────────────┘
(1 row)

Time: 0.460 ms
postgres[32046][1]=# SET enable_indexscan = false;
SET
Time: 0.119 ms
postgres[32046][1]=# EXPLAIN SELECT a FROM t WHERE b < 100;
┌──────────────────────────────────────────────────────────────────────────────
│ QUERY PLAN

├──────────────────────────────────────────────────────────────────────────────
│ Bitmap Heap Scan on t (cost=27.05..31.06 rows=1000 width=4)
│ Recheck Cond: ((b < 300) AND (b < 200))
│ Filter: (b < 100)
│ -> BitmapAnd (cost=27.05..27.05 rows=1 width=0)
│ -> Bitmap Index Scan on idx003 (cost=0.00..13.15 rows=1000 width=0)
│ -> Bitmap Index Scan on idx002 (cost=0.00..13.15 rows=1000 width=0)
└──────────────────────────────────────────────────────────────────────────────

Looking at the stats is interesting:

postgres[32046][1]=# ANALYZE t;SELECT relpages, reltuples, relallvisible FROM pg_class WHERE relname IN ('t', 'idx003', 'idx002', 'idx001');EXPLAIN SELECT a FROM t WHERE b < 100;
ANALYZE
Time: 123.469 ms
┌──────────┬───────────┬───────────────┐
│ relpages │ reltuples │ relallvisible │
├──────────┼───────────┼───────────────┤
│ 44248 │ 1e+07 │ 44248 │
│ 2 │ 0 │ 0 │
│ 2 │ 667 │ 0 │
│ 2 │ 667 │ 0 │
└──────────┴───────────┴───────────────┘
(4 rows)

Time: 0.405 ms
┌────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├────────────────────────────────────────────────────────────────────┤
│ Index Scan using idx001 on t (cost=0.12..24.63 rows=1000 width=4) │
└────────────────────────────────────────────────────────────────────┘
(1 row)

Time: 0.269 ms
postgres[32046][1]=# ANALYZE t;SELECT relpages, reltuples, relallvisible FROM pg_class WHERE relname IN ('t', 'idx003', 'idx002', 'idx001');EXPLAIN SELECT a FROM t WHERE b < 100;
ANALYZE
Time: 124.430 ms
┌──────────┬───────────┬───────────────┐
│ relpages │ reltuples │ relallvisible │
├──────────┼───────────┼───────────────┤
│ 44248 │ 1e+07 │ 44248 │
│ 2 │ 0 │ 0 │
│ 2 │ 0 │ 0 │
│ 2 │ 0 │ 0 │
└──────────┴───────────┴───────────────┘
(4 rows)

Time: 0.272 ms
┌──────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├──────────────────────────────────────────────────────────────────────────────┤
│ Bitmap Heap Scan on t (cost=9.01..13.02 rows=1000 width=4) │
│ Recheck Cond: ((b < 300) AND (b < 200)) │
│ Filter: (b < 100) │
│ -> BitmapAnd (cost=9.01..9.01 rows=1 width=0) │
│ -> Bitmap Index Scan on idx003 (cost=0.00..4.13 rows=1000 width=0) │
│ -> Bitmap Index Scan on idx002 (cost=0.00..4.13 rows=1000 width=0) │
└──────────────────────────────────────────────────────────────────────────────┘
(6 rows)

Time: 0.275 ms
postgres[32046][1]=# ANALYZE t;SELECT relpages, reltuples, relallvisible FROM pg_class WHERE relname IN ('t', 'idx003', 'idx002', 'idx001');EXPLAIN SELECT a FROM t WHERE b < 100;
ANALYZE
Time: 112.592 ms
┌──────────┬─────────────┬───────────────┐
│ relpages │ reltuples │ relallvisible │
├──────────┼─────────────┼───────────────┤
│ 44248 │ 9.99998e+06 │ 44248 │
│ 2 │ 0 │ 0 │
│ 2 │ 334 │ 0 │
│ 2 │ 334 │ 0 │
└──────────┴─────────────┴───────────────┘
(4 rows)

Time: 0.330 ms
┌────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├────────────────────────────────────────────────────────────────────┤
│ Index Scan using idx001 on t (cost=0.12..24.63 rows=1000 width=4) │
└────────────────────────────────────────────────────────────────────┘
(1 row)

Time: 0.320 ms

So this seems to be stats related.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#2)
Re: strange plan with bitmap heap scan and multiple partial indexes

Andres Freund <andres@anarazel.de> writes:

On 2015-07-11 14:31:25 +0200, Tomas Vondra wrote:

While working on the "IOS with partial indexes" patch, I've noticed a bit
strange plan. It's unrelated to that particular patch (reproducible on
master), so I'm starting a new thread for it.

It's indeed interesting. Running
ANALYZE t;EXPLAIN SELECT a FROM t WHERE b < 100;
repeatedly switches back and forth between the plans.

The issue basically is that ANALYZE is putting quasi-random numbers into
the reltuples estimates for the indexes. Things seem to be consistently
sane after a VACUUM:

regression=# vacuum t;
VACUUM
regression=# select relname,relpages,reltuples from pg_class where relname in ( 't', 'idx001', 'idx002', 'idx003');
relname | relpages | reltuples
---------+----------+-------------
t | 44248 | 9.99998e+06
idx001 | 2 | 99
idx002 | 2 | 199
idx003 | 2 | 299
(4 rows)

but not so much after ANALYZE:

regression=# analyze t;
ANALYZE
regression=# select relname,relpages,reltuples from pg_class where relname in ( 't', 'idx001', 'idx002', 'idx003');
relname | relpages | reltuples
---------+----------+-----------
t | 44248 | 1e+07
idx001 | 2 | 0
idx002 | 2 | 0
idx003 | 2 | 0
(4 rows)

I've also seen states like

relname | relpages | reltuples
---------+----------+-------------
t | 44248 | 9.99998e+06
idx001 | 2 | 0
idx002 | 2 | 334
idx003 | 2 | 334
(4 rows)

Presumably, this is happening because the numbers of rows actually
satisfying the index predicates are so small that it's a matter of luck
whether any of them are included in ANALYZE's sample.

Given this bad data for the index sizes, it's not totally surprising that
choose_bitmap_and() does something wacko. I'm not sure whether we should
try to make it smarter, or write this off as "garbage in, garbage out".

Another idea is to not trust any individual ANALYZE's estimate of the
index rowcount so completely. (I'd thought that the moving-average logic
would get applied to that, but it doesn't seem to be kicking in for some
reason.)

We could probably make this smarter if we were willing to apply the
predicate-proof machinery in more situations; in this example, once we know
that idx001 is applicable, we really should disregard idx002 and idx003
altogether because their predicates are implied by idx001's. I've always
been hesitant to do that because the cost of checking seemed likely to
greatly outweigh the benefits. But since Tomas is nosing around in this
territory already, maybe he'd like to investigate that further.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Tom Lane (#3)
Re: strange plan with bitmap heap scan and multiple partial indexes

On 07/11/2015 06:32 PM, Tom Lane wrote:
...

Presumably, this is happening because the numbers of rows actually
satisfying the index predicates are so small that it's a matter of
luck whether any of them are included in ANALYZE's sample.

Given this bad data for the index sizes, it's not totally surprising
that choose_bitmap_and() does something wacko. I'm not sure whether
we should try to make it smarter, or write this off as "garbage in,
garbage out".

I think we should make it smarter, if possible - while this example is
somewhat artificial, partial indexes are often used exactly like this,
i.e. to index only very small subset of data. A good example may be an
index on "active invoices", i.e. invoices that were yet sorted out.
There may be a lot of invoices in the table, but only very small
fraction of them will be active (and thus in the index).

So I don't think is an artificial problem, and we should not write it
off as "garbage in".

Another idea is to not trust any individual ANALYZE's estimate of
the index rowcount so completely. (I'd thought that the
moving-average logic would get applied to that, but it doesn't seem
to be kicking in for some reason.)

We could probably make this smarter if we were willing to apply the
predicate-proof machinery in more situations; in this example, once
we know that idx001 is applicable, we really should disregard idx002
and idx003 altogether because their predicates are implied by
idx001's. I've always been hesitant to do that because the cost of
checking seemed likely to greatly outweigh the benefits. But since
Tomas is nosing around in this territory already, maybe he'd like to
investigate that further.

I think there are two possible approaches in general - we may improve
the statistics somehow, or we may start doing the predicate proofing.

I doubt approaching this at the statistics level alone is sufficient,
because even with statistics target 10k (i.e. the most detailed one),
the sample is still fixed-size. So there will always exist a combination
of a sufficiently large data set and selective partial index, causing
trouble with the sampling.

Moreover, I can't really think of a way to fix this at the statistics
level. Maybe there's a clever trick guarding against this particular
issue, but my personal experience is that whenever I used such a smart
hack, it eventually caused strange issues elsewhere.

So I think the predicate proofing is a better approach, but of course
the planning cost may be an issue. But maybe we can make this cheaper by
some clever tricks? For example, given two predicates A and B, it seems
that if A => B, then selectivity(A) <= selectivity(B). Could we use this
to skip some of the expensive stuff? We should have the selectivities
anyway, no?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tomas Vondra (#4)
Re: strange plan with bitmap heap scan and multiple partial indexes

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

So I think the predicate proofing is a better approach, but of course
the planning cost may be an issue. But maybe we can make this cheaper by
some clever tricks? For example, given two predicates A and B, it seems
that if A => B, then selectivity(A) <= selectivity(B). Could we use this
to skip some of the expensive stuff? We should have the selectivities
anyway, no?

We do. The existing logic in choose_bitmap_and essentially uses the
selectivity as a heuristic to indicate which partial indexes might have
predicates that imply another index's predicate. The expectation is that
the former would have selectivity strictly smaller than the latter,
causing the former to be processed first, and then the existing rules
about what indexes can be "added onto" a potential AND combination will
do the trick.

The reason this fails in your example is that if the two indexes have
exactly identical selectivities (due to identical reltuples values),
there's no certainty what order they get sorted in, and the adding-on
rules don't catch the case where the new index would actually imply
the old one rather than vice versa.

Conceivably, we could fix this at relatively small cost in the normal case
by considering predicate proof rules in the sort comparator, and only if
the estimated selectivities are identical. Sure seems like a kluge
though, and I remain unconvinced that it's really a case that arises that
much in the real world. The short description of the set of indexes you
showed is "redundantly silly". Useful sets of indexes would not likely
all have indistinguishably small selectivities.

Perhaps a less klugy answer is to tweak the adding-on rules some more,
but I haven't thought about exactly how.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Tom Lane (#5)
Re: strange plan with bitmap heap scan and multiple partial indexes

Hi,

On 07/11/2015 11:40 PM, Tom Lane wrote:

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

So I think the predicate proofing is a better approach, but of
course the planning cost may be an issue. But maybe we can make
this cheaper by some clever tricks? For example, given two
predicates A and B, it seems that if A => B, then selectivity(A) <=
selectivity(B). Could we use this to skip some of the expensive
stuff? We should have the selectivities anyway, no?

We do. The existing logic in choose_bitmap_and essentially uses the
selectivity as a heuristic to indicate which partial indexes might
have predicates that imply another index's predicate. The expectation
is that the former would have selectivity strictly smaller than the
latter, causing the former to be processed first, and then the
existing rules about what indexes can be "added onto" a potential AND
combination will do the trick.

The reason this fails in your example is that if the two indexes
have exactly identical selectivities (due to identical reltuples
values), there's no certainty what order they get sorted in, and the
adding-on rules don't catch the case where the new index would
actually imply the old one rather than vice versa.

Ah, OK. Thanks for the explanation.

Conceivably, we could fix this at relatively small cost in the normal
case by considering predicate proof rules in the sort comparator, and
only if the estimated selectivities are identical.

Yep, that's what basically I had in mind.

Sure seems like a kluge though, and I remain unconvinced that it's
really a case that arises that much in the real world. The short
description of the set of indexes you showed is "redundantly silly".
Useful sets of indexes would not likely all have indistinguishably
small selectivities.

Fair point - the example really is artificial, and was constructed to
demonstrate planning costs of the extended predicate-proofing for
partial indexes. And while in reality small partial indexes are quite
common, they probably use predicates tailored for individual queries
(and not the nested predicates as in the example).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers