bitmap scan issues 8.1 devel

Started by Merlin Moncureover 20 years ago4 messages
#1Merlin Moncure
merlin.moncure@rcsonline.com

Hello,
Doing some testing on upcoming 8.1 devel and am having serious issues
with new bitmap index scan feature. It is easy to work around (just
disable it) but IMO the planner is using it when a regular index scan
should be strongly favored. The performance of the bitmapscan in my
usage is actually quite a bit worse than a full sequential scan.

here is a query which does this:
explain analyze execute
data1_read_next_product_structure_file_0('012241', '', '', '002', 1);

Here is the 8.0/bitmap off plan:
Limit (cost=0.00..45805.23 rows=5722 width=288) (actual
time=0.070..0.072 rows=1 loops=1)
-> Index Scan using product_structure_file_pkey on
product_structure_file (cost=0.00..45805.23 rows=5722 width=288)
(actual time=0.063..0.063 row
s=1 loops=1)
Index Cond: ((ps_parent_code)::text >= ($1)::text)
Filter: ((((ps_parent_code)::text > ($1)::text) OR
(ps_group_code >= $2)) AND (((ps_parent_code)::text > ($1)::text) OR
(ps_group_code > $2)
OR ((ps_section_code)::text >= ($3)::text)) AND (((ps_parent_code)::text

($1)::text) OR (ps_group_code > $2) OR ((ps_section_code)::text >

($3)::tex
t) OR ((ps_seq_no)::smallint > $4)))
Total runtime: 0.185 ms

Here is the 8.1 with bitamp on:
Limit (cost=3768.32..3782.63 rows=5722 width=288) (actual
time=2287.488..2287.490 rows=1 loops=1)
-> Sort (cost=3768.32..3782.63 rows=5722 width=288) (actual
time=2287.480..2287.480 rows=1 loops=1)
Sort Key: ps_parent_code, ps_group_code, ps_section_code,
ps_seq_no
-> Bitmap Heap Scan on product_structure_file
(cost=187.84..3411.20 rows=5722 width=288) (actual time=19.977..514.532
rows=47355 loops=1)
Recheck Cond: ((ps_parent_code)::text >= ($1)::text)
Filter: ((((ps_parent_code)::text > ($1)::text) OR
(ps_group_code >= $2)) AND (((ps_parent_code)::text > ($1)::text) OR
(ps_group_code

$2) OR ((ps_section_code)::text >= ($3)::text)) AND

(((ps_parent_code)::text > ($1)::text) OR (ps_group_code > $2) OR
((ps_section_code)::text > ($3
)::text) OR ((ps_seq_no)::smallint > $4)))
-> Bitmap Index Scan on product_structure_file_pkey
(cost=0.00..187.84 rows=18239 width=0) (actual time=19.059..19.059
rows=47356 loo
ps=1)
Index Cond: ((ps_parent_code)::text >= ($1)::text)
Total runtime: 2664.034 ms

Here is the prepared statement definition:
prepare data1_read_next_product_structure_file_0 (character varying,
character, character varying, int4, int4)
as select 1::int4, * from data1.product_structure_file
where ps_parent_code >= $1 and
(ps_parent_code > $1 or ps_group_code >= $2) and
(ps_parent_code > $1 or ps_group_code > $2 or
ps_section_code >= $3) and
(ps_parent_code > $1 or ps_group_code > $2 or
ps_section_code > $3 or ps_seq_no > $4)
order by ps_parent_code, ps_group_code, ps_section_code,
ps_seq_no
limit $5

Aside: this is the long way of writing
select 1::int4, * from data1.product_structure_file where
(ps_parent_code, ps_group_code, ps_section_code, ps_seq_no) > ($1, $2,
$3, $4) limit %5

which is allowed in pg but returns the wrong answer.

Merlin

#2Josh Berkus
josh@agliodbs.com
In reply to: Merlin Moncure (#1)
Re: [HACKERS] bitmap scan issues 8.1 devel

Merlin,

   ->  Index Scan using product_structure_file_pkey on
product_structure_file  (cost=0.00..45805.23 rows=5722 width=288)
(actual time=0.063..0.063 row
s=1 loops=1)

It appears that your DB is estimating the number of rows returned much too
high (5722 instead of 1). Please raise the statistics on all columns to
about 500, analyze, and try your test again.

Thanks!

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Merlin Moncure (#1)
Re: [HACKERS] bitmap scan issues 8.1 devel

"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:

Doing some testing on upcoming 8.1 devel and am having serious issues
with new bitmap index scan feature. It is easy to work around (just
disable it) but IMO the planner is using it when a regular index scan
should be strongly favored.

I think blaming the bitmap code is the wrong response. What I see in
your example is that the planner doesn't know what the LIMIT value is,
and accordingly is favoring a plan that isn't going to get blown out of
the water if the LIMIT is large. I'd suggest not parameterizing the
LIMIT.

(But hmm ... I wonder if we could use estimate_expression_value for
LIMIT items, instead of handling only simple Consts as the code does
now?)

regards, tom lane

#4Merlin Moncure
merlin.moncure@rcsonline.com
In reply to: Tom Lane (#3)
Re: [PERFORM] bitmap scan issues 8.1 devel

"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:

Doing some testing on upcoming 8.1 devel and am having serious

issues

with new bitmap index scan feature. It is easy to work around (just
disable it) but IMO the planner is using it when a regular index

scan

should be strongly favored.

I think blaming the bitmap code is the wrong response. What I see in
your example is that the planner doesn't know what the LIMIT value is,
and accordingly is favoring a plan that isn't going to get blown out

of

the water if the LIMIT is large. I'd suggest not parameterizing the
LIMIT.

You nailed it...I hard coded the limit and everything was cool. In
fact, the same problem contributes to the fact that I've had to run
seqscan=false on all my production systems. It seemed the planner would
randomly seqscan the table...now I know why.

(But hmm ... I wonder if we could use estimate_expression_value for
LIMIT items, instead of handling only simple Consts as the code does
now?)

I absolutely support this :) In normal usage, the supplied limit is
quite small, say 100 or less. Anyways, planner issues aside,
parameterizing the limit is an elegant way to read records off a table
when you don't know how many you are going to read in advance...I make
heavy use of it :(.

Merlin