Bitmap scans vs. the statistics views

Started by Tom Laneover 20 years ago12 messages
#1Tom Lane
tgl@sss.pgh.pa.us

I'm currently finding that the stats regression test fails with
bitmapped index scans enabled:

*** 62,68 ****
WHERE st.relname='tenk2' AND cl.relname='tenk2';
?column? | ?column? | ?column? | ?column?
----------+----------+----------+----------
! t | t | t | t
(1 row)

  SELECT st.heap_blks_read + st.heap_blks_hit >= pr.heap_blks + cl.relpages,
--- 62,68 ----
   WHERE st.relname='tenk2' AND cl.relname='tenk2';
   ?column? | ?column? | ?column? | ?column?
  ----------+----------+----------+----------
!  t        | t        | f        | f
  (1 row)

SELECT st.heap_blks_read + st.heap_blks_hit >= pr.heap_blks + cl.relpages,

The reason for this appears to be that the standard stats views
disregard tuples_fetched for tables, but tuples_fetched is the only
counter that's getting bumped in a bitmap scan.

I could probably add some code to bump tuples_returned as well,
but I wonder whether something more drastic isn't needed. The
stats views seem to be designed around the assumption that there
are seqscans and indexscans and nothing else. Do we need to
expand the number of functions and rows to allow for a third basic
scan type, or do we want to fuzz things up?

Comments anyone?

regards, tom lane

#2Jim C. Nasby
decibel@decibel.org
In reply to: Tom Lane (#1)
Re: Bitmap scans vs. the statistics views

I would like to know if bitmap scans are being used on a table. I think
it's worth adding to the stats views, though I'm not sure on the best
way. For example, this means that a query on one table can now scan
multiple indexes, but it doesn't seem right to lump that in with
'traditional' index scans. An extreme view would be to add new bitmap
scan columns to pg_stat(io)_(tables|indexes), but that might be
overkill.
--
Jim C. Nasby, Database Consultant decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

#3Josh Berkus
josh@agliodbs.com
In reply to: Jim C. Nasby (#2)
Re: Bitmap scans vs. the statistics views

Tom,

The reason for this appears to be that the standard stats views
disregard tuples_fetched for tables, but tuples_fetched is the only
counter that's getting bumped in a bitmap scan.

I could probably add some code to bump tuples_returned as well,
but I wonder whether something more drastic isn't needed.  The
stats views seem to be designed around the assumption that there
are seqscans and indexscans and nothing else.  Do we need to
expand the number of functions and rows to allow for a third basic
scan type, or do we want to fuzz things up?

Comments anyone?

Well, technically a bitmapscan is a different operation. So it should
probably have its own columns. Unless you're talking about an overhaul of
the stats views more drastic than that? If so, what?

I'm not clear on why bitmapscan doesn't bump tuples_returned. Can you
explain?

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#3)
Re: Bitmap scans vs. the statistics views

Josh Berkus <josh@agliodbs.com> writes:

Well, technically a bitmapscan is a different operation. So it should
probably have its own columns. Unless you're talking about an overhaul of
the stats views more drastic than that? If so, what?

That was basically what I was asking: do we expand all the stats support
to handle this as a separate path, and if so what does it look like
exactly? I'm particularly unclear what to do at the level of the
functions described in
http://www.postgresql.org/docs/8.0/static/monitoring-stats.html#MONITORING-STATS-FUNCS-TABLE

I've never fully understood the distinction the stats make between
"tuples fetched" and "tuples returned", and it's even less obvious how
to apply it when the index and heap operations are decoupled. In
particular the function design assumes that heap tuple fetches can be
tied to particular indexes, which is now a broken assumption. You
might be amused by this test case I just finished debugging:

regression=# explain analyze select * from tenk1, int4_tbl where unique1 in (40,50,f1) and unique2 >= 3999 and unique2 < 9999;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=9.07..66.56 rows=4 width=248) (actual time=16.266..77.452 rows=6 loops=1)
-> Seq Scan on int4_tbl (cost=0.00..1.05 rows=5 width=4) (actual time=0.055..0.105 rows=5 loops=1)
-> Bitmap Heap Scan on tenk1 (cost=9.07..13.08 rows=1 width=244) (actual time=15.387..15.396 rows=1 loops=5)
Recheck Cond: (((tenk1.unique1 = 40) OR (tenk1.unique1 = 50) OR (tenk1.unique1 = "outer".f1)) AND (tenk1.unique2 >= 3999) AND (tenk1.unique2 < 9999))
-> BitmapAnd (cost=9.07..9.07 rows=50 width=0) (actual time=15.353..15.353 rows=0 loops=5)
-> BitmapOr (cost=6.52..6.52 rows=50 width=0) (actual time=0.152..0.152 rows=0 loops=5)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..2.17 rows=50 width=0) (actual time=0.059..0.059 rows=1 loops=5)
Index Cond: (unique1 = 40)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..2.17 rows=50 width=0) (actual time=0.032..0.032 rows=1 loops=5)
Index Cond: (unique1 = 50)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..2.17 rows=50 width=0) (actual time=0.029..0.029 rows=0 loops=5)
Index Cond: (tenk1.unique1 = "outer".f1)
-> Bitmap Index Scan on tenk1_unique2 (cost=0.00..2.30 rows=50 width=0) (actual time=15.148..15.148 rows=6000 loops=5)
Index Cond: ((unique2 >= 3999) AND (unique2 < 9999))
Total runtime: 78.369 ms
(15 rows)

What exactly do we want to count here? The 6000 TIDs pulled from
tenk1_unique2 don't translate into much of anything at the heap
access stage. (Shortly I'm going to add some logic to not bother
using very nonselective index conditions in BitAnd, but it's not there
right now.)

I'm not clear on why bitmapscan doesn't bump tuples_returned. Can you
explain?

Well, there was no such bump in the bits of code I cribbed to make
nodeBitmapHeapScan and friends ;-). It is easy enough to add, once
we have a clear idea of what we want to count, but I don't feel that
I have that idea yet.

regards, tom lane

#5Jan Wieck
JanWieck@Yahoo.com
In reply to: Tom Lane (#4)
Re: Bitmap scans vs. the statistics views

On 4/22/2005 3:30 PM, Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

Well, technically a bitmapscan is a different operation. So it should
probably have its own columns. Unless you're talking about an overhaul of
the stats views more drastic than that? If so, what?

That was basically what I was asking: do we expand all the stats support
to handle this as a separate path, and if so what does it look like
exactly? I'm particularly unclear what to do at the level of the
functions described in
http://www.postgresql.org/docs/8.0/static/monitoring-stats.html#MONITORING-STATS-FUNCS-TABLE

I've never fully understood the distinction the stats make between
"tuples fetched" and "tuples returned", and it's even less obvious how
to apply it when the index and heap operations are decoupled. In
particular the function design assumes that heap tuple fetches can be
tied to particular indexes, which is now a broken assumption. You
might be amused by this test case I just finished debugging:

tuples fetched is the number of raw, possibly dead tuples fetched from
the heap. Tuples returned is the number of alive tuples ... IIRC.

Jan

regression=# explain analyze select * from tenk1, int4_tbl where unique1 in (40,50,f1) and unique2 >= 3999 and unique2 < 9999;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=9.07..66.56 rows=4 width=248) (actual time=16.266..77.452 rows=6 loops=1)
-> Seq Scan on int4_tbl (cost=0.00..1.05 rows=5 width=4) (actual time=0.055..0.105 rows=5 loops=1)
-> Bitmap Heap Scan on tenk1 (cost=9.07..13.08 rows=1 width=244) (actual time=15.387..15.396 rows=1 loops=5)
Recheck Cond: (((tenk1.unique1 = 40) OR (tenk1.unique1 = 50) OR (tenk1.unique1 = "outer".f1)) AND (tenk1.unique2 >= 3999) AND (tenk1.unique2 < 9999))
-> BitmapAnd (cost=9.07..9.07 rows=50 width=0) (actual time=15.353..15.353 rows=0 loops=5)
-> BitmapOr (cost=6.52..6.52 rows=50 width=0) (actual time=0.152..0.152 rows=0 loops=5)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..2.17 rows=50 width=0) (actual time=0.059..0.059 rows=1 loops=5)
Index Cond: (unique1 = 40)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..2.17 rows=50 width=0) (actual time=0.032..0.032 rows=1 loops=5)
Index Cond: (unique1 = 50)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..2.17 rows=50 width=0) (actual time=0.029..0.029 rows=0 loops=5)
Index Cond: (tenk1.unique1 = "outer".f1)
-> Bitmap Index Scan on tenk1_unique2 (cost=0.00..2.30 rows=50 width=0) (actual time=15.148..15.148 rows=6000 loops=5)
Index Cond: ((unique2 >= 3999) AND (unique2 < 9999))
Total runtime: 78.369 ms
(15 rows)

What exactly do we want to count here? The 6000 TIDs pulled from
tenk1_unique2 don't translate into much of anything at the heap
access stage. (Shortly I'm going to add some logic to not bother
using very nonselective index conditions in BitAnd, but it's not there
right now.)

I'm not clear on why bitmapscan doesn't bump tuples_returned. Can you
explain?

Well, there was no such bump in the bits of code I cribbed to make
nodeBitmapHeapScan and friends ;-). It is easy enough to add, once
we have a clear idea of what we want to count, but I don't feel that
I have that idea yet.

regards, tom lane

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#6Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#4)
Re: Bitmap scans vs. the statistics views

Tom,

I've never fully understood the distinction the stats make between
"tuples fetched" and "tuples returned", and it's even less obvious how
to apply it when the index and heap operations are decoupled.

Well, it's mainly a counter to measure how many dead rows are in your active
data set. It probably belongs elsewhere, such as pg_stats_all_tables with
the overall fetch counts.

In
particular the function design assumes that heap tuple fetches can be
tied to particular indexes, which is now a broken assumption. You
might be amused by this test case I just finished debugging:

Well, I'd be in favor of moving the "useful" version of tuples_returned to
pg_stat_*_tables as an overall count. However, we cant drop the column from
pg_stat_indexes without breaking apps; at best, we'd have to warn people of
the accuracy issues and give it a few versions before we dropped it.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jan Wieck (#5)
Re: Bitmap scans vs. the statistics views

Jan Wieck <JanWieck@Yahoo.com> writes:

tuples fetched is the number of raw, possibly dead tuples fetched from
the heap. Tuples returned is the number of alive tuples ... IIRC.

No, count_heap_fetch only counts tuples that have already passed the
snapshot test. It could be that the places where the counts are
actually bumped don't line up with your original vision for the
stats design.

For a regular index scan, it seems to make sense to count (a) number of
TIDs returned by the index AM, and (b) number of tuples returned by the
IndexScan node. There are several intermediate steps
* does the tuple pass the snapshot test
* does the tuple pass any indexqual rechecks (for lossy indexes)
* does the tuple pass any additional non-index restriction
conditions that are being enforced at the scan level
but I'm not sure that counting all the intermediate steps is interesting.

The places where the counts are actually taken don't quite line up with
this ... there are also some other macros like count_heap_scan and
count_index_scan that seem a bit randomly placed. The description in
table 23-2 makes it sound like those should count once per scan, but
they are bumped inside the per-tuple loops ...

regards, tom lane

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#6)
Re: Bitmap scans vs. the statistics views

Josh Berkus <josh@agliodbs.com> writes:

I've never fully understood the distinction the stats make between
"tuples fetched" and "tuples returned", and it's even less obvious how
to apply it when the index and heap operations are decoupled.

Well, it's mainly a counter to measure how many dead rows are in your active
data set.

You may think that's what it is, but given where the counts are actually
placed in the code, it does no such thing. AFAICS there is no count of
dead tuples at all in the seqscan case, and in the indexscan case the
only counter that advances before the snapshot test is
pgstat_count_index_scan, which isn't counting tuples so much as
amgetnext calls, and is documented in a way that suggests it does
something completely different :-(

regards, tom lane

#9Jan Wieck
JanWieck@Yahoo.com
In reply to: Tom Lane (#7)
Re: Bitmap scans vs. the statistics views

On 4/22/2005 3:53 PM, Tom Lane wrote:

Jan Wieck <JanWieck@Yahoo.com> writes:

tuples fetched is the number of raw, possibly dead tuples fetched from
the heap. Tuples returned is the number of alive tuples ... IIRC.

No, count_heap_fetch only counts tuples that have already passed the
snapshot test. It could be that the places where the counts are
actually bumped don't line up with your original vision for the
stats design.

For a regular index scan, it seems to make sense to count (a) number of
TIDs returned by the index AM, and (b) number of tuples returned by the
IndexScan node. There are several intermediate steps
* does the tuple pass the snapshot test
* does the tuple pass any indexqual rechecks (for lossy indexes)
* does the tuple pass any additional non-index restriction
conditions that are being enforced at the scan level

Now that you say it ... yes. The whole stats stuff was intended
originally to find "DB tuning hints". A large number of tuples returned
by index scan and filtered out by additional non-index restrictions
indicate that there might be another multicolumn index missing.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#10Josh Berkus
josh@agliodbs.com
In reply to: Jan Wieck (#9)
Re: Bitmap scans vs. the statistics views

Tom,

Hmmm ... we need to flag *something* in pg_stat_*_indexes, whether it is a new
column or the tuplefetch column. People use that view to find indexes they
can drop.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

#11Hannu Krosing
hannu@tm.ee
In reply to: Josh Berkus (#10)
Re: Bitmap scans vs. the statistics views

On R, 2005-04-22 at 13:46 -0700, Josh Berkus wrote:

Tom,

Hmmm ... we need to flag *something* in pg_stat_*_indexes, whether it is a new
column or the tuplefetch column. People use that view to find indexes they
can drop.

I think that "idx_scan" and "idx_tup_read" can have the same meaning as
for other index types, i.e. of number of scans and number of index
tuples read (and copied into bitmap).

and as idx_tup_fetch seems to be == idx_tup_read, it can be the same for
bitmap indexes as well :)

but if the view will be changed, I'd like to see column idx_pages_read
added there.

--
Hannu Krosing <hannu@tm.ee>

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jan Wieck (#9)
Re: Bitmap scans vs. the statistics views

Quite some time ago I complained about the fact that bitmap index scans
weren't being counted sanely by the statistics mechanism:
http://archives.postgresql.org/pgsql-hackers/2005-04/msg00675.php
That discussion trailed off without deciding how to fix it, but we
really can't let this go without fixing it in 8.1.

I studied the code some more and realized that we had been operating
under some fundamental misconceptions. The distinction made in the
existing stats code between "tuples fetched" and "tuples returned" has
nothing whatever to do with live vs. dead tuples --- all these counts
are made only after determining that a tuple is visible. The way it
really works in 8.0 is:

table "tuples_returned": tuples returned by heap_getnext, ie,
live tuples found by seqscans
table "tuples_fetched": tuples returned by heap_fetch under
conditions other than being invoked by an indexscan
(this covers various random cases like ANALYZE and
TID scans)
index "tuples_fetched": tuples returned by heap_fetch when
invoked by an indexscan on this index
index "tuples_returned": actually, exactly the same as
tuples_fetched.

This possibly explains why the original design of the pg_stat_all_tables
view exposed only two of the seemingly four interesting counts.

I have just committed changes that redefine the counts like this:

table "tuples_returned": same as before, ie,
live tuples found by seqscans
table "tuples_fetched": tuples returned by heap_fetch when
invoked by a bitmap scan (the random other cases
no longer get counted at all)
index "tuples_fetched": same as before, ie, live tuples
fetched by simple indexscans using this index
index "tuples_returned": number of index entries returned
from the index AM, counting both simple and bitmap
scans.

The pg_stat_all_tables view is modified to add the table's
tuples_fetched count to the sum of the per-index tuples_fetched counts,
so that idx_tup_fetch counts both simple and bitmap index scans.
It's possible to break these out by looking at the low-level statistics
functions, however.

With the new definitions you can get some weak information about the
numbers of dead tuples fetched by indexscans, which was not possible
at all before. (It's weak because it's not easy to distinguish
differences due to dead tuples from differences due to bitmap scanning.)
In the earlier discussion, Josh commented that getting stats about dead
tuples probably belongs somewhere else anyway, and I'm inclined to agree
with that; so I don't feel too bad about not having provided more
complete information.

regards, tom lane