understanding bitmap index benefit
Tom,
I noticed that along with many improvements in join operations bitmap
index speed up execution of first time query. It's known complain about
slow full text searching when query runs for the first time. But in CVS
version I see very nice behaviour I'd like to understand.
Query below is full text search on titles from pgsql mailing list archive:
8.0.3:
mw=# explain analyze select title from titles where fts_index @@ to_tsquery('vacuum&analyze');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Index Scan using fts_idx on titles (cost=0.00..1124.34 rows=378 width=41) (actual time=20.884..806.073 rows=291 loops=1)
Index Cond: (fts_index @@ '\'vacuum\' & \'analyz\''::tsquery)
Total runtime: 808.095 ms
(3 rows)
8.1dev:
mw=# explain analyze select title from titles where fts_index @@ to_tsquery('vacuum&analyze');
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on titles (cost=5.32..1349.79 rows=378 width=41) (actual time=207.304..258.964 rows=291 loops=1)
Filter: (fts_index @@ '\'vacuum\' & \'analyz\''::tsquery)
-> Bitmap Index Scan on fts_idx (cost=0.00..5.32 rows=378 width=0) (actual time=206.993..206.993 rows=291 loops=1)
Index Cond: (fts_index @@ '\'vacuum\' & \'analyz\''::tsquery)
Total runtime: 260.946 ms
(5 rows)
Now, more frequent terms:
8.0.3:
mw=# explain analyze select title from titles where fts_index @@ to_tsquery('create&table');
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Index Scan using fts_idx on titles (cost=0.00..1124.34 rows=378 width=41) (actual time=117.904..4463.131 rows=551 loops=1)
Index Cond: (fts_index @@ '\'create\' & \'table\''::tsquery)
Total runtime: 4467.814 ms
(3 rows)
8.1dev:
mw=# explain analyze select title from titles where fts_index @@ to_tsquery('create&table');
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on titles (cost=5.32..1349.79 rows=378 width=41) (actual time=296.310..423.622 rows=551 loops=1)
Filter: (fts_index @@ '\'create\' & \'table\''::tsquery)
-> Bitmap Index Scan on fts_idx (cost=0.00..5.32 rows=378 width=0) (actual time=296.001..296.001 rows=551 loops=1)
Index Cond: (fts_index @@ '\'create\' & \'table\''::tsquery)
Total runtime: 426.573 ms
(5 rows)
Configurations for two postmasters are the same and they're running on the
same machine (my notebook IBM X40, Linux 2.6.8.1).
my applause, Tom !
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
yOn Thu, 19 May 2005, Oleg Bartunov wrote:
Tom,
I noticed that along with many improvements in join operations bitmap
index speed up execution of first time query. It's known complain about
slow full text searching when query runs for the first time. But in CVS
version I see very nice behaviour I'd like to understand.
Hmm, after restarting postmasters I don't see any benefit :(
Sometimes, I see reversed behaviour. Seems, combination of system's and
postgres cacheing.
I tried to see io statistics, but it was weird in 8.0X and in 8.1dev I still
don't understand it :)
Below is a stats for fully cached query:
heap_blk, idx_blk show stats in form of 'blks_read:blks_hit' and
seq_tup,idx_tup - in form of 'number of scan:tuples fetched'.
8.0.3:
mw=# explain analyze select title from titles where fts_index @@ to_tsquery('list');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Index Scan using fts_idx on titles (cost=0.00..1124.34 rows=378 width=42) (actual time=0.088..160.026 rows=4153 loops=1)
Index Cond: (fts_index @@ '\'list\''::tsquery)
Total runtime: 184.834 ms
(3 rows)
mw=# select * from iostat where relname='titles';
relname | heap_blk | idx_blk | seq_tup | idx_tup
---------+----------+---------+---------+---------
titles | 0:3078 | 0:6925 | 0:0 | 1:4154
(1 row)
8.1dev:
mw=# explain analyze select title from titles where fts_index @@ to_tsquery('list');
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on titles (cost=5.32..1349.79 rows=378 width=41) (actual time=105.663..136.422 rows=4153 loops=1)
Filter: (fts_index @@ '\'list\''::tsquery)
-> Bitmap Index Scan on fts_idx (cost=0.00..5.32 rows=378 width=0) (actual time=104.012..104.012 rows=4154 loops=1)
Index Cond: (fts_index @@ '\'list\''::tsquery)
Total runtime: 158.258 ms
(5 rows)
mw=# select * from iostat where relname='titles'; relname | heap_blk | idx_blk | seq_tup | idx_tup
relname | heap_blk | idx_blk | seq_tup | idx_tup
---------+----------+---------+---------+---------
titles | 0:2704 | 0:2797 | 0:0 | 0:0
(1 row)
I see that in 8.1dev, there is no stats for idx_tup !
If I disable bitmap indices in 8.1dev I got
mw=# select * from iostat where relname='titles'; relname | heap_blk | idx_blk | seq_tup | idx_tup
relname | heap_blk | idx_blk | seq_tup | idx_tup
---------+----------+---------+---------+---------
titles | 0:3446 | 0:2797 | 0:0 | 1:4154
(1 row)
notice, stats is different from 8.0.3 as one could expect, especially
in the numbers of idx_blk.
Since, everything is cached there is no visible difference in timings.
after restarting postmasters (pg_ctl restart):
8.0.3:
mw=# explain analyze select title from titles where fts_index @@ to_tsquery('list');
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Index Scan using fts_idx on titles (cost=0.00..1124.34 rows=378 width=42) (actual time=0.159..5845.181 rows=4153 loops=1)
Index Cond: (fts_index @@ '\'list\''::tsquery)
Total runtime: 5870.105 ms
(3 rows)
mw=# select * from iostat where relname='titles';
relname | heap_blk | idx_blk | seq_tup | idx_tup
---------+-----------+-----------+---------+---------
titles | 2072:1006 | 1386:5539 | 0:0 | 1:4154
(1 row)
8.1dev:
mw=# explain analyze select title from titles where fts_index @@ to_tsquery('list');
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on titles (cost=5.32..1349.79 rows=378 width=41) (actual time=577.541..2630.110 rows=4153 loops=1)
Filter: (fts_index @@ '\'list\''::tsquery)
-> Bitmap Index Scan on fts_idx (cost=0.00..5.32 rows=378 width=0) (actual time=575.808..575.808 rows=4154 loops=1)
Index Cond: (fts_index @@ '\'list\''::tsquery)
Total runtime: 2654.472 ms
(5 rows)
mw=# select * from iostat where relname='titles';
relname | heap_blk | idx_blk | seq_tup | idx_tup
---------+----------+-----------+---------+---------
titles | 2704:0 | 1399:1398 | 0:0 | 0:0
(1 row)
iostat is my view defined as
View "public.iostat"
Column | Type | Modifiers
----------+------+-----------
relname | name |
heap_blk | text |
idx_blk | text |
seq_tup | text |
idx_tup | text |
View definition:
SELECT blk.relname, (blk.heap_blks_read::text || ':'::text) || blk.heap_blks_hit::text AS heap_blk, (blk.idx_blks_read::text || ':'::text) || blk.idx_blks_hit::text AS idx_blk, (tpl.seq_scan::text || ':'::text) || tpl.seq_tup_read::text AS seq_tup, (tpl.idx_scan::text || ':'::text) || tpl.idx_tup_fetch::text AS idx_tup
FROM pg_statio_user_tables blk
JOIN pg_stat_user_tables tpl USING (relname);
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
Oleg Bartunov <oleg@sai.msu.su> writes:
I tried to see io statistics, but it was weird in 8.0X and in 8.1dev I still
don't understand it :)
We aren't yet updating the io statistics for bitmap scans properly.
There was a thread about this but it petered out without any resolution
about what we should do ...
regards, tom lane
On Thu, 19 May 2005, Tom Lane wrote:
Oleg Bartunov <oleg@sai.msu.su> writes:
I tried to see io statistics, but it was weird in 8.0X and in 8.1dev I still
don't understand it :)We aren't yet updating the io statistics for bitmap scans properly.
There was a thread about this but it petered out without any resolution
about what we should do ...
what's 'Bitmap Heap Scan' ?
Bitmap Heap Scan on titles (cost=5.32..1349.79 rows=378 width=41) (actual time=76.601..111.207 rows=4153 loops=1)
Filter: (fts_index @@ '\'list\''::tsquery)
-> Bitmap Index Scan on fts_idx (cost=0.00..5.32 rows=378 width=0) (actual time=74.943..74.943 rows=4154 loops=1)
Index Cond: (fts_index @@ '\'list\''::tsquery)
regards, tom lane
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83