performance of bitmap scans in nested loop joins
Hello All,
I'd like to report about suprising (for me) results of performance testing of
bitmap indexes in nested loop join plans.
I have the table q3c
test=# \d q3c
Table "public.q3c"
Column | Type | Modifiers
--------+--------+-----------
ipix | bigint |
errbox | box |
ra | real |
dec | real |
Indexes:
"ipix_idx" UNIQUE, btree (ipix)
####################
test=# SELECT count(*) from q3c;
count
---------
3000000
(1 row)
First, two join plans with just simple index scan:
############# 1)
test=# EXPLAIN ANALYZE SELECT * FROM q3c,q3c AS q3cs WHERE
q3c.ipix>=q3cs.ipix-3 AND q3c.ipix<=q3cs.ipix+3;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.01..49059519577.21 rows=1000010666695 width=96) (actual time=0.137..39385.725 rows=3000000 loops=1)
-> Seq Scan on q3c q3cs (cost=0.00..60928.16 rows=3000016 width=48) (actual time=0.007..3659.171 rows=3000000 loops=1)
-> Index Scan using ipix_idx on q3c (cost=0.01..9686.37 rows=333335 width=48) (actual time=0.007..0.009 rows=1 loops=3000000)
Index Cond: ((q3c.ipix >= ("outer".ipix - 3)) AND (q3c.ipix <= ("outer".ipix + 3)))
Total runtime: 40755.390 ms
(5 rows)
############# 2)
test=# EXPLAIN ANALYZE SELECT * FROM q3c,q3c as q3cs WHERE
(q3c.ipix>=q3cs.ipix-1000 AND q3c.ipix<=q3cs.ipix-993);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.01..49059519577.21 rows=1000010666695 width=96) (actual time=28058.983..28058.983 rows=0 loops=1)
-> Seq Scan on q3c q3cs (cost=0.00..60928.16 rows=3000016 width=48) (actual time=0.061..3803.598 rows=3000000 loops=1)
-> Index Scan using ipix_idx on q3c (cost=0.01..9686.37 rows=333335 width=48) (actual time=0.006..0.006 rows=0 loops=3000000)
Index Cond: ((q3c.ipix >= ("outer".ipix - 1000)) AND (q3c.ipix <= ("outer".ipix - 993)))
Total runtime: 28059.066 ms
(5 rows)
#############
And now I combine them in one:
test=# EXPLAIN ANALYZE SELECT * FROM q3c,q3c AS q3cs WHERE
(q3c.ipix>=q3cs.ipix-3 AND q3c.ipix<=q3cs.ipix+3) OR
(q3c.ipix>=q3cs.ipix-1000 AND q3c.ipix<=q3cs.ipix-993);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=5832.03..190281569757.46 rows=1888909037091 width=96) (actual time=0.180..122444.610 rows=3000000 loops=1)
-> Seq Scan on q3c q3cs (cost=0.00..60928.16 rows=3000016 width=48) (actual time=0.063..3871.731 rows=3000000 loops=1)
-> Bitmap Heap Scan on q3c (cost=5832.03..43426.73 rows=666670 width=48) (actual time=0.033..0.034 rows=1 loops=3000000)
Recheck Cond: (((q3c.ipix >= ("outer".ipix - 3)) AND (q3c.ipix <= ("outer".ipix + 3))) OR ((q3c.ipix >= ("outer".ipix - 1000)) AND (q3c.ipix <= ("outer".ipix - 993))))
-> BitmapOr (cost=5832.03..5832.03 rows=666670 width=0) (actual time=0.029..0.029 rows=0 loops=3000000)
-> Bitmap Index Scan on ipix_idx (cost=0.00..2916.02 rows=333335 width=0) (actual time=0.014..0.014 rows=1 loops=3000000)
Index Cond: ((q3c.ipix >= ("outer".ipix - 3)) AND (q3c.ipix <= ("outer".ipix + 3)))
-> Bitmap Index Scan on ipix_idx (cost=0.00..2916.02 rows=333335 width=0) (actual time=0.011..0.011 rows=0 loops=3000000)
Index Cond: ((q3c.ipix >= ("outer".ipix - 1000)) AND (q3c.ipix <= ("outer".ipix - 993)))
Total runtime: 124366.545 ms
(10 rows)
So, we see that total time of the plan with bitmap scan is roughly two times
greater than the sum of times of individual index scans.
I see that in my case even simple union is significantly faster than bitmap
indexes.
test=# EXPLAIN ANALYZE SELECT * FROM q3c,q3c AS q3cs WHERE
(q3c.ipix>=q3cs.ipix-3 AND q3c.ipix<=q3cs.ipix+3)
UNION ALL
SELECT * FROM q3c,q3c AS q3cs WHERE
(q3c.ipix>=q3cs.ipix-1000 AND q3c.ipix<=q3cs.ipix-993);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Append (cost=0.01..118119252488.32 rows=2000021333390 width=96) (actual time=0.139..75048.897 rows=3000000 loops=1)
-> Subquery Scan "*SELECT* 1" (cost=0.01..59059626244.16 rows=1000010666695 width=96) (actual time=0.139..44221.409 rows=3000000 loops=1)
-> Nested Loop (cost=0.01..49059519577.21 rows=1000010666695 width=96) (actual time=0.137..40735.906 rows=3000000 loops=1)
-> Seq Scan on q3c q3cs (cost=0.00..60928.16 rows=3000016 width=48) (actual time=0.063..3719.637 rows=3000000 loops=1)
-> Index Scan using ipix_idx on q3c (cost=0.01..9686.37 rows=333335 width=48) (actual time=0.007..0.009 rows=1 loops=3000000)
Index Cond: ((q3c.ipix >= ("outer".ipix - 3)) AND (q3c.ipix <= ("outer".ipix + 3)))
-> Subquery Scan "*SELECT* 2" (cost=0.01..59059626244.16 rows=1000010666695 width=96) (actual time=28120.449..28120.449 rows=0 loops=1)
-> Nested Loop (cost=0.01..49059519577.21 rows=1000010666695 width=96) (actual time=28120.447..28120.447 rows=0 loops=1)
-> Seq Scan on q3c q3cs (cost=0.00..60928.16 rows=3000016 width=48) (actual time=0.023..3649.739 rows=3000000 loops=1)
-> Index Scan using ipix_idx on q3c (cost=0.01..9686.37 rows=333335 width=48) (actual time=0.006..0.006 rows=0 loops=3000000)
Index Cond: ((q3c.ipix >= ("outer".ipix - 1000)) AND (q3c.ipix <= ("outer".ipix - 993)))
Total runtime: 76413.737 ms
(12 rows)
Last note: all those queries were run in fully cached regime on P4 2.8Ghz.
I used the yesterday's CVS snapshot.
Are those performance results expected for the bitmap index scans ?
Thanks in advance for any comments.
With Best Regards,
Sergey Koposov
------------------------------------------------------------
Sergey E. Koposov
Sternberg Astronomical Institute, Moscow University (Russia)
Max-Planck Institute for Astronomy (Germany)
Internet: math@sai.msu.ru, http://lnfm1.sai.msu.su/~math/
"Sergey E. Koposov" <math@sai.msu.ru> writes:
I'd like to report about suprising (for me) results of performance testing of
bitmap indexes in nested loop join plans.
I'm surprised too. There's something weird in the indexscans
themselves:
-> Index Scan using ipix_idx on q3c (cost=0.01..9686.37 rows=333335 width=48) (actual time=0.006..0.006 rows=0 loops=3000000)
Index Cond: ((q3c.ipix >= ("outer".ipix - 1000)) AND (q3c.ipix <= ("outer".ipix - 993)))
-> Bitmap Index Scan on ipix_idx (cost=0.00..2916.02 rows=333335 width=0) (actual time=0.011..0.011 rows=0 loops=3000000)
Index Cond: ((q3c.ipix >= ("outer".ipix - 1000)) AND (q3c.ipix <= ("outer".ipix - 993)))
The latter is (or should be) doing slightly *less* work, so why is it
taking almost twice as much time? Can you get gprof profiles of the
two cases?
regards, tom lane
On Fri, 29 Apr 2005, Tom Lane wrote:
-> Index Scan using ipix_idx on q3c (cost=0.01..9686.37 rows=333335 width=48) (actual time=0.006..0.006 rows=0 loops=3000000)
Index Cond: ((q3c.ipix >= ("outer".ipix - 1000)) AND (q3c.ipix <= ("outer".ipix - 993)))-> Bitmap Index Scan on ipix_idx (cost=0.00..2916.02 rows=333335 width=0) (actual time=0.011..0.011 rows=0 loops=3000000)
Index Cond: ((q3c.ipix >= ("outer".ipix - 1000)) AND (q3c.ipix <= ("outer".ipix - 993)))The latter is (or should be) doing slightly *less* work, so why is it
taking almost twice as much time? Can you get gprof profiles of the
two cases?
I've got them. Here there are two gprof profiles:
http://lnfm1.sai.msu.ru/~math/public_misc/idxscan.gprof
http://lnfm1.sai.msu.ru/~math/public_misc/bitmap.gprof
(now as links, because the previous letter with those files as attachements
haven't passed on -hackers (due to size, I think))
bitmap.gprof is the profiling of the:
test=# explain analyze select * from q3c,q3c as q3cs where
(q3c.ipix>=q3cs.ipix-3 AND q3c.ipix<=q3cs.ipix+3) OR
(q3c.ipix>=q3cs.ipix-1000 AND q3c.ipix<=q3cs.ipix-993);
QUERY PLAN
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-----------------
Nested Loop (cost=5832.01..190280130928.00 rows=1888888888889 width=96)
(actual time=0.435..374743.591 rows=3000000 loops=1)
-> Seq Scan on q3c q3cs (cost=0.00..60928.00 rows=3000000 width=48)
(actual time=0.079..10632.570 rows=3000000 loops=1)
-> Bitmap Heap Scan on q3c (cost=5832.01..43426.68 rows=666667
width=48)
(actual time=0.102..0.104 rows=1 loops=3000000)
Recheck Cond: (((q3c.ipix >= ("outer".ipix - 3)) AND (q3c.ipix <=
("outer".ipix + 3))) OR ((q3c.ipix >= ("outer".ipix - 1000)) AND (q3c.ipix
<=
("outer".ipix - 993))))
-> BitmapOr (cost=5832.01..5832.01 rows=666667 width=0) (actual
time=0.094..0.094 rows=0 loops=3000000)
-> Bitmap Index Scan on ipix_idx (cost=0.00..2916.01
rows=333333 width=0) (actual time=0.045..0.045 rows=1 loops=3000000)
Index Cond: ((q3c.ipix >= ("outer".ipix - 3)) AND
(q3c.ipix <= ("outer".ipix + 3)))
-> Bitmap Index Scan on ipix_idx (cost=0.00..2916.01
rows=333333 width=0) (actual time=0.041..0.041 rows=0 loops=3000000)
Index Cond: ((q3c.ipix >= ("outer".ipix - 1000)) AND
(q3c.ipix <= ("outer".ipix - 993)))
Total runtime: 377551.805 ms
(10 rows)
And idxscan.gprof is the profiling of the:
test=# EXPLAIN ANALYZE SELECT * FROM q3c,q3c as q3cs WHERE
(q3c.ipix>=q3cs.ipix-1000 AND q3c.ipix<=q3cs.ipix-993);
QUERY PLAN
--------------------------------------------------------------------------------
---------------------------------------------------
Nested Loop (cost=0.01..49059045928.00 rows=1000000000000 width=96)
(actual
time=104991.950..104991.950 rows=0 loops=1)
-> Seq Scan on q3c q3cs (cost=0.00..60928.00 rows=3000000 width=48)
(actual time=0.069..10465.514 rows=3000000 loops=1)
-> Index Scan using ipix_idx on q3c (cost=0.01..9686.33 rows=333333
width=48) (actual time=0.025..0.025 rows=0 loops=3000000)
Index Cond: ((q3c.ipix >= ("outer".ipix - 1000)) AND (q3c.ipix <=
("outer".ipix - 993)))
Total runtime: 104992.202 ms
(5 rows)
With Best regards,
Sergey Koposov