bitmap scan much slower than index scan, hash_search_with_hash_value

Started by Sergey Koposovover 13 years ago8 messages
#1Sergey Koposov
koposov@ast.cam.ac.uk

Hi,

I'm experiencing the case when bitmap scan is ~ 70 times slower than
index scan which seems to be caused by 1) very big table 2) some hash
search logic (hash_search_with_hash_value )

Here is the explain analyze of the query with bitmap scans allowed:

wsdb=> explain analyze select * from test as t, crts.data as d1
where d1.objid=t.objid and d1.mjd=t.mjd limit 10000;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=11514.04..115493165.44 rows=10000 width=68) (actual time=27.512..66620.231 rows=10000 loops=1)
-> Nested Loop (cost=11514.04..1799585184.18 rows=155832 width=68) (actual time=27.511..66616.807 rows=10000 loops=1)
-> Seq Scan on test t (cost=0.00..2678.40 rows=156240 width=28) (actual time=0.010..4.685 rows=11456 loops=1)
-> Bitmap Heap Scan on data d1 (cost=11514.04..11518.05 rows=1 width=40) (actual time=5.807..5.807 rows=1 loops=11456)
Recheck Cond: ((mjd = t.mjd) AND (objid = t.objid))
-> BitmapAnd (cost=11514.04..11514.04 rows=1 width=0) (actual time=5.777..5.777 rows=0 loops=11456)
-> Bitmap Index Scan on data_mjd_idx (cost=0.00..2501.40 rows=42872 width=0) (actual time=3.920..3.920 rows=22241 loops=11456)
Index Cond: (mjd = t.mjd)
-> Bitmap Index Scan on data_objid_idx (cost=0.00..8897.90 rows=415080 width=0) (actual time=0.025..0.025 rows=248 loops=11456)
Index Cond: (objid = t.objid)
Total runtime: 66622.026 ms
(11 rows)

Here is the output when bitmap scans are disabled: QUERY PLAN
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..329631941.65 rows=10000 width=68) (actual time=0.082..906.876 rows=10000 loops=1)
-> Nested Loop (cost=0.00..4979486036.95 rows=151062 width=68) (actual time=0.081..905.683 rows=10000 loops=1)
Join Filter: (t.mjd = d1.mjd)
-> Seq Scan on test t (cost=0.00..2632.77 rows=151677 width=28) (actual time=0.009..1.679 rows=11456 loops=1)
-> Index Scan using data_objid_idx on data d1 (cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050 rows=248 loops=11456)
Index Cond: (objid = t.objid)
Total runtime: 907.462 ms

When the bitmap scans are enabled the "prof" of postgres shows
     47.10%  postmaster  postgres           [.] hash_search_with_hash_value
             |
             --- hash_search_with_hash_value
     11.06%  postmaster  postgres           [.] hash_seq_search
             |
             --- hash_seq_search
      6.95%  postmaster  postgres           [.] hash_any
             |
             --- hash_any
      5.17%  postmaster  postgres           [.] _bt_checkkeys
             |
             --- _bt_checkkeys
      4.07%  postmaster  postgres           [.] tbm_add_tuples
             |
             --- tbm_add_tuples
      3.41%  postmaster  postgres           [.] hash_search
             |
             --- hash_search

And the last note is that the crts.data table which is being bitmap scanned is
a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan code
is somehow unprepared to combine two bitmaps for such a big table, and this
leads to the terrible performance.

Regards,
Sergey

PS Here are the schemas of the tables, just in case:
wsdb=> \d test
Table "koposov.test"
Column | Type | Modifiers
---------+------------------+-----------
mjd | double precision |
fieldid | bigint |
intmag | integer |
objid | bigint |

wsdb=> \d crts.data
Table "crts.data"
Column | Type | Modifiers
--------+------------------+-----------
objid | bigint |
mjd | double precision |
mag | real |
emag | real |
ra | double precision |
dec | double precision |
Indexes:
"data_mjd_idx" btree (mjd) WITH (fillfactor=100)
"data_objid_idx" btree (objid) WITH (fillfactor=100)
"data_q3c_ang2ipix_idx" btree (q3c_ang2ipix(ra, "dec")) WITH (fillfactor=100)

PPS shared_buffers=10GB, work_mem=1GB
All the test shown here were don in fully cached regime.

PPS I can believe that what I'm seeing is a feature, not a bug of bitmap scans,
and I can live with disabling them, but I still thought it's worth reporting.

*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551

#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Sergey Koposov (#1)
Re: bitmap scan much slower than index scan, hash_search_with_hash_value

Hello

2012/9/2 Sergey Koposov <koposov@ast.cam.ac.uk>:

Hi,

I'm experiencing the case when bitmap scan is ~ 70 times slower than index
scan which seems to be caused by 1) very big table 2) some hash search logic
(hash_search_with_hash_value )

Here is the explain analyze of the query with bitmap scans allowed:

wsdb=> explain analyze select * from test as t, crts.data as d1
where d1.objid=t.objid and d1.mjd=t.mjd limit 10000;
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=11514.04..115493165.44 rows=10000 width=68) (actual
time=27.512..66620.231 rows=10000 loops=1)
-> Nested Loop (cost=11514.04..1799585184.18 rows=155832 width=68)
(actual time=27.511..66616.807 rows=10000 loops=1)
-> Seq Scan on test t (cost=0.00..2678.40 rows=156240 width=28)
(actual time=0.010..4.685 rows=11456 loops=1)
-> Bitmap Heap Scan on data d1 (cost=11514.04..11518.05 rows=1
width=40) (actual time=5.807..5.807 rows=1 loops=11456)
Recheck Cond: ((mjd = t.mjd) AND (objid = t.objid))
-> BitmapAnd (cost=11514.04..11514.04 rows=1 width=0)
(actual time=5.777..5.777 rows=0 loops=11456)
-> Bitmap Index Scan on data_mjd_idx
(cost=0.00..2501.40 rows=42872 width=0) (actual time=3.920..3.920 rows=22241
loops=11456)
Index Cond: (mjd = t.mjd)

-> Bitmap Index Scan on data_objid_idx
(cost=0.00..8897.90 rows=415080 width=0) (actual time=0.025..0.025 rows=248
loops=11456)

statistics on data_objid_idx table are absolutly out - so planner
cannot find optimal plan

Regard

Pavel Stehule

Show quoted text

Index Cond: (objid = t.objid)
Total runtime: 66622.026 ms
(11 rows)

Here is the output when bitmap scans are disabled:
QUERY PLAN
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..329631941.65 rows=10000 width=68) (actual
time=0.082..906.876 rows=10000 loops=1)
-> Nested Loop (cost=0.00..4979486036.95 rows=151062 width=68) (actual
time=0.081..905.683 rows=10000 loops=1)
Join Filter: (t.mjd = d1.mjd)
-> Seq Scan on test t (cost=0.00..2632.77 rows=151677 width=28)
(actual time=0.009..1.679 rows=11456 loops=1)
-> Index Scan using data_objid_idx on data d1
(cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050
rows=248 loops=11456)
Index Cond: (objid = t.objid)
Total runtime: 907.462 ms

When the bitmap scans are enabled the "prof" of postgres shows
47.10%  postmaster  postgres           [.] hash_search_with_hash_value
|
--- hash_search_with_hash_value
11.06%  postmaster  postgres           [.] hash_seq_search
|
--- hash_seq_search
6.95%  postmaster  postgres           [.] hash_any
|
--- hash_any
5.17%  postmaster  postgres           [.] _bt_checkkeys
|
--- _bt_checkkeys
4.07%  postmaster  postgres           [.] tbm_add_tuples
|
--- tbm_add_tuples
3.41%  postmaster  postgres           [.] hash_search
|
--- hash_search

And the last note is that the crts.data table which is being bitmap scanned
is a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan
code
is somehow unprepared to combine two bitmaps for such a big table, and this
leads to the terrible performance.

Regards,
Sergey

PS Here are the schemas of the tables, just in case:
wsdb=> \d test
Table "koposov.test"
Column | Type | Modifiers
---------+------------------+-----------
mjd | double precision |
fieldid | bigint |
intmag | integer |
objid | bigint |

wsdb=> \d crts.data
Table "crts.data"
Column | Type | Modifiers
--------+------------------+-----------
objid | bigint |
mjd | double precision |
mag | real |
emag | real |
ra | double precision |
dec | double precision |
Indexes:
"data_mjd_idx" btree (mjd) WITH (fillfactor=100)
"data_objid_idx" btree (objid) WITH (fillfactor=100)
"data_q3c_ang2ipix_idx" btree (q3c_ang2ipix(ra, "dec")) WITH
(fillfactor=100)

PPS shared_buffers=10GB, work_mem=1GB
All the test shown here were don in fully cached regime.

PPS I can believe that what I'm seeing is a feature, not a bug of bitmap
scans,
and I can live with disabling them, but I still thought it's worth
reporting.

*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Peter Geoghegan
peter@2ndquadrant.com
In reply to: Sergey Koposov (#1)
Re: bitmap scan much slower than index scan, hash_search_with_hash_value

On 2 September 2012 06:21, Sergey Koposov <koposov@ast.cam.ac.uk> wrote:

Hi,

I'm experiencing the case when bitmap scan is ~ 70 times slower than index
scan which seems to be caused by 1) very big table 2) some hash search logic
(hash_search_with_hash_value )

Here is the explain analyze of the query with bitmap scans allowed:

wsdb=> explain analyze select * from test as t, crts.data as d1
where d1.objid=t.objid and d1.mjd=t.mjd limit 10000;
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=11514.04..115493165.44 rows=10000 width=68) (actual
time=27.512..66620.231 rows=10000 loops=1)
-> Nested Loop (cost=11514.04..1799585184.18 rows=155832 width=68)
(actual time=27.511..66616.807 rows=10000 loops=1)
-> Seq Scan on test t (cost=0.00..2678.40 rows=156240 width=28)
(actual time=0.010..4.685 rows=11456 loops=1)
-> Bitmap Heap Scan on data d1 (cost=11514.04..11518.05 rows=1
width=40) (actual time=5.807..5.807 rows=1 loops=11456)
Recheck Cond: ((mjd = t.mjd) AND (objid = t.objid))
-> BitmapAnd (cost=11514.04..11514.04 rows=1 width=0)
(actual time=5.777..5.777 rows=0 loops=11456)
-> Bitmap Index Scan on data_mjd_idx
(cost=0.00..2501.40 rows=42872 width=0) (actual time=3.920..3.920 rows=22241
loops=11456)
Index Cond: (mjd = t.mjd)
-> Bitmap Index Scan on data_objid_idx
(cost=0.00..8897.90 rows=415080 width=0) (actual time=0.025..0.025 rows=248
loops=11456)
Index Cond: (objid = t.objid)
Total runtime: 66622.026 ms
(11 rows)

Here is the output when bitmap scans are disabled:
QUERY PLAN
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..329631941.65 rows=10000 width=68) (actual
time=0.082..906.876 rows=10000 loops=1)
-> Nested Loop (cost=0.00..4979486036.95 rows=151062 width=68) (actual
time=0.081..905.683 rows=10000 loops=1)
Join Filter: (t.mjd = d1.mjd)
-> Seq Scan on test t (cost=0.00..2632.77 rows=151677 width=28)
(actual time=0.009..1.679 rows=11456 loops=1)
-> Index Scan using data_objid_idx on data d1
(cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050
rows=248 loops=11456)
Index Cond: (objid = t.objid)
Total runtime: 907.462 ms

When the bitmap scans are enabled the "prof" of postgres shows
47.10%  postmaster  postgres           [.] hash_search_with_hash_value
|
--- hash_search_with_hash_value
11.06%  postmaster  postgres           [.] hash_seq_search
|
--- hash_seq_search
6.95%  postmaster  postgres           [.] hash_any
|
--- hash_any
5.17%  postmaster  postgres           [.] _bt_checkkeys
|
--- _bt_checkkeys
4.07%  postmaster  postgres           [.] tbm_add_tuples
|
--- tbm_add_tuples
3.41%  postmaster  postgres           [.] hash_search
|
--- hash_search

And the last note is that the crts.data table which is being bitmap scanned
is a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan
code
is somehow unprepared to combine two bitmaps for such a big table, and this
leads to the terrible performance.

I think that this kind of question is better suited to the
pgsql-performance list. Granted, it was presented as a bug report
(though they're generally sent to -bugs rather than -hackers), but I
don't think that this is a valid bug.

One obvious red-flag from your query plans is that there is a
misestimation of the row return count of a few orders of magnitude in
the Bitmap Index Scan node. Did you trying performing an ANALYZE to
see if that helped? It may also be helpful to show pg_stats entries
for both the data.mjd and test.mjd columns. You may find, prior to
doing an ANALYZE, that there is no entries for one of those tables.

Turning off the enable_* planner options in production is generally
discouraged. Certainly, you'd be crazy to do that on a server-wide
basis.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#4Sergey Koposov
koposov@ast.cam.ac.uk
In reply to: Pavel Stehule (#2)
Re: bitmap scan much slower than index scan, hash_search_with_hash_value

On Sun, 2 Sep 2012, Pavel Stehule wrote:

statistics on data_objid_idx table are absolutly out - so planner
cannot find optimal plan

That doesn't have anything to do with the problem, AFAIU.
First, the data table is static and was analysed.
Second, the query in question is the join, and afaik the estimation of the
number of rows is known to be incorrect, in the case of column
correlation.
Third, according at least to my understanding in the fully cached regime
bitmap scan should not take two orders of magnitude more CPU time than
index scan.

Sergey

Regard

Pavel Stehule

Index Cond: (objid = t.objid)
Total runtime: 66622.026 ms
(11 rows)

Here is the output when bitmap scans are disabled:
QUERY PLAN
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..329631941.65 rows=10000 width=68) (actual
time=0.082..906.876 rows=10000 loops=1)
-> Nested Loop (cost=0.00..4979486036.95 rows=151062 width=68) (actual
time=0.081..905.683 rows=10000 loops=1)
Join Filter: (t.mjd = d1.mjd)
-> Seq Scan on test t (cost=0.00..2632.77 rows=151677 width=28)
(actual time=0.009..1.679 rows=11456 loops=1)
-> Index Scan using data_objid_idx on data d1
(cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050
rows=248 loops=11456)
Index Cond: (objid = t.objid)
Total runtime: 907.462 ms

When the bitmap scans are enabled the "prof" of postgres shows
47.10%  postmaster  postgres           [.] hash_search_with_hash_value
|
--- hash_search_with_hash_value
11.06%  postmaster  postgres           [.] hash_seq_search
|
--- hash_seq_search
6.95%  postmaster  postgres           [.] hash_any
|
--- hash_any
5.17%  postmaster  postgres           [.] _bt_checkkeys
|
--- _bt_checkkeys
4.07%  postmaster  postgres           [.] tbm_add_tuples
|
--- tbm_add_tuples
3.41%  postmaster  postgres           [.] hash_search
|
--- hash_search

And the last note is that the crts.data table which is being bitmap scanned
is a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan
code
is somehow unprepared to combine two bitmaps for such a big table, and this
leads to the terrible performance.

Regards,
Sergey

PS Here are the schemas of the tables, just in case:
wsdb=> \d test
Table "koposov.test"
Column | Type | Modifiers
---------+------------------+-----------
mjd | double precision |
fieldid | bigint |
intmag | integer |
objid | bigint |

wsdb=> \d crts.data
Table "crts.data"
Column | Type | Modifiers
--------+------------------+-----------
objid | bigint |
mjd | double precision |
mag | real |
emag | real |
ra | double precision |
dec | double precision |
Indexes:
"data_mjd_idx" btree (mjd) WITH (fillfactor=100)
"data_objid_idx" btree (objid) WITH (fillfactor=100)
"data_q3c_ang2ipix_idx" btree (q3c_ang2ipix(ra, "dec")) WITH
(fillfactor=100)

PPS shared_buffers=10GB, work_mem=1GB
All the test shown here were don in fully cached regime.

PPS I can believe that what I'm seeing is a feature, not a bug of bitmap
scans,
and I can live with disabling them, but I still thought it's worth
reporting.

*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

#5Sergey Koposov
koposov@ast.cam.ac.uk
In reply to: Peter Geoghegan (#3)
Re: bitmap scan much slower than index scan, hash_search_with_hash_value

Thanks for your comments.
On Sun, 2 Sep 2012, Peter Geoghegan wrote:

On 2 September 2012 06:21, Sergey Koposov <koposov@ast.cam.ac.uk> wrote:

I think that this kind of question is better suited to the
pgsql-performance list. Granted, it was presented as a bug report
(though they're generally sent to -bugs rather than -hackers), but I
don't think that this is a valid bug.

The reason is that was inclined to think that it is a bug is that I
encountered a similar bug before with bitmap scans and very big
tables
http://archives.postgresql.org/pgsql-hackers/2011-08/msg00958.php
Furthermore 2 orders of magnitudes more of CPU time for bitmap scans
comparing to the index scan didn't sound right to me (although obviously
I'm not in the position to claim that it's 100% bug).

One obvious red-flag from your query plans is that there is a
misestimation of the row return count of a few orders of magnitude in
the Bitmap Index Scan node. Did you trying performing an ANALYZE to
see if that helped? It may also be helpful to show pg_stats entries
for both the data.mjd and test.mjd columns. You may find, prior to
doing an ANALYZE, that there is no entries for one of those tables.

The main large table is static and was analyzed. The test table was as
well. But as mentioned in another recent email, the query is the join, so
column correlation is a problem.

Turning off the enable_* planner options in production is generally
discouraged. Certainly, you'd be crazy to do that on a server-wide
basis.

I'm using PG for data mining, data analysis purposes with very few clients
connected and very large tables, so enable_* is used quite often to fix
incorrect plans due to incorrect selectivities.

Regards,
Sergey

*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sergey Koposov (#5)
Re: bitmap scan much slower than index scan, hash_search_with_hash_value

Sergey Koposov <koposov@ast.cam.ac.uk> writes:

On Sun, 2 Sep 2012, Peter Geoghegan wrote:

One obvious red-flag from your query plans is that there is a
misestimation of the row return count of a few orders of magnitude in
the Bitmap Index Scan node. Did you trying performing an ANALYZE to
see if that helped? It may also be helpful to show pg_stats entries
for both the data.mjd and test.mjd columns. You may find, prior to
doing an ANALYZE, that there is no entries for one of those tables.

The main large table is static and was analyzed. The test table was as
well. But as mentioned in another recent email, the query is the join, so
column correlation is a problem.

The problem is definitely the misestimation here:

-> Index Scan using data_objid_idx on data d1 (cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050 rows=248 loops=11456)
Index Cond: (objid = t.objid)

The planner thinks that indexscan will be 2000 times more expensive than
it really is (assuming that the cost per retrieved row is linear, which
it isn't entirely, but close enough for the moment). Of course, it's
also thinking the bitmap component scan on the same index will be 2000
times more expensive than reality, but that has only perhaps a 4X impact
on the total cost of the bitmap scan, since the use of the other index
is what dominates there. With a more accurate idea of this join
condition's selectivity, I'm pretty certain it would have gone for a
plain indexscan, or else a bitmap scan using only this index.

So if there's a bug here, it's in eqjoinsel(). Can you pinpoint
anything unusual about the stats of the objid columns?

regards, tom lane

#7Sergey Koposov
koposov@ast.cam.ac.uk
In reply to: Tom Lane (#6)
Re: bitmap scan much slower than index scan, hash_search_with_hash_value

On Sun, 2 Sep 2012, Tom Lane wrote:

Sergey Koposov <koposov@ast.cam.ac.uk> writes:
The problem is definitely the misestimation here:

-> Index Scan using data_objid_idx on data d1 (cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050 rows=248 loops=11456)
Index Cond: (objid = t.objid)

The planner thinks that indexscan will be 2000 times more expensive than
it really is (assuming that the cost per retrieved row is linear, which
it isn't entirely, but close enough for the moment). Of course, it's
also thinking the bitmap component scan on the same index will be 2000
times more expensive than reality, but that has only perhaps a 4X impact
on the total cost of the bitmap scan, since the use of the other index
is what dominates there. With a more accurate idea of this join
condition's selectivity, I'm pretty certain it would have gone for a
plain indexscan, or else a bitmap scan using only this index.
So if there's a bug here, it's in eqjoinsel(). Can you pinpoint
anything unusual about the stats of the objid columns?

Here are the pg_stats rows for two tables.

schemaname | tablename | attname | inherited | null_frac | avg_width | n_distinct | most_common_vals !
| most_common_freq!
s !

| !
histogram_bounds | correlation
------------+-----------+---------+-----------+-----------+-----------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------!
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------!
--------------------------------------------------------------!
--------
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------!
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------
koposov | test | objid | f | 0 | 8 | 12871 | {1129054024305,1129054013805,1129054015519,1129054010351,1129054024288,1129054031271,1129054025160,1129054002377,1129054009879,1129054052531,1129054009504,1129054012457,1129054019711,1129054021674,1129054036120,1129054007511,1129054047608,1129054051315,1129053031933,1129054002167,1129054025887,1129054031101,1129054052009,1129054005693,1129054052179,1129054009770,1129054010667,1129054011055,1129054012202,1129054023796,1129054024869,1129054033130,1129054034568,1129054007984,1129054011841,1129054017365,1129054024604,1129055040074,1129054004140,1129054023593,1129054037199,1129054049199,1129054012698,1129054012820,1129054035783,1129054036867,1129054040468,1129054004406,1129054019158,1129054021526,1129054041113,1129054045723,1129055012252,1129054017556,1129054020772,1129054036364,1129054044186,1129053031526,1129054016861,1129054026443,1129054040073,1129054045776,1129054009002,1129054013202,112905!
4051372,1129054037023,1129054038374,1129054042219,1129054049868,1129054007536,1129054009831,1129054017925,1129054000969,1129054012326,1129054049904,1129054012362,1129054040299,1129054043170,1129054049082,1129054002797,1129054009865,1129054019731,1129054025094,1129054025513,1129054036242,1129054045152,1129054052444,1129053049005,1129054008818,1129054012536,1129054019393,1129054036625,1129054039488,1129054001843,1129054030113,1129054033380,1129054036119,1129054037048,1129054037726,1129054040232} | {0.0017,0.0016,0.0016,0.00153333,0.00146667,0.00143333,0.0014,0.00136667,0.00136667,0.00136667,0.00133333,0.00133333,0.00133333,0.00133333,0.00133333,0.0013,0.0013,0.0013,0.00126667,0.00126667,0.00126667,0.00126667,0.00126667,0.00123333,0.00123333,0.0012,0.0012,0.0012,0.0012,0.0012,0.0012,0.0012,0.0012,0.00116667,0.00116667,0.00116667,0.00116667,0.00116667,0.00113333,0.00113333,0.00113333,0.00113333,0.0011,0.0011,0.0011,0.0011,0.0011,0.00106667,0.00106667,0.00106667,0.00106667,0.001!
06667,0.00106667,0.00103333,0.00103333,0.00103333,0.00103333,0!
.001,0.0
01,0.001,0.001,0.001,0.000966667,0.000933333,0.000933333,0.0009,0.0009,0.0009,0.0009,0.000866667,0.000866667,0.000866667,0.000833333,0.000833333,0.000833333,0.0008,0.0008,0.0008,0.0008,0.000766667,0.000766667,0.000766667,0.000766667,0.000766667,0.000766667,0.000766667,0.000766667,0.000733333,0.000733333,0.000733333,0.000733333,0.000733333,0.000733333,0.0007,0.0007,0.0007,0.0007,0.0007,0.0007,0.0007} | {1126055052854,1129053016156,1129053034242,1129054000425,1129054001170,1129054001752,1129054002346,1129054002907,1129054003399,1129054003950,1129054004482,1129054005041,1129054005624,1129054006138,1129054006636,1129054007360,1129054007880,1129054008405,1129054009003,1129054009521,1129054010022,1129054010472,1129054010987,1129054011411,1129054012093,1129054012747,1129054013320,1129054013877,1129054014498,1129054014886,1129054015395,1129054015882,1129054016429,1129054016996,1129054017481,1129054018011,1129054018594,1129054019010,1129054019419,1129054019857,1129054020367,112905402!
0851,1129054021386,1129054021974,1129054022470,1129054022961,1129054023507,1129054024144,1129054024709,1129054025236,1129054025668,1129054026108,1129054026713,1129054027136,1129054027725,1129054028321,1129054028791,1129054029410,1129054029930,1129054030588,1129054031188,1129054031714,1129054032265,1129054032885,1129054033562,1129054033982,1129054034471,1129054034995,1129054035565,1129054036163,1129054036618,1129054037228,1129054037814,1129054038511,1129054039023,1129054039716,1129054040221,1129054040896,1129054041519,1129054042087,1129054042712,1129054043109,1129054043572,1129054044106,1129054044719,1129054045295,1129054045953,1129054046667,1129054047347,1129054047912,1129054048614,1129054049229,1129054049970,1129054050705,1129054051360,1129054051926,1129054052381,1129054053001,1129055021424,1129055042151,1132053000980} | 0.00388717
(1 row)

schemaname | tablename | attname | inherited | null_frac | avg_width | n_distinct | most_common_vals !
| !
!

most_common_freqs | !
histogram_bounds !
| correlation
------------+-----------+---------+-----------+-----------+-----------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------!
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------!
--------------------------------------------------------------!
--------
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------!
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------!
-------------------------+-------------
crts | data | objid | f | 0 | 8 | 42984 | {1001052054863,1101028035628,1107015007145,1129016053807,1001010035337,1001047071131,1101056005163,1109079016218,1112118039038,1115046030816,1115046038573,1115078006068,1121063029656,1126048008082,1132004060369,1138019056456,1004051056380,1004073058947,1004083045581,1007021032422,1007113045682,1009018027582,1012083068700,1012110014859,1015121028788,1101029052205,1101048059555,1101127022623,1104089078373,1104120055698,1109045027987,1109078024374,1109083064345,1109091021615,1112024034752,1112053044352,1112060039848,1115015037059,1115067015530,1115078013852,1118021049852,1118042046304,1118086054873,1121059018037,1121071057373,1121080059645,1123038024940,1123038039165,1123077021003,1129053014283,1129112028843,1132104013785,1135002067293,1135046049690,1135064053365,1138070003559,1140002058964,1140035053541,1140042049671,1140058002804,1146044021576,1146046009579,1001013030330,1001014040915,100102!
6061566,1001027033057,1001032050491,1001050029421,1001063047914,1001077045026,1001086040152,1001111064042,1001114077173,1001119045214,1001122002865,1004018025733,1004050058091,1004071054511,1004074007987,1004076056804,1004078064774,1004112043963,1004112068670,1004116012773,1004117067826,1004121051038,1004123012077,1004126006726,1004127028634,1007079042379,1007110052823,1007111033667,1007123042281,1009075018815,1009119043295,1012007002296,1012025045008,1012049033574,1012055017037,1012087051308} | {0.000233333,0.000233333,0.000233333,0.000233333,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.0001!
66667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166!
667,0.00
0166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333} | {1001004004139,1001029020233,1001059002785,1001087029100,1001118009993,1004027052124,1004062046045,1004088060260,1004116023848,1007013040141,1007050044111,1007080021932,1007111006517,1009009008718,1009053001627,1009086036243,1009119018400,1012049009567,1012084095760,1012117034362,1015030070068,1015085100797,1018025013582,1018085006093,1101016044008,1101044062170,1101063008371,1101087058719,110111803!
7664,1104019022680,1104044088809,1104066053561,1104086083356,1104115067821,1107010050477,1107043017994,1107058030735,1107083056547,1107112094304,1109007017319,1109029048299,1109059031081,1109084031184,1109114001040,1112007032430,1112029071539,1112059005867,1112083065284,1112113082391,1115007032793,1115028081315,1115054012610,1115080039403,1115113004475,1118008012760,1118040026538,1118059016105,1118086012906,1118115042717,1121015044477,1121040023534,1121062004140,1121085034182,1121116010387,1123018070377,1123046044260,1123073063570,1123087064201,1126011041245,1126025050115,1126049009289,1126075061393,1126109065727,1129013063350,1129036011138,1129052004132,1129080036686,1129111040209,1132016030793,1132040002165,1132067026752,1132104035342,1135018109413,1135042003024,1135070060711,1135102062229,1138014077819,1138037001219,1138068005351,1140002057146,1140033028928,1140062055119,1140098008413,1143040044324,1143074050576,1146053003606,1149034019717,1152027024188,1155054018323,116!
0036012767,1171035015563} | 1
(1 row)

After looking at them, I think I understand the reason -- the
number of n_distinct for crts.data is terribly wrong. In reality it should
be ~ 130 millions.
I already faced this problem at certain point when doing "group by objid" and
PG was excausting all the memory, because of that wrong estimate. But
I know that it's a known hard problem to estimate n_distinct.

So probably that's the main reason of a problem...

Regards,
Sergey

*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

From pgsql-hackers-owner@postgresql.org Sun Sep 2 15:45:24 2012

Received: from magus.postgresql.org ([87.238.57.229])
by malur.postgresql.org with esmtp (Exim 4.72)
(envelope-from <peter@2ndquadrant.com>)
id 1T8CMV-0005QW-Sp
for pgsql-hackers@postgresql.org; Sun, 02 Sep 2012 15:45:23 +0000
Received: from outmail148094.authsmtp.co.uk ([62.13.148.94])
by magus.postgresql.org with esmtp (Exim 4.72)
(envelope-from <peter@2ndquadrant.com>)
id 1T8CMR-00056e-KR
for pgsql-hackers@postgresql.org; Sun, 02 Sep 2012 15:45:23 +0000
Received: from mail-c193.authsmtp.com (mail-c193.authsmtp.com [62.13.128.118])
by punt7.authsmtp.com (8.14.2/8.14.2/Kp) with ESMTP id q82FjHnp057444
for <pgsql-hackers@postgresql.org>; Sun, 2 Sep 2012 16:45:17 +0100 (BST)
Received: from mail-ob0-f174.google.com (mail-ob0-f174.google.com [209.85.214.174])
(authenticated bits=0)
by mail.authsmtp.com (8.14.2/8.14.2) with ESMTP id q82FjEEh024527
(version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NO)
for <pgsql-hackers@postgresql.org>; Sun, 2 Sep 2012 16:45:15 +0100 (BST)
Received: by obbuo13 with SMTP id uo13so8038071obb.19
for <pgsql-hackers@postgresql.org>; Sun, 02 Sep 2012 08:45:14 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.182.18.143 with SMTP id w15mr12097714obd.6.1346600714168; Sun,
02 Sep 2012 08:45:14 -0700 (PDT)
Received: by 10.182.62.196 with HTTP; Sun, 2 Sep 2012 08:45:14 -0700 (PDT)
In-Reply-To: <alpine.LRH.2.02.1209021610140.25232@calx046.ast.cam.ac.uk>
References: <alpine.LRH.2.02.1209020540470.25232@calx046.ast.cam.ac.uk>
<CAEYLb_UX0_5fw3-jW4neJa=bs0KmGpgH_9FWqVhKXh5AHrpAEQ@mail.gmail.com>
<alpine.LRH.2.02.1209021256450.25232@calx046.ast.cam.ac.uk>
<22545.1346597069@sss.pgh.pa.us>
<alpine.LRH.2.02.1209021610140.25232@calx046.ast.cam.ac.uk>
Date: Sun, 2 Sep 2012 16:45:14 +0100
Message-ID: <CAEYLb_U13MFekB_snPwL7gSTpaEJ1rPp2kxVh+jv+e_zqMCnXw@mail.gmail.com>
Subject: Re: bitmap scan much slower than index scan, hash_search_with_hash_value
From: Peter Geoghegan <peter@2ndquadrant.com>
To: Sergey Koposov <koposov@ast.cam.ac.uk>
Cc: Tom Lane <tgl@sss.pgh.pa.us>, pgsql-hackers@postgresql.org
Content-Type: text/plain; charset=ISO-8859-1
X-Server-Quench: 309e1576-f515-11e1-97bb-002264978518
X-AuthReport-Spam: If SPAM / abuse - report it at: http://www.authsmtp.com/abuse
X-AuthRoute: OCdxZQATClZNQRgX EAteCiNZVAwpPBRK HVkIKg5MOFUSTAAU Mk9RLVdJK3UEQkpF VCReGBUITgIzDi11 aRUyOEVCYEpPXRVr VkJNR1hTEUtqARUf DhsAUBh2aQRBfGFy YFljEDFTXjIgADMl QkxQEG8BK2RpYWRR U0BRIgJRdwVXKBpE b01+VSYMfGwANCh9 R1dpZW5tbGoPdX0F VDsAfxohYW8gPRMG fCVKFzQzGEQdDz44 JhpuL0MXHA4KNkIt PEFpRVIRNVcTDAFT DwlWCyZfYxEBTjEr OhhXVFQVEApCQDtc NR0hOR9/HDVWRycw
X-Authentic-SMTP: 61633235383639.1014:706
X-AuthFastPath: 0 (Was 255)
X-AuthSMTP-Origin: 209.85.214.174/25
X-AuthVirus-Status: No virus detected - but ensure you scan with your own anti-virus system.
X-Pg-Spam-Score: -2.6 (--)
X-Archive-Number: 201209/40
X-Sequence-Number: 213068

On 2 September 2012 16:26, Sergey Koposov <koposov@ast.cam.ac.uk> wrote:

After looking at them, I think I understand the reason -- the
number of n_distinct for crts.data is terribly wrong. In reality it should
be ~ 130 millions. I already faced this problem at certain point when doing
"group by objid" and PG was excausting all the memory, because of that wrong
estimate. But I know that it's a known hard problem to estimate n_distinct.

So probably that's the main reason of a problem...

That's why we support altering that value with an ALTER TABLE...ALTER
COLUMN DDL statement. You might at least consider increasing the
statistics target for the column first though, to see if that will
make the n_distinct value better accord with reality.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#8Sergey Koposov
koposov@ast.cam.ac.uk
In reply to: Sergey Koposov (#1)
Re: bitmap scan much slower than index scan, hash_search_with_hash_value

On Sun, 2 Sep 2012, Peter Geoghegan wrote:

On 2 September 2012 16:26, Sergey Koposov <koposov@ast.cam.ac.uk> wrote:

That's why we support altering that value with an ALTER TABLE...ALTER
COLUMN DDL statement. You might at least consider increasing the
statistics target for the column first though, to see if that will
make the n_distinct value better accord with reality.

Thanks, I've forgot about that possibility.

After fixing the n_distinct to the correct value the index scan became the
preferred option. and the row number estimate for the index scan became
more realistic

Sorry for the noise.
Sergey

*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/