Slow query: bitmap scan troubles
Hi guys (and girls)
I've been banging my head over this for a few days now so if any of you kind
souls could take a minute to take a look at this I would be eternally
grateful.
I have a pretty straightforward query that is very slow by default, and
about 70 times faster when I set enable_bitmapscan=off. I would like to
convince the planner to use my lovely indexes.
The scenario is this; I have two tables, trade and position_effect. A trade
is a deal we do with somebody to exchange something for something else. It
has a time it was done, and is associated with a particular book for
accounting purposes. A position effect records changes to our position (e.g.
how much we have) of an particular asset. One trade can many position
effects (usually only 1,2 or 3)
For example, I do a trade of USD/GBP and I get two position effects, +1000
GBP and -1200USD
SCHEMA:
-------
The actual schema is a bit more complicated but I will put the important
parts here (if you think it important, the full schema for the two tables is
here: http://pastebin.com/6Y52aDFL):
CREATE TABLE trade
(
id bigserial NOT NULL,
time_executed timestamp with time zone NOT NULL,
id_book integer NOT NULL,
CONSTRAINT cons_trade_primary_key PRIMARY KEY (id),
)
CREATE INDEX idx_trade_id_book
ON trade
USING btree
(id_book, time_executed, id);
CREATE TABLE position_effect
(
id bigserial NOT NULL,
id_trade bigint NOT NULL,
id_asset integer NOT NULL,
quantity double precision NOT NULL,
CONSTRAINT cons_pe_primary_key PRIMARY KEY (id_trade, id_asset),
)
SETUP:
------
These tables are relatively large (~100 million rows in position effect).
The box is a pretty beastly affair with 512Mb of ram and 4x10 2Ghz cores.
The postgres configuration is here:
I am using a 64bit postgresql 9.2.1, hand compiled on a RedHat 6.2 box.
QUERY:
------
What I want to do is sum all of the position effects, for a particular asset
while joined to the trade table to filter for the time it was executed and
the book it was traded into:
SELECT sum(position_effect.quantity)
FROM trade, position_effect
WHERE trade.id = position_effect.id_trade
AND position_effect.id_asset = 1837
AND trade.time_executed >= '2012-10-28 00:00:00'
AND trade.id_book = 41
In this case there are only 11 rows that need to be summed. If I just let
postgres do its thing, that query takes 5000ms (Which when multiplied over
many books and assets gets very slow). I think this is because it is
bitmapping the whole position_effect table which is very large. If I disable
bitmap scans:
set enable_bitmapscan = off;
The query takes 43ms, and properly uses the indexes I have set up.
Slow version with bitmapscan enabled: http://explain.depesz.com/s/6I7
Fast version with bitmapscan disabled: http://explain.depesz.com/s/4MWG
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Bad form to reply to yourself I know but just check-reading that for the
third time I noticed two mistakes
- The box has 128Gb of ram, not 512Mb
- There is an additional constraint on the position_effect table (though I
don't think it matters for this discussion):
CONSTRAINT cons_pe_trade FOREIGN KEY (id_trade) REFERENCES trade (id)
Sorry to clog your inboxes further!
Regards,
Philip
-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of
postgresql@foo.me.uk
Sent: 04 December 2012 15:07
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Slow query: bitmap scan troubles
Hi guys (and girls)
I've been banging my head over this for a few days now so if any of you kind
souls could take a minute to take a look at this I would be eternally
grateful.
I have a pretty straightforward query that is very slow by default, and
about 70 times faster when I set enable_bitmapscan=off. I would like to
convince the planner to use my lovely indexes.
The scenario is this; I have two tables, trade and position_effect. A trade
is a deal we do with somebody to exchange something for something else. It
has a time it was done, and is associated with a particular book for
accounting purposes. A position effect records changes to our position (e.g.
how much we have) of an particular asset. One trade can many position
effects (usually only 1,2 or 3)
For example, I do a trade of USD/GBP and I get two position effects, +1000
GBP and -1200USD
SCHEMA:
-------
The actual schema is a bit more complicated but I will put the important
parts here (if you think it important, the full schema for the two tables is
here: http://pastebin.com/6Y52aDFL):
CREATE TABLE trade
(
id bigserial NOT NULL,
time_executed timestamp with time zone NOT NULL,
id_book integer NOT NULL,
CONSTRAINT cons_trade_primary_key PRIMARY KEY (id),
)
CREATE INDEX idx_trade_id_book
ON trade
USING btree
(id_book, time_executed, id);
CREATE TABLE position_effect
(
id bigserial NOT NULL,
id_trade bigint NOT NULL,
id_asset integer NOT NULL,
quantity double precision NOT NULL,
CONSTRAINT cons_pe_primary_key PRIMARY KEY (id_trade, id_asset),
)
SETUP:
------
These tables are relatively large (~100 million rows in position effect).
The box is a pretty beastly affair with 512Mb of ram and 4x10 2Ghz cores.
The postgres configuration is here:
I am using a 64bit postgresql 9.2.1, hand compiled on a RedHat 6.2 box.
QUERY:
------
What I want to do is sum all of the position effects, for a particular asset
while joined to the trade table to filter for the time it was executed and
the book it was traded into:
SELECT sum(position_effect.quantity)
FROM trade, position_effect
WHERE trade.id = position_effect.id_trade
AND position_effect.id_asset = 1837
AND trade.time_executed >= '2012-10-28 00:00:00'
AND trade.id_book = 41
In this case there are only 11 rows that need to be summed. If I just let
postgres do its thing, that query takes 5000ms (Which when multiplied over
many books and assets gets very slow). I think this is because it is
bitmapping the whole position_effect table which is very large. If I disable
bitmap scans:
set enable_bitmapscan = off;
The query takes 43ms, and properly uses the indexes I have set up.
Slow version with bitmapscan enabled: http://explain.depesz.com/s/6I7 Fast
version with bitmapscan disabled: http://explain.depesz.com/s/4MWG
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On Tue, Dec 4, 2012 at 12:06 PM, <postgresql@foo.me.uk> wrote:
Slow version with bitmapscan enabled: http://explain.depesz.com/s/6I7
Fast version with bitmapscan disabled: http://explain.depesz.com/s/4MWG
If you check the "fast" plan, it has a higher cost compared against
the "slow" plan.
The difference between cost estimation and actual cost of your
queries, under relatively precise row estimates, seems to suggest your
e_c_s or r_p_c aren't a reflection of your hardware's performance.
First, make sure caching isn't interfering with your results. Run each
query several times.
Then, if the difference persists, you may have to tweak
effective_cache_size first, maybe random_page_cost too, to better
match your I/O subsystem's actual performance
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On Tue, Dec 4, 2012 at 7:27 AM, Claudio Freire <klaussfreire@gmail.com> wrote:
On Tue, Dec 4, 2012 at 12:06 PM, <postgresql@foo.me.uk> wrote:
Slow version with bitmapscan enabled: http://explain.depesz.com/s/6I7
Fast version with bitmapscan disabled: http://explain.depesz.com/s/4MWGIf you check the "fast" plan, it has a higher cost compared against
the "slow" plan.The difference between cost estimation and actual cost of your
queries, under relatively precise row estimates, seems to suggest your
e_c_s or r_p_c aren't a reflection of your hardware's performance.
But the row estimates are not precise at the top of the join/filter.
It thinks there will 2120 rows, but there are only 11.
So it seems like there is a negative correlation between the two
tables which is not recognized.
First, make sure caching isn't interfering with your results. Run each
query several times.
If that is not how the production system works (running the same query
over and over) then you want to model the cold cache, not the hot one.
But in any case, the posted explains indicates that all buffers were
cached.
Cheers,
Jeff
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On Tue, Dec 4, 2012 at 2:22 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
On Tue, Dec 4, 2012 at 7:27 AM, Claudio Freire <klaussfreire@gmail.com> wrote:
On Tue, Dec 4, 2012 at 12:06 PM, <postgresql@foo.me.uk> wrote:
Slow version with bitmapscan enabled: http://explain.depesz.com/s/6I7
Fast version with bitmapscan disabled: http://explain.depesz.com/s/4MWGIf you check the "fast" plan, it has a higher cost compared against
the "slow" plan.The difference between cost estimation and actual cost of your
queries, under relatively precise row estimates, seems to suggest your
e_c_s or r_p_c aren't a reflection of your hardware's performance.But the row estimates are not precise at the top of the join/filter.
It thinks there will 2120 rows, but there are only 11.
Ah... I didn't spot that one...
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
But the row estimates are not precise at the top of the join/filter.
It thinks there will 2120 rows, but there are only 11.
Ah... I didn't spot that one...
Yes, you are right there - this is probably a slightly atypical query of
this sort actually, 2012 is a pretty good guess.
On Claudio's suggestion I have found lots more things to read up on and am
eagerly awaiting 6pm when I can bring the DB down and start tweaking. The
effective_work_mem setting is going from 6Gb->88Gb which I think will make
quite a difference.
I still can't quite wrap around my head why accessing an index is expected
to use more disk access than doing a bitmap scan of the table itself, but I
guess it does make a bit of sense if postgres assumes the table is more
likely to be cached.
It's all quite, quite fascinating :)
I'll let you know how it goes.
- Phil
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
But the row estimates are not precise at the top of the join/filter.
It thinks there will 2120 rows, but there are only 11.
Ah... I didn't spot that one...
Yes, you are right there - this is probably a slightly atypical query of
this sort actually, 2012 is a pretty good guess.
On Claudio's suggestion I have found lots more things to read up on and am
eagerly awaiting 6pm when I can bring the DB down and start tweaking. The
effective_work_mem setting is going from 6Gb->88Gb which I think will make
quite a difference.
I still can't quite wrap around my head why accessing an index is expected
to use more disk access than doing a bitmap scan of the table itself, but I
guess it does make a bit of sense if postgres assumes the table is more
likely to be cached.
It's all quite, quite fascinating :)
I'll let you know how it goes.
- Phil
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
But the row estimates are not precise at the top of the join/filter.
It thinks there will 2120 rows, but there are only 11.
So it seems like there is a negative correlation between the two tables
which is not recognized.
Yes, you are right there. I am only just beginning to understand how to
parse these explain reports.. As I mentioned above, I probably picked a bad
example to run that query on 11 is an unusually low number of results to get
back, a few thousand would be more normal.
Though that doesn't account for the 70x difference between the speed of the
two queries in actuality given a pretty similar expected speed (does it?).
It does go some way to explaining why a bad choice of plan was made.
Is there some nice bit of literature somewhere that explains what sort of
costs are associated with the different types of lookup? I have found bits
and bobs online but I still don't have a really clear idea in my head what
the difference is between a bitmap index scan and index only scan is, though
I can sort of guess I don't see why one would be considered more likely to
use the disk than the other.
On the 'slow' query (with the better predicted score)
First, make sure caching isn't interfering with your results. Run each
query several times.If that is not how the production system works (running the same query
over and over) then you want to model the cold cache, not the hot one.
But in any case, the posted explains indicates that all buffers were
cached.
We are in the rather pleasant situation here in that we are willing to spend
money on the box (up to a point, but quite a large point) to get it up to
the spec so that it should hardly ever need to touch the disk, the trick is
figuring out how to let our favourite database server know that.
I've just discovered pgtune and am having some fun with that too.
Cheers,
Phil
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
The difference between cost estimation and actual cost of your queries,
under relatively precise row estimates, seems to suggest your e_c_s or r_p_c
aren't a reflection of your hardware's performance.
Wow, so tweaking these has fixed it and then some. It now picks a slightly
different plan than the 'fast' one previously:
New super fast version with e_c_s 6GB->88Gb and r_p_c 2-> 1 (s_p_c 1->0.5):
http://explain.depesz.com/s/ECk
For reference:
Slow version with bitmapscan enabled: http://explain.depesz.com/s/6I7
Fast version with bitmapscan disabled: http://explain.depesz.com/s/4MWG
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On Tue, Dec 4, 2012 at 3:03 PM, <postgresql@foo.me.uk> wrote:
Though that doesn't account for the 70x difference between the speed of the
two queries in actuality given a pretty similar expected speed (does it?).
It does go some way to explaining why a bad choice of plan was made.
I still don't think it does. I still think the problem is the GUC settings.
The slow plan joins in a way that processes all 3M rows in both sides
of the join, and pg knows it.
The fast plan only processes 5k of them. And pg knows it. Why is it
choosing to process 3M rows?
If there's negative correlation, it only means less rows will be
produced, but the nested loop and and the right-hand index scan still
happens.
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On Tue, Dec 4, 2012 at 3:31 PM, Philip Scott <pscott@foo.me.uk> wrote:
r_p_c 2-> 1 (s_p_c 1->0.5):
Is this really necessary?
(looks like a no-op, unless your CPU is slow)
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Well, you don't need to put anything down. Most settings that change
planner decisions can be tuned on per-quey basis by issuing set commands in
given session. This should not affect other queries more than it is needed
to run query in the way planner chooses.
Best regards, Vitalii Tymchyshyn
2012/12/4 <postgresql@foo.me.uk>
But the row estimates are not precise at the top of the join/filter.
It thinks there will 2120 rows, but there are only 11.Ah... I didn't spot that one...
Yes, you are right there - this is probably a slightly atypical query of
this sort actually, 2012 is a pretty good guess.On Claudio's suggestion I have found lots more things to read up on and am
eagerly awaiting 6pm when I can bring the DB down and start tweaking. The
effective_work_mem setting is going from 6Gb->88Gb which I think will make
quite a difference.I still can't quite wrap around my head why accessing an index is expected
to use more disk access than doing a bitmap scan of the table itself, but I
guess it does make a bit of sense if postgres assumes the table is more
likely to be cached.It's all quite, quite fascinating :)
I'll let you know how it goes.
- Phil
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
--
Best regards,
Vitalii Tymchyshyn
Ah, okay - my reasoning was there's a big fancy-pants raid array behind it
that makes disk operations faster relative to CPU ones.
I'll test it and see if it actually makes any difference.
-----Original Message-----
From: Claudio Freire [mailto:klaussfreire@gmail.com]
Sent: 04 December 2012 18:33
To: Philip Scott
Cc: postgresql@foo.me.uk; postgres performance list
Subject: Re: [PERFORM] Slow query: bitmap scan troubles
On Tue, Dec 4, 2012 at 3:31 PM, Philip Scott <pscott@foo.me.uk> wrote:
r_p_c 2-> 1 (s_p_c 1->0.5):
Is this really necessary?
(looks like a no-op, unless your CPU is slow)
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Ah okay, thanks. I knew I could set various things but not
effective_work_mem (I tried reloading the edited config file but it didn't
seem to pick it up)
From: Vitalii Tymchyshyn [mailto:tivv00@gmail.com]
Sent: 04 December 2012 18:51
To: postgresql@foo.me.uk
Cc: postgres performance list
Subject: Re: [PERFORM] Slow query: bitmap scan troubles
Well, you don't need to put anything down. Most settings that change planner
decisions can be tuned on per-quey basis by issuing set commands in given
session. This should not affect other queries more than it is needed to run
query in the way planner chooses.
Best regards, Vitalii Tymchyshyn
2012/12/4 <postgresql@foo.me.uk>
But the row estimates are not precise at the top of the join/filter.
It thinks there will 2120 rows, but there are only 11.
Ah... I didn't spot that one...
Yes, you are right there - this is probably a slightly atypical query of
this sort actually, 2012 is a pretty good guess.
On Claudio's suggestion I have found lots more things to read up on and am
eagerly awaiting 6pm when I can bring the DB down and start tweaking. The
effective_work_mem setting is going from 6Gb->88Gb which I think will make
quite a difference.
I still can't quite wrap around my head why accessing an index is expected
to use more disk access than doing a bitmap scan of the table itself, but I
guess it does make a bit of sense if postgres assumes the table is more
likely to be cached.
It's all quite, quite fascinating :)
I'll let you know how it goes.
- Phil
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
--
Best regards,
Vitalii Tymchyshyn
Ah okay, thanks. I knew I could set various things but not
effective_work_mem (I tried reloading the edited config file but it didn't
seem to pick it up)
From: Vitalii Tymchyshyn [mailto:tivv00@gmail.com]
Sent: 04 December 2012 18:51
To: postgresql@foo.me.uk
Cc: postgres performance list
Subject: Re: [PERFORM] Slow query: bitmap scan troubles
Well, you don't need to put anything down. Most settings that change planner
decisions can be tuned on per-quey basis by issuing set commands in given
session. This should not affect other queries more than it is needed to run
query in the way planner chooses.
Best regards, Vitalii Tymchyshyn
2012/12/4 <postgresql@foo.me.uk>
But the row estimates are not precise at the top of the join/filter.
It thinks there will 2120 rows, but there are only 11.
Ah... I didn't spot that one...
Yes, you are right there - this is probably a slightly atypical query of
this sort actually, 2012 is a pretty good guess.
On Claudio's suggestion I have found lots more things to read up on and am
eagerly awaiting 6pm when I can bring the DB down and start tweaking. The
effective_work_mem setting is going from 6Gb->88Gb which I think will make
quite a difference.
I still can't quite wrap around my head why accessing an index is expected
to use more disk access than doing a bitmap scan of the table itself, but I
guess it does make a bit of sense if postgres assumes the table is more
likely to be cached.
It's all quite, quite fascinating :)
I'll let you know how it goes.
- Phil
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
--
Best regards,
Vitalii Tymchyshyn
On Tue, Dec 4, 2012 at 9:47 AM, <postgresql@foo.me.uk> wrote:
eagerly awaiting 6pm when I can bring the DB down and start tweaking. The
effective_work_mem setting is going from 6Gb->88Gb which I think will make
quite a difference.
I also wonder if increasing (say x10) of default_statistics_target or
just doing ALTER TABLE SET STATISTICS for particular tables will help.
It will make planned to produce more precise estimations. Do not
forget ANALYZE afer changing it.
I still can't quite wrap around my head why accessing an index is expected
to use more disk access than doing a bitmap scan of the table itself, but I
guess it does make a bit of sense if postgres assumes the table is more
likely to be cached.It's all quite, quite fascinating :)
I'll let you know how it goes.
- Phil
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
--
Sergey Konoplev
Database and Software Architect
http://www.linkedin.com/in/grayhemp
Phones:
USA +1 415 867 9984
Russia, Moscow +7 901 903 0499
Russia, Krasnodar +7 988 888 1979
Skype: gray-hemp
Jabber: gray.ru@gmail.com
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On Tue, Dec 4, 2012 at 9:47 AM, <postgresql@foo.me.uk> wrote:
But the row estimates are not precise at the top of the join/filter.
It thinks there will 2120 rows, but there are only 11.Ah... I didn't spot that one...
Yes, you are right there - this is probably a slightly atypical query of
this sort actually, 2012 is a pretty good guess.
What do the timings look like on a more realistic example?
On Claudio's suggestion I have found lots more things to read up on and am
eagerly awaiting 6pm when I can bring the DB down and start tweaking. The
effective_work_mem setting is going from 6Gb->88Gb which I think will make
quite a difference.
You can change effective_cache_size just in your own session, or do it
globally with a "reload" or SIGHUP, no need to bring down the server.
However, I don't think it will make much difference. Even though it
thinks it is hitting the index 14,085 times, that is still small
compared to the overall size of the table.
I still can't quite wrap around my head why accessing an index is expected
to use more disk access than doing a bitmap scan of the table itself,
It is only doing an bitmap scan of those parts of the table which
contain relevant data, and it is doing them in physical order, so it
thinks that much of the IO which it thinks it is going to do is
largely sequential.
but I
guess it does make a bit of sense if postgres assumes the table is more
likely to be cached.
Unfortunately, postgres's planner doesn't know anything about that.
From your "explain" I can see in hindsight that everything you needed
was cached, but that is not information that the planner can use
(currently). And I don't know if *everything* is cached, or if just
those particular blocks are because you already ran the same query
with the same parameters recently.
Also, your work_mem is pretty low given the amount of RAM you have.
work_mem = 1MB
I don't think the current planner attempts to take account of the fact
that a bitmap scan which overflows work_mem and so becomes "lossy" is
quite a performance set-back. Nor does it look like explain analyze
informs you of this happening. But maybe I'm just looking in the
wrong places.
Cheers,
Jeff
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On Tue, Dec 4, 2012 at 10:03 AM, <postgresql@foo.me.uk> wrote:
Though that doesn't account for the 70x difference between the speed of the
two queries in actuality given a pretty similar expected speed (does it?).
It kind of does. The expected speed is predicated on the number of
rows being 200 fold higher. If the number of rows actually was that
much higher, the two speeds might be closer together. That is why it
would be interesting to see a more typical case where the actual
number of rows is closer to the 2000 estimate.
But I am curious about how the cost estimate for the primary key look
up is arrived at:
Index Scan using cons_pe_primary_key on position_effect
(cost=0.00..42.96 rows=1 width=16)
There should be a random page for the index leaf page, and a random
page for the heap page. Since you set random_page_cost to 2, that
comes up to 4. Then there would be some almost negligible CPU costs.
Where the heck is the extra 38 cost coming from?
It does go some way to explaining why a bad choice of plan was made.
Is there some nice bit of literature somewhere that explains what sort of
costs are associated with the different types of lookup?
I've heard good things about Greg Smith's book, but I don't know if it
covers this particular thing.
Otherwise, I don't know of a good single place which is a tutorial
rather than a reference (or the code itself)
First, make sure caching isn't interfering with your results. Run each
query several times.If that is not how the production system works (running the same query
over and over) then you want to model the cold cache, not the hot one.
But in any case, the posted explains indicates that all buffers were
cached.
We are in the rather pleasant situation here in that we are willing to spend
money on the box (up to a point, but quite a large point) to get it up to
the spec so that it should hardly ever need to touch the disk, the trick is
figuring out how to let our favourite database server know that.
Well, that part is fairly easy. Make random_page_cost and
seq_page_cost much smaller than their defaults. Like, 0.04 and 0.03,
for example.
I think the *_page_cost should strictly an estimate of actually doing
IO, with a separate parameter to reflect likelihood of needing to do
the IO, like *_page_cachedness. But that isn't the way it is done
currently.
Cheers,
Jeff
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On Tue, Dec 4, 2012 at 3:42 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
(Regarding http://explain.depesz.com/s/4MWG, wrote)
But I am curious about how the cost estimate for the primary key look
up is arrived at:Index Scan using cons_pe_primary_key on position_effect
(cost=0.00..42.96 rows=1 width=16)There should be a random page for the index leaf page, and a random
page for the heap page. Since you set random_page_cost to 2, that
comes up to 4. Then there would be some almost negligible CPU costs.
Where the heck is the extra 38 cost coming from?
I now see where the cost is coming from. In commit 21a39de5809 (first
appearing in 9.2) the "fudge factor" cost estimate for large indexes
was increased by about 10 fold, which really hits this index hard.
This was fixed in commit bf01e34b556 "Tweak genericcostestimate's
fudge factor for index size", by changing it to use the log of the
index size. But that commit probably won't be shipped until 9.3.
I'm not sure that this change would fix your problem, because it might
also change the costs of the alternative plans in a way that
neutralizes things. But I suspect it would fix it. Of course, a
correct estimate of the join size would also fix it--you have kind of
a perfect storm here.
Cheers,
Jeff
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On Wed, Dec 5, 2012 at 2:39 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
I'm not sure that this change would fix your problem, because it might
also change the costs of the alternative plans in a way that
neutralizes things. But I suspect it would fix it. Of course, a
correct estimate of the join size would also fix it--you have kind of
a perfect storm here.
As far as I can see on the explain, the misestimation is 3x~4x not 200x.
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance