optimizer cost calculation problem
Kenji Sugita has identified a problem with cost_sort() in costsize.c.
In the following code fragment, sortmembytes is defined as long. So
double nruns = nbytes / (sortmembytes * 2);
may cause an integer overflow if sortmembytes exceeds 2^30, which in
turn make optimizer to produce wrong query plan(this actually happned
in a large PostgreSQL installation which has tons of memory). Here is a
proposed fix against current:
*** optimizer/path/costsize.c 16 Feb 2003 02:30:38 -0000 1.107
--- optimizer/path/costsize.c 31 Mar 2003 22:25:06 -0000
***************
*** 490,496 ****
Cost startup_cost = input_cost;
Cost run_cost = 0;
double nbytes = relation_byte_size(tuples, width);
! long sortmembytes = SortMem * 1024L;
if (!enable_sort)
startup_cost += disable_cost;
--- 490,496 ----
Cost startup_cost = input_cost;
Cost run_cost = 0;
double nbytes = relation_byte_size(tuples, width);
! double sortmembytes = SortMem * 1024.0;
if (!enable_sort)
startup_cost += disable_cost;
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
Kenji Sugita has identified a problem with cost_sort() in costsize.c.
In the following code fragment, sortmembytes is defined as long. So
double nruns = nbytes / (sortmembytes * 2);
may cause an integer overflow if sortmembytes exceeds 2^30, which in
turn make optimizer to produce wrong query plan(this actually happned
in a large PostgreSQL installation which has tons of memory).
I find it really really hard to believe that it's wise to run with
sort_mem exceeding 2 gig ;-). Does that installation have so much
RAM that it can afford to run multiple many-Gb sorts concurrently?
This is far from being the only place that multiplies SortMem by 1024.
My inclination is that a safer fix is to alter guc.c's entry for
SortMem to establish a maximum value of INT_MAX/1024 for the variable.
Probably some of the other GUC variables like shared_buffers ought to
have overflow-related maxima established, too.
regards, tom lane
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
Kenji Sugita has identified a problem with cost_sort() in costsize.c.
In the following code fragment, sortmembytes is defined as long. So
double nruns = nbytes / (sortmembytes * 2);
may cause an integer overflow if sortmembytes exceeds 2^30, which in
turn make optimizer to produce wrong query plan(this actually happned
in a large PostgreSQL installation which has tons of memory).I find it really really hard to believe that it's wise to run with
sort_mem exceeding 2 gig ;-). Does that installation have so much
RAM that it can afford to run multiple many-Gb sorts concurrently?
The process is assigned 1 gig sort mem to speed up a batch job by
uisng backend-process-only sort mem setting, and they do not modify
postgresql.conf for ordinaly user.
BTW it does not 2 gig, but 1 gig (remember that we do sortmembytes *
2) .
Show quoted text
This is far from being the only place that multiplies SortMem by 1024.
My inclination is that a safer fix is to alter guc.c's entry for
SortMem to establish a maximum value of INT_MAX/1024 for the variable.Probably some of the other GUC variables like shared_buffers ought to
have overflow-related maxima established, too.regards, tom lane
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
BTW it does not 2 gig, but 1 gig (remember that we do sortmembytes *
2) .
Good point. Probably that particular calculation should be
"sortmembytes * 2.0" to force it to double before it can overflow.
But I still think we'd better limit SortMem so that the basic
SortMem*1024 calculation can't overflow (or even come close to overflow,
likely).
regards, tom lane
Tom wrote:
I find it really really hard to believe that it's wise to run with
sort_mem exceeding 2 gig ;-). Does that installation have so much
RAM that it can afford to run multiple many-Gb sorts concurrently?
I don't do 2 gig... but I found 0.3 gig helped on a not-too-large system.
In a nightly load of a datawarehousing application I have
a number of places where I do:
set sort_mem=300000;
create table new_whatevers as select distinct whatever from import_table;
set sort_mem=10000;
and
set sort_mem=100000;
select count(*),b from a group by b;
set sort_mem=10000;
when the optimizer likes hash aggregates.
It significantly (8 hours -> 6 hours) reduces the nightly processing
of my log-file analysis database. If my modest system benefited
from 1/3 gig, it wouldn't surprise me if a large system benefits
from 2 gig.
If more info's useful, I'd be happy to provide some.
Ron
On Mon, 31 Mar 2003, Tom Lane wrote:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
BTW it does not 2 gig, but 1 gig (remember that we do sortmembytes *
2) .Good point. Probably that particular calculation should be
"sortmembytes * 2.0" to force it to double before it can overflow.
But I still think we'd better limit SortMem so that the basic
SortMem*1024 calculation can't overflow (or even come close to overflow,
likely).
This isn't really an issue for 64 bit hardware is it?
"scott.marlowe" <scott.marlowe@ihs.com> writes:
This isn't really an issue for 64 bit hardware is it?
Is "int" 64 bits on such a machine? The ones I've dealt with chose to
set int = 32bits, long = 64bits. If they don't do that then they have
a problem with not having any native C 32bit type (and promoting short
int up to 32 bits just moves the problem...)
At some point we should probably try to regularize the backend code so
that all memory-size-related calcs are consistently done in size_t or
ssize_t arithmetic; but we're a long way from that at present. For now
I think it's prudent to keep sort_mem small enough to avoid overflow in
int32 arithmetic.
regards, tom lane
On Tue, Apr 01, 2003 at 11:24:01AM -0500, Tom Lane wrote:
"scott.marlowe" <scott.marlowe@ihs.com> writes:
This isn't really an issue for 64 bit hardware is it?
Is "int" 64 bits on such a machine? The ones I've dealt with chose to
It was for the regression tests we did recently on a 64-bit
gcc-3.2-compiled Solaris 8 binary. Piles of regression failures
because int was not 32 bits.
A
--
----
Andrew Sullivan 204-4141 Yonge Street
Liberty RMS Toronto, Ontario Canada
<andrew@libertyrms.info> M2P 2A8
+1 416 646 3304 x110