indexes and floats
Is this a bug that the index doesn't work on floats or is it a datatype
mismatch thing?
-----
template1=> create table foo (x float4, y float4);
CREATE
template1=> create index foox on foo(x);
CREATE
template1=> create index fooy on foo(y);
CREATE
template1=> insert into foo values(123.45,678.90);
INSERT 239376 1
template1=> insert into foo values(123.25,678.95);
INSERT 239377 1
template1=> explain select * from foo where x = 123.4;
NOTICE: QUERY PLAN:
Seq Scan on foo (cost=0.00 size=0 width=8)
EXPLAIN
template1=>
-----
Vince.
--
==========================================================================
Vince Vielhaber -- KA8CSH email: vev@michvhf.com flame-mail: /dev/null
# include <std/disclaimers.h> TEAM-OS2
Online Searchable Campground Listings http://www.camping-usa.com
"There is no outfit less entitled to lecture me about bloat
than the federal government" -- Tony Snow
==========================================================================
Is this a bug that the index doesn't work on floats or is it a
datatype mismatch thing?
Seq Scan on foo (cost=0.00 size=0 width=8)
Likely neither. Notice the cost of a sequential scan; it is zero. Why
bother doing an index scan?
You need at least two more things to happen to provoke Postgres into
using your index:
1) Have more entries in the table; rule of thumb is O(100) before an
index will be used.
2) Run vacuum after getting that many entries to make sure the optimizer
will know about the configuration of the tables and indices.
Good luck.
- Tom
On 03-Aug-98 Thomas G. Lockhart wrote:
Is this a bug that the index doesn't work on floats or is it a
datatype mismatch thing?
Seq Scan on foo (cost=0.00 size=0 width=8)Likely neither. Notice the cost of a sequential scan; it is zero. Why
bother doing an index scan?You need at least two more things to happen to provoke Postgres into
using your index:1) Have more entries in the table; rule of thumb is O(100) before an
index will be used.2) Run vacuum after getting that many entries to make sure the optimizer
will know about the configuration of the tables and indices.
Here's why I'm bringing it up. I ran vacuum on the database yesterday.
The table consists of city, state (both text), lon and lat (both float4).
The text index is working but not the floats. I have another table with
12,000 - 14,000 rows and I'm getting the same thing.
-----
campsites=> explain select lon from locations where lon = 83.5;
NOTICE: QUERY PLAN:
Seq Scan on locations (cost=7263.30 size=2 width=4)
EXPLAIN
campsites=> select count(*) from locations;
count
------
169797
(1 row)
campsites=>
-----
Vince.
--
==========================================================================
Vince Vielhaber -- KA8CSH email: vev@michvhf.com flame-mail: /dev/null
# include <std/disclaimers.h> TEAM-OS2
Online Searchable Campground Listings http://www.camping-usa.com
"There is no outfit less entitled to lecture me about bloat
than the federal government" -- Tony Snow
==========================================================================
Is this a bug that the index doesn't work on floats or is it a
datatype mismatch thing?
Seq Scan on foo (cost=0.00 size=0 width=8)The table consists of city, state (both text), lon and lat (both
float4). The text index is working but not the floats. I have another
table with 12,000 - 14,000 rows and I'm getting the same thing.
Hmm. You are right. I'm glad you insisted on pursuing this, but I might
have been able to see the problem right away if you had described it in
detail (e.g. "floats" is not clearly float4) and if your initial example
had actually been with float4 data (it showed a width of 8 which is
likely to be a float8) :/
Anyway, the problem seems to be specific to float4. The workaround is to
use float8 for now, and we'll look into fixing the problem for the next
release...
- Tom
Test cases from the regression database follow:
regression=> create table test (f4 float4);
CREATE
regression=> insert into test select seqno from bt_f8_heap;
INSERT 0 10000
regression=> create index ti on test using btree (f4);
CREATE
regression=> vacuum;
VACUUM
regression=> explain select f4 from test where f4 = 500.0;
NOTICE: QUERY PLAN:
Seq Scan on test (cost=394.00 size=1 width=4)
EXPLAIN
regression=> create table test8 as select seqno from bt_f8_heap;
SELECT
regression=> create index t8 on test8 using btree (seqno);
CREATE
regression=> vacuum;
VACUUM
regression=> explain select seqno from test8 where seqno = 500.0;
NOTICE: QUERY PLAN:
Index Scan using t8 on test8 (cost=2.05 size=1 width=8)
EXPLAIN
"Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:
The table consists of city, state (both text), lon and lat (both
float4). The text index is working but not the floats. I have another
table with 12,000 - 14,000 rows and I'm getting the same thing.
Anyway, the problem seems to be specific to float4. The workaround is to
use float8 for now, and we'll look into fixing the problem for the next
release...
Ah-hah, all of a sudden this looks *real* familiar. I bet it's because
pgsql is not noticing that "500.0" can be interpreted as a float4.
Let's try it.
play=> create table f8 (x float8);
CREATE
play=> \copy f8 from countup; // 0..999 in a text file
Successfully copied.
play=> create index f8_i on f8 using btree (x);
CREATE
play=> create table f4 (x float4);
CREATE
play=> \copy f4 from countup;
Successfully copied.
play=> create index f4_i on f4 using btree (x);
CREATE
play=> vacuum verbose analyze;
.... blah blah
play=> explain select x from f8 where x = 500;
NOTICE: QUERY PLAN:
Seq Scan on f8 (cost=40.00 size=100 width=8)
EXPLAIN
play=> explain select x from f8 where x = 500.0;
NOTICE: QUERY PLAN:
Index Scan using f8_i on f8 (cost=2.05 size=1 width=8)
EXPLAIN
play=> explain select x from f4 where x = 500;
NOTICE: QUERY PLAN:
Seq Scan on f4 (cost=40.00 size=100 width=4)
EXPLAIN
play=> explain select x from f4 where x = 500.0;
NOTICE: QUERY PLAN:
Seq Scan on f4 (cost=40.00 size=1 width=4)
EXPLAIN
play=> explain select x from f4 where x = 500 :: float4;
NOTICE: QUERY PLAN:
Index Scan using f4_i on f4 (cost=2.05 size=1 width=4)
EXPLAIN
play=> explain select x from f4 where x = 500.0 :: float4;
ERROR: parser_typecast: cannot cast this expression to type 'float4'
play=>
(This is with cvs sources of a few days ago.)
I see two different bugs here:
1. select ... where x = constant; is optimized to an index scan
only if the constant is of the exact type of the field x.
Apparently, the coercion of the constant to match the type of x
happens only after the optimizer decides it doesn't know what to do.
The coercion ought to happen *before* the optimizer runs.
(I have no idea whether this is a new bug caused by the recent
type-system changes, or whether it existed in 6.3.2 and earlier.)
2. Type coercion fails for "500.0 :: float4" (ditto for "500.0 :: float8"
btw). Presumably this is a simple localized bug in the parser or the type
coercion logic for floats.
I had previously complained of bug #1 in connection with OIDs;
with the present sources, "where oid = 123456" will not use
an index on OID, while "where oid = 123456::oid" will.
I will bet lunch (at the nearest McD's, I'm not rich ;-)) that
Vince Vielhaber's recent gripe about
select city from locations where lower(city) = lower('st. ignace');
failing to use an index
create index locations_city on locations(lower(city) text_ops);
is an artifact of the same sort of type-mismatch problem.
regards, tom lane
Import Notes
Reply to msg id not found: YourmessageofMon03Aug1998142838+000035C5C916.55BE0381@alumni.caltech.edu | Resolved by subject fallback
1. select ... where x = constant; is optimized to an index scan
only if the constant is of the exact type of the field x.
Apparently, the coercion of the constant to match the type of x
happens only after the optimizer decides it doesn't know what to do.
The coercion ought to happen *before* the optimizer runs.
(I have no idea whether this is a new bug caused by the recent
type-system changes, or whether it existed in 6.3.2 and earlier.)
I think the current sources Thomas has modified work better, perhaps?
2. Type coercion fails for "500.0 :: float4" (ditto for "500.0 :: float8"
btw). Presumably this is a simple localized bug in the parser or the type
coercion logic for floats.I had previously complained of bug #1 in connection with OIDs;
with the present sources, "where oid = 123456" will not use
an index on OID, while "where oid = 123456::oid" will.I will bet lunch (at the nearest McD's, I'm not rich ;-)) that
Vince Vielhaber's recent gripe about
select city from locations where lower(city) = lower('st. ignace');
failing to use an index
create index locations_city on locations(lower(city) text_ops);
is an artifact of the same sort of type-mismatch problem.regards, tom lane
--
Bruce Momjian | 830 Blythe Avenue
maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)
Tom Lane wrote:
I will bet lunch (at the nearest McD's, I'm not rich ;-)) that
Vince Vielhaber's recent gripe about
select city from locations where lower(city) = lower('st. ignace');
failing to use an index
create index locations_city on locations(lower(city) text_ops);
is an artifact of the same sort of type-mismatch problem.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
No. This is the result of using lower('st. ignace') - function:
optimizer considers clause as usable for index only for
constants and parameters!
We discussed this ~ month ago. lower('st. ignace') could be
replaced by parameter of PARAM_EXEC type (implemented
for subqueries) to be 1. considered by optimizer as index key
value, 2. evaluated _ONCE_ by executor.
As I mentioned before, lower('st. ignace') will be evaluated
FOR EACH tuple in SeqScan!!!
PARAM_EXEC was implemented to handle queries like this:
select * from A where A.x = (select max(B.y) from B)
- subquery will be executed once and index on A (x) will be
used (if exists).
Optimizer just rewrites this query as
select * from A where A.x = _parameter_
and stores information about _parameter_ in InitPlan of
execution plan.
Look:
vac=> explain select * from test where x = lower('a');
NOTICE: QUERY PLAN:
Seq Scan on test (cost=40.00 size=100 width=12)
^^^^^^^^
EXPLAIN
vac=> explain select * from test where x = (select lower('a'));
NOTICE: QUERY PLAN:
Index Scan using itest2 on test (cost=2.05 size=1 width=12)
^^^^^^^^^^
InitPlan
-> Result (cost=0.00 size=0 width=0)
Nevertheless,
vac=> explain select * from test where lower(x) = (select lower('a'));
NOTICE: IndexSelectivity: no key -1 in index 20305
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NOTICE: QUERY PLAN:
Seq Scan on test (cost=40.00 size=100 width=12)
^^^^^^^^
InitPlan
-> Result (cost=0.00 size=0 width=0)
- something is broken for functional indices...
Vadim
Ah-hah, all of a sudden this looks *real* familiar. I bet it's
because pgsql is not noticing that "500.0" can be interpreted as a
float4. Let's try it.Oh, you have nailed it! This is interesting because (probably) a query
likeselect f4 from t4 where f4 = 500.0;
is being automatically "upgraded" in the parser backend to
select f4 from t4 where float8(f4) = 500.0;
So, since there is no functional index float8(f4) on the table we cannot
use an existing index on f4 to advantage.This may be a result of my recent enhancements to the automatic type
coersion code. But I'm a little suprised that v6.3.x doesn't just
complain about a type mismatch but instead actually works. It may be
that the old code which converted constants using intermediate strings
worked (sort of) for this case. In general, the pre-enhancement code
_only_ tried to convert constants, and complained about type mismatches
when non-constants were involved.
Yes, it did numeric constants, I think.
--
Bruce Momjian | 830 Blythe Avenue
maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)
Import Notes
Reply to msg id not found: 35C69298.4EE5669@alumni.caltech.edu | Resolved by subject fallback
Ah-hah, all of a sudden this looks *real* familiar. I bet it's
because pgsql is not noticing that "500.0" can be interpreted as a
float4. Let's try it.
Oh, you have nailed it! This is interesting because (probably) a query
like
select f4 from t4 where f4 = 500.0;
is being automatically "upgraded" in the parser backend to
select f4 from t4 where float8(f4) = 500.0;
So, since there is no functional index float8(f4) on the table we cannot
use an existing index on f4 to advantage.
This may be a result of my recent enhancements to the automatic type
coersion code. But I'm a little suprised that v6.3.x doesn't just
complain about a type mismatch but instead actually works. It may be
that the old code which converted constants using intermediate strings
worked (sort of) for this case. In general, the pre-enhancement code
_only_ tried to convert constants, and complained about type mismatches
when non-constants were involved.
So far in the examples, imho this (new?) feature isn't a _fatal_
problem, because you want to handle the following query correctly (I'll
switch to an int column to make it clearer):
select i4 from t4 where i4 < 500.1;
Now, if we do the "optimizable thing" blindly, then we would transform
this to
select i4 from t4 where i4 < int4(500.1);
But of course this would produce the wrong result if the table contains
a value of 500. Perhaps something a bit different could be implemented,
but it probably wouldn't generalize very well with the extensible type
system.
So, is there a problem to fix, or just documentation to write? I've
already written a new "Type Conversion" chapter for the User's Guide,
but haven't mentioned this specific issue.
- Tom
"Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:
Oh, you have nailed it! This is interesting because (probably) a query
like
select f4 from t4 where f4 = 500.0;
is being automatically "upgraded" in the parser backend to
select f4 from t4 where float8(f4) = 500.0;
So, since there is no functional index float8(f4) on the table we cannot
use an existing index on f4 to advantage.
OK, that sounds plausible. But in my examples,
play=> explain select x from f8 where x = 500;
NOTICE: QUERY PLAN:
Seq Scan on f8 (cost=40.00 size=100 width=8)
Your explanation implies that here, the parser is converting to
select x from f8 where int4(x) = 500;
which is wrong for the same accuracy-loss reasons you cite later.
(And if that isn't what it's doing, what then?)
I think it would be a good idea if someone actually dug into this
and verified what's going on. I have found some other cases that
lead me to think there's more to this than we understand just yet.
With an index on an int4 field, I get
tree=> explain select * from marketorderhistory where sequenceNo = 140000;
NOTICE: QUERY PLAN:
Index Scan using marketorderhistory_sequenceno_i on marketorderhistory
(cost=2.05 size=1 width=100)
tree=> explain select * from marketorderhistory where sequenceNo > 140000;
NOTICE: QUERY PLAN:
Seq Scan on marketorderhistory (cost=63.38 size=449 width=100)
which doesn't look like it could be explained by parser type coercions.
Perhaps this one just indicates an omission from the list of
type-specific routines that can be used for index comparisons? If so,
maybe there are other omissions affecting the results for other types.
... you want to handle the following query correctly (I'll
switch to an int column to make it clearer):
select i4 from t4 where i4 < 500.1;
Now, if we do the "optimizable thing" blindly, then we would transform
this to
select i4 from t4 where i4 < int4(500.1);
But of course this would produce the wrong result if the table contains
a value of 500. Perhaps something a bit different could be implemented,
but it probably wouldn't generalize very well with the extensible type
system.
That's a good point. Still, it would be nice if the system had some
reasonable amount of smarts about the "primitive" types that the parser
has constant syntax for. In particular I think an automatic coercion of
an int constant to float where needed would be a reasonable thing to
expect. That's not happening now, see my example above.
So, is there a problem to fix, or just documentation to write?
This one is most certainly a bug:
play=> select x from f4 where x = 500.0 :: float4;
ERROR: parser_typecast: cannot cast this expression to type 'float4'
Beyond that, if I can force the right thing to happen by casting
the constant to the type of the field, then I can live with it.
I have seen a number of cases where the system wouldn't use an index
even with a cast, however, so I'm not a happy camper yet.
regards, tom lane
Import Notes
Reply to msg id not found: YourmessageofTue04Aug1998044824+000035C69298.4EE5669@alumni.caltech.edu | Resolved by subject fallback
On 04-Aug-98 Tom Lane wrote:
OK, that sounds plausible. But in my examples,
play=> explain select x from f8 where x = 500;
NOTICE: QUERY PLAN:
Seq Scan on f8 (cost=40.00 size=100 width=8)Your explanation implies that here, the parser is converting to
select x from f8 where int4(x) = 500;
which is wrong for the same accuracy-loss reasons you cite later.
(And if that isn't what it's doing, what then?)I think it would be a good idea if someone actually dug into this
and verified what's going on. I have found some other cases that
lead me to think there's more to this than we understand just yet.
With an index on an int4 field, I get
What file(s) are these decisions actually made in?
Vince.
--
==========================================================================
Vince Vielhaber -- KA8CSH email: vev@michvhf.com flame-mail: /dev/null
# include <std/disclaimers.h> TEAM-OS2
Online Searchable Campground Listings http://www.camping-usa.com
"There is no outfit less entitled to lecture me about bloat
than the federal government" -- Tony Snow
==========================================================================
Your explanation implies that here, the parser is converting to
select x from f8 where int4(x) = 500;
which is wrong for the same accuracy-loss reasons you cite later.
(And if that isn't what it's doing, what then?)
No, that isn't what it would do. The parser does know about a heirarchy
of built-in data types, and would not downgrade a float8 to an int4.
I've sent some e-mail about the new parser features, and have written
them up in doc/src/sgml/typeconv.sgml (but haven't generated a new html
version yet).
I think it would be a good idea if someone actually dug into this
and verified what's going on. I have found some other cases that
lead me to think there's more to this than we understand just yet.
I'm pretty sure that there is another thing happening, which is getting
in the way of using indices with your example using
where lowercase(col) = lowercase('const')
That's a good point. Still, it would be nice if the system had some
reasonable amount of smarts about the "primitive" types that the
parser has constant syntax for. In particular I think an automatic
coercion of an int constant to float where needed would be a
reasonable thing to expect. That's not happening now, see my example
above.
I agree with your points, but it already does exactly what you would
want. I don't see an example above illustrating this problem.
So, is there a problem to fix, or just documentation to write?
This one is most certainly a bug:
play=> select x from f4 where x = 500.0 :: float4;
ERROR: parser_typecast: cannot cast this expression to type 'float4'
Might be easy. I'll look at it... Oh, the current workaround is to
specify it as
where x = '500.0'::float4;
or
where x = float4(500.0);
Beyond that, if I can force the right thing to happen by casting
the constant to the type of the field, then I can live with it.
I have seen a number of cases where the system wouldn't use an index
even with a cast, however, so I'm not a happy camper yet.
Yeah, that was Vadim's example and he sez we need to add a bit of code
to get things working the way you'd expect.
- Tom
"Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:
No, that isn't what it would do. The parser does know about a heirarchy
of built-in data types, and would not downgrade a float8 to an int4.
OK, that's good. But if it knows that much, I'd expect it to be able
to *upgrade* an int4 to a float8 where appropriate. That's not
happening, as in my prior example:
play=> explain select x from f8 where x = 500.0;
NOTICE: QUERY PLAN:
Index Scan using f8_i on f8 (cost=2.05 size=1 width=8)play=> explain select x from f8 where x = 500;
NOTICE: QUERY PLAN:
Seq Scan on f8 (cost=40.00 size=100 width=8)
The thing that I find peculiar about all this is that the system doesn't
reject the query completely, as you'd expect it to if there were really
a type mismatch problem. It takes it and just does it a lot slower than
it should. It seems the type coercion will happen, but too late for the
optimizer to notice.
regards, tom lane
Import Notes
Reply to msg id not found: YourmessageofTue04Aug1998163754+000035C738E2.9C77CD5@alumni.caltech.edu | Resolved by subject fallback
The parser does know about a heirarchy
of built-in data types, and would not downgrade a float8 to an int4.OK, that's good. But if it knows that much, I'd expect it to be able
to *upgrade* an int4 to a float8 where appropriate. That's not
happening
Ah, but it _is_ happening. The problem lies later. Read on...
, as in my prior example:
play=> explain select x from f8 where x = 500;
NOTICE: QUERY PLAN:
Seq Scan on f8 (cost=40.00 size=100 width=8)It seems the type coercion will happen, but too late for the
optimizer to notice.
This is the specific case discussed by Vadim, and is, I believe,
directly related to the
select city from locations where lower(city) = lower('st. ignace');
example given by Vince.
The parser is converting this query to become
select x from f8 where x = float8(500);
The problem appears to be that the optimizer/executor does not know how
to evaluate the constant string once-only, and insists on doing a
sequential scan for some reason. Not really in the parser's hands.
Remember, the types are all set and the functions are all defined by the
time the actual optimizer/executor sees anything. There may be a slight
bit of "optimization" which happens in the parser, but only to shoehorn
SQL92 into the Postgres backend. Oh, there are some heuristics in the
parser to fudge the distinctions between char/varchar/text types to
allow reuse of some of the support code, but that is a Bad Thing in
principle.
The downside to putting more heuristics into the parser (as opposed to
upgrading the optimizer/executor to handle the case) is that it embeds
more assumptions about the _behavior_ of, for example, int4->float8
conversions and would reduce the flexibility of the extensible type
system.
istm that we should be focusing on Vadim's hints on what it would take
to use indices with function calls on constants...
- Tom
istm that we should be focusing on Vadim's hints on what it would take
to use indices with function calls on constants...
Looking at Vadim's note again, maybe it will be the parser's duty to
insert the PARAM_EXEC node; will need more details or some time to look
at it...
- Tom
"Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:
The parser is converting this query to become
select x from f8 where x = float8(500);
The problem appears to be that the optimizer/executor does not know how
to evaluate the constant string once-only, and insists on doing a
sequential scan for some reason.
Ah, it's finally starting to make some sense to me. What you're saying
is that this is a failure to do constant-folding.
Doing a sequential scan would be appropriate if the righthand side of
the comparison needed to be evaluated afresh at each row. If the
optimizer thinks that, then that explains a lot.
The question then is why the righthand side doesn't look like a
constant. I'd have expected that any expression not involving a table
attribute would be evaluated once (folded into a constant) before any
decisions are made on how to perform the scan. Is that reasonable, or
is there some aspect of SQL semantics that makes it wrong?
If it is supposed to be happening, could it be that float8() is for
some reason not marked as a safely foldable function?
While I'm asking dumb questions: are "float8(500)" and "500::float8"
treated differently? Actually, I can see that they are:
play=> explain select x from f8 where x = 500::float8;
NOTICE: QUERY PLAN:
Index Scan using f8_i on f8 (cost=2.05 size=1 width=8)
play=> explain select x from f8 where x = float8(500);
NOTICE: QUERY PLAN:
Seq Scan on f8 (cost=40.00 size=100 width=8)
But why? Is there a difference in semantics?
regards, tom lane
Import Notes
Reply to msg id not found: YourmessageofWed05Aug1998152938+000035C87A62.C11A66EF@alumni.caltech.edu | Resolved by subject fallback
The problem appears to be that the optimizer/executor does not know how
to evaluate the constant string once-only, and insists on doing a
sequential scan for some reason. Not really in the parser's hands.
Remember, the types are all set and the functions are all defined by the
time the actual optimizer/executor sees anything. There may be a slight
bit of "optimization" which happens in the parser, but only to shoehorn
SQL92 into the Postgres backend. Oh, there are some heuristics in the
parser to fudge the distinctions between char/varchar/text types to
allow reuse of some of the support code, but that is a Bad Thing in
principle.The downside to putting more heuristics into the parser (as opposed to
upgrading the optimizer/executor to handle the case) is that it embeds
more assumptions about the _behavior_ of, for example, int4->float8
conversions and would reduce the flexibility of the extensible type
system.istm that we should be focusing on Vadim's hints on what it would take
to use indices with function calls on constants...
I understand the problem now. There is code in the optimizer to handle
functions on constants, but it does not appear to be in a place where
index matching can use it.
I will be looking over the code in the next few weeks, but not sure if I
can figure it all out by then. I have added it to my personal TODO
list.
--
Bruce Momjian | 830 Blythe Avenue
maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)
"Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:
The parser is converting this query to become
select x from f8 where x = float8(500);The problem appears to be that the optimizer/executor does not know how
to evaluate the constant string once-only, and insists on doing a
sequential scan for some reason.Ah, it's finally starting to make some sense to me. What you're saying
is that this is a failure to do constant-folding.
Yep. I believe it happens in the executor, but doesn't appear to happen
in the optimizer at a time when it would be useful. You can create
functional indexes, and I think they are matched by the function, just
not constants.
Doing a sequential scan would be appropriate if the righthand side of
the comparison needed to be evaluated afresh at each row. If the
optimizer thinks that, then that explains a lot.
Yep. I think that is the issue, and index matching does not
pre-evaluate a function on a constant.
The question then is why the righthand side doesn't look like a
constant. I'd have expected that any expression not involving a table
attribute would be evaluated once (folded into a constant) before any
decisions are made on how to perform the scan. Is that reasonable, or
is there some aspect of SQL semantics that makes it wrong?If it is supposed to be happening, could it be that float8() is for
some reason not marked as a safely foldable function?While I'm asking dumb questions: are "float8(500)" and "500::float8"
treated differently? Actually, I can see that they are:play=> explain select x from f8 where x = 500::float8;
NOTICE: QUERY PLAN:
Index Scan using f8_i on f8 (cost=2.05 size=1 width=8)play=> explain select x from f8 where x = float8(500);
NOTICE: QUERY PLAN:
Seq Scan on f8 (cost=40.00 size=100 width=8)But why? Is there a difference in semantics?
Sure. In the :: case (or CAST (const AS type)), the parser actually
converts the type INSIDE the parser to the proper type. In the float8()
case, the value conversion is delayed until the executor.
I may be wrong in some of this, but that is what I think is happening.
--
Bruce Momjian | 830 Blythe Avenue
maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)
"Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:
The parser is converting this query to become
select x from f8 where x = float8(500);The problem appears to be that the optimizer/executor does not
know how
to evaluate the constant string once-only, and insists on doing a
sequential scan for some reason.Ah, it's finally starting to make some sense to me. What you're
saying
is that this is a failure to do constant-folding.
Yep. I believe it happens in the executor, but doesn't appear to
happen
in the optimizer at a time when it would be useful. You can create
functional indexes, and I think they are matched by the function, just
not constants.Doing a sequential scan would be appropriate if the righthand side
of
the comparison needed to be evaluated afresh at each row. If the
optimizer thinks that, then that explains a lot.Yep. I think that is the issue, and index matching does not
pre-evaluate a function on a constant.The question then is why the righthand side doesn't look like a
constant. I'd have expected that any expression not involving atable
attribute would be evaluated once (folded into a constant) before
any
decisions are made on how to perform the scan. Is that reasonable,
or
is there some aspect of SQL semantics that makes it wrong?
If it is supposed to be happening, could it be that float8() is for
some reason not marked as a safely foldable function?While I'm asking dumb questions: are "float8(500)" and "500::float8"
treated differently? Actually, I can see that they are:play=> explain select x from f8 where x = 500::float8;
NOTICE: QUERY PLAN:
Index Scan using f8_i on f8 (cost=2.05 size=1 width=8)play=> explain select x from f8 where x = float8(500);
NOTICE: QUERY PLAN:
Seq Scan on f8 (cost=40.00 size=100 width=8)But why? Is there a difference in semantics?
Sure. In the :: case (or CAST (const AS type)), the parser actually
converts the type INSIDE the parser to the proper type. In the
float8()
case, the value conversion is delayed until the executor.I may be wrong in some of this, but that is what I think is happening.
This would be logical to me because we allow function creation.
Consider, if someone creates an SQL function on a table being updated
that returns a count of all rows who's index matches an int. Which
would mean that an insert using that function to select rows could need
to be evaluated for each row.
Well, I guess what I'm trying to say is that the current behavior could
be desired/being used by someone. It could also be used as a substitute
for a select trigger.
Of course as always I could be completely wrong.
-DEJ
Show quoted text
-- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
Import Notes
Resolved by subject fallback
This would be logical to me because we allow function creation.
Consider, if someone creates an SQL function on a table being updated
that returns a count of all rows who's index matches an int. Which
would mean that an insert using that function to select rows could
need to be evaluated for each row.
Well, I guess what I'm trying to say is that the current behavior
could be desired/being used by someone.
As you point out, the current behavior allows functions with
side-effects to work, where if we change it then functions must be
side-effect-free. Most functions do not have side effects, though one
using the SPI interface for example might be prone to them.
We _could_ add a column attribute for each function which declares a
function as being "safe" or not.
- Tom