Sequential Scan Index Bug

Started by Gabriel Weinbergabout 22 years ago9 messagesbugs
Jump to latest
#1Gabriel Weinberg
yegg@alum.mit.edu

I have a table with an integer column with about 10M rows in it.

This column has an index (btree).

When I try to select a row using this column with an integer, e.g. select *
from table where id=4, it always uses the index. However, if I select try
to select a row using this column with a decimal, e.g. select * from table
where id=4.343, it skips the index entirely and does a sequential scan of
the table.

I am using v7.4.2 on Freebsd 4.9.

Gabriel

_________________
Gabriel Weinberg
yegg@alum.mit.edu

#2Bruno Wolff III
bruno@wolff.to
In reply to: Gabriel Weinberg (#1)
Re: Sequential Scan Index Bug

On Sat, Apr 03, 2004 at 13:51:56 -0500,
Gabriel Weinberg <yegg@alum.mit.edu> wrote:

I have a table with an integer column with about 10M rows in it.

This column has an index (btree).

When I try to select a row using this column with an integer, e.g. select *
from table where id=4, it always uses the index. However, if I select try
to select a row using this column with a decimal, e.g. select * from table
where id=4.343, it skips the index entirely and does a sequential scan of
the table.

I am using v7.4.2 on Freebsd 4.9.

Depending on what you want to do, you probably either want to cast the
value to an int explicitly or combine that with a test (using a stable
function) to make sure the number is actually an integer.

#3Gabriel Weinberg
yegg@alum.mit.edu
In reply to: Bruno Wolff III (#2)
Re: Sequential Scan Index Bug

Yes, I thought I had done that, but now that I figured out what was going
on, I did it for all cases. So it is no longer occurring for me, but it
still seems like a bug in PostgreSQL. I would expect it to throw an error
immediately, instead of scanning the table for a value of a different type.
In my case, the table is huge, so it really put a hamper on the system.

Gabriel

_________________
Gabriel Weinberg
yegg@alum.mit.edu

-----Original Message-----
From: Bruno Wolff III [mailto:bruno@wolff.to]
Sent: Wednesday, April 07, 2004 1:38 AM
To: Gabriel Weinberg
Cc: pgsql-bugs@postgresql.org
Subject: Re: Sequential Scan Index Bug

On Sat, Apr 03, 2004 at 13:51:56 -0500,
Gabriel Weinberg <yegg@alum.mit.edu> wrote:

I have a table with an integer column with about 10M rows in it.

This column has an index (btree).

When I try to select a row using this column with an integer, e.g.
select * from table where id=4, it always uses the index. However, if
I select try to select a row using this column with a decimal, e.g.
select * from table where id=4.343, it skips the index entirely and
does a sequential scan of the table.

I am using v7.4.2 on Freebsd 4.9.

Depending on what you want to do, you probably either want to cast the value
to an int explicitly or combine that with a test (using a stable
function) to make sure the number is actually an integer.

#4Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Gabriel Weinberg (#3)
Re: Sequential Scan Index Bug

On Wed, 7 Apr 2004, Gabriel Weinberg wrote:

Yes, I thought I had done that, but now that I figured out what was going
on, I did it for all cases. So it is no longer occurring for me, but it
still seems like a bug in PostgreSQL. I would expect it to throw an error
immediately, instead of scanning the table for a value of a different type.

But what if you said id = 4.0? Would you want it to find the id=4 row?

#5Gabriel Weinberg
yegg@alum.mit.edu
In reply to: Stephan Szabo (#4)
Re: Sequential Scan Index Bug

Presumably, but that is not what I was doing.

I was taking a number from a user, which was supposed to be divisible by an
integer. Sometimes the user left off the last digit when typing in the
number or otherwise typed it in wrong, rendering the input not divisible by
that number.

So it was looking for something like 4.345.

Gabriel

_________________
Gabriel Weinberg
yegg@alum.mit.edu

-----Original Message-----
From: Stephan Szabo [mailto:sszabo@megazone.bigpanda.com]
Sent: Wednesday, April 07, 2004 11:35 AM
To: Gabriel Weinberg
Cc: 'Bruno Wolff III'; pgsql-bugs@postgresql.org
Subject: Re: [BUGS] Sequential Scan Index Bug

On Wed, 7 Apr 2004, Gabriel Weinberg wrote:

Yes, I thought I had done that, but now that I figured out what was
going on, I did it for all cases. So it is no longer occurring for
me, but it still seems like a bug in PostgreSQL. I would expect it to
throw an error immediately, instead of scanning the table for a value
of a different type.

But what if you said id = 4.0? Would you want it to find the id=4 row?

#6Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Gabriel Weinberg (#5)
Re: Sequential Scan Index Bug

On Wed, 7 Apr 2004, Gabriel Weinberg wrote:

Presumably, but that is not what I was doing.

I was responding to the part that was: "I would expect it to throw an
error immediately, instead of scanning the table for a value of a
different type." If say intcol = 4.345 is an error, is intcol = 4.0 an
error as well given that 4.345 and 4.0 are presumably the same type? I'm
not sure what error you would expect.

#7Gabriel Weinberg
yegg@alum.mit.edu
In reply to: Stephan Szabo (#6)
Re: Sequential Scan Index Bug

I would expect if I did intcol = 4.35 or intcol = 'abc', it would throw a
type mismatch error.

Now whether you want to be clever and make intcol = 4.0 not throw an error
and instead convert it to intcol = 4 is another thing. In that case, I
still wouldn't scan the table. And now that I think of it, 4 is different
than 4.0 in terms of precision. If you are querying an intcol, maybe that
doesn't matter, but I would probably err on the side of precaution and throw
a type mismatch error as well.

Gabriel

_________________
Gabriel Weinberg
yegg@alum.mit.edu

-----Original Message-----
From: Stephan Szabo [mailto:sszabo@megazone.bigpanda.com]
Sent: Wednesday, April 07, 2004 12:58 PM
To: Gabriel Weinberg
Cc: 'Bruno Wolff III'; pgsql-bugs@postgresql.org
Subject: RE: [BUGS] Sequential Scan Index Bug

On Wed, 7 Apr 2004, Gabriel Weinberg wrote:

Presumably, but that is not what I was doing.

I was responding to the part that was: "I would expect it to throw an error
immediately, instead of scanning the table for a value of a different type."
If say intcol = 4.345 is an error, is intcol = 4.0 an error as well given
that 4.345 and 4.0 are presumably the same type? I'm not sure what error
you would expect.

#8Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Gabriel Weinberg (#7)
Re: Sequential Scan Index Bug

On Wed, 7 Apr 2004, Gabriel Weinberg wrote:

I would expect if I did intcol = 4.35 or intcol = 'abc', it would throw a
type mismatch error.

Well, in practice, the former is AFAICS required to do something "right"
by the SQL spec because it explicitly states that all all numbers are
mutually comparable, so erroring would technically be against spec. It's
possible that we could do something more intelligent than the current
behavior for that case but I can't come up with a particularly good
choice that wouldn't have bad effects elsewhere.

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephan Szabo (#8)
Re: Sequential Scan Index Bug

Stephan Szabo <sszabo@megazone.bigpanda.com> writes:

It's possible that we could do something more intelligent than the current
behavior for that case but I can't come up with a particularly good
choice that wouldn't have bad effects elsewhere.

In theory we could recognize that "integer_column = 4.35" will yield a
constant false. If the expression were replaced by "false" during
constant folding then the planner would produce a short-circuited plan
that won't actually examine the table.

In practice, though, I don't see any way to do that that wouldn't be a
horrendous kluge. I don't like putting special-case type-specific
knowledge into the planner; yet here we have knowledge that's not only
type-specific but specific to the combination of two different types.
Yech. I don't see any hope for a catalog-driven, extensible approach
for such things.

You'd also have to ask questions about whether the planner time spent
testing for such cases would really be a good investment...

regards, tom lane