full table scan on 'select max(value) from table'?

Started by Sean Hardingabout 25 years ago3 messagesgeneral

sharding@dogcow.org

about 25 years ago

I have a table, 'mesg_headers', which holds headers from email
messages. Each message has a unique integer ID within the system,
'mesgid'. mesgid is the primary key for mesg_headers, so it has index
mesg_headers_pkey. This index is used if I do 'select * from mesg_headers
where mesgnum = whatever', but if I do 'select max(mesgnum) from
mesg_headers', I get a full table scan, which takes a long time (there are
currently over 370,000 rows). Explains:

email=# explain select * from mesg_headers where mesgnum = 100;
NOTICE: QUERY PLAN:

Index Scan using mesg_headers_pkey on mesg_headers (cost=0.00..4.99 rows=1 width=92)

EXPLAIN
email=# explain select max(mesgnum) from mesg_headers;
NOTICE: QUERY PLAN:

Aggregate (cost=80319.44..80319.44 rows=1 width=4)
-> Seq Scan on mesg_headers (cost=0.00..79392.55 rows=370755 width=4)

EXPLAIN

So is there anything I can do about this, or will max(mesgnum) never use an
index? I'm migrating this db from MySQL, where the same query returns almost
instantanously, so some of my code makes the assumption that it's a cheap
operation. I could work around it, but it would definitely be nicer to find
a way to just make it use an index.

Thanks.

sean

--
Sean Harding sharding@dogcow.org | "I am the captain and I have been told
http://www.dogcow.org/sean/ | that tomorrow we land and my ship has
| been sold." --Dar Williams

Randall F. Kern

randy@spoke.net

about 25 years ago

In reply to: Sean Harding (#1)

RE: full table scan on 'select max(value) from table'?

I've had this problem also (sounds like a bug to me).

My work around was a query like this:
select id from mytable order by id desc limit 1;

This query used the index, whereas the max(id) query did not.
-Randy

Show quoted text

-----Original Message-----
From: Sean Harding [mailto:sharding@dogcow.org]
Sent: Wednesday, March 28, 2001 8:42 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] full table scan on 'select max(value) from table'?

I have a table, 'mesg_headers', which holds headers from email
messages. Each message has a unique integer ID within the system,
'mesgid'. mesgid is the primary key for mesg_headers, so it has index
mesg_headers_pkey. This index is used if I do 'select * from
mesg_headers
where mesgnum = whatever', but if I do 'select max(mesgnum) from
mesg_headers', I get a full table scan, which takes a long
time (there are
currently over 370,000 rows). Explains:

email=# explain select * from mesg_headers where mesgnum = 100;
NOTICE: QUERY PLAN:

Index Scan using mesg_headers_pkey on mesg_headers
(cost=0.00..4.99 rows=1 width=92)

EXPLAIN
email=# explain select max(mesgnum) from mesg_headers;
NOTICE: QUERY PLAN:

Aggregate (cost=80319.44..80319.44 rows=1 width=4)
-> Seq Scan on mesg_headers (cost=0.00..79392.55
rows=370755 width=4)

EXPLAIN

So is there anything I can do about this, or will
max(mesgnum) never use an
index? I'm migrating this db from MySQL, where the same query
returns almost
instantanously, so some of my code makes the assumption that
it's a cheap
operation. I could work around it, but it would definitely be
nicer to find
a way to just make it use an index.

Thanks.

sean

--
Sean Harding sharding@dogcow.org | "I am the captain and I
have been told
http://www.dogcow.org/sean/ | that tomorrow we land
and my ship has
| been sold." --Dar Williams

---------------------------(end of
broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Import Notes

Resolved by subject fallback

Brent R. Matzelle

bmatzelle@yahoo.com

about 25 years ago

In reply to: Sean Harding (#1)

Re: full table scan on 'select max(value) from table'?

--- Sean Harding <sharding@dogcow.org> wrote:

I have a table, 'mesg_headers', which holds headers from email

messages. Each message has a unique integer ID within the
system,
'mesgid'. mesgid is the primary key for mesg_headers, so it
has index
mesg_headers_pkey. This index is used if I do 'select * from
mesg_headers
where mesgnum = whatever', but if I do 'select max(mesgnum)
from
mesg_headers', I get a full table scan, which takes a long
time (there are
currently over 370,000 rows). Explains:

email=# explain select * from mesg_headers where mesgnum =
100;
NOTICE: QUERY PLAN:

Index Scan using mesg_headers_pkey on mesg_headers
(cost=0.00..4.99 rows=1 width=92)

EXPLAIN
email=# explain select max(mesgnum) from mesg_headers;
NOTICE: QUERY PLAN:

Aggregate (cost=80319.44..80319.44 rows=1 width=4)
-> Seq Scan on mesg_headers (cost=0.00..79392.55
rows=370755 width=4)

EXPLAIN

So is there anything I can do about this, or will max(mesgnum)
never use an
index? I'm migrating this db from MySQL, where the same query
returns almost
instantanously, so some of my code makes the assumption that
it's a cheap
operation. I could work around it, but it would definitely be
nicer to find
a way to just make it use an index.

That functionality has been placed on the TODO list
(http://www.postgresql.org/docs/todo.html) under INDEXES.

Brent

__________________________________________________
Do You Yahoo!?
Get email at your own domain with Yahoo! Mail.
http://personal.mail.yahoo.com/?.refer=text