Re: Lack of use of indexes

Started by Doug McNaughtover 23 years ago4 messagesgeneral

doug@mcnaught.org

over 23 years ago

Don Isgitt <djisgitt@soundenergy.com> writes:

Hello.

I have a question regarding the lack of use of indexes on a table; I
have included what I hope is all the relevant information. Your help
is appreciated.

Use "varchar" or "text" instead of "character" for your column types
and it should work. IIRC, "character" is treated slightly differently
from other text types in the query optimizer, and string constants
default to type "text".

As far as why the integer index isn't being used, I'm not sure. Does
it use the index if you change the WHERE clase to " section = '14' "?

-Doug

Import Notes

Reply to msg id not found: DonIsgitt'smessageofFri22Nov2002101138-0600Reference msg id not found: 3DDE573A.6030302@soundenergy.com

Stephan Szabo

sszabo@megazone23.bigpanda.com

over 23 years ago

In reply to: Doug McNaught (#1)

On Fri, 22 Nov 2002, Don Isgitt wrote:

gds2=# select count(*) from lg;
count
--------
138459
(1 row)

gds2=# explain select * from lg where state='NM';
NOTICE: QUERY PLAN:

Seq Scan on lg (cost=0.00..5885.77 rows=14890 width=73)

EXPLAIN

gds2=# explain select * from lg where section=14;
NOTICE: QUERY PLAN:

Seq Scan on lg (cost=0.00..5885.77 rows=3836 width=73)

My guess would be that if you turned off seq_scan (enable_seqscan=off)
and explained, you'd get a larger estimate for the cost of the index
scan. Assuming those row estimates are correct and the width is around
73 and that the data isn't very clustered, it's probably guessing that
it's going to be reading most of the datafile anyway and so the sequence
scan is faster. If it gives a higher estimate, but a lower real time with
enable_seqscan=off your data might be more clustered than it seems to be
expecting or maybe the default cost for random page reads is higher than
necessary on your machine (there are some settings in postgresql.conf that
you can play with)

Import Notes

Reply to msg id not found: 3DDE573A.6030302@soundenergy.com | Resolved by subject fallback

Doug McNaught

doug@mcnaught.org

over 23 years ago

In reply to: Stephan Szabo (#2)

Don Isgitt <djisgitt@soundenergy.com> writes:

Sequential scan is obviously not off; how do you turn it off? I tried
enable_seqscan=off at the psql prompt, but it did not like that. Is
that a postgresql.conf variable?

I think it's 'set enable_seqscan off' (or try '=off') at the prompt.
You can definitely change it on the floy.

-Doug

Import Notes

Reply to msg id not found: DonIsgitt'smessageofFri22Nov2002113419-0600

Don Isgitt

djisgitt@soundenergy.com

over 23 years ago

In reply to: Stephan Szabo (#2)

Hi Stephan. Thank you for your quick reply.

Pursuant to your suggestions, I tried the following.

gds2-# \set seqscan off
gds2=# explain select * from lg where section=14;
NOTICE: QUERY PLAN:

Seq Scan on lg (cost=0.00..5885.77 rows=3836 width=73)

EXPLAIN
gds2=# \set
VERSION = 'PostgreSQL 7.2 on i686-pc-linux-gnu, compiled by GCC 2.96'
DBNAME = 'gds2'
USER = 'djisgitt'
PORT = '5432'
ENCODING = 'SQL_ASCII'
PROMPT1 = '%/%R%# '
PROMPT2 = '%/%R%# '
PROMPT3 = '>> '
HISTSIZE = '500'
LASTOID = '0'
seqscan = 'off'

Sequential scan is obviously not off; how do you turn it off? I tried
enable_seqscan=off at the psql prompt, but it did not like that. Is that
a postgresql.conf variable?

Don

Stephan Szabo wrote:

Show quoted text

On Fri, 22 Nov 2002, Don Isgitt wrote:

gds2=# select count(*) from lg;
count
--------
138459
(1 row)

gds2=# explain select * from lg where state='NM';
NOTICE: QUERY PLAN:

Seq Scan on lg (cost=0.00..5885.77 rows=14890 width=73)

EXPLAIN

gds2=# explain select * from lg where section=14;
NOTICE: QUERY PLAN:

Seq Scan on lg (cost=0.00..5885.77 rows=3836 width=73)

My guess would be that if you turned off seq_scan (enable_seqscan=off)
and explained, you'd get a larger estimate for the cost of the index
scan. Assuming those row estimates are correct and the width is around
73 and that the data isn't very clustered, it's probably guessing that
it's going to be reading most of the datafile anyway and so the sequence
scan is faster. If it gives a higher estimate, but a lower real time with
enable_seqscan=off your data might be more clustered than it seems to be
expecting or maybe the default cost for random page reads is higher than
necessary on your machine (there are some settings in postgresql.conf that
you can play with)