Lack of use of indexes

Started by Don Isgittover 23 years ago8 messagesgeneral
Jump to latest
#1Don Isgitt
djisgitt@soundenergy.com

Hello.

I have a question regarding the lack of use of indexes on a table; I
have included what I hope is all the relevant information. Your help is
appreciated.

Thank you.

Don

gds2=# create index lgtwn on lg (township);
CREATE
gds2=# create index lgrng on lg (range);
CREATE
gds2=# create index lgsec on lg (section);
CREATE
gds2=# create index lgst on lg (state);
CREATE
gds2=# analyze lg;
ANALYZE

gds2=# \d lg
Table "lg"
Column | Type | Modifiers
----------+--------------+-----------
state | character(2) | not null
county | text | not null
township | character(5) |
range | character(5) |
section | integer |
meridian | integer |
boundary | polygon |
Indexes: lgrng,
lgsec,
lgst,
lgtwn

gds2=# select count(*) from lg;
count
--------
138459
(1 row)

gds2=# explain select * from lg where state='NM';
NOTICE: QUERY PLAN:

Seq Scan on lg (cost=0.00..5885.77 rows=14890 width=73)

EXPLAIN

gds2=# explain select * from lg where section=14;
NOTICE: QUERY PLAN:

Seq Scan on lg (cost=0.00..5885.77 rows=3836 width=73)

EXPLAIN

gds2=# select state,count(*) from lg group by state;
state | count
-------+-------
KS | 36866
NM | 15201
OA | 13797
OK | 72595
(4 rows)

#2scott.marlowe
scott.marlowe@ihs.com
In reply to: Don Isgitt (#1)
Re: Lack of use of indexes

On Fri, 22 Nov 2002, Don Isgitt wrote:

Thanks, Doug for your very prompt reply. This newsgroup is wonderful.

It will take a while, but I will create the table with text fields to
see if that helps. The query with ...section='14' did not use the index.

Note that you can also "cast" the field you're searching on...

select * from sometable where field='14'::char(5)

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Don Isgitt (#1)
Re: Lack of use of indexes

Don Isgitt <djisgitt@soundenergy.com> writes:

gds2=# select count(*) from lg;
count
--------
138459
(1 row)

gds2=# explain select * from lg where state='NM';
NOTICE: QUERY PLAN:

Seq Scan on lg (cost=0.00..5885.77 rows=14890 width=73)

An indexscan is usually a poor choice for retrieving 10% of the data in
a table (unless the index order and physical order are highly
correlated, as for instance after a CLUSTER command).

If you don't think the planner is guessing correctly here, you can force
an indexscan (do "set enable_seqscan = off") ... but I'll bet it gets
slower.

For randomly-ordered rows the cutoff point for indexscan effectiveness
is surprisingly low --- typically around 1% of the rows.

regards, tom lane

#4Don Isgitt
djisgitt@soundenergy.com
In reply to: Don Isgitt (#1)
Re: Lack of use of indexes

Thanks, Doug for your very prompt reply. This newsgroup is wonderful.

It will take a while, but I will create the table with text fields to
see if that helps. The query with ...section='14' did not use the index.

Don

Doug McNaught wrote:

Show quoted text

Don Isgitt <djisgitt@soundenergy.com> writes:

Hello.

I have a question regarding the lack of use of indexes on a table; I
have included what I hope is all the relevant information. Your help
is appreciated.

Use "varchar" or "text" instead of "character" for your column types
and it should work. IIRC, "character" is treated slightly differently
from other text types in the query optimizer, and string constants
default to type "text".

As far as why the integer index isn't being used, I'm not sure. Does
it use the index if you change the WHERE clase to " section = '14' "?

-Doug

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Don Isgitt (#1)
Re: Lack of use of indexes

Don Isgitt <djisgitt@soundenergy.com> writes:

gds2=# explain select * from lg where section=14;
NOTICE: QUERY PLAN:

Seq Scan on lg (cost=0.00..5885.77 rows=3836 width=73)

EXPLAIN
gds2=# set enable_seqscan=off;
SET VARIABLE
gds2=# explain select * from lg where section=14;
NOTICE: QUERY PLAN:

Index Scan using lgsec on lg (cost=0.00..12167.45 rows=3836 width=73)

EXPLAIN

Of course, the above only proves that the planner thinks the indexscan
will be slower ;-). You should try EXPLAIN ANALYZE to see how well the
planner estimates square up with reality ...

regards, tom lane

#6Don Isgitt
djisgitt@soundenergy.com
In reply to: Don Isgitt (#1)
Re: Lack of use of indexes

Thank you, Tom, for your reply.

As usual (from my observation of this newsgroup), you are quite correct,
as was Stephan. To wit,

gds2=# explain select * from lg where section=14;
NOTICE: QUERY PLAN:

Seq Scan on lg (cost=0.00..5885.77 rows=3836 width=73)

EXPLAIN
gds2=# set enable_seqscan=off;
SET VARIABLE
gds2=# explain select * from lg where section=14;
NOTICE: QUERY PLAN:

Index Scan using lgsec on lg (cost=0.00..12167.45 rows=3836 width=73)

EXPLAIN

I am surprised at the low cutoff percentage, but it is nice to know for
future reference.

Thank you also to Doug and Stephan for your help. I remain quite
pleasantly amazed at the quality of the software and of the support.

Don

Tom Lane wrote:

Show quoted text

Don Isgitt <djisgitt@soundenergy.com> writes:

gds2=# select count(*) from lg;
count
--------
138459
(1 row)

gds2=# explain select * from lg where state='NM';
NOTICE: QUERY PLAN:

Seq Scan on lg (cost=0.00..5885.77 rows=14890 width=73)

An indexscan is usually a poor choice for retrieving 10% of the data in
a table (unless the index order and physical order are highly
correlated, as for instance after a CLUSTER command).

If you don't think the planner is guessing correctly here, you can force
an indexscan (do "set enable_seqscan = off") ... but I'll bet it gets
slower.

For randomly-ordered rows the cutoff point for indexscan effectiveness
is surprisingly low --- typically around 1% of the rows.

regards, tom lane

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Don Isgitt (#1)
Re: Lack of use of indexes

Don Isgitt <djisgitt@soundenergy.com> writes:

Smart planner...curiosity: what is that first number following cost= and
actual time=?

Startup time (effectively, the time to return the first row).

regards, tom lane

#8Don Isgitt
djisgitt@soundenergy.com
In reply to: Don Isgitt (#1)
Re: Lack of use of indexes

Quite so! Therefore,

gds2=# explain analyze select boundary from lg where section=14;
NOTICE: QUERY PLAN:

Index Scan using lgsec on lg (cost=0.00..12167.45 rows=3836 width=32)
(actual time=44.98..18325.87 rows=3759 loops=1)
Total runtime: 18344.06 msec

EXPLAIN
gds2=# set enable_seqscan=on;
SET VARIABLE
gds2=# explain analyze select boundary from lg where section=14;
NOTICE: QUERY PLAN:

Seq Scan on lg (cost=0.00..5885.77 rows=3836 width=32) (actual
time=0.24..2974.65 rows=3759 loops=1)
Total runtime: 2987.61 msec

EXPLAIN

Smart planner...curiosity: what is that first number following cost= and
actual time=?

Thank you again.
Don

Tom Lane wrote:

Show quoted text

Don Isgitt <djisgitt@soundenergy.com> writes:

gds2=# explain select * from lg where section=14;
NOTICE: QUERY PLAN:

Seq Scan on lg (cost=0.00..5885.77 rows=3836 width=73)

EXPLAIN
gds2=# set enable_seqscan=off;
SET VARIABLE
gds2=# explain select * from lg where section=14;
NOTICE: QUERY PLAN:

Index Scan using lgsec on lg (cost=0.00..12167.45 rows=3836 width=73)

EXPLAIN

Of course, the above only proves that the planner thinks the indexscan
will be slower ;-). You should try EXPLAIN ANALYZE to see how well the
planner estimates square up with reality ...

regards, tom lane