Lack of use of indexes
Hello.
I have a question regarding the lack of use of indexes on a table; I
have included what I hope is all the relevant information. Your help is
appreciated.
Thank you.
Don
gds2=# create index lgtwn on lg (township);
CREATE
gds2=# create index lgrng on lg (range);
CREATE
gds2=# create index lgsec on lg (section);
CREATE
gds2=# create index lgst on lg (state);
CREATE
gds2=# analyze lg;
ANALYZE
gds2=# \d lg
Table "lg"
Column | Type | Modifiers
----------+--------------+-----------
state | character(2) | not null
county | text | not null
township | character(5) |
range | character(5) |
section | integer |
meridian | integer |
boundary | polygon |
Indexes: lgrng,
lgsec,
lgst,
lgtwn
gds2=# select count(*) from lg;
count
--------
138459
(1 row)
gds2=# explain select * from lg where state='NM';
NOTICE: QUERY PLAN:
Seq Scan on lg (cost=0.00..5885.77 rows=14890 width=73)
EXPLAIN
gds2=# explain select * from lg where section=14;
NOTICE: QUERY PLAN:
Seq Scan on lg (cost=0.00..5885.77 rows=3836 width=73)
EXPLAIN
gds2=# select state,count(*) from lg group by state;
state | count
-------+-------
KS | 36866
NM | 15201
OA | 13797
OK | 72595
(4 rows)
On Fri, 22 Nov 2002, Don Isgitt wrote:
Thanks, Doug for your very prompt reply. This newsgroup is wonderful.
It will take a while, but I will create the table with text fields to
see if that helps. The query with ...section='14' did not use the index.
Note that you can also "cast" the field you're searching on...
select * from sometable where field='14'::char(5)
Import Notes
Reply to msg id not found: 3DDE6645.7070405@soundenergy.com | Resolved by subject fallback
Don Isgitt <djisgitt@soundenergy.com> writes:
gds2=# select count(*) from lg;
count
--------
138459
(1 row)
gds2=# explain select * from lg where state='NM';
NOTICE: QUERY PLAN:
Seq Scan on lg (cost=0.00..5885.77 rows=14890 width=73)
An indexscan is usually a poor choice for retrieving 10% of the data in
a table (unless the index order and physical order are highly
correlated, as for instance after a CLUSTER command).
If you don't think the planner is guessing correctly here, you can force
an indexscan (do "set enable_seqscan = off") ... but I'll bet it gets
slower.
For randomly-ordered rows the cutoff point for indexscan effectiveness
is surprisingly low --- typically around 1% of the rows.
regards, tom lane
Thanks, Doug for your very prompt reply. This newsgroup is wonderful.
It will take a while, but I will create the table with text fields to
see if that helps. The query with ...section='14' did not use the index.
Don
Doug McNaught wrote:
Show quoted text
Don Isgitt <djisgitt@soundenergy.com> writes:
Hello.
I have a question regarding the lack of use of indexes on a table; I
have included what I hope is all the relevant information. Your help
is appreciated.Use "varchar" or "text" instead of "character" for your column types
and it should work. IIRC, "character" is treated slightly differently
from other text types in the query optimizer, and string constants
default to type "text".As far as why the integer index isn't being used, I'm not sure. Does
it use the index if you change the WHERE clase to " section = '14' "?-Doug
Don Isgitt <djisgitt@soundenergy.com> writes:
gds2=# explain select * from lg where section=14;
NOTICE: QUERY PLAN:
Seq Scan on lg (cost=0.00..5885.77 rows=3836 width=73)
EXPLAIN
gds2=# set enable_seqscan=off;
SET VARIABLE
gds2=# explain select * from lg where section=14;
NOTICE: QUERY PLAN:
Index Scan using lgsec on lg (cost=0.00..12167.45 rows=3836 width=73)
EXPLAIN
Of course, the above only proves that the planner thinks the indexscan
will be slower ;-). You should try EXPLAIN ANALYZE to see how well the
planner estimates square up with reality ...
regards, tom lane
Import Notes
Reply to msg id not found: 3DDE7356.5010208@soundenergy.com
Thank you, Tom, for your reply.
As usual (from my observation of this newsgroup), you are quite correct,
as was Stephan. To wit,
gds2=# explain select * from lg where section=14;
NOTICE: QUERY PLAN:
Seq Scan on lg (cost=0.00..5885.77 rows=3836 width=73)
EXPLAIN
gds2=# set enable_seqscan=off;
SET VARIABLE
gds2=# explain select * from lg where section=14;
NOTICE: QUERY PLAN:
Index Scan using lgsec on lg (cost=0.00..12167.45 rows=3836 width=73)
EXPLAIN
I am surprised at the low cutoff percentage, but it is nice to know for
future reference.
Thank you also to Doug and Stephan for your help. I remain quite
pleasantly amazed at the quality of the software and of the support.
Don
Tom Lane wrote:
Show quoted text
Don Isgitt <djisgitt@soundenergy.com> writes:
gds2=# select count(*) from lg;
count
--------
138459
(1 row)gds2=# explain select * from lg where state='NM';
NOTICE: QUERY PLAN:Seq Scan on lg (cost=0.00..5885.77 rows=14890 width=73)
An indexscan is usually a poor choice for retrieving 10% of the data in
a table (unless the index order and physical order are highly
correlated, as for instance after a CLUSTER command).If you don't think the planner is guessing correctly here, you can force
an indexscan (do "set enable_seqscan = off") ... but I'll bet it gets
slower.For randomly-ordered rows the cutoff point for indexscan effectiveness
is surprisingly low --- typically around 1% of the rows.regards, tom lane
Don Isgitt <djisgitt@soundenergy.com> writes:
Smart planner...curiosity: what is that first number following cost= and
actual time=?
Startup time (effectively, the time to return the first row).
regards, tom lane
Import Notes
Reply to msg id not found: 3DDE8212.3050609@soundenergy.com
Quite so! Therefore,
gds2=# explain analyze select boundary from lg where section=14;
NOTICE: QUERY PLAN:
Index Scan using lgsec on lg (cost=0.00..12167.45 rows=3836 width=32)
(actual time=44.98..18325.87 rows=3759 loops=1)
Total runtime: 18344.06 msec
EXPLAIN
gds2=# set enable_seqscan=on;
SET VARIABLE
gds2=# explain analyze select boundary from lg where section=14;
NOTICE: QUERY PLAN:
Seq Scan on lg (cost=0.00..5885.77 rows=3836 width=32) (actual
time=0.24..2974.65 rows=3759 loops=1)
Total runtime: 2987.61 msec
EXPLAIN
Smart planner...curiosity: what is that first number following cost= and
actual time=?
Thank you again.
Don
Tom Lane wrote:
Show quoted text
Don Isgitt <djisgitt@soundenergy.com> writes:
gds2=# explain select * from lg where section=14;
NOTICE: QUERY PLAN:Seq Scan on lg (cost=0.00..5885.77 rows=3836 width=73)
EXPLAIN
gds2=# set enable_seqscan=off;
SET VARIABLE
gds2=# explain select * from lg where section=14;
NOTICE: QUERY PLAN:Index Scan using lgsec on lg (cost=0.00..12167.45 rows=3836 width=73)
EXPLAIN
Of course, the above only proves that the planner thinks the indexscan
will be slower ;-). You should try EXPLAIN ANALYZE to see how well the
planner estimates square up with reality ...regards, tom lane