select distinct
I've noticed that adding an index on a column has no effect on "select
distinct" (index or no index, the entire table is sequence scanned; see
example below). Couldn't you just traverse the index to get the
distinct values?
Tim
keitt=# vacuum analyze test;
VACUUM
keitt=# select distinct a from test;
a
---
a
b
c
d
f
s
z
(7 rows)
keitt=# explain select distinct a from test;
NOTICE: QUERY PLAN:
Unique (cost=1.67..1.72 rows=2 width=12)
-> Sort (cost=1.67..1.67 rows=21 width=12)
-> Seq Scan on test (cost=0.00..1.21 rows=21 width=12)
EXPLAIN
keitt=# create index test_index on test (a);
CREATE
keitt=# explain select distinct a from test;
NOTICE: QUERY PLAN:
Unique (cost=1.67..1.72 rows=2 width=12)
-> Sort (cost=1.67..1.67 rows=21 width=12)
-> Seq Scan on test (cost=0.00..1.21 rows=21 width=12)
EXPLAIN
--
Timothy H. Keitt
National Center for Ecological Analysis and Synthesis
735 State Street, Suite 300, Santa Barbara, CA 93101
Phone: 805-892-2519, FAX: 805-892-2510
http://www.nceas.ucsb.edu/~keitt/
"Timothy H. Keitt" <keitt@nceas.ucsb.edu> writes:
Couldn't you just traverse the index to get the distinct values?
No, because the index doesn't contain commit status; you can't tell
which values are actually valid without visiting the main table.
7.0 does consider both indexscan | unique and seqscan | sort | unique
plans for this problem, but the explicit sort is usually faster
(according to the system's cost models, anyway).
regards, tom lane