Possible bug in query planner when using DISTINCT

Started by Nick Howdenover 23 years ago2 messagesbugs
Jump to latest
#1Nick Howden
n.howden@eris.qinetiq.com

Using postgres 7.2.3 the query planner seems to lose interest in using
indexes when DISTINCT is used.

Sample query without DISTINCT:

SELECT
ps_7.id,
FROM
ps_7,
page
WHERE
page.id = ps_7.id
ORDER BY
ps_7.id DESC
LIMIT 5;

NOTICE: QUERY PLAN:

Limit (cost=0.00..33.22 rows=5 width=16)
-> Nested Loop (cost=0.00..230302.91 rows=34668 width=16)
-> Index Scan Backward using ps7_key on edsol_ps_page_7
(cost=0.00..62379.02 rows=34668 width=12)
-> Index Scan using page_key on edsol_page
(cost=0.00..4.83 rows=1 width=4)

SELECT
DISTINCT ps_7.id,
FROM
ps_7,
page
WHERE
page.id = ps_7.id
ORDER BY
ps_7.id DESC
LIMIT 5;

NOTICE: QUERY PLAN:

Limit (cost=11363.68..11363.93 rows=5 width=16)
-> Unique (cost=11363.68..11537.02 rows=3467 width=16)
-> Sort (cost=11363.68..11363.68 rows=34668 width=16)
-> Hash Join (cost=2422.35..8404.58 rows=34668 width=16)
-> Seq Scan on page (cost=0.00..3602.75
rows=82075 width=4)
-> Hash (cost=2182.68..2182.68 rows=34668 width=12)
-> Seq Scan on ps_7
(cost=0.00..2182.68 rows=34668 width=12)

which is considerably more expensive
--
Nick Howden - Senior IT Analyst
Trusted Information Management Group
Woodward Building, Room B105
QinetiQ Malvern Technology Park, WR14 3PS
Telephone 01684 895566, Fax 4303

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nick Howden (#1)
Re: Possible bug in query planner when using DISTINCT

Nick Howden <n.howden@eris.qinetiq.com> writes:

Using postgres 7.2.3 the query planner seems to lose interest in using
indexes when DISTINCT is used.

I don't think this is a bug. DISTINCT requires a sort step, so even
though you have a LIMIT, the planner has to plan on the basis of reading
the entire join table. The nestloop plan loses badly on that basis.

regards, tom lane