space for optimalization: DISTINCT without index
Hello
I did some test and I can see so DISTINCT works well on indexed columns, but
is slow on derived tables without indexes. If I use without distinct group
by I get much better times.
SELECT DISTINCT a, b FROM tab
SELECT a,b FROM tab GROUP BY a, b.
Can You Explain it.
Thank You
Pavel Stehule
_________________________________________________________________
Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com.
http://www.msn.cz/
On Mon, Dec 12, 2005 at 18:35:07 +0100,
Pavel Stehule <pavel.stehule@hotmail.com> wrote:
Hello
I did some test and I can see so DISTINCT works well on indexed columns,
but is slow on derived tables without indexes. If I use without distinct
group by I get much better times.SELECT DISTINCT a, b FROM tab
SELECT a,b FROM tab GROUP BY a, b.Can You Explain it.
DISTINCT will require a sort step to remove duplicates. GROUP BY can use
either a sort or hash aggregate plan. If there are few distinct values,
the hash aggregate plan can be much faster.
Bruno Wolff III <bruno@wolff.to> writes:
DISTINCT will require a sort step to remove duplicates. GROUP BY can use
either a sort or hash aggregate plan. If there are few distinct values,
the hash aggregate plan can be much faster.
The DISTINCT code hasn't been revisited in a long time. One obstacle to
improving it is that it's very tightly intertwined with ORDER BY. While
fixing that might be just a Small Matter Of Programming, it's not clear
how to decouple them without breaking DISTINCT ON.
regards, tom lane