group by query plan on already clustered index
On a table T with two fields, f1 and f2, the sql command
select count(f2), f1
from T
group by f1
result is seq scan followed by a sort on f1 (see the query plan below):
GroupAggregate (cost=21566127.88..22326004.09 rows=987621 width=8)
-> Sort (cost=21566127.88..21816127.88 rows=100000000 width=8)
Sort Key: pid
-> Seq Scan on tc (cost=0.00..1442478.00 rows=100000000 width=8)
However, the table is already has clustered index on f1. My question is
why doesn't
postgres simply scan the table and compute the group? Why it needs to sort
on f1 again?
How can I force postgres to use the clustered index for group by?
Thanks.
Sandeep
Hello
2013/7/25 Sandeep Gupta <gupta.sandeep@gmail.com>:
On a table T with two fields, f1 and f2, the sql command
select count(f2), f1
from T
group by f1result is seq scan followed by a sort on f1 (see the query plan below):
GroupAggregate (cost=21566127.88..22326004.09 rows=987621 width=8)
-> Sort (cost=21566127.88..21816127.88 rows=100000000 width=8)
Sort Key: pid
-> Seq Scan on tc (cost=0.00..1442478.00 rows=100000000 width=8)However, the table is already has clustered index on f1. My question is why
doesn't
postgres simply scan the table and compute the group? Why it needs to sort
on f1 again?
How can I force postgres to use the clustered index for group by?
PostgreSQL doesn't support index organized tables. Statement CLUSTER
doesn't ensure order of heap for ever. After any INSERT, UPDATE you
can lost a order.
So every time, you have to check order.
PostgreSQL 9.2 support a index only scan - but it is not used in your plan.
Regards
Pavel Stehule
Thanks.
Sandeep
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general