group by
Hi,
I notices a weird thing here.
version 7.2.1
on Solaris
table "test", has a field "state".
There are 4 "state" values, 1, 2, 3, 4.
select count(*) from test group by state;
took 11500 msec
but
select count(*) from test where state = 1;
select count(*) from test where state = 2;
select count(*) from test where state = 3;
select count(*) from test where state = 4;
total took 626 msec
Why ??
thanks,
kathy
On Fri, 3 Oct 2003, Kathy Zhu wrote:
Hi,
I notices a weird thing here.
version 7.2.1
on Solaristable "test", has a field "state".
There are 4 "state" values, 1, 2, 3, 4.select count(*) from test group by state;
took 11500 msecbut
select count(*) from test where state = 1;
select count(*) from test where state = 2;
select count(*) from test where state = 3;
select count(*) from test where state = 4;
total took 626 msecWhy ??
First thought was caching: the disk blocks are cached in memory after the first
qery so all the others just fetch from there.
Second thought: well I haven't really had it but indexes and sorting is sort of
sloshing around in my mind. Not sure how that applies to here since I can't
remember if the first would require the sort step having retrieved all the
tuples and the others would just use the index pages.
It is Friday though.
--
Nigel J. Andrews
When was the last time you vacuumed full?
Kathy Zhu wrote:
Show quoted text
Hi,
I notices a weird thing here.
version 7.2.1
on Solaristable "test", has a field "state".
There are 4 "state" values, 1, 2, 3, 4.select count(*) from test group by state;
took 11500 msecbut
select count(*) from test where state = 1;
select count(*) from test where state = 2;
select count(*) from test where state = 3;
select count(*) from test where state = 4;
total took 626 msecWhy ??
thanks,
kathy---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
On Fri, 3 Oct 2003, Kathy Zhu wrote:
Hi,
I notices a weird thing here.
version 7.2.1
on Solaristable "test", has a field "state".
There are 4 "state" values, 1, 2, 3, 4.select count(*) from test group by state;
took 11500 msecbut
select count(*) from test where state = 1;
select count(*) from test where state = 2;
select count(*) from test where state = 3;
select count(*) from test where state = 4;
total took 626 msec
I believe Solaris's qsort implementation (which is used for the group by)
has problems dealing with large numbers of similar values.
I think in later versions of pg our own qsort is used.
Kris Jurka
I did a vacuum and got the same result.
I think the problem lies in there is swapping going for groupby when there is a
large number of rows in the table, 5000 in this case.
I guess I have to use group by with caution.
thanks for the all the replies though,
kathy
Date: Fri, 03 Oct 2003 15:53:02 -0700
From: Dennis Gearon <gearond@fireserve.net>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4)
Gecko/20030624
Show quoted text
X-Accept-Language: en-us, ru, es-mx
To: Kathy Zhu <Kathy.Zhu@sun.com>
CC: pgsql-general@postgresql.org
Subject: Re: [GENERAL] group by
Content-Transfer-Encoding: 7bitWhen was the last time you vacuumed full?
Kathy Zhu wrote:
Hi,
I notices a weird thing here.
version 7.2.1
on Solaristable "test", has a field "state".
There are 4 "state" values, 1, 2, 3, 4.select count(*) from test group by state;
took 11500 msecbut
select count(*) from test where state = 1;
select count(*) from test where state = 2;
select count(*) from test where state = 3;
select count(*) from test where state = 4;
total took 626 msecWhy ??
thanks,
kathy---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
Import Notes
Resolved by subject fallback
On Fri, Oct 03, 2003 at 17:32:22 -0600,
Kathy Zhu <Kathy.Zhu@Sun.COM> wrote:
I did a vacuum and got the same result.
I think the problem lies in there is swapping going for groupby when there is a
large number of rows in the table, 5000 in this case.I guess I have to use group by with caution.
In 7.4 you will probably find the results more compatible.
For 7.3 and lower, the first case will use a sort to do a group by.
No sort will be done in the second example.
In 7.4 the groub by will use the new hash aggregate method and will
probably be significantly faster than the second way of doing things.
That was also one of my guesses why groupby takes longer, although it is not
mentioned in the doc.
thanks !!!
kathy
Date: Sat, 4 Oct 2003 11:23:48 -0500
From: Bruno Wolff III <bruno@wolff.to>
To: Kathy Zhu <Kathy.Zhu@sun.com>
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] group by
Mail-Followup-To: Kathy Zhu <Kathy.Zhu@Sun.COM>, pgsql-general@postgresql.org
Content-Disposition: inline
User-Agent: Mutt/1.5.4iOn Fri, Oct 03, 2003 at 17:32:22 -0600,
Kathy Zhu <Kathy.Zhu@Sun.COM> wrote:I did a vacuum and got the same result.
I think the problem lies in there is swapping going for groupby when there
is a
Show quoted text
large number of rows in the table, 5000 in this case.
I guess I have to use group by with caution.
In 7.4 you will probably find the results more compatible.
For 7.3 and lower, the first case will use a sort to do a group by.
No sort will be done in the second example.
In 7.4 the groub by will use the new hash aggregate method and will
probably be significantly faster than the second way of doing things.
Import Notes
Resolved by subject fallback
On Mon, Oct 06, 2003 at 09:23:05 -0600,
Kathy Zhu <Kathy.Zhu@sun.com> wrote:
That was also one of my guesses why groupby takes longer, although it is not
mentioned in the doc.
If you want to know how a query is being done, you can use explain
(or explain analyze if you want timing information) to see the
plan.