An issue with max() and order by ... limit 1 in postgresql8.3-beta3

Started by zxo102 ouyangabout 16 years ago8 messagesgeneral
Jump to latest
#1zxo102 ouyang
zxo102@gmail.com

Hi everyone,
I am using postgresql 8.3-beta3. I have a table 'test' with three fields:
sid data date
1 1.1 2009-09-01 1:00:00
1 2.1 2010-01-01 1:00:20
2 3.1 2009-09-01 1:00:10
2 0.1 2010-01-01 1:00:30

I create index for data field.
Each sid may have millions of rows.
I want to get maximum data value and corresponding "time" for each group of
sid. Here is my query:
########################
select t1.sid , max(t1.data)
(select t2.date
from test t2,
where t2.sid = t1.sid and
t2.date between '2009-08-01' and '2010-01-02' and
order by t2.data DESC limit 1
)
from test t1
where t1.date between '2009-08-01' and '2010-01-08' and
group by t1.sid
##########################
But when max() in postgresql may slow down the search when there are
millions of rows for each sid.
So I use " order by t2.data DESC limit 1" to find max:
########################
select t1.sid ,
(select t2.data
from test t2,
where t2.sid = t1.sid and
t2.date between '2009-08-01' and '2010-01-02' and
order by t2.data DESC limit 1
)
(select t2.date
from test t2,
where t2.sid = t1.sid and
t2.date between '2009-08-01' and '2010-01-02' and
order by t2.data DESC limit 1
)
from test t1
where t1.date between '2009-08-01' and '2010-01-08' and
group by t1.sid
##########################
The second query looks "strange" since similar search is done twice.
Because of two fields, the following can not be used directly in the above
query.

(select t2.date, t2.data
from test t2,
where t2.sid = t1.sid and
t2.date between '2009-08-01' and '2010-01-02' and
order by t2.data DESC limit 1
)

Any suggestions for the best way to get maximum data value and corresponding
"time" for each group of sid in my case?

Thanks a lot.

ouyang

In reply to: zxo102 ouyang (#1)
Re: An issue with max() and order by ... limit 1 in postgresql8.3-beta3

On 09/01/2010 16:43, zxo102 ouyang wrote:

Hi everyone,
I am using postgresql 8.3-beta3. I have a table 'test' with three fields:

Without meaning to sound unhelpful, why on earth are you using a beta
version when 8.3 was released *ages* ago and has had several bug-fix
updates since?

I'd look first in the release notes for the updates to see if there's
anything that addresses your problem, and then upgrade to the latest
release version at the first opportunity.

Ray.

--
Raymond O'Donnell :: Galway :: Ireland
rod@iol.ie

#3Andreas Kretschmer
akretschmer@spamfence.net
In reply to: zxo102 ouyang (#1)
Re: An issue with max() and order by ... limit 1 in postgresql8.3-beta3

zxo102 ouyang <zxo102@gmail.com> wrote:

Hi everyone,
I am using postgresql 8.3-beta3. I have a table 'test' with three fields:

I'm guessing you mean 8.4-beta3, right?

Any suggestions for the best way to get maximum data value and corresponding
"time" for each group of sid in my case?

Based on your data:

test=*# select * from test;
sid | data | date
-----+------+---------------------
1 | 1.1 | 2009-09-01 01:00:00
1 | 2.1 | 2010-01-01 01:00:20
2 | 3.1 | 2009-09-01 01:00:10
2 | 0.1 | 2010-01-01 01:00:30
(4 Zeilen)

Try:

Zeit: 0,227 ms
test=*# select distinct on (sid) sid, data, date from test order by sid, data desc, date;
sid | data | date
-----+------+---------------------
1 | 2.1 | 2010-01-01 01:00:20
2 | 3.1 | 2009-09-01 01:00:10
(2 Zeilen)

Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect. (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly." (unknown)
Kaufbach, Saxony, Germany, Europe. N 51.05082�, E 13.56889�

#4Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Andreas Kretschmer (#3)
Re: An issue with max() and order by ... limit 1 in postgresql8.3-beta3

Andreas Kretschmer wrote:

zxo102 ouyang <zxo102@gmail.com> wrote:

Hi everyone,
I am using postgresql 8.3-beta3. I have a table 'test' with three fields:

I'm guessing you mean 8.4-beta3, right?

either of those are unsuitable for any kind of production use...

Stefan

#5Andreas Kretschmer
akretschmer@spamfence.net
In reply to: Stefan Kaltenbrunner (#4)
Re: An issue with max() and order by ... limit 1 in postgresql8.3-beta3

Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote:

Andreas Kretschmer wrote:

zxo102 ouyang <zxo102@gmail.com> wrote:

Hi everyone, I am using postgresql 8.3-beta3. I have a table
'test' with three fields:

I'm guessing you mean 8.4-beta3, right?

either of those are unsuitable for any kind of production use...

Hey, we needs beta-testers, right? And yes, read again, the table is
called 'test' ...

Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect. (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly." (unknown)
Kaufbach, Saxony, Germany, Europe. N 51.05082�, E 13.56889�

#6Scott Marlowe
scott.marlowe@gmail.com
In reply to: Andreas Kretschmer (#5)
Re: An issue with max() and order by ... limit 1 in postgresql8.3-beta3

On Sat, Jan 9, 2010 at 2:46 PM, Andreas Kretschmer
<akretschmer@spamfence.net> wrote:

Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote:

Andreas Kretschmer wrote:

zxo102 ouyang <zxo102@gmail.com> wrote:

Hi everyone,    I am using postgresql 8.3-beta3. I have a table
'test' with three fields:

I'm guessing you mean 8.4-beta3, right?

either of those are unsuitable for any kind of production use...

Hey, we needs beta-testers, right? And yes, read again, the table is
called 'test' ...

True, but if you're gonna test betas / alphas, I'd think 8.5 alpha
would be the choice for testing. 8.4's beta ended quite some time
ago.

#7zxo102 ouyang
zxo102@gmail.com
In reply to: Scott Marlowe (#6)
Re: An issue with max() and order by ... limit 1 in postgresql8.3-beta3

Thanks for your guys' help. I did not know the meaning of betas / alphas
things before and just simply downloaded one to use. Now my 8.3 -beta3
version is really in production and get 100 rows of monitoring data per
minutes. So far so good. Anyway, I will upgrade it to a latest stable
version.
With Andreas's example query:
select distinct on (sid) sid, data, date from test order by sid, data
desc, date;
I rewrite my query. It just take 20 seconds to finish the search and is much
better than my old query which takes 400 seconds to return the results.
Thanks again for Andreas's example query. For reference, the following is
my new query (I create an index for two fileds
(rd.sensor_id,rd.sensor_channel)).

Thanks a lot gain.

Ouyang

#############################################
select rt_data.r_flowmeter_caliber as r_flowmeter_caliber,
rt_data.r_max01_sloc as r_max01_sloc,
rt_data.r_max01_sdata as r_max01_sdata,
rt_data.r_max01_sdate as r_max01_sdate,
rt_data.r_min01_sdata as r_min01_sdata,
rt_data.r_min01_sdate as r_min01_sdate,
rt_data.r_avg01_sdata as r_avg01_sdata,
acc_data.r_end_sdate as r_end_sdate,
acc_data.r_end_sdata as r_end_sdata,
acc_data.r_start_sdate as r_start_sdate,
acc_data.r_start_sdata as r_start_sdata,
acc_data.r_acc_sdata as r_acc_sdata
from ( select ec.flowmeter_caliber as r_flowmeter_caliber,
max01.r_sloc as r_max01_sloc,
round(max01.r_sdata*100)/100 as r_max01_sdata,
max01.r_sdate as r_max01_sdate,
round(min01.r_sdata*100)/100 as r_min01_sdata,
min01.r_sdate as r_min01_sdate,
round(avg01.r_sdata*100)/100 as r_avg01_sdata,
max01.r_channel as r_channel,
max01.r_sid as r_sid,
max01.r_sloc as r_sloc
from (select distinct on (rd.sensor_id,rd.sensor_channel)
rd.sensor_id as r_sid,
rd.sensor_channel as r_channel,
rd.sensor_data as r_sdata,
rd.sensor_date as r_sdate,
sc.external_ins as r_sloc
from record_data rd, sensor_cfg sc,
energy_classification02 ec
where rd.sensor_date between '2009-08-01'
and '2010-01-08' and
sc.sensor_id = rd.sensor_id and
sc.external_ins=ec.measure_name and
sc.channel = ec.instantaneous_channel and
sc.channel = rd.sensor_channel and
sc.remarks='瞬时值' and
ec.flowmeter_caliber='流量'
order by rd.sensor_id,rd.sensor_channel,
rd.sensor_data DESC, rd.sensor_date
) max01,
( select distinct on (rd.sensor_id,rd.sensor_channel) rd.sensor_id
as r_sid,
rd.sensor_channel as r_channel,
rd.sensor_data as r_sdata,
rd.sensor_date as r_sdate,
sc.external_ins as r_sloc
from record_data rd, sensor_cfg sc,
energy_classification02 ec
where rd.sensor_date between '2009-08-01'
and '2010-01-08' and
sc.sensor_id = rd.sensor_id and
sc.external_ins=ec.measure_name and
sc.channel = ec.instantaneous_channel and
sc.channel = rd.sensor_channel and
sc.remarks='瞬时值' and
ec.flowmeter_caliber='流量'
order by rd.sensor_id,rd.sensor_channel,
rd.sensor_data ASC, rd.sensor_date
) min01,
( select avg(rd01.sensor_data) as r_sdata,
rd01.sensor_id as r_sid,
rd01.sensor_channel as r_channel
from record_data rd01,
sensor_cfg sc,
energy_classification02 ec
where rd01.sensor_date between '2009-08-01' and '2010-01-08' and
sc.sensor_id = rd01.sensor_id and
sc.external_ins=ec.measure_name and
sc.channel = ec.instantaneous_channel and
sc.channel=rd01.sensor_channel and
sc.remarks='瞬时值' and
ec.flowmeter_caliber='流量'
group by rd01.sensor_id,rd01.sensor_channel
) avg01,
energy_classification02 ec,
sensor_cfg sc
where max01.r_sid=min01.r_sid and
min01.r_sid=avg01.r_sid and
max01.r_sid=sc.sensor_id and
sc.channel = ec.instantaneous_channel and
sc.channel= min01.r_channel and
sc.channel= max01.r_channel and
sc.channel=avg01.r_channel and
sc.external_ins=ec.measure_name and
sc.remarks='瞬时值' and
ec.flowmeter_caliber='流量'
) rt_data,
(select round(max01.r_sdata-min01.r_sdata)*100/100 as r_acc_sdata,
max01.r_sid as r_sid,
max01.r_sloc as r_sloc,
max01.r_sdate as r_end_sdate,
max01.r_sdata as r_end_sdata,
min01.r_sdate as r_start_sdate,
min01.r_sdata as r_start_sdata
from
(select distinct on (rd.sensor_id,rd.sensor_channel) rd.sensor_id
as r_sid,
rd.sensor_channel as r_channel,
rd.sensor_data as r_sdata,
rd.sensor_date as r_sdate,
sc.external_ins as r_sloc
from record_data rd, sensor_cfg sc,
energy_classification02 ec
where rd.sensor_date between '2009-08-01'
and '2010-01-08' and
sc.sensor_id = rd.sensor_id and
sc.external_ins=ec.measure_name and
sc.channel = ec.cumulative_channel and
sc.channel = rd.sensor_channel and
sc.remarks='累积值' and
ec.flowmeter_caliber='流量'
order by rd.sensor_id,rd.sensor_channel,
rd.sensor_data DESC, rd.sensor_date
) max01,
(select distinct on (rd.sensor_id,rd.sensor_channel) rd.sensor_id
as r_sid,
rd.sensor_channel as r_channel,
rd.sensor_data as r_sdata,
rd.sensor_date as r_sdate,
sc.external_ins as r_sloc
from record_data rd, sensor_cfg sc,
energy_classification02 ec
where rd.sensor_date between '2009-08-01' and
'2010-01-08' and
sc.sensor_id = rd.sensor_id and
sc.external_ins=ec.measure_name and
sc.channel = ec.cumulative_channel and
sc.channel = rd.sensor_channel and
sc.remarks='累积值' and
ec.flowmeter_caliber='流量'
order by rd.sensor_id,rd.sensor_channel,
rd.sensor_data ASC, rd.sensor_date
) min01,
energy_classification02 ec,
sensor_cfg sc
where max01.r_sid=min01.r_sid and
max01.r_sid=sc.sensor_id and
sc.channel = ec.cumulative_channel and
sc.channel= min01.r_channel and
sc.channel=max01.r_channel and
sc.external_ins=ec.measure_name and
sc.remarks='累积值' and
ec.flowmeter_caliber='流量'
) acc_data
where acc_data.r_sloc = rt_data.r_sloc
order by r_max01_sloc desc
##########################################

2010/1/10 Scott Marlowe <scott.marlowe@gmail.com>

Show quoted text

On Sat, Jan 9, 2010 at 2:46 PM, Andreas Kretschmer
<akretschmer@spamfence.net> wrote:

Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote:

Andreas Kretschmer wrote:

zxo102 ouyang <zxo102@gmail.com> wrote:

Hi everyone, I am using postgresql 8.3-beta3. I have a table
'test' with three fields:

I'm guessing you mean 8.4-beta3, right?

either of those are unsuitable for any kind of production use...

Hey, we needs beta-testers, right? And yes, read again, the table is
called 'test' ...

True, but if you're gonna test betas / alphas, I'd think 8.5 alpha
would be the choice for testing. 8.4's beta ended quite some time
ago.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8A. Kretschmer
andreas.kretschmer@schollglas.com
In reply to: Scott Marlowe (#6)
Re: An issue with max() and order by ... limit 1 in postgresql8.3-beta3

In response to Scott Marlowe :

On Sat, Jan 9, 2010 at 2:46 PM, Andreas Kretschmer
<akretschmer@spamfence.net> wrote:

Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote:

Andreas Kretschmer wrote:

zxo102 ouyang <zxo102@gmail.com> wrote:

Hi everyone, � �I am using postgresql 8.3-beta3. I have a table
'test' with three fields:

I'm guessing you mean 8.4-beta3, right?

either of those are unsuitable for any kind of production use...

Hey, we needs beta-testers, right? And yes, read again, the table is
called 'test' ...

True, but if you're gonna test betas / alphas, I'd think 8.5 alpha
would be the choice for testing. 8.4's beta ended quite some time
ago.

I'm stupid, i meant 8.5 ...

Andreas
--
Andreas Kretschmer
Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431 2EB0 389D 1DC2 3172 0C99