Thread: An issue with max() and order by ... limit 1 in postgresql8.3-beta3
Hi everyone,
I am using postgresql 8.3-beta3. I have a table 'test' with three fields:
sid data date
1 1.1 2009-09-01 1:00:00
1 2.1 2010-01-01 1:00:20
2 3.1 2009-09-01 1:00:10
2 0.1 2010-01-01 1:00:30
I create index for data field.
Each sid may have millions of rows.
I want to get maximum data value and corresponding "time" for each group of sid. Here is my query:
########################
select t1.sid , max(t1.data)
(select t2.date
from test t2,
where t2.sid = t1.sid and
t2.date between '2009-08-01' and '2010-01-02' and
order by t2.data DESC limit 1
)
from test t1
where t1.date between '2009-08-01' and '2010-01-08' and
group by t1.sid
##########################
But when max() in postgresql may slow down the search when there are millions of rows for each sid.
So I use " order by t2.data DESC limit 1" to find max:
########################
select t1.sid ,
(select t2.data
from test t2,
where t2.sid = t1.sid and
t2.date between '2009-08-01' and '2010-01-02' and
order by t2.data DESC limit 1
) (select t2.date
from test t2,
where t2.sid = t1.sid and
t2.date between '2009-08-01' and '2010-01-02' and
order by t2.data DESC limit 1
)
from test t1
where t1.date between '2009-08-01' and '2010-01-08' and
group by t1.sid
##########################
The second query looks "strange" since similar search is done twice.
Because of two fields, the following can not be used directly in the above query.
(select t2.date, t2.data
from test t2,
where t2.sid = t1.sid and
t2.date between '2009-08-01' and '2010-01-02' and
order by t2.data DESC limit 1
)
Any suggestions for the best way to get maximum data value and corresponding "time" for each group of sid in my case?
Thanks a lot.
ouyang
Re: An issue with max() and order by ... limit 1 in postgresql8.3-beta3
From
Raymond O'Donnell
Date:
On 09/01/2010 16:43, zxo102 ouyang wrote: > Hi everyone, > I am using postgresql 8.3-beta3. I have a table 'test' with three fields: Without meaning to sound unhelpful, why on earth are you using a beta version when 8.3 was released *ages* ago and has had several bug-fix updates since? I'd look first in the release notes for the updates to see if there's anything that addresses your problem, and then upgrade to the latest release version at the first opportunity. Ray. -- Raymond O'Donnell :: Galway :: Ireland rod@iol.ie
Re: An issue with max() and order by ... limit 1 in postgresql8.3-beta3
From
Andreas Kretschmer
Date:
zxo102 ouyang <zxo102@gmail.com> wrote: > Hi everyone, > I am using postgresql 8.3-beta3. I have a table 'test' with three fields: I'm guessing you mean 8.4-beta3, right? > Any suggestions for the best way to get maximum data value and corresponding > "time" for each group of sid in my case? Based on your data: test=*# select * from test; sid | data | date -----+------+--------------------- 1 | 1.1 | 2009-09-01 01:00:00 1 | 2.1 | 2010-01-01 01:00:20 2 | 3.1 | 2009-09-01 01:00:10 2 | 0.1 | 2010-01-01 01:00:30 (4 Zeilen) Try: Zeit: 0,227 ms test=*# select distinct on (sid) sid, data, date from test order by sid, data desc, date; sid | data | date -----+------+--------------------- 1 | 2.1 | 2010-01-01 01:00:20 2 | 3.1 | 2009-09-01 01:00:10 (2 Zeilen) Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
Re: An issue with max() and order by ... limit 1 in postgresql8.3-beta3
From
Stefan Kaltenbrunner
Date:
Andreas Kretschmer wrote: > zxo102 ouyang <zxo102@gmail.com> wrote: > >> Hi everyone, >> I am using postgresql 8.3-beta3. I have a table 'test' with three fields: > > I'm guessing you mean 8.4-beta3, right? either of those are unsuitable for any kind of production use... Stefan
Re: An issue with max() and order by ... limit 1 in postgresql8.3-beta3
From
Andreas Kretschmer
Date:
Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote: > Andreas Kretschmer wrote: >> zxo102 ouyang <zxo102@gmail.com> wrote: >> >>> Hi everyone, I am using postgresql 8.3-beta3. I have a table >>> 'test' with three fields: >> >> I'm guessing you mean 8.4-beta3, right? > > either of those are unsuitable for any kind of production use... Hey, we needs beta-testers, right? And yes, read again, the table is called 'test' ... Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
On Sat, Jan 9, 2010 at 2:46 PM, Andreas Kretschmer <akretschmer@spamfence.net> wrote: > Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote: > >> Andreas Kretschmer wrote: >>> zxo102 ouyang <zxo102@gmail.com> wrote: >>> >>>> Hi everyone, I am using postgresql 8.3-beta3. I have a table >>>> 'test' with three fields: >>> >>> I'm guessing you mean 8.4-beta3, right? >> >> either of those are unsuitable for any kind of production use... > > Hey, we needs beta-testers, right? And yes, read again, the table is > called 'test' ... True, but if you're gonna test betas / alphas, I'd think 8.5 alpha would be the choice for testing. 8.4's beta ended quite some time ago.
Thanks for your guys' help. I did not know the meaning of betas / alphas things before and just simply downloaded one to use. Now my 8.3 -beta3 version is really in production and get 100 rows of monitoring data per minutes. So far so good. Anyway, I will upgrade it to a latest stable version.
With Andreas's example query:
select distinct on (sid) sid, data, date from test order by sid, data desc, date;
I rewrite my query. It just take 20 seconds to finish the search and is much better than my old query which takes 400 seconds to return the results. Thanks again for Andreas's example query. For reference, the following is my new query (I create an index for two fileds (rd.sensor_id,rd.sensor_channel)).
Thanks a lot gain.
Ouyang
#############################################
select rt_data.r_flowmeter_caliber as r_flowmeter_caliber,
rt_data.r_max01_sloc as r_max01_sloc,
rt_data.r_max01_sdata as r_max01_sdata,
rt_data.r_max01_sdate as r_max01_sdate,
rt_data.r_min01_sdata as r_min01_sdata,
rt_data.r_min01_sdate as r_min01_sdate,
rt_data.r_avg01_sdata as r_avg01_sdata,
acc_data.r_end_sdate as r_end_sdate,
acc_data.r_end_sdata as r_end_sdata,
acc_data.r_start_sdate as r_start_sdate,
acc_data.r_start_sdata as r_start_sdata,
acc_data.r_acc_sdata as r_acc_sdata
from ( select ec.flowmeter_caliber as r_flowmeter_caliber,
max01.r_sloc as r_max01_sloc,
round(max01.r_sdata*100)/100 as r_max01_sdata,
max01.r_sdate as r_max01_sdate,
round(min01.r_sdata*100)/100 as r_min01_sdata,
min01.r_sdate as r_min01_sdate,
round(avg01.r_sdata*100)/100 as r_avg01_sdata,
max01.r_channel as r_channel,
max01.r_sid as r_sid,
max01.r_sloc as r_sloc
from (select distinct on (rd.sensor_id,rd.sensor_channel) rd.sensor_id as r_sid,
rd.sensor_channel as r_channel,
rd.sensor_data as r_sdata,
rd.sensor_date as r_sdate,
sc.external_ins as r_sloc
from record_data rd, sensor_cfg sc, energy_classification02 ec
where rd.sensor_date between '2009-08-01' and '2010-01-08' and
sc.sensor_id = rd.sensor_id and
sc.external_ins=ec.measure_name and
sc.channel = ec.instantaneous_channel and
sc.channel = rd.sensor_channel and
sc.remarks='瞬时值' and
ec.flowmeter_caliber='流量'
order by rd.sensor_id,rd.sensor_channel, rd.sensor_data DESC, rd.sensor_date
) max01,
( select distinct on (rd.sensor_id,rd.sensor_channel) rd.sensor_id as r_sid,
rd.sensor_channel as r_channel,
rd.sensor_data as r_sdata,
rd.sensor_date as r_sdate,
sc.external_ins as r_sloc
from record_data rd, sensor_cfg sc, energy_classification02 ec
where rd.sensor_date between '2009-08-01' and '2010-01-08' and
sc.sensor_id = rd.sensor_id and
sc.external_ins=ec.measure_name and
sc.channel = ec.instantaneous_channel and
sc.channel = rd.sensor_channel and
sc.remarks='瞬时值' and
ec.flowmeter_caliber='流量'
order by rd.sensor_id,rd.sensor_channel, rd.sensor_data ASC, rd.sensor_date
) min01,
( select avg(rd01.sensor_data) as r_sdata,
rd01.sensor_id as r_sid,
rd01.sensor_channel as r_channel
from record_data rd01,
sensor_cfg sc,
energy_classification02 ec
where rd01.sensor_date between '2009-08-01' and '2010-01-08' and
sc.sensor_id = rd01.sensor_id and
sc.external_ins=ec.measure_name and
sc.channel = ec.instantaneous_channel and
sc.channel=rd01.sensor_channel and
sc.remarks='瞬时值' and
ec.flowmeter_caliber='流量'
group by rd01.sensor_id,rd01.sensor_channel
) avg01,
energy_classification02 ec,
sensor_cfg sc
where max01.r_sid=min01.r_sid and
min01.r_sid=avg01.r_sid and
max01.r_sid=sc.sensor_id and
sc.channel = ec.instantaneous_channel and
sc.channel= min01.r_channel and
sc.channel= max01.r_channel and
sc.channel=avg01.r_channel and
sc.external_ins=ec.measure_name and
sc.remarks='瞬时值' and
ec.flowmeter_caliber='流量'
) rt_data,
(select round(max01.r_sdata-min01.r_sdata)*100/100 as r_acc_sdata,
max01.r_sid as r_sid,
max01.r_sloc as r_sloc,
max01.r_sdate as r_end_sdate,
max01.r_sdata as r_end_sdata,
min01.r_sdate as r_start_sdate,
min01.r_sdata as r_start_sdata
from
(select distinct on (rd.sensor_id,rd.sensor_channel) rd.sensor_id as r_sid,
rd.sensor_channel as r_channel,
rd.sensor_data as r_sdata,
rd.sensor_date as r_sdate,
sc.external_ins as r_sloc
from record_data rd, sensor_cfg sc, energy_classification02 ec
where rd.sensor_date between '2009-08-01' and '2010-01-08' and
sc.sensor_id = rd.sensor_id and
sc.external_ins=ec.measure_name and
sc.channel = ec.cumulative_channel and
sc.channel = rd.sensor_channel and
sc.remarks='累积值' and
ec.flowmeter_caliber='流量'
order by rd.sensor_id,rd.sensor_channel, rd.sensor_data DESC, rd.sensor_date
) max01,
(select distinct on (rd.sensor_id,rd.sensor_channel) rd.sensor_id as r_sid,
rd.sensor_channel as r_channel,
rd.sensor_data as r_sdata,
rd.sensor_date as r_sdate,
sc.external_ins as r_sloc
from record_data rd, sensor_cfg sc, energy_classification02 ec
where rd.sensor_date between '2009-08-01' and '2010-01-08' and
sc.sensor_id = rd.sensor_id and
sc.external_ins=ec.measure_name and
sc.channel = ec.cumulative_channel and
sc.channel = rd.sensor_channel and
sc.remarks='累积值' and
ec.flowmeter_caliber='流量'
order by rd.sensor_id,rd.sensor_channel, rd.sensor_data ASC, rd.sensor_date
) min01,
energy_classification02 ec,
sensor_cfg sc
where max01.r_sid=min01.r_sid and
max01.r_sid=sc.sensor_id and
sc.channel = ec.cumulative_channel and
sc.channel= min01.r_channel and
sc.channel=max01.r_channel and
sc.external_ins=ec.measure_name and
sc.remarks='累积值' and
ec.flowmeter_caliber='流量'
) acc_data
where acc_data.r_sloc = rt_data.r_sloc
order by r_max01_sloc desc
rt_data.r_max01_sloc as r_max01_sloc,
rt_data.r_max01_sdata as r_max01_sdata,
rt_data.r_max01_sdate as r_max01_sdate,
rt_data.r_min01_sdata as r_min01_sdata,
rt_data.r_min01_sdate as r_min01_sdate,
rt_data.r_avg01_sdata as r_avg01_sdata,
acc_data.r_end_sdate as r_end_sdate,
acc_data.r_end_sdata as r_end_sdata,
acc_data.r_start_sdate as r_start_sdate,
acc_data.r_start_sdata as r_start_sdata,
acc_data.r_acc_sdata as r_acc_sdata
from ( select ec.flowmeter_caliber as r_flowmeter_caliber,
max01.r_sloc as r_max01_sloc,
round(max01.r_sdata*100)/100 as r_max01_sdata,
max01.r_sdate as r_max01_sdate,
round(min01.r_sdata*100)/100 as r_min01_sdata,
min01.r_sdate as r_min01_sdate,
round(avg01.r_sdata*100)/100 as r_avg01_sdata,
max01.r_channel as r_channel,
max01.r_sid as r_sid,
max01.r_sloc as r_sloc
from (select distinct on (rd.sensor_id,rd.sensor_channel) rd.sensor_id as r_sid,
rd.sensor_channel as r_channel,
rd.sensor_data as r_sdata,
rd.sensor_date as r_sdate,
sc.external_ins as r_sloc
from record_data rd, sensor_cfg sc, energy_classification02 ec
where rd.sensor_date between '2009-08-01' and '2010-01-08' and
sc.sensor_id = rd.sensor_id and
sc.external_ins=ec.measure_name and
sc.channel = ec.instantaneous_channel and
sc.channel = rd.sensor_channel and
sc.remarks='瞬时值' and
ec.flowmeter_caliber='流量'
order by rd.sensor_id,rd.sensor_channel, rd.sensor_data DESC, rd.sensor_date
) max01,
( select distinct on (rd.sensor_id,rd.sensor_channel) rd.sensor_id as r_sid,
rd.sensor_channel as r_channel,
rd.sensor_data as r_sdata,
rd.sensor_date as r_sdate,
sc.external_ins as r_sloc
from record_data rd, sensor_cfg sc, energy_classification02 ec
where rd.sensor_date between '2009-08-01' and '2010-01-08' and
sc.sensor_id = rd.sensor_id and
sc.external_ins=ec.measure_name and
sc.channel = ec.instantaneous_channel and
sc.channel = rd.sensor_channel and
sc.remarks='瞬时值' and
ec.flowmeter_caliber='流量'
order by rd.sensor_id,rd.sensor_channel, rd.sensor_data ASC, rd.sensor_date
) min01,
( select avg(rd01.sensor_data) as r_sdata,
rd01.sensor_id as r_sid,
rd01.sensor_channel as r_channel
from record_data rd01,
sensor_cfg sc,
energy_classification02 ec
where rd01.sensor_date between '2009-08-01' and '2010-01-08' and
sc.sensor_id = rd01.sensor_id and
sc.external_ins=ec.measure_name and
sc.channel = ec.instantaneous_channel and
sc.channel=rd01.sensor_channel and
sc.remarks='瞬时值' and
ec.flowmeter_caliber='流量'
group by rd01.sensor_id,rd01.sensor_channel
) avg01,
energy_classification02 ec,
sensor_cfg sc
where max01.r_sid=min01.r_sid and
min01.r_sid=avg01.r_sid and
max01.r_sid=sc.sensor_id and
sc.channel = ec.instantaneous_channel and
sc.channel= min01.r_channel and
sc.channel= max01.r_channel and
sc.channel=avg01.r_channel and
sc.external_ins=ec.measure_name and
sc.remarks='瞬时值' and
ec.flowmeter_caliber='流量'
) rt_data,
(select round(max01.r_sdata-min01.r_sdata)*100/100 as r_acc_sdata,
max01.r_sid as r_sid,
max01.r_sloc as r_sloc,
max01.r_sdate as r_end_sdate,
max01.r_sdata as r_end_sdata,
min01.r_sdate as r_start_sdate,
min01.r_sdata as r_start_sdata
from
(select distinct on (rd.sensor_id,rd.sensor_channel) rd.sensor_id as r_sid,
rd.sensor_channel as r_channel,
rd.sensor_data as r_sdata,
rd.sensor_date as r_sdate,
sc.external_ins as r_sloc
from record_data rd, sensor_cfg sc, energy_classification02 ec
where rd.sensor_date between '2009-08-01' and '2010-01-08' and
sc.sensor_id = rd.sensor_id and
sc.external_ins=ec.measure_name and
sc.channel = ec.cumulative_channel and
sc.channel = rd.sensor_channel and
sc.remarks='累积值' and
ec.flowmeter_caliber='流量'
order by rd.sensor_id,rd.sensor_channel, rd.sensor_data DESC, rd.sensor_date
) max01,
(select distinct on (rd.sensor_id,rd.sensor_channel) rd.sensor_id as r_sid,
rd.sensor_channel as r_channel,
rd.sensor_data as r_sdata,
rd.sensor_date as r_sdate,
sc.external_ins as r_sloc
from record_data rd, sensor_cfg sc, energy_classification02 ec
where rd.sensor_date between '2009-08-01' and '2010-01-08' and
sc.sensor_id = rd.sensor_id and
sc.external_ins=ec.measure_name and
sc.channel = ec.cumulative_channel and
sc.channel = rd.sensor_channel and
sc.remarks='累积值' and
ec.flowmeter_caliber='流量'
order by rd.sensor_id,rd.sensor_channel, rd.sensor_data ASC, rd.sensor_date
) min01,
energy_classification02 ec,
sensor_cfg sc
where max01.r_sid=min01.r_sid and
max01.r_sid=sc.sensor_id and
sc.channel = ec.cumulative_channel and
sc.channel= min01.r_channel and
sc.channel=max01.r_channel and
sc.external_ins=ec.measure_name and
sc.remarks='累积值' and
ec.flowmeter_caliber='流量'
) acc_data
where acc_data.r_sloc = rt_data.r_sloc
order by r_max01_sloc desc
##########################################
2010/1/10 Scott Marlowe <scott.marlowe@gmail.com>
On Sat, Jan 9, 2010 at 2:46 PM, Andreas KretschmerTrue, but if you're gonna test betas / alphas, I'd think 8.5 alpha
<akretschmer@spamfence.net> wrote:
> Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote:
>
>> Andreas Kretschmer wrote:
>>> zxo102 ouyang <zxo102@gmail.com> wrote:
>>>
>>>> Hi everyone, I am using postgresql 8.3-beta3. I have a table
>>>> 'test' with three fields:
>>>
>>> I'm guessing you mean 8.4-beta3, right?
>>
>> either of those are unsuitable for any kind of production use...
>
> Hey, we needs beta-testers, right? And yes, read again, the table is
> called 'test' ...
would be the choice for testing. 8.4's beta ended quite some time
ago.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
In response to Scott Marlowe : > On Sat, Jan 9, 2010 at 2:46 PM, Andreas Kretschmer > <akretschmer@spamfence.net> wrote: > > Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote: > > > >> Andreas Kretschmer wrote: > >>> zxo102 ouyang <zxo102@gmail.com> wrote: > >>> > >>>> Hi everyone, I am using postgresql 8.3-beta3. I have a table > >>>> 'test' with three fields: > >>> > >>> I'm guessing you mean 8.4-beta3, right? > >> > >> either of those are unsuitable for any kind of production use... > > > > Hey, we needs beta-testers, right? And yes, read again, the table is > > called 'test' ... > > True, but if you're gonna test betas / alphas, I'd think 8.5 alpha > would be the choice for testing. 8.4's beta ended quite some time > ago. I'm stupid, i meant 8.5 ... Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431 2EB0 389D 1DC2 3172 0C99