Thread: An issue with max() and order by ... limit 1 in postgresql8.3-beta3

An issue with max() and order by ... limit 1 in postgresql8.3-beta3

From
zxo102 ouyang
Date:
Hi everyone, 
   I am using postgresql 8.3-beta3. I have a table 'test' with three fields:
     sid     data                     date
      1        1.1                  2009-09-01 1:00:00
      1        2.1                  2010-01-01 1:00:20
      2        3.1                  2009-09-01 1:00:10
      2        0.1                  2010-01-01 1:00:30

I create index for data field. 
Each sid may have millions of rows.
I want to get maximum data value and corresponding "time" for each group of sid. Here is my query:
########################
select  t1.sid ,  max(t1.data)  
          (select t2.date 
    from test t2, 
    where t2.sid = t1.sid and 
   t2.date between '2009-08-01' and '2010-01-02' and 
              order by t2.data DESC limit 1
           ) 
 from test t1 
 where  t1.date between '2009-08-01' and '2010-01-08' and 
 group by t1.sid
##########################
But when max() in postgresql may slow down the search when there are millions of rows for each sid. 
So I use " order by t2.data DESC limit 1" to find max:
########################
select  t1.sid ,
           (select t2.data 
    from test t2, 
    where t2.sid = t1.sid and 
   t2.date between '2009-08-01' and '2010-01-02' and 
              order by t2.data DESC limit 1
           )  
          (select t2.date 
    from test t2, 
    where t2.sid = t1.sid and 
   t2.date between '2009-08-01' and '2010-01-02' and 
              order by t2.data DESC limit 1
           ) 
 from test t1 
 where  t1.date between '2009-08-01' and '2010-01-08' and 
 group by t1.sid
##########################
The second query looks "strange" since similar search is done twice. 
Because of two fields, the following can not be used directly in the above query.

          (select t2.date, t2.data 
    from test t2, 
    where t2.sid = t1.sid and 
   t2.date between '2009-08-01' and '2010-01-02' and 
              order by t2.data DESC limit 1
           ) 

Any suggestions for the best way to get maximum data value and corresponding "time" for each group of sid in my case?

Thanks a lot.

ouyang

Re: An issue with max() and order by ... limit 1 in postgresql8.3-beta3

From
Raymond O'Donnell
Date:
On 09/01/2010 16:43, zxo102 ouyang wrote:
> Hi everyone,
>    I am using postgresql 8.3-beta3. I have a table 'test' with three fields:

Without meaning to sound unhelpful, why on earth are you using a beta
version when 8.3 was released *ages* ago and has had several bug-fix
updates since?

I'd look first in the release notes for the updates to see if there's
anything that addresses your problem, and then upgrade to the latest
release version at the first opportunity.

Ray.


--
Raymond O'Donnell :: Galway :: Ireland
rod@iol.ie

Re: An issue with max() and order by ... limit 1 in postgresql8.3-beta3

From
Andreas Kretschmer
Date:
zxo102 ouyang <zxo102@gmail.com> wrote:

> Hi everyone,
>    I am using postgresql 8.3-beta3. I have a table 'test' with three fields:

I'm guessing you mean 8.4-beta3, right?


> Any suggestions for the best way to get maximum data value and corresponding
> "time" for each group of sid in my case?

Based on your data:

test=*# select * from test;
 sid | data |        date
-----+------+---------------------
   1 |  1.1 | 2009-09-01 01:00:00
   1 |  2.1 | 2010-01-01 01:00:20
   2 |  3.1 | 2009-09-01 01:00:10
   2 |  0.1 | 2010-01-01 01:00:30
(4 Zeilen)


Try:


Zeit: 0,227 ms
test=*# select distinct on (sid) sid, data, date from test order by sid, data desc, date;
 sid | data |        date
-----+------+---------------------
   1 |  2.1 | 2010-01-01 01:00:20
   2 |  3.1 | 2009-09-01 01:00:10
(2 Zeilen)




Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: An issue with max() and order by ... limit 1 in postgresql8.3-beta3

From
Stefan Kaltenbrunner
Date:
Andreas Kretschmer wrote:
> zxo102 ouyang <zxo102@gmail.com> wrote:
>
>> Hi everyone,
>>    I am using postgresql 8.3-beta3. I have a table 'test' with three fields:
>
> I'm guessing you mean 8.4-beta3, right?

either of those are unsuitable for any kind of production use...



Stefan

Re: An issue with max() and order by ... limit 1 in postgresql8.3-beta3

From
Andreas Kretschmer
Date:
Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote:

> Andreas Kretschmer wrote:
>> zxo102 ouyang <zxo102@gmail.com> wrote:
>>
>>> Hi everyone,    I am using postgresql 8.3-beta3. I have a table
>>> 'test' with three fields:
>>
>> I'm guessing you mean 8.4-beta3, right?
>
> either of those are unsuitable for any kind of production use...

Hey, we needs beta-testers, right? And yes, read again, the table is
called 'test' ...


Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: An issue with max() and order by ... limit 1 in postgresql8.3-beta3

From
Scott Marlowe
Date:
On Sat, Jan 9, 2010 at 2:46 PM, Andreas Kretschmer
<akretschmer@spamfence.net> wrote:
> Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote:
>
>> Andreas Kretschmer wrote:
>>> zxo102 ouyang <zxo102@gmail.com> wrote:
>>>
>>>> Hi everyone,    I am using postgresql 8.3-beta3. I have a table
>>>> 'test' with three fields:
>>>
>>> I'm guessing you mean 8.4-beta3, right?
>>
>> either of those are unsuitable for any kind of production use...
>
> Hey, we needs beta-testers, right? And yes, read again, the table is
> called 'test' ...

True, but if you're gonna test betas / alphas, I'd think 8.5 alpha
would be the choice for testing.  8.4's beta ended quite some time
ago.

Re: An issue with max() and order by ... limit 1 in postgresql8.3-beta3

From
zxo102 ouyang
Date:
Thanks for your guys' help. I did not know the meaning of betas / alphas things before and just simply downloaded one to use. Now my 8.3 -beta3 version is really in production and get 100 rows of monitoring data per minutes. So far so good. Anyway, I will upgrade it to a latest stable version. 
With Andreas's example query:
    select distinct on (sid) sid, data, date from test order by sid, data desc, date;
I rewrite my query. It just take 20 seconds to finish the search and is much better than  my old query which takes 400 seconds to return the results. Thanks again for Andreas's example query.   For reference, the following is my new query (I create an index for two fileds (rd.sensor_id,rd.sensor_channel)).
 
Thanks  a lot gain.
 
Ouyang
 
#############################################
select rt_data.r_flowmeter_caliber as  r_flowmeter_caliber,
       rt_data.r_max01_sloc as r_max01_sloc,
       rt_data.r_max01_sdata as r_max01_sdata,
       rt_data.r_max01_sdate as r_max01_sdate,
       rt_data.r_min01_sdata as r_min01_sdata,
       rt_data.r_min01_sdate as r_min01_sdate,
       rt_data.r_avg01_sdata as r_avg01_sdata,
       acc_data.r_end_sdate as r_end_sdate,
       acc_data.r_end_sdata as r_end_sdata,
       acc_data.r_start_sdate as r_start_sdate,
       acc_data.r_start_sdata as r_start_sdata,
       acc_data.r_acc_sdata as r_acc_sdata
       from ( select ec.flowmeter_caliber as r_flowmeter_caliber,
         max01.r_sloc as r_max01_sloc,
         round(max01.r_sdata*100)/100 as r_max01_sdata,
         max01.r_sdate as r_max01_sdate,
         round(min01.r_sdata*100)/100 as r_min01_sdata,
         min01.r_sdate as r_min01_sdate,
         round(avg01.r_sdata*100)/100 as r_avg01_sdata,
         max01.r_channel as r_channel,
         max01.r_sid as r_sid,
         max01.r_sloc as r_sloc
         from (select distinct on (rd.sensor_id,rd.sensor_channel) rd.sensor_id as r_sid,
                                 rd.sensor_channel as r_channel,
                          rd.sensor_data as r_sdata,
                          rd.sensor_date as r_sdate,
                                 sc.external_ins as r_sloc 
                              from record_data rd, sensor_cfg sc, energy_classification02 ec 
                              where   rd.sensor_date between '2009-08-01' and '2010-01-08' and
                                 sc.sensor_id = rd.sensor_id and
                               sc.external_ins=ec.measure_name and
                               sc.channel = ec.instantaneous_channel and
                               sc.channel = rd.sensor_channel and
                               sc.remarks='瞬时值' and
                               ec.flowmeter_caliber='流量'  
                             order by rd.sensor_id,rd.sensor_channel, rd.sensor_data DESC, rd.sensor_date
               ) max01,
         ( select distinct on (rd.sensor_id,rd.sensor_channel) rd.sensor_id as r_sid,
                                 rd.sensor_channel as r_channel,
                          rd.sensor_data as r_sdata,
                          rd.sensor_date as r_sdate,
                                 sc.external_ins as r_sloc 
                              from record_data rd, sensor_cfg sc, energy_classification02 ec 
                              where   rd.sensor_date between '2009-08-01' and '2010-01-08' and
                                 sc.sensor_id = rd.sensor_id and
                               sc.external_ins=ec.measure_name and
                               sc.channel = ec.instantaneous_channel and
                               sc.channel = rd.sensor_channel and
                               sc.remarks='瞬时值' and
                               ec.flowmeter_caliber='流量'  
                             order by rd.sensor_id,rd.sensor_channel, rd.sensor_data ASC, rd.sensor_date
          ) min01,
         ( select avg(rd01.sensor_data) as r_sdata,
           rd01.sensor_id as r_sid,
           rd01.sensor_channel as r_channel
           from record_data rd01,
           sensor_cfg sc,
           energy_classification02 ec
           where rd01.sensor_date between '2009-08-01' and '2010-01-08' and
           sc.sensor_id = rd01.sensor_id and
           sc.external_ins=ec.measure_name and
           sc.channel = ec.instantaneous_channel and
           sc.channel=rd01.sensor_channel and
           sc.remarks='瞬时值' and
           ec.flowmeter_caliber='流量'
           group by rd01.sensor_id,rd01.sensor_channel
           ) avg01,
         energy_classification02 ec,
         sensor_cfg sc
         where  max01.r_sid=min01.r_sid and
         min01.r_sid=avg01.r_sid and
         max01.r_sid=sc.sensor_id and
         sc.channel = ec.instantaneous_channel and
         sc.channel= min01.r_channel and
         sc.channel= max01.r_channel and
         sc.channel=avg01.r_channel and
         sc.external_ins=ec.measure_name and
         sc.remarks='瞬时值' and
         ec.flowmeter_caliber='流量'
                ) rt_data,
         (select round(max01.r_sdata-min01.r_sdata)*100/100 as r_acc_sdata,
          max01.r_sid as r_sid,
          max01.r_sloc as r_sloc,
          max01.r_sdate as r_end_sdate,
          max01.r_sdata as r_end_sdata,
          min01.r_sdate as r_start_sdate,
          min01.r_sdata as r_start_sdata
          from
         (select distinct on (rd.sensor_id,rd.sensor_channel) rd.sensor_id as r_sid,
                                 rd.sensor_channel as r_channel,
                          rd.sensor_data as r_sdata,
                          rd.sensor_date as r_sdate,
                                 sc.external_ins as r_sloc 
                              from record_data rd, sensor_cfg sc, energy_classification02 ec 
                              where   rd.sensor_date between '2009-08-01' and '2010-01-08' and
                                 sc.sensor_id = rd.sensor_id and
                               sc.external_ins=ec.measure_name and
                               sc.channel = ec.cumulative_channel and
                               sc.channel = rd.sensor_channel and
                               sc.remarks='累积值' and
                               ec.flowmeter_caliber='流量'  
                             order by rd.sensor_id,rd.sensor_channel, rd.sensor_data DESC, rd.sensor_date
               ) max01,
         (select distinct on (rd.sensor_id,rd.sensor_channel) rd.sensor_id as r_sid,
                                 rd.sensor_channel as r_channel,
                          rd.sensor_data as r_sdata,
                          rd.sensor_date as r_sdate,
                                 sc.external_ins as r_sloc 
                              from record_data rd, sensor_cfg sc, energy_classification02 ec 
                              where rd.sensor_date between '2009-08-01' and '2010-01-08' and
                                 sc.sensor_id = rd.sensor_id and
                               sc.external_ins=ec.measure_name and
                               sc.channel = ec.cumulative_channel and
                               sc.channel = rd.sensor_channel and
                               sc.remarks='累积值' and
                               ec.flowmeter_caliber='流量'  
                             order by rd.sensor_id,rd.sensor_channel, rd.sensor_data ASC, rd.sensor_date
                 ) min01,
          energy_classification02 ec,
          sensor_cfg sc
          where  max01.r_sid=min01.r_sid and
          max01.r_sid=sc.sensor_id and
          sc.channel = ec.cumulative_channel and
          sc.channel= min01.r_channel and
          sc.channel=max01.r_channel and
          sc.external_ins=ec.measure_name and
          sc.remarks='累积值' and
          ec.flowmeter_caliber='流量'
          ) acc_data
          where acc_data.r_sloc = rt_data.r_sloc
          order by  r_max01_sloc desc
##########################################
 
 

 
2010/1/10 Scott Marlowe <scott.marlowe@gmail.com>
On Sat, Jan 9, 2010 at 2:46 PM, Andreas Kretschmer
<akretschmer@spamfence.net> wrote:
> Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote:
>
>> Andreas Kretschmer wrote:
>>> zxo102 ouyang <zxo102@gmail.com> wrote:
>>>
>>>> Hi everyone,    I am using postgresql 8.3-beta3. I have a table
>>>> 'test' with three fields:
>>>
>>> I'm guessing you mean 8.4-beta3, right?
>>
>> either of those are unsuitable for any kind of production use...
>
> Hey, we needs beta-testers, right? And yes, read again, the table is
> called 'test' ...

True, but if you're gonna test betas / alphas, I'd think 8.5 alpha
would be the choice for testing.  8.4's beta ended quite some time
ago.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: An issue with max() and order by ... limit 1 in postgresql8.3-beta3

From
"A. Kretschmer"
Date:
In response to Scott Marlowe :
> On Sat, Jan 9, 2010 at 2:46 PM, Andreas Kretschmer
> <akretschmer@spamfence.net> wrote:
> > Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote:
> >
> >> Andreas Kretschmer wrote:
> >>> zxo102 ouyang <zxo102@gmail.com> wrote:
> >>>
> >>>> Hi everyone,    I am using postgresql 8.3-beta3. I have a table
> >>>> 'test' with three fields:
> >>>
> >>> I'm guessing you mean 8.4-beta3, right?
> >>
> >> either of those are unsuitable for any kind of production use...
> >
> > Hey, we needs beta-testers, right? And yes, read again, the table is
> > called 'test' ...
>
> True, but if you're gonna test betas / alphas, I'd think 8.5 alpha
> would be the choice for testing.  8.4's beta ended quite some time
> ago.

I'm stupid, i meant 8.5 ...


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99