Thread: return MAX and when it happened

return MAX and when it happened

From
Scara Maccai
Date:
Hi all,

suppose I have a table like:

CREATE TABLE mytab
(
     num integer,
     mydate timestamp
);

and I want to find MAX(num) and the "mydate" where it first happened.

I guess I could use

select * from mytab where num = (select MAX(num) from mytab) order by
mydate limit 1;

but that would scan the data twice (I guess...)

Do I have to write my own MAX function, something like:

select MYMAX(num, timestamp) from mytab

which would return a custom type?
Or is there a better way?






Re: return MAX and when it happened

From
Csaba Nagy
Date:
Hi Scara,

This should work just fine:

select num, min(mydate) from mytab group by num order by num desc limit
1;

If you have an index on 'num' it will also be fast.

Cheers,
Csaba.

On Wed, 2008-11-19 at 08:47 -0600, Scara Maccai wrote:
> Hi all,
>
> suppose I have a table like:
>
> CREATE TABLE mytab
> (
>      num integer,
>      mydate timestamp
> );
>
> and I want to find MAX(num) and the "mydate" where it first happened.
>
> I guess I could use
>
> select * from mytab where num = (select MAX(num) from mytab) order by
> mydate limit 1;
>
> but that would scan the data twice (I guess...)
>
> Do I have to write my own MAX function, something like:
>
> select MYMAX(num, timestamp) from mytab
>
> which would return a custom type?
> Or is there a better way?
>
>
>
>
>
>


Re: return MAX and when it happened

From
Sam Mason
Date:
On Wed, Nov 19, 2008 at 08:47:57AM -0600, Scara Maccai wrote:
> CREATE TABLE mytab
> (
>     num integer,
>     mydate timestamp
> );
>
> and I want to find MAX(num) and the "mydate" where it first happened.
>
> I guess I could use
>
> select * from mytab where num = (select MAX(num) from mytab) order by
> mydate limit 1;

Why not just do:

  SELECT * FROM mytab
  ORDER BY num, mydate
  LIMIT 1;

If you're trying to do more complicated things, DISTINCT ON may be more
useful.

> Do I have to write my own MAX function, something like:
>
> select MYMAX(num, timestamp) from mytab
>
> which would return a custom type?  Or is there a better way?

I've wanted MAX to support this sort of thing several times before, but
alas it doesn't.


  Sam

Re: return MAX and when it happened

From
Scara Maccai
Date:
Sam Mason wrote:
> Why not just do:
>
>   SELECT * FROM mytab
>   ORDER BY num, mydate
>   LIMIT 1;
>
> If you're trying to do more complicated things, DISTINCT ON may be more
> useful.
>

Well you guys are right; the problem is that the query is actually more
complex, I tried to simplify it for the question, but ended up with a
trivial question...
Let's see if this example is better:

CREATE TABLE mytab
(
     num integer,
     num1 integer,
     num2 integer,
     mydate timestamp
);


select MAX(num), SUM(num1)+SUM(num2), date_trunc('day', mydate) from
mytab group by date_trunc('day', mydate)

but I also want the timestamp where MAX(num) happened for each day.





Re: return MAX and when it happened

From
Sam Mason
Date:
On Wed, Nov 19, 2008 at 09:35:34AM -0600, Scara Maccai wrote:
> Well you guys are right; the problem is that the query is actually more
> complex, I tried to simplify it for the question, but ended up with a
> trivial question...

always a problem with simplification, I've done the same far too often
myself!  at least you tried :)

> Let's see if this example is better:
>
> CREATE TABLE mytab
> (
>     num integer,
>     num1 integer,
>     num2 integer,
>     mydate timestamp
> );
>
>
> select MAX(num), SUM(num1)+SUM(num2), date_trunc('day', mydate) from
> mytab group by date_trunc('day', mydate)
>
> but I also want the timestamp where MAX(num) happened for each day.

Do you really want the SUM of num1 and num2, because that makes it more
fiddly and would be where having MAX accept a record/tuple would be
best.  If you don't, maybe something like:

  SELECT DISTINCT ON (date_trunc('day', mydate))
      date_trunc('day', mydate), num, num1+num2
    FROM mytab
    ORDER BY date_trunc('day', mydate), num DESC;


  Sam

Re: return MAX and when it happened

From
Scara Maccai
Date:
Sam Mason wrote:
> Do you really want the SUM of num1 and num2, because that makes it more
> fiddly and would be where having MAX accept a record/tuple would be
> best.  If you don't, maybe something like:
>
>   SELECT DISTINCT ON (date_trunc('day', mydate))
>       date_trunc('day', mydate), num, num1+num2
>     FROM mytab
>     ORDER BY date_trunc('day', mydate), num DESC;
>
Unfortunately, I need:

- the SUM of some values, grouped per day
- the MAX of some other value, grouped per day
- the timestamp where the MAX above happened (per day, of course)

The "num" columns are events, and sometimes I need to know not only the
amount of a certain event per day, but also WHEN the MAX of some event
happened...

I guess I have to use a Composite Type (something like "numeric,
timestamp") + user defined aggregate?








Re: return MAX and when it happened

From
Sam Mason
Date:
On Wed, Nov 19, 2008 at 10:10:08AM -0600, Scara Maccai wrote:
> Sam Mason wrote:
> >Do you really want the SUM of num1 and num2, because that makes it more
> >fiddly and would be where having MAX accept a record/tuple would be
> >best.  If you don't, maybe something like:
> >
> >  SELECT DISTINCT ON (date_trunc('day', mydate))
> >      date_trunc('day', mydate), num, num1+num2
> >    FROM mytab
> >    ORDER BY date_trunc('day', mydate), num DESC;
> >
> Unfortunately, I need:
>
> - the SUM of some values, grouped per day
> - the MAX of some other value, grouped per day
> - the timestamp where the MAX above happened (per day, of course)
>
> The "num" columns are events, and sometimes I need to know not only the
> amount of a certain event per day, but also WHEN the MAX of some event
> happened...
>
> I guess I have to use a Composite Type (something like "numeric,
> timestamp") + user defined aggregate?

Yes, that sounds about right. It's all going to be a bit more of a
fiddle than it should be, but at least it's possible.  An alternative
would be to use a couple of sub-queries and put things back together
again after getting each piece of data.  The custom aggregate sounds the
most elegant, it's just annoying that it's so much fiddling to get it
all working to start with.


  Sam

Re: return MAX and when it happened

From
Scara Maccai
Date:
Sam Mason wrote:
> The custom aggregate sounds the
> most elegant, it's just annoying that it's so much fiddling to get it
> all working to start with
Thanks.

I think I wrote it, but there's something I don't get from the docs: do
I have to call

get_call_result_type(fcinfo, NULL, &tupdesc)

every time?

I mean: the only example I've found about returning Composite Types
talks about returning sets as well (34.9.10. Returning Sets). In that
example the get_call_result_type call is done once:

if (SRF_IS_FIRSTCALL())
{
  [...]
  if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
  [...]
}


Should I do something like that in my function or that only applies to
function returning sets?
I'm calling it every time now but I don't know if it's right...




Re: return MAX and when it happened

From
Sam Mason
Date:
On Wed, Nov 19, 2008 at 05:06:14PM -0600, Scara Maccai wrote:
> Sam Mason wrote:
> >The custom aggregate sounds the
> >most elegant, it's just annoying that it's so much fiddling to get it
> >all working to start with
> Thanks.
>
> I think I wrote it, but there's something I don't get from the docs: do
> I have to call
>
> get_call_result_type(fcinfo, NULL, &tupdesc)

I've always tried to stay away from C level extensions so far!  How
many records are you expecting to aggregate across?  If it's only a few
thousand a simple SQL language function may be ok:

  CREATE TYPE nt AS ( n INTEGER, t TIMESTAMP );

  CREATE FUNCTION maxnt(nt, nt) RETURNS nt IMMUTABLE LANGUAGE SQL AS $$
    SELECT CASE WHEN $1.n > $2.n THEN $1 ELSE COALESCE($2,$1) END $$;

  CREATE AGGREGATE MAX (nt) (
      SFUNC = maxnt,
      STYPE = nt
  );

This is about 20 times slower than a C function (80 vs ~1500 rows per
ms), but if you're only iterating over a few rows it's not going to
matter much.

Sorry I can't be of more help!


  Sam

Re: return MAX and when it happened

From
Scara Maccai
Date:
Sam Mason wrote:
> On Wed, Nov 19, 2008 at 05:06:14PM -0600, Scara Maccai wrote:
>
>> I think I wrote it, but there's something I don't get from the docs: do
>> I have to call
>>
>> get_call_result_type(fcinfo, NULL, &tupdesc)
>>
>
> I've always tried to stay away from C level extensions so far!  How
> many records are you expecting to aggregate across?  If it's only a few
> thousand a simple SQL language function may be ok:
>

It's going to be used in millions of records.
That's why I went straight to the C implementation. But your SQL
function is a lot easier to read.

> Sorry I can't be of more help!
>
You helped me a lot, thank you.


Anyone can answer me?
Do I have to call

get_call_result_type


every time or I can just call it the first time? I don't think the docs
are very clear in case of a function returning 1 Composite Type.