Thread: return MAX and when it happened
Hi all, suppose I have a table like: CREATE TABLE mytab ( num integer, mydate timestamp ); and I want to find MAX(num) and the "mydate" where it first happened. I guess I could use select * from mytab where num = (select MAX(num) from mytab) order by mydate limit 1; but that would scan the data twice (I guess...) Do I have to write my own MAX function, something like: select MYMAX(num, timestamp) from mytab which would return a custom type? Or is there a better way?
Hi Scara, This should work just fine: select num, min(mydate) from mytab group by num order by num desc limit 1; If you have an index on 'num' it will also be fast. Cheers, Csaba. On Wed, 2008-11-19 at 08:47 -0600, Scara Maccai wrote: > Hi all, > > suppose I have a table like: > > CREATE TABLE mytab > ( > num integer, > mydate timestamp > ); > > and I want to find MAX(num) and the "mydate" where it first happened. > > I guess I could use > > select * from mytab where num = (select MAX(num) from mytab) order by > mydate limit 1; > > but that would scan the data twice (I guess...) > > Do I have to write my own MAX function, something like: > > select MYMAX(num, timestamp) from mytab > > which would return a custom type? > Or is there a better way? > > > > > >
On Wed, Nov 19, 2008 at 08:47:57AM -0600, Scara Maccai wrote: > CREATE TABLE mytab > ( > num integer, > mydate timestamp > ); > > and I want to find MAX(num) and the "mydate" where it first happened. > > I guess I could use > > select * from mytab where num = (select MAX(num) from mytab) order by > mydate limit 1; Why not just do: SELECT * FROM mytab ORDER BY num, mydate LIMIT 1; If you're trying to do more complicated things, DISTINCT ON may be more useful. > Do I have to write my own MAX function, something like: > > select MYMAX(num, timestamp) from mytab > > which would return a custom type? Or is there a better way? I've wanted MAX to support this sort of thing several times before, but alas it doesn't. Sam
Sam Mason wrote: > Why not just do: > > SELECT * FROM mytab > ORDER BY num, mydate > LIMIT 1; > > If you're trying to do more complicated things, DISTINCT ON may be more > useful. > Well you guys are right; the problem is that the query is actually more complex, I tried to simplify it for the question, but ended up with a trivial question... Let's see if this example is better: CREATE TABLE mytab ( num integer, num1 integer, num2 integer, mydate timestamp ); select MAX(num), SUM(num1)+SUM(num2), date_trunc('day', mydate) from mytab group by date_trunc('day', mydate) but I also want the timestamp where MAX(num) happened for each day.
On Wed, Nov 19, 2008 at 09:35:34AM -0600, Scara Maccai wrote: > Well you guys are right; the problem is that the query is actually more > complex, I tried to simplify it for the question, but ended up with a > trivial question... always a problem with simplification, I've done the same far too often myself! at least you tried :) > Let's see if this example is better: > > CREATE TABLE mytab > ( > num integer, > num1 integer, > num2 integer, > mydate timestamp > ); > > > select MAX(num), SUM(num1)+SUM(num2), date_trunc('day', mydate) from > mytab group by date_trunc('day', mydate) > > but I also want the timestamp where MAX(num) happened for each day. Do you really want the SUM of num1 and num2, because that makes it more fiddly and would be where having MAX accept a record/tuple would be best. If you don't, maybe something like: SELECT DISTINCT ON (date_trunc('day', mydate)) date_trunc('day', mydate), num, num1+num2 FROM mytab ORDER BY date_trunc('day', mydate), num DESC; Sam
Sam Mason wrote: > Do you really want the SUM of num1 and num2, because that makes it more > fiddly and would be where having MAX accept a record/tuple would be > best. If you don't, maybe something like: > > SELECT DISTINCT ON (date_trunc('day', mydate)) > date_trunc('day', mydate), num, num1+num2 > FROM mytab > ORDER BY date_trunc('day', mydate), num DESC; > Unfortunately, I need: - the SUM of some values, grouped per day - the MAX of some other value, grouped per day - the timestamp where the MAX above happened (per day, of course) The "num" columns are events, and sometimes I need to know not only the amount of a certain event per day, but also WHEN the MAX of some event happened... I guess I have to use a Composite Type (something like "numeric, timestamp") + user defined aggregate?
On Wed, Nov 19, 2008 at 10:10:08AM -0600, Scara Maccai wrote: > Sam Mason wrote: > >Do you really want the SUM of num1 and num2, because that makes it more > >fiddly and would be where having MAX accept a record/tuple would be > >best. If you don't, maybe something like: > > > > SELECT DISTINCT ON (date_trunc('day', mydate)) > > date_trunc('day', mydate), num, num1+num2 > > FROM mytab > > ORDER BY date_trunc('day', mydate), num DESC; > > > Unfortunately, I need: > > - the SUM of some values, grouped per day > - the MAX of some other value, grouped per day > - the timestamp where the MAX above happened (per day, of course) > > The "num" columns are events, and sometimes I need to know not only the > amount of a certain event per day, but also WHEN the MAX of some event > happened... > > I guess I have to use a Composite Type (something like "numeric, > timestamp") + user defined aggregate? Yes, that sounds about right. It's all going to be a bit more of a fiddle than it should be, but at least it's possible. An alternative would be to use a couple of sub-queries and put things back together again after getting each piece of data. The custom aggregate sounds the most elegant, it's just annoying that it's so much fiddling to get it all working to start with. Sam
Sam Mason wrote: > The custom aggregate sounds the > most elegant, it's just annoying that it's so much fiddling to get it > all working to start with Thanks. I think I wrote it, but there's something I don't get from the docs: do I have to call get_call_result_type(fcinfo, NULL, &tupdesc) every time? I mean: the only example I've found about returning Composite Types talks about returning sets as well (34.9.10. Returning Sets). In that example the get_call_result_type call is done once: if (SRF_IS_FIRSTCALL()) { [...] if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE) [...] } Should I do something like that in my function or that only applies to function returning sets? I'm calling it every time now but I don't know if it's right...
On Wed, Nov 19, 2008 at 05:06:14PM -0600, Scara Maccai wrote: > Sam Mason wrote: > >The custom aggregate sounds the > >most elegant, it's just annoying that it's so much fiddling to get it > >all working to start with > Thanks. > > I think I wrote it, but there's something I don't get from the docs: do > I have to call > > get_call_result_type(fcinfo, NULL, &tupdesc) I've always tried to stay away from C level extensions so far! How many records are you expecting to aggregate across? If it's only a few thousand a simple SQL language function may be ok: CREATE TYPE nt AS ( n INTEGER, t TIMESTAMP ); CREATE FUNCTION maxnt(nt, nt) RETURNS nt IMMUTABLE LANGUAGE SQL AS $$ SELECT CASE WHEN $1.n > $2.n THEN $1 ELSE COALESCE($2,$1) END $$; CREATE AGGREGATE MAX (nt) ( SFUNC = maxnt, STYPE = nt ); This is about 20 times slower than a C function (80 vs ~1500 rows per ms), but if you're only iterating over a few rows it's not going to matter much. Sorry I can't be of more help! Sam
Sam Mason wrote: > On Wed, Nov 19, 2008 at 05:06:14PM -0600, Scara Maccai wrote: > >> I think I wrote it, but there's something I don't get from the docs: do >> I have to call >> >> get_call_result_type(fcinfo, NULL, &tupdesc) >> > > I've always tried to stay away from C level extensions so far! How > many records are you expecting to aggregate across? If it's only a few > thousand a simple SQL language function may be ok: > It's going to be used in millions of records. That's why I went straight to the C implementation. But your SQL function is a lot easier to read. > Sorry I can't be of more help! > You helped me a lot, thank you. Anyone can answer me? Do I have to call get_call_result_type every time or I can just call it the first time? I don't think the docs are very clear in case of a function returning 1 Composite Type.