Thread: Restriction by grouping problem.
The query I have now: SELECT min(datettime), sid FROM logs GROUP by sid; This returns the first instance of an sid in the logs table, there can be many rows in the table for each sid. The problem I have is that I do not need the SID I just need to group the min(datetime) by it. I need to return the column getfile. So the question I have is how would I create the retstriction of the min(datetime), sid and only return the value of getfile? The english of what I am looking for is. Select the getfile for the earliest of each sid. Any help would be greatly appreciated. --Jeff
Jeff, > The query I have now: > > SELECT min(datettime), sid FROM logs GROUP by sid; > > This returns the first instance of an sid in the logs table, there > can be > many rows in the table for each sid. The problem I have is that I do > not > need the SID I just need to group the min(datetime) by it. The answer to this question is simple: SELECT min(datetime) as mintime FROM logs GROUP BY sid; However, I get the impression that your intended problem was more complicated. Can you re-explain it? -Josh ______AGLIO DATABASE SOLUTIONS___________________________ Josh Berkus Complete information technology josh@agliodbs.com and data management solutions (415) 565-7293 for law firms, small businesses fax 621-2533 and non-profit organizations. San Francisco
Yeah I have been having a bit of a problem expressing myself in terms of this query, sorry about that. What I need is: SELECT getfile FROM logs (and a restriction that results in finding the rows with the lowest datetime for each unique sid) To define this table a bit more: Logs table has a primary key of logid (int4) and serveral columns, of which I am deling with sid (text), getfile (text), and datetime (int4). Now a select getfile, datetime, logid from logs where sid = onevalue; would return a set of rows for that sid, the row I want is for the one with the smallest aka min(datetime) and I want this for every row in the table. Thanks for the quick response Josh, hopefully above will help out some. --Jeff ""Josh Berkus"" <josh@agliodbs.com> wrote in message news:web-92249@davinci.ethosmedia.com... > Jeff, > > > The query I have now: > > > > > > SELECT min(datettime), sid FROM logs GROUP by sid; > > > > > > This returns the first instance of an sid in the logs table, there > > > can be > > > many rows in the table for each sid. The problem I have is that I do > > > not > > > need the SID I just need to group the min(datetime) by it. > > The answer to this question is simple: > > SELECT min(datetime) as mintime FROM logs GROUP BY sid; > > However, I get the impression that your intended problem was more > complicated. Can you re-explain it? > > -Josh > > > ______AGLIO DATABASE SOLUTIONS___________________________ > > Josh Berkus > > Complete information technology josh@agliodbs.com > > and data management solutions (415) 565-7293 > > for law firms, small businesses fax 621-2533 > > and non-profit organizations. San Francisco > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly
A way to do this is: Select GETFILE From (Select Min(DATETIME),SID From LOGS Group by SID) as MDT, LOGS L Where L.SID =MDT.SID And L.DATETIME = MDT.DATETIME But this fails if there are more than one row for a given SID/DATETIME pair (doe you have a unique index on them?). Alternatively, you could try: Select (Select GETFILE From LOGS L Where L.SID = S.SID Order By DATETIME Asc Limit 1) as GETFILE From (Select Distinct SID From LOGS) as S ...the FROM statement would be a lot nicer if your had a table of SIDs, or at least had a good way of generating the list of SIDs. [Not sure if I got the PG dialect right] At 18:03 26/07/01 -0400, Jeff Barrett wrote: > >Logs table has a primary key of logid (int4) and serveral columns, of which >I am deling with sid (text), getfile (text), and datetime (int4). Now a >select getfile, datetime, logid from logs where sid = onevalue; would return >a set of rows for that sid, the row I want is for the one with the smallest >aka min(datetime) and I want this for every row in the table. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
At 11:47 27/07/01 +1000, Philip Warner wrote: > >...the FROM statement would be a lot nicer if your had a table of SIDs, or >at least had a good way of generating the list of SIDs. > A final (& possibly best) option, is: Select MDT.GETFILE, [anything else from 'best' record] From (Select Distinct SID From LOGS) as S, (Select* From LOGS L Where L.SID = S.SID Order By DATETIME Asc Limit 1) as MDT But unfortunately PG 7.1 does not allow this - the second select can not reference the first select. I have not tried it in 7.1.2. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Phillip, > What it effectively does is executes the second SELECT for each row > of the > first select, a bit like a column-select expression, but it allows > access > to all rows. Ah! I knew that there had to be a simple way to do what you want to do in PGSQL: SELECT MIN(getfile) as bestfile FROM logs, (SELECT SID, MAX(logtime) as lasttime FROM logs) MDT WHERE logs.SID = MDT.SID AND logs.logtime = MDT.lasttime GROUP BY SID This gives you the "getfile" from the logs with the latest timestamp. In the event of two with the exact same logtime, it selects the first one alphabetically so that you don't get two files for one SID. -Josh ______AGLIO DATABASE SOLUTIONS___________________________ Josh Berkus Complete informationtechnology josh@agliodbs.com and data management solutions (415) 565-7293 for law firms, small businesses fax 621-2533 and non-profit organizations. San Francisco
At 17:19 27/07/01 -0700, Josh Berkus wrote: >Phillip, > >> What it effectively does is executes the second SELECT for each row >> of the >> first select, a bit like a column-select expression, but it allows >> access >> to all rows. > >Ah! I knew that there had to be a simple way to do what you want to do >in PGSQL: The first two examples I sent do it 'simply' in PG. Unfortunately the example you give will only return one row, and since Jeff wanted one gettime for each SID, we'd need to modify both your and my first solution to: Select SID,Min(GETFILE) From (Select Min(DATETIME),SID From LOGS Group by SID) as MDT, LOGS L Where L.SID = MDT.SID And L.DATETIME = MDT.DATETIME This will work, but will not solve the larger problem of 'give me all the attrs of of the record with the least datetime for a given SID'. Jeff does not need this, but it is a common problem. Simply using min/max works for one attr, but fails to provide a consistent view of multiple attrs. PG has no elegant single-statement solution to this problem. The SQL solution I use in other systems is, as I mentioned earlier: Select [any list of attrs from the 'best' record] From (Select Distinct SID From LOGS) as S, -- Get the groupingattrs (Select * From LOGS L Where L.SID = S.SID -- For each group do a 'row-select' Order By DATETIME Asc Limit 1) as MDT The second select can be thought of as analagous to a column-select-expression, but allowing access to all attrs of the resulting row. If you want f1,f2,and f3 from LOGS, then a similar result would be achieved (inelegantly) by: Select (Select F1 From LOGS L Where L.SID = S.SID Order By DATETIME Asc Limit 1) as F1, (SelectF2 From LOGS L Where L.SID = S.SID Order By DATETIME Asc Limit 1) as F2, (Select F3 From LOGSL Where L.SID = S.SID Order By DATETIME Asc Limit 1) as F3 From (Select Distinct SID FromLOGS) as S, -- Get the grouping attrs Assuming DATETIME is unique then F1, F2, F3 will all come from the same row and you will have a consistent record. I have not checked, but I'd guess that PG will create a cross block with four entries, whereas the original syntax above should just use 2 entries. Like I said, it's just an optimizer hint. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes: > This will work, but will not solve the larger problem of 'give me all the > attrs of of the record with the least datetime for a given SID'. Jeff does > not need this, but it is a common problem. Simply using min/max works for > one attr, but fails to provide a consistent view of multiple attrs. PG has > no elegant single-statement solution to this problem. Yes it does: SELECT DISTINCT ON (sid) * FROM logs ORDER BY sid, datetime; This gives you just one output row per distinct "sid" value, and that row will be the one with least datetime. See the DISTINCT ON example in the SELECT reference page. regards, tom lane