Thread: Help with query
Hello, I need a little help with a query. We have some auditing tables. Now we want to see the values that were in a database at a special date. Here is our table: project_his_conno operation user_no stamp project_no project_name 1 I 1234 2006-01-15 16:58:41.218 1 TEST 2 I 1234 2006-01-15 16:59:16.703 2 TEST2 3 U 1234 2006-01-15 17:03:37.937 1 TESTxyz 4 U 1234 2006-01-15 17:03:37.937 2 TEST2xyz 5 D 1234 2006-01-15 17:04:09.234 1 TESTxyz 6 D 1234 2006-01-15 17:04:09.234 2 TEST2xyz I want to see the project_no and project_name at 2006-01-15 17:04:00 select project_no, project_name from p01_projects_his where stamp<'2006-01-15 17:04:00' But now I am getting row 1 to 4. But for every project_no I only want to get the newest (row 3 and 4 here). Can I use a normal query or will I have to write a function for that? When this is solved the next part is to return nothing if the row is already deleted. So when the query is: select project_no, project_name from p01_projects_his where stamp<'2006-01-15 17:06:00' The newest rows are 5 and 6. But these rows indicate that the row was deleted (operation = 'D') and then nothing should be returned. I think I have do go the following way: 1. Only return the newest row for every project_no 2. If operaton = D then delete this row from the result-set. I hope this is possible without using stored functions. I tried to use the 'limit' at the end of the statement (for example limit 2). But this will not work, because I can not know how much rows I will need. How would you solve this? Thanks, Christian
Christian Hofmann <christian.hofmann@gmx.de> schrieb: > Hello, I need a little help with a query. > > We have some auditing tables. Now we want to see the values that were in a > database at a special date. > > Here is our table: > > project_his_conno operation user_no stamp project_no > project_name > 1 I 1234 2006-01-15 16:58:41.218 1 TEST > 2 I 1234 2006-01-15 16:59:16.703 2 TEST2 > 3 U 1234 2006-01-15 17:03:37.937 1 TESTxyz > 4 U 1234 2006-01-15 17:03:37.937 2 TEST2xyz > 5 D 1234 2006-01-15 17:04:09.234 1 TESTxyz > 6 D 1234 2006-01-15 17:04:09.234 2 TEST2xyz > > > I want to see the project_no and project_name at 2006-01-15 17:04:00 > > select project_no, project_name from p01_projects_his where > stamp<'2006-01-15 17:04:00' > > But now I am getting row 1 to 4. But for every project_no I only want to get > the newest (row 3 and 4 here). Why is project_name for a given project_no different? I think, you have a wrong data-model. I have created a similar table and a test-case: test=# \d p Table "public.p" Column | Type | Modifiers --------+-----------------------------+----------- id | integer | op | character(1) | pnr | integer | ts | timestamp without time zone | test=# select * from p; id | op | pnr | ts ----+----+-----+------------------------- 1 | I | 1 | 2006-01-15 16:58:41.218 2 | I | 2 | 2006-01-15 16:59:16.703 3 | U | 1 | 2006-01-15 17:03:37.937 4 | U | 2 | 2006-01-15 17:03:37.937 5 | D | 1 | 2006-01-15 17:04:09.234 6 | D | 2 | 2006-01-15 17:04:09.234 (6 rows) test=# select pnr, max(ts) from p where ts < '2006-01-15 17:04:00' group by pnr; pnr | max -----+------------------------- 2 | 2006-01-15 17:03:37.937 1 | 2006-01-15 17:03:37.937 (2 rows) > > Can I use a normal query or will I have to write a function for that? > > When this is solved the next part is to return nothing if the row is already > deleted. So when the query is: > > select project_no, project_name from p01_projects_his where > stamp<'2006-01-15 17:06:00' > > The newest rows are 5 and 6. But these rows indicate that the row was > deleted (operation = 'D') and then nothing should be returned. select pnr, max(ts) from p where ts < '2006-01-15 17:06:00' and pnr not in (select pnr from p where op = 'D' and ts < '2006-01-15 17:06:00') group by pnr; HTH, Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
Hello Andreas, > Why is project_name for a given project_no different? I > think, you have a wrong data-model. I updated the project_name. It is the same when it were a address table customers storing their addresses and later I update some values when they move to an other city. > select pnr, max(ts) from p where ts < '2006-01-15 17:06:00' > and pnr not > in (select pnr from p where op = 'D' and ts < '2006-01-15 17:06:00') > group by pnr; I tried to use this on my table: select project_no, project_name, max(stamp) from p01_projects_his where stamp < '2006-01-15 17:06:00' and project_no not in (select project_no from p01_projects_his where operation = 'D' and stamp < '2006-01-15 17:06:00') group by project_no; I added project_name because I want to have this row too (to see what it is at this time). But now I get the following error: ERROR: column "p01_projects_his.project_name" must appear in the GROUP BY clause or be used in an aggregate function Do you know how to solve this error? I need this column. Adding project_name to the group by will not work, because grouping by project_name is not possible (they are different). Thank you, Christian
am 15.01.2006, um 20:26:30 +0100 mailte Christian Hofmann folgendes: > Hello Andreas, Morning, > > > > Why is project_name for a given project_no different? I > > think, you have a wrong data-model. > > I updated the project_name. It is the same when it were a address table > customers storing their addresses and later I update some values when they > move to an other city. > > > select pnr, max(ts) from p where ts < '2006-01-15 17:06:00' > > and pnr not > > in (select pnr from p where op = 'D' and ts < '2006-01-15 17:06:00') > > group by pnr; > > I tried to use this on my table: > > select project_no, project_name, max(stamp) from p01_projects_his where > stamp < '2006-01-15 17:06:00' and project_no not > in (select project_no from p01_projects_his where operation = 'D' and stamp > < '2006-01-15 17:06:00') group by project_no; > > > I added project_name because I want to have this row too (to see what it is > at this time). But now I get the following error: Then you *must* this row added to the 'GROUP BY'. Because you have for one project_no different project_name, you get for every project_no AND project_name a max(stamp). > Do you know how to solve this error? I need this column. normalize your table. Andreas -- Andreas Kretschmer (Kontakt: siehe Header) Heynitz: 035242/47212, D1: 0160/7141639 GnuPG-ID 0x3FFF606C http://wwwkeys.de.pgp.net === Schollglas Unternehmensgruppe ===
Hello Andreas, > > Do you know how to solve this error? I need this column. > > normalize your table. But my tables are normalized. In the normal table there is only one state for a given project_no. So your queries would success. But the table we are talking about is a auditing table. It it logging all events that are going to the normal table. So it saves when a row is inserted, updated or deleted. When in the normal table the name for the project_no 1 is altered there will be also only one row. But in the auditing table (the one I am talking about) there will be one row for every update, insert or delete in the normal table. This is based on the documentation: http://www.postgresql.org/docs/8.1/static/plpgsql-trigger.html So the example table would be the emp_audit and the normal table the table emp. But whzen there is no other possibility I think I will have to fetch all the rows and do the logic which row is the right one in business logig? Thank you, Christian
Christian Hofmann <christian.hofmann@gmx.de> schrieb: > But my tables are normalized. In the normal table there is only one state > for a given project_no. > So your queries would success. > But the table we are talking about is a auditing table. It it logging all > events that are going to the normal table. > So it saves when a row is inserted, updated or deleted. Okay. > When in the normal table the name for the project_no 1 is altered there will > be also only one row. But in the auditing table (the one I am talking about) > there will be one row for every update, insert or delete in the normal > table. Okay. Perhaps there are other solutions: you can create a new table (project_no, project_name), with a primary index on project_no. And create a trigger on the original table to insert or update this table. Then you can join this table to other querys. Perhaps there are other solutions with tricky joins to select the latest project_name for every project_no and joining this to your select. test=# select * from pname; id | name ----+---------- 1 | TESTxyz 2 | TEST2xyz (2 rows) test=# select pnr, pname.name, max(ts) from p left join pname on pnr=pname.id where ts < '2006-01-15 17:04:00' group by pnr, pname.name; pnr | name | max -----+----------+------------------------- 1 | TESTxyz | 2006-01-15 17:03:37.937 2 | TEST2xyz | 2006-01-15 17:03:37.937 (2 rows) Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°