Thread: Help with query

Help with query

From

"Christian Hofmann"

Date:

15 January 2006, 14:19:13

Hello, I need a little help with a query.

We have some auditing tables. Now we want to see the values that were in a
database at a special date.

Here is our table:

project_his_conno    operation    user_no    stamp    project_no
project_name
1    I    1234    2006-01-15 16:58:41.218    1    TEST
2    I    1234    2006-01-15 16:59:16.703    2    TEST2
3    U    1234    2006-01-15 17:03:37.937    1    TESTxyz
4    U    1234    2006-01-15 17:03:37.937    2    TEST2xyz
5    D    1234    2006-01-15 17:04:09.234    1    TESTxyz
6    D    1234    2006-01-15 17:04:09.234    2    TEST2xyz


I want to see the project_no and project_name at 2006-01-15 17:04:00

select project_no, project_name from p01_projects_his where
stamp<'2006-01-15 17:04:00'

But now I am getting row 1 to 4. But for every project_no I only want to get
the newest (row 3 and 4 here).

Can I use a normal query or will I have to write a function for that?

When this is solved the next part is to return nothing if the row is already
deleted. So when the query is:

select project_no, project_name from p01_projects_his where
stamp<'2006-01-15 17:06:00'

The newest rows are 5 and 6. But these rows indicate that the row was
deleted (operation = 'D') and then nothing should be returned.


I think I have do go the following way:

1. Only return the newest row for every project_no
2. If operaton = D then delete this row from the result-set.

I hope this is possible without using stored functions.

I tried to use the 'limit' at the end of the statement (for example limit
2). But this will not work, because I can not know how much rows I will
need.

How would you solve this?

Thanks,

Christian

Re: Help with query

From

Andreas Kretschmer

Date:

15 January 2006, 14:49:46

Christian Hofmann <christian.hofmann@gmx.de> schrieb:

> Hello, I need a little help with a query.
>
> We have some auditing tables. Now we want to see the values that were in a
> database at a special date.
>
> Here is our table:
>
> project_his_conno    operation    user_no    stamp    project_no
> project_name
> 1    I    1234    2006-01-15 16:58:41.218    1    TEST
> 2    I    1234    2006-01-15 16:59:16.703    2    TEST2
> 3    U    1234    2006-01-15 17:03:37.937    1    TESTxyz
> 4    U    1234    2006-01-15 17:03:37.937    2    TEST2xyz
> 5    D    1234    2006-01-15 17:04:09.234    1    TESTxyz
> 6    D    1234    2006-01-15 17:04:09.234    2    TEST2xyz
>
>
> I want to see the project_no and project_name at 2006-01-15 17:04:00
>
> select project_no, project_name from p01_projects_his where
> stamp<'2006-01-15 17:04:00'
>
> But now I am getting row 1 to 4. But for every project_no I only want to get
> the newest (row 3 and 4 here).

Why is project_name for a given project_no different? I think, you have
a wrong data-model.


I have created a similar table and a test-case:

test=# \d p
                 Table "public.p"
 Column |            Type             | Modifiers
--------+-----------------------------+-----------
 id     | integer                     |
 op     | character(1)                |
 pnr    | integer                     |
 ts     | timestamp without time zone |

test=# select * from p;
 id | op | pnr |           ts
----+----+-----+-------------------------
  1 | I  |   1 | 2006-01-15 16:58:41.218
  2 | I  |   2 | 2006-01-15 16:59:16.703
  3 | U  |   1 | 2006-01-15 17:03:37.937
  4 | U  |   2 | 2006-01-15 17:03:37.937
  5 | D  |   1 | 2006-01-15 17:04:09.234
  6 | D  |   2 | 2006-01-15 17:04:09.234
(6 rows)

test=# select pnr, max(ts) from p where ts < '2006-01-15 17:04:00' group by pnr;
 pnr |           max
-----+-------------------------
   2 | 2006-01-15 17:03:37.937
   1 | 2006-01-15 17:03:37.937
(2 rows)





>
> Can I use a normal query or will I have to write a function for that?
>
> When this is solved the next part is to return nothing if the row is already
> deleted. So when the query is:
>
> select project_no, project_name from p01_projects_his where
> stamp<'2006-01-15 17:06:00'
>
> The newest rows are 5 and 6. But these rows indicate that the row was
> deleted (operation = 'D') and then nothing should be returned.

select pnr, max(ts) from p where ts < '2006-01-15 17:06:00' and pnr not
in (select pnr from p where op = 'D' and ts < '2006-01-15 17:06:00')
group by pnr;


HTH, Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect. (Linus Torvalds)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: Help with query

From

"Christian Hofmann"

Date:

15 January 2006, 15:26:42

Hello Andreas,


> Why is project_name for a given project_no different? I
> think, you have a wrong data-model.

I updated the project_name. It is the same when it were a address table
customers storing their addresses and later I update some values when they
move to an other city.

> select pnr, max(ts) from p where ts < '2006-01-15 17:06:00'
> and pnr not
> in (select pnr from p where op = 'D' and ts < '2006-01-15 17:06:00')
> group by pnr;

I tried to use this on my table:

select project_no, project_name, max(stamp) from p01_projects_his where
stamp < '2006-01-15 17:06:00' and project_no not
in (select project_no from p01_projects_his where operation = 'D' and stamp
< '2006-01-15 17:06:00') group by project_no;


I added project_name because I want to have this row too (to see what it is
at this time). But now I get the following error:


ERROR:  column "p01_projects_his.project_name" must appear in the GROUP BY
clause or be used in an aggregate function

Do you know how to solve this error? I need this column.
Adding project_name to the group by will not work, because grouping by
project_name is not possible (they are different).

Thank you,

Christian

Re: Help with query

From

"A. Kretschmer"

Date:

16 January 2006, 02:09:02

am  15.01.2006, um 20:26:30 +0100 mailte Christian Hofmann folgendes:
> Hello Andreas,

Morning,


>
>
> > Why is project_name for a given project_no different? I
> > think, you have a wrong data-model.
>
> I updated the project_name. It is the same when it were a address table
> customers storing their addresses and later I update some values when they
> move to an other city.
>
> > select pnr, max(ts) from p where ts < '2006-01-15 17:06:00'
> > and pnr not
> > in (select pnr from p where op = 'D' and ts < '2006-01-15 17:06:00')
> > group by pnr;
>
> I tried to use this on my table:
>
> select project_no, project_name, max(stamp) from p01_projects_his where
> stamp < '2006-01-15 17:06:00' and project_no not
> in (select project_no from p01_projects_his where operation = 'D' and stamp
> < '2006-01-15 17:06:00') group by project_no;
>
>
> I added project_name because I want to have this row too (to see what it is
> at this time). But now I get the following error:

Then you *must* this row added to the 'GROUP BY'. Because you have for
one project_no different project_name, you get for every project_no AND
project_name a max(stamp).


> Do you know how to solve this error? I need this column.

normalize your table.


Andreas
--
Andreas Kretschmer    (Kontakt: siehe Header)
Heynitz:  035242/47212,      D1: 0160/7141639
GnuPG-ID 0x3FFF606C http://wwwkeys.de.pgp.net
 ===    Schollglas Unternehmensgruppe    ===

Re: Help with query

From

"Christian Hofmann"

Date:

16 January 2006, 08:56:10

Hello Andreas,

> > Do you know how to solve this error? I need this column.
>
> normalize your table.

But my tables are normalized. In the normal table there is only one state
for a given project_no.
So your queries would success.
But the table we are talking about is a auditing table. It it logging all
events that are going to the normal table.
So it saves when a row is inserted, updated or deleted.
When in the normal table the name for the project_no 1 is altered there will
be also only one row. But in the auditing table (the one I am talking about)
there will be one row for every update, insert or delete in the normal
table.

This is based on the documentation:
http://www.postgresql.org/docs/8.1/static/plpgsql-trigger.html

So the example table would be the emp_audit and the normal table the table
emp.

But whzen there is no other possibility I think I will have to fetch all the
rows and do the logic which row is the right one in business logig?

Thank you,

Christian

Re: Help with query

From

Andreas Kretschmer

Date:

16 January 2006, 12:46:18

Christian Hofmann <christian.hofmann@gmx.de> schrieb:
> But my tables are normalized. In the normal table there is only one state
> for a given project_no.
> So your queries would success.
> But the table we are talking about is a auditing table. It it logging all
> events that are going to the normal table.
> So it saves when a row is inserted, updated or deleted.

Okay.


> When in the normal table the name for the project_no 1 is altered there will
> be also only one row. But in the auditing table (the one I am talking about)
> there will be one row for every update, insert or delete in the normal
> table.

Okay. Perhaps there are other solutions: you can create a new table
(project_no, project_name), with a primary index on project_no.
And create a trigger on the original table to insert or update this
table. Then you can join this table to other querys.

Perhaps there are other solutions with tricky joins to select the latest
project_name for every project_no and joining this to your select.

test=# select * from pname;
 id |   name
----+----------
  1 | TESTxyz
  2 | TEST2xyz
(2 rows)

test=# select pnr, pname.name, max(ts) from p left join pname on
pnr=pname.id where ts < '2006-01-15 17:04:00' group by pnr, pname.name;
 pnr |   name   |           max
-----+----------+-------------------------
   1 | TESTxyz  | 2006-01-15 17:03:37.937
   2 | TEST2xyz | 2006-01-15 17:03:37.937
(2 rows)


Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect. (Linus Torvalds)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°