Thread: History-based (or logged) database.

History-based (or logged) database.

From

"Octavio Alvarez"

Date:

04 January 2004, 23:44:15

Hi! I would like to implement a database which allows me to keep track of
changes from users, but I don't know if there is any model already used
for this. Let me show you what I mean.

Say I have a table t_table1 with 2 columns plus a PK. Normally my table
with some data would look like:

t_table1
------------------
 PK | col1 | col2
------------------
  1 |   3  |   4
  2 |   4  |   7
  3 |   6  |   9
 ... and so on...

If I make a change, I can't get the info about who made the change and
when did he do it, I can't do a "what was the value on a certain
date"-type query.

An UPDATE col1 = 9 WHERE pk = 1; would make t_table1 look like:

t_table1:
------------------
 PK | col1 | col2
------------------
  1 |   9  |   4
  2 |   4  |   7
  3 |   6  |   9
 ... and so on...

To solve my "who and when", and "what on a certain date" problem, I was
thinking on a platform like the following:

t_table1:
-------------------
 PK | col1 | col2 |     record_date           | who_created
-------------------
  1 |   3  |   4  |  2003-03-03 11:30:10 a.m. | alvarezp
  2 |   4  |   7  |  2003-03-03 11:30:10 a.m. | alvarezp
  3 |   6  |   9  |  2003-03-04 11:30:10 a.m. | alvarezp
 ... and so on...

Now, an UPDATE col1 = 9 WHERE pk = 1; (done on '2003-03-05 12:00:00 a.m.')
by 'ggarcia' would make t_table1 look like:

t_table1:
    --------------------
UID | PK | col1 | col2 |     record_date           | who_created
    --------------------
  1 |  1 |   3  |   4  |  2003-03-03 11:30:10 a.m. | alvarezp
  2 |  2 |   4  |   7  |  2003-03-03 11:30:10 a.m. | alvarezp
  3 |  3 |   6  |   9  |  2003-03-04 11:30:10 a.m. | alvarezp
  4 |  1 |   9  |   4  |  2003-03-05 12:00:00 a.m. | ggarcia
 ... and so on...

I would extend SQL to include a "WHEN" clause in SELECT statements. If
omitted, the query should use only the last valid records, using only UID
= {2, 3, 4}, which will make it completely transparent to not-yet-updated
applications.

Of course, may be a "deleted" column would be needed in order to DELETE
from t_table1;" and still have the data available for the hypothetical
"SELECT ... WHEN '2003-03-03 3:00:00 p. m.';"

Has anyone implemented something similar in PGSQL? If so, how have you
done it?

Thanks in advance.

--
Octavio Alvarez Piza.
E-mail: alvarezp@alvarezp.ods.org

Re: History-based (or logged) database.

From

"Chris Travers"

Date:

05 January 2004, 09:05:42

Hi Octavio;

I have had to do something like this (hotel reservation app I am developing)
and want to be able to reconstruct an accurate picture of the database from
any point in time for reporting purposes (suppose I change the configuration
of a room and want to see vacancy rate info for a certain configuration on a
certain date).

Here is what I did.  This is off the top of my head, and may not work as
written here, but it should at least show the idea and structure of my
solution ot the problem.
(simple excerpt that shows the example):

create table rooms (
room_id varchar(64) primary key,
class_id int4 references rclasses(class_id)
);

create table room_archive (
room_id varchar(64) NOT NULL,
class_id int4 NOT NULL,
valid_until TIMESTAMP NOT NULL);

CREATE FUNCTION archive_room_trig() RETURNS TRIGGER AS '
    BEGIN
        INSERT INTO room_archive(room_id, class_id, valid_until)
        VALUES (old.room_id, old.class_id, now());
    END;
' LANGUAGE PLPGSQL;

CREATE TRIGGER trig_archive_room
BEFORE INSERT OR UPDATE OR DELETE ON rooms
FOR EACH ROW archive_room_trig();

CREATE VIEW room_history AS
SELECT room_id, class_id, now() AS valid_until FROM rooms
UNION
SELECT room_id, class_id, valid_until FROM room_archive;

Best Wishes,
Chris Travers
----- Original Message -----
From: "Octavio Alvarez" <alvarezp@alvarezp.ods.org>
To: <pgsql-general@postgresql.org>
Sent: Monday, January 05, 2004 10:43 AM
Subject: [GENERAL] History-based (or logged) database.


>
> Hi! I would like to implement a database which allows me to keep track of
> changes from users, but I don't know if there is any model already used
> for this. Let me show you what I mean.
>
> Say I have a table t_table1 with 2 columns plus a PK. Normally my table
> with some data would look like:
>
> t_table1
> ------------------
>  PK | col1 | col2
> ------------------
>   1 |   3  |   4
>   2 |   4  |   7
>   3 |   6  |   9
>  ... and so on...
>
> If I make a change, I can't get the info about who made the change and
> when did he do it, I can't do a "what was the value on a certain
> date"-type query.
>
> An UPDATE col1 = 9 WHERE pk = 1; would make t_table1 look like:
>
> t_table1:
> ------------------
>  PK | col1 | col2
> ------------------
>   1 |   9  |   4
>   2 |   4  |   7
>   3 |   6  |   9
>  ... and so on...
>
> To solve my "who and when", and "what on a certain date" problem, I was
> thinking on a platform like the following:
>
> t_table1:
> -------------------
>  PK | col1 | col2 |     record_date           | who_created
> -------------------
>   1 |   3  |   4  |  2003-03-03 11:30:10 a.m. | alvarezp
>   2 |   4  |   7  |  2003-03-03 11:30:10 a.m. | alvarezp
>   3 |   6  |   9  |  2003-03-04 11:30:10 a.m. | alvarezp
>  ... and so on...
>
> Now, an UPDATE col1 = 9 WHERE pk = 1; (done on '2003-03-05 12:00:00 a.m.')
> by 'ggarcia' would make t_table1 look like:
>
> t_table1:
>     --------------------
> UID | PK | col1 | col2 |     record_date           | who_created
>     --------------------
>   1 |  1 |   3  |   4  |  2003-03-03 11:30:10 a.m. | alvarezp
>   2 |  2 |   4  |   7  |  2003-03-03 11:30:10 a.m. | alvarezp
>   3 |  3 |   6  |   9  |  2003-03-04 11:30:10 a.m. | alvarezp
>   4 |  1 |   9  |   4  |  2003-03-05 12:00:00 a.m. | ggarcia
>  ... and so on...
>
> I would extend SQL to include a "WHEN" clause in SELECT statements. If
> omitted, the query should use only the last valid records, using only UID
> = {2, 3, 4}, which will make it completely transparent to not-yet-updated
> applications.
>
> Of course, may be a "deleted" column would be needed in order to DELETE
> from t_table1;" and still have the data available for the hypothetical
> "SELECT ... WHEN '2003-03-03 3:00:00 p. m.';"
>
> Has anyone implemented something similar in PGSQL? If so, how have you
> done it?
>
> Thanks in advance.
>
> --
> Octavio Alvarez Piza.
> E-mail: alvarezp@alvarezp.ods.org
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org
>
>

Re: History-based (or logged) database.

From

elein

Date:

11 January 2004, 23:46:43

On Mon, Jan 05, 2004 at 05:16:23PM +0700, Chris Travers wrote:
> Hi Octavio;
>
> I have had to do something like this (hotel reservation app I am developing)
> and want to be able to reconstruct an accurate picture of the database from
> any point in time for reporting purposes (suppose I change the configuration
> of a room and want to see vacancy rate info for a certain configuration on a
> certain date).
>
> Here is what I did.  This is off the top of my head, and may not work as
> written here, but it should at least show the idea and structure of my
> solution ot the problem.
> (simple excerpt that shows the example):
>
> create table rooms (
> room_id varchar(64) primary key,
> class_id int4 references rclasses(class_id)
> );
>
> create table room_archive (
> room_id varchar(64) NOT NULL,
> class_id int4 NOT NULL,
> valid_until TIMESTAMP NOT NULL);
>
> CREATE FUNCTION archive_room_trig() RETURNS TRIGGER AS '
>     BEGIN
>         INSERT INTO room_archive(room_id, class_id, valid_until)
>         VALUES (old.room_id, old.class_id, now());
>     END;
> ' LANGUAGE PLPGSQL;
>
> CREATE TRIGGER trig_archive_room
> BEFORE INSERT OR UPDATE OR DELETE ON rooms
> FOR EACH ROW archive_room_trig();
>
> CREATE VIEW room_history AS
> SELECT room_id, class_id, now() AS valid_until FROM rooms
> UNION
> SELECT room_id, class_id, valid_until FROM room_archive;
>
> Best Wishes,
> Chris Travers
> ----- Original Message -----
> From: "Octavio Alvarez" <alvarezp@alvarezp.ods.org>
> To: <pgsql-general@postgresql.org>
> Sent: Monday, January 05, 2004 10:43 AM
> Subject: [GENERAL] History-based (or logged) database.
>
>
> >
> > Hi! I would like to implement a database which allows me to keep track of
> > changes from users, but I don't know if there is any model already used
> > for this. Let me show you what I mean.
> >
> > Say I have a table t_table1 with 2 columns plus a PK. Normally my table
> > with some data would look like:
> >
> > t_table1
> > ------------------
> >  PK | col1 | col2
> > ------------------
> >   1 |   3  |   4
> >   2 |   4  |   7
> >   3 |   6  |   9
> >  ... and so on...
> >
> > If I make a change, I can't get the info about who made the change and
> > when did he do it, I can't do a "what was the value on a certain
> > date"-type query.
> >
> > An UPDATE col1 = 9 WHERE pk = 1; would make t_table1 look like:
> >
> > t_table1:
> > ------------------
> >  PK | col1 | col2
> > ------------------
> >   1 |   9  |   4
> >   2 |   4  |   7
> >   3 |   6  |   9
> >  ... and so on...
> >
> > To solve my "who and when", and "what on a certain date" problem, I was
> > thinking on a platform like the following:
> >
> > t_table1:
> > -------------------
> >  PK | col1 | col2 |     record_date           | who_created
> > -------------------
> >   1 |   3  |   4  |  2003-03-03 11:30:10 a.m. | alvarezp
> >   2 |   4  |   7  |  2003-03-03 11:30:10 a.m. | alvarezp
> >   3 |   6  |   9  |  2003-03-04 11:30:10 a.m. | alvarezp
> >  ... and so on...
> >
> > Now, an UPDATE col1 = 9 WHERE pk = 1; (done on '2003-03-05 12:00:00 a.m.')
> > by 'ggarcia' would make t_table1 look like:
> >
> > t_table1:
> >     --------------------
> > UID | PK | col1 | col2 |     record_date           | who_created
> >     --------------------
> >   1 |  1 |   3  |   4  |  2003-03-03 11:30:10 a.m. | alvarezp
> >   2 |  2 |   4  |   7  |  2003-03-03 11:30:10 a.m. | alvarezp
> >   3 |  3 |   6  |   9  |  2003-03-04 11:30:10 a.m. | alvarezp
> >   4 |  1 |   9  |   4  |  2003-03-05 12:00:00 a.m. | ggarcia
> >  ... and so on...
> >
> > I would extend SQL to include a "WHEN" clause in SELECT statements. If
> > omitted, the query should use only the last valid records, using only UID
> > = {2, 3, 4}, which will make it completely transparent to not-yet-updated
> > applications.
> >
> > Of course, may be a "deleted" column would be needed in order to DELETE
> > from t_table1;" and still have the data available for the hypothetical
> > "SELECT ... WHEN '2003-03-03 3:00:00 p. m.';"
> >
> > Has anyone implemented something similar in PGSQL? If so, how have you
> > done it?
> >
> > Thanks in advance.
> >
> > --
> > Octavio Alvarez Piza.
> > E-mail: alvarezp@alvarezp.ods.org
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> >                http://archives.postgresql.org
> >
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly

In the current issue of PostgreSQL General Bits #57
http://www.varlena.com/GeneralBits
there is also a brief example of tracking updates with triggers.

Be careful out there!  I think Chris referenced OLD values
instead of NEW values in his version of the BEFORE TRIGGER.

elein
============================================================
elein@varlena.com        Varlena, LLC        www.varlena.com

          PostgreSQL Consulting, Support & Training

PostgreSQL General Bits   http://www.varlena.com/GeneralBits/
=============================================================
I have always depended on the [QA] of strangers.

Re: History-based (or logged) database.

From

"Chris Travers"

Date:

12 January 2004, 09:30:06

Hi Elein;

Nope, OLD is correct.  I track the OLD values and then use the view to
combine those with the current ones.  This allows the OLAP portions of the
code to hit against *all* the data, while archiving old, outdated
information in the archive table.  It also allows deleted tuples to be
tracked with the same trigger since a deleted row doesn't exactly have a NEW
tuple :-) Maybe you misunderstand what I am trying to do?

Best WIshes,
Chris Travers

Re: History-based (or logged) database.

From

elein

Date:

12 January 2004, 14:07:02

Yes, I did.  For just the simple updating, (not the
logging you are doing) NEW is what you want.  But OLD is proper
for archiving/logging.

--elein

On Mon, Jan 12, 2004 at 08:22:27PM +0700, Chris Travers wrote:
> Hi Elein;
>
> Nope, OLD is correct.  I track the OLD values and then use the view to
> combine those with the current ones.  This allows the OLAP portions of the
> code to hit against *all* the data, while archiving old, outdated
> information in the archive table.  It also allows deleted tuples to be
> tracked with the same trigger since a deleted row doesn't exactly have a NEW
> tuple :-) Maybe you misunderstand what I am trying to do?
>
> Best WIshes,
> Chris Travers