Thread: Rule causes baffling error

Rule causes baffling error

From

"Ken Winter"

Date:

16 December 2005, 17:07:32

I'm trying to figure out why a rule gives me a uniqueness violation when I
try to do an update.

I have a table, "my_data", defined as:

create table my_data (id INT8 not null default nextval('person_seq'),effective_date_and_time TIMESTAMP WITH TIME ZONE
notnull default
 
CURRENT_TIMESTAMP,expiration_date_and_time TIMESTAMP WITH TIME ZONE null default
'infinity',user_name VARCHAR(255)null,constraint PK_MY_DATA primary key
(effective_date_and_time, id)
);

I have a view, my_data_now, defined as:

SELECT my_data.id, my_data.user_name, my_data.effective_date_and_time,     my_data.expiration_date_and_time
FROM     my_data
WHERE my_data.effective_date_and_time <= 'now'::text::timestamp(6)
with time zone AND     my_data.expiration_date_and_time >=
'now'::text::timestamp(6) with time zone;

And I have this rule (among others):

CREATE OR REPLACE RULE upd_my_data_now ASON UPDATE TO my_data_nowDO INSTEAD    (    /* Update current record, and make
iteffective now. */    UPDATE my_data        SET id = NEW.id,             user_name = NEW.user_name,
effective_date_and_time=
 
('now'::text)::timestamp(6) with time zone        WHERE effective_date_and_time =
OLD.effective_date_and_time             AND id = OLD.id;    /* Insert a record containing the old values,     and
expireit as of now. */    INSERT INTO my_data (        effective_date_and_time,        expiration_date_and_time,
  id,         user_name)         VALUES (              OLD.effective_date_and_time,
('now'::text)::timestamp(6)with time zone,            OLD.id,             OLD.user_name)    ); 
 

This rule is supposed to (1) cause an update directed to the view
"my_data_now" to be made to the underlying table "my_data", (2) reset the
"effective_date_and_time" of that row to 'now', (3) insert a record
containing the old values into "my_data", and (4) expire that "old" record
by setting its "expiration_date_and_time" to 'now'.

But when I try to do an update against the view "my_data_now" with a query
such as:
    update my_data_now set user_name = 'Suzy' where id = 1;

I get:
ERROR:  duplicate key violates unique constraint "pk_my_data"

Presumably this happens when the rule tries to insert the new row. The new
row does indeed contain the "old" id and effective_date_and_time.  However,
the rule is structured so that the current row's "effective_date_and_time"
gets updated to 'now' *before* the new row is inserted, making its value
different from the old "effective_date_and_time".  So the uniqueness
conflict shouldn't occur.

I figure either there's some bug in my code that I can't see, or else the
PostgreSQL rule processor works in some way that I don't understand.  

In either case, help!

~ TIA
~ Ken

Re: Rule causes baffling error

From

Richard Huxton

Date:

19 December 2005, 05:08:46

Ken Winter wrote:
> This rule is supposed to (1) cause an update directed to the view
> "my_data_now" to be made to the underlying table "my_data", (2) reset the
> "effective_date_and_time" of that row to 'now', (3) insert a record
> containing the old values into "my_data", and (4) expire that "old" record
> by setting its "expiration_date_and_time" to 'now'.

I think you want a trigger rather than a rule.

Rules rewrite the query structure, triggers let you deal with values on 
a row-by-row basis (for row-level triggers).

--   Richard Huxton  Archonet Ltd

Re: Rule causes baffling error

From

"Ken Winter"

Date:

19 December 2005, 11:44:46

Richard ~

Thanks for your response.

Can a trigger be written on a *view*?  I can't find anything in the
PostgreSQL docs that answers this question.

I originally wrote these actions (described in my original message) as a
trigger on my base table, but then realized I was getting in deeper and
deeper trouble because (a) I was getting into cascading triggers that I
didn't want and (b) I need to enable some queries to access the base table
without triggering these actions.  That's why I set up the view, and then I
assumed that the only way I could implement these actions was as rules.  

~ Ken

> -----Original Message-----
> From: Richard Huxton [mailto:dev@archonet.com]
> Sent: Monday, December 19, 2005 4:08 AM
> To: Ken Winter
> Cc: 'PostgreSQL pg-sql list'
> Subject: Re: [SQL] Rule causes baffling error
> 
> Ken Winter wrote:
> > This rule is supposed to (1) cause an update directed to the view
> > "my_data_now" to be made to the underlying table "my_data", (2) reset
> the
> > "effective_date_and_time" of that row to 'now', (3) insert a record
> > containing the old values into "my_data", and (4) expire that "old"
> record
> > by setting its "expiration_date_and_time" to 'now'.
> 
> I think you want a trigger rather than a rule.
> 
> Rules rewrite the query structure, triggers let you deal with values on
> a row-by-row basis (for row-level triggers).
> 
> --
>    Richard Huxton
>    Archonet Ltd

Re: Rule causes baffling error

From

Richard Huxton

Date:

19 December 2005, 12:08:33

Ken Winter wrote:
> Richard ~
> 
> Thanks for your response.
> 
> Can a trigger be written on a *view*?  I can't find anything in the
> PostgreSQL docs that answers this question.

There's nothing for them to fire against even if you could attach the 
trigger. I suppose you could have a statement-level trigger in more 
recent versions, but for row-level triggers there aren't any rows in the 
view to be affected.

> I originally wrote these actions (described in my original message) as a
> trigger on my base table, but then realized I was getting in deeper and
> deeper trouble because (a) I was getting into cascading triggers that I
> didn't want and (b) I need to enable some queries to access the base table
> without triggering these actions.  That's why I set up the view, and then I
> assumed that the only way I could implement these actions was as rules.  

Hmm - the cascading should be straightforward enough to deal with. When 
you are updating check if NEW.expiration_date_and_time = now() and if so 
exit the trigger function (since there's nothing to do anyway).

The other thing you might want to consider is whether the "live" data 
should be in the same table as the "old" data. That will depend on how 
you want to use it - conceptually is it all one continuum or is the 
"old" data just for archive purposes.

Now, having got this feature working, why do you want to bypass it? Will 
it be a specific user, involve specific patterns of values or what?

--  Richard Huxton  Archonet Ltd

Re: Rule causes baffling error

From

Tom Lane

Date:

19 December 2005, 12:27:39

Richard Huxton <dev@archonet.com> writes:
> Ken Winter wrote:
>> Can a trigger be written on a *view*?

> There's nothing for them to fire against even if you could attach the 
> trigger.

Currently we reject CREATE TRIGGER on a view, but it occurred to me the
other day that that could be relaxed, at least for BEFORE triggers.
The system could feed the trigger with the synthetic view row, and the
trigger could update the view's underlying tables and then return NULL
to suppress any actual "operation" on the view proper.

To do this, instead of erroring out in the rewriter if a view has no
DO INSTEAD rule, we would have to error out down in the guts of
the executor if control got as far as trying to actually
insert/update/delete a tuple in a view.

The trickiest part of this is probably generating the "old" row for
UPDATE and DELETE cases.  I think you'd need to adjust the planner
so that it would generate all the "old" view columns, rather than
the current situation in which it generates just the "new" columns
for an UPDATE, or no columns at all (only the CTID) for a DELETE.
I don't see any fundamental reason why this couldn't be made to work
though.

Triggers would be better than rules for quite a few view-rewriting
scenarios, mainly because you'd avoid all the gotchas with double
evaluation and so on.  So it seems like it might be worth doing.
        regards, tom lane

Re: Rule causes baffling error

From

"Ken Winter"

Date:

19 December 2005, 14:20:08

Richard ~

Let me zoom out for a moment, for the bigger picture.

As you have inferred, what I'm trying to do is develop a history-preserving
table ("my_data" in the example that started this thread).  *Most* user
programs would see and manipulate this table as if it contained only the
current rows (marked by effective_date_and_time <= 'now' and
expiration_date_and_time = 'infinity').

When these programs do an INSERT, I need automatic actions that set the
expiration and date timestamps to 'now' and 'infinity'; when they do an
UPDATE, I need automatic actions that save the old data in a history record
and expire it as of 'now' and the new data in a record that's effective
'now' and expires at 'infinity'; when they do a DELETE, I need an automatic
action to expire the target record as of 'now' rather than actually deleting
it.

However, I also need certain maintenance programs, designed to enable
certain users to correct inaccurately entered data.  These need to be able
to "rewrite history" by doing actions against "my_data" without these
automatic actions occurring.  It may prove advisable to provide some
automatic actions for these programs too, but they definitely won't be the
actions described above.  If the above actions were implemented as triggers,
all the ways I could think of to conditionally disable them (and possibly
replace them with other actions) seemed architecturally very klunky.  That's
when I decided I needed the "my_data_now" view, and from that I inferred
(apparently correctly) that the actions would have to be implemented as
rewrite rules.

The cascading problem was solkable.  But the solution was a bit hard to
reach because the user-invoked UPDATE action triggered both an INSERT and an
UPDATE on the same table (and user DELETE triggered an UPDATE), and so one
had to take into account that all of these triggered actions would cause
their triggers to fire again.  Not a deal-killer, but the solution felt
brittle.

Yes, I did consider having a "live" table and a separate "history" table.
The killer of that idea was my inability to find a way to implement foreign
keys that could refer to both tables and that could follow a record when it
was moved from "live" to "history".  Much of the history I'm trying to
preserve is not in the "my_data" table; it's in related tables that refer to
it.  I presumably could do this by not declaring the FKs to PostgreSQL, and
implementing the necessary referential integrity with triggers, but - well,
in a word, yuck.

As it happens, I have found a rewrite of my UPDATE rule that works, so my
immediate need is past.  FYI, the old update rule was:

CREATE OR REPLACE RULE upd_my_data_now ASON UPDATE TO my_data_nowDO INSTEAD    (    /* Update current record, and make
iteffective now. */    UPDATE my_data        SET id = NEW.id,             user_name = NEW.user_name,
effective_date_and_time= CURRENT_TIMESTAMP        WHERE effective_date_and_time = CURRENT_TIMESTAMP            AND id =
OLD.id;   /* Insert a record containing the old values,     and expire it as of now. */    INSERT INTO my_data (
effective_date_and_time,       expiration_date_and_time,           id,         user_name)         VALUES (
OLD.effective_date_and_time,           CURRENT_TIMESTAMP,            OLD.id,             OLD.user_name)    ); 

And the one that works is:

CREATE OR REPLACE RULE upd_my_data_now ASON UPDATE TO my_data_nowDO INSTEAD    (    /* Expire the current record. */
UPDATEmy_data        SET expiration_date_and_time = CURRENT_TIMESTAMP        WHERE effective_date_and_time =
OLD.effective_date_and_time            AND id = OLD.id                       AND effective_date_and_time <=
CURRENT_TIMESTAMP                      AND expiration_date_and_time >= CURRENT_TIMESTAMP; 
/* Insert a record containing the new values,     effective as of now. */    INSERT INTO my_data (
effective_date_and_time,       id,         user_name)         VALUES (              CURRENT_TIMESTAMP,
NEW.id,            NEW.user_name)    ); 

The relevant change is that I'm now expiring the record with the old data
and inserting the one with the new data, rather than vice versa.  I still
don't know why the old rule didn't work and this one does, but hey,
whatever.  Another advantage of the new one is that I don't have to re-point
foreign keys that were already pointed to the record containing the old
data, because that record stays in place.

(The other change, adding the lines     AND effective_date_and_time <= CURRENT_TIMESTAMP     AND
expiration_date_and_time>= CURRENT_TIMESTAMP; 
to the UPDATE, was necessary to keep updates to the "my_data_now" from
updating the expired rows as well.)

Thanks for your help.  I hope this little essay is of some value to others.

~ Ken

> -----Original Message-----
> From: Richard Huxton [mailto:dev@archonet.com]
> Sent: Monday, December 19, 2005 11:05 AM
> To: Ken Winter
> Cc: 'PostgreSQL pg-sql list'
> Subject: Re: [SQL] Rule causes baffling error
>
> Ken Winter wrote:
> > Richard ~
> >
> > Thanks for your response.
> >
> > Can a trigger be written on a *view*?  I can't find anything in the
> > PostgreSQL docs that answers this question.
>
> There's nothing for them to fire against even if you could attach the
> trigger. I suppose you could have a statement-level trigger in more
> recent versions, but for row-level triggers there aren't any rows in the
> view to be affected.
>
> > I originally wrote these actions (described in my original message) as a
> > trigger on my base table, but then realized I was getting in deeper and
> > deeper trouble because (a) I was getting into cascading triggers that I
> > didn't want and (b) I need to enable some queries to access the base
> table
> > without triggering these actions.  That's why I set up the view, and
> then I
> > assumed that the only way I could implement these actions was as rules.
>
> Hmm - the cascading should be straightforward enough to deal with. When
> you are updating check if NEW.expiration_date_and_time = now() and if so
> exit the trigger function (since there's nothing to do anyway).
>
> The other thing you might want to consider is whether the "live" data
> should be in the same table as the "old" data. That will depend on how
> you want to use it - conceptually is it all one continuum or is the
> "old" data just for archive purposes.
>
> Now, having got this feature working, why do you want to bypass it? Will
> it be a specific user, involve specific patterns of values or what?
>
> --
>    Richard Huxton
>    Archonet Ltd

Re: Rule causes baffling error

From

Richard Huxton

Date:

20 December 2005, 05:15:45

Ken Winter wrote:
> Richard ~
> 
> Let me zoom out for a moment, for the bigger picture.
> 
> As you have inferred, what I'm trying to do is develop a history-preserving
> table ("my_data" in the example that started this thread).  *Most* user
> programs would see and manipulate this table as if it contained only the
> current rows (marked by effective_date_and_time <= 'now' and
> expiration_date_and_time = 'infinity').  
> 
> When these programs do an INSERT, I need automatic actions that set the
> expiration and date timestamps to 'now' and 'infinity'; when they do an
> UPDATE, I need automatic actions that save the old data in a history record
> and expire it as of 'now' and the new data in a record that's effective
> 'now' and expires at 'infinity'; when they do a DELETE, I need an automatic
> action to expire the target record as of 'now' rather than actually deleting
> it.  

Oh - while I think of it, be VERY VERY careful that your system clock 
doesn't get put back. I've done this sort of thing and been bitten by it.

> However, I also need certain maintenance programs, designed to enable
> certain users to correct inaccurately entered data.  These need to be able
> to "rewrite history" by doing actions against "my_data" without these
> automatic actions occurring.  It may prove advisable to provide some
> automatic actions for these programs too, but they definitely won't be the
> actions described above.  If the above actions were implemented as triggers,
> all the ways I could think of to conditionally disable them (and possibly
> replace them with other actions) seemed architecturally very klunky.  That's
> when I decided I needed the "my_data_now" view, and from that I inferred
> (apparently correctly) that the actions would have to be implemented as
> rewrite rules.

The "standard" approach in so far as there is one would be to have a 
first line IF CURRENT_USER = 'MAINTENANCE' THEN RETURN ... or perhaps a 
boolean stored in a system-settings table to turn them on or off in 
en-masse. In your case the user-test seems better.

> The cascading problem was solkable.  But the solution was a bit hard to
> reach because the user-invoked UPDATE action triggered both an INSERT and an
> UPDATE on the same table (and user DELETE triggered an UPDATE), and so one
> had to take into account that all of these triggered actions would cause
> their triggers to fire again.  Not a deal-killer, but the solution felt
> brittle.
> 
> Yes, I did consider having a "live" table and a separate "history" table.
> The killer of that idea was my inability to find a way to implement foreign
> keys that could refer to both tables and that could follow a record when it
> was moved from "live" to "history".  Much of the history I'm trying to
> preserve is not in the "my_data" table; it's in related tables that refer to
> it.  I presumably could do this by not declaring the FKs to PostgreSQL, and
> implementing the necessary referential integrity with triggers, but - well,
> in a word, yuck.

If you're going to do this with multiple tables you actually need (at 
least) three. For example, if you had different versions of e.g. 
documents being stored you would want: document - invariants: the id, perhaps document-type.            FKeys link to
this.           A row is only deleted from here if all live+history            is also deleted. document_live - the one
thatgets edited.                 1:1 relationship with document if still live document_hist - with timestamps. N:1 with
document

Have a google for Temporal Databases too - there's a lot of thinking 
been done about this.

> 
> As it happens, I have found a rewrite of my UPDATE rule that works, so my
> immediate need is past.  FYI, the old update rule was:
> 
[snip]

> The relevant change is that I'm now expiring the record with the old data
> and inserting the one with the new data, rather than vice versa.  I still
> don't know why the old rule didn't work and this one does, but hey,
> whatever.  Another advantage of the new one is that I don't have to re-point
> foreign keys that were already pointed to the record containing the old
> data, because that record stays in place.
> 
> (The other change, adding the lines
>       AND effective_date_and_time <= CURRENT_TIMESTAMP
>       AND expiration_date_and_time >= CURRENT_TIMESTAMP;
> to the UPDATE, was necessary to keep updates to the "my_data_now" from
> updating the expired rows as well.)

Make sure you test it with inserts/updates of multiple rows too.

--   Richard Huxton  Archonet Ltd

Still struggling with history tables

From

"Ken Winter"

Date:

17 January 2006, 23:29:18

Friends ~

I'm still trying to implement a solution to the requirement to keep a
complete history of data changes to a "person" table.  (See earlier
correspondence below.)  I'm trying for a variant of the architecture
suggested by Richard Huxton (also below).  In my variant, I have this
"identifier" table, carrying the id and invariant info about each person:

/*==============================================================*/
/* Table: person_i                                              */
/*==============================================================*/
create table person_i (
idi                  BIGSERIAL            not null,
date_of_birth        DATE                 null,
constraint PK_person_i_key_1 primary key (idi)
)
;

And then I have this "history" table, carrying the time-varying info on
which I want to keep a complete history:

/*==============================================================*/
/* Table: person_h                                              */
/*==============================================================*/
create table person_h (
idh                  INT8                 not null,
start_date           DATE                 not null default 'now()',
end_date             DATE                 null,
name                 VARCHAR(255)         null,
constraint PK_person_h_key_1 primary key (idh, start_date),
constraint fk_reference_6 foreign key (idh)  references person_i (idi)on delete restrict on update restrict
)
;

Triggers are in place on the "person_h" table so that when an app does an
update, the current h record is expired (with its old data) and a new record
(wuth the updated data)is inserted and made effective "now".  What I'm now
trying to build is this view:

/*==============================================================*/
/* View: person                                                 */
/*==============================================================*/
create view person as
select idi, date_of_birth, start_date, end_date, name
from person_i i, person_h h
where i.idi = h.idh;

I want to enable users (and apps) who need to deal only with current data to
be able to treat "person" as a real table (i.e. to write to it as well as
read from it).  Specifically, the requirements are that when a user does:

.    Insert - The system inserts a record into the i table and the first
record in the h table.
.    Select - The system returns attributes of i and h tables (not
duplicating the identifier columns).
.    Update - The system allows updating of i attributes
(update-in-place, not creating a new history record) and h attributes
(creating a new history record).
.    Delete - The system deletes the i record and all of its h records.

I'm stuck on how to implement the "insert" action, which I thought would be
simple.  The problem is this:  The i table id is of type BIGSERIAL, i.e.
sequence-assigned.  I've tried writing the following rule to get both the i
record and the first h record inserted:

CREATE RULE ru AS
ON INSERT TO person
DO INSTEAD (INSERT INTO person_i DEFAULT VALUES; INSERT INTO person_h (idh) VALUES (NEW.idi)
);

I thought this would grab the sequence-assigned value of person_i.idi to put
into person_h.idh (this trick works in trigger functions), but instead it
just assigns Null to person_h.idh, and the transaction fails with a "not
null" violation on person_h.idh.  And now that I look at the documentation
(http://www.postgresql.org/docs/7.4/static/rules-update.html), it says that
a column not assigned in the invoking query "is replaced by a null value
(for an INSERT)" in the NEW pseudo-record.  Bummer.  Triggers behave nicely,
but rules don't.

I'd be willing to do it with a trigger function instead, but I can't attach
a trigger to a view.

I considered doing it with a trigger function on the person_i table, but I
don't know how that could be made to cause an insert of the person_h table
record - and the assignment of h table values such as "name" from the app's
query.

Suggestions?

~ TIA
~ Ken

> -----Original Message-----
> From: Richard Huxton [mailto:dev@archonet.com]
> Sent: Tuesday, December 20, 2005 4:16 AM
> To: Ken Winter
> Cc: 'PostgreSQL pg-sql list'
> Subject: Re: [SQL] Rule causes baffling error
>
> Ken Winter wrote:
> > Richard ~
> >
> > Let me zoom out for a moment, for the bigger picture.
> >
> > As you have inferred, what I'm trying to do is develop a history-
> preserving
> > table ("my_data" in the example that started this thread).  *Most* user
> > programs would see and manipulate this table as if it contained only the
> > current rows (marked by effective_date_and_time <= 'now' and
> > expiration_date_and_time = 'infinity').
> >
> > When these programs do an INSERT, I need automatic actions that set the
> > expiration and date timestamps to 'now' and 'infinity'; when they do an
> > UPDATE, I need automatic actions that save the old data in a history
> record
> > and expire it as of 'now' and the new data in a record that's effective
> > 'now' and expires at 'infinity'; when they do a DELETE, I need an
> automatic
> > action to expire the target record as of 'now' rather than actually
> deleting
> > it.
...
>
> > However, I also need certain maintenance programs, designed to enable
> > certain users to correct inaccurately entered data.  These need to be
> able
> > to "rewrite history" by doing actions against "my_data" without these
> > automatic actions occurring.  It may prove advisable to provide some
> > automatic actions for these programs too, but they definitely won't be
> the
> > actions described above.  If the above actions were implemented as
> triggers,
> > all the ways I could think of to conditionally disable them (and
> possibly
> > replace them with other actions) seemed architecturally very klunky.
> That's
> > when I decided I needed the "my_data_now" view, and from that I inferred
> > (apparently correctly) that the actions would have to be implemented as
> > rewrite rules.
>
> > The cascading problem was solvable.  But the solution was a bit hard to
> > reach because the user-invoked UPDATE action triggered both an INSERT
> and an
> > UPDATE on the same table (and user DELETE triggered an UPDATE), and so
> one
> > had to take into account that all of these triggered actions would cause
> > their triggers to fire again.  Not a deal-killer, but the solution felt
> > brittle.
> >
> > Yes, I did consider having a "live" table and a separate "history"
> table.
> > The killer of that idea was my inability to find a way to implement
> foreign
> > keys that could refer to both tables and that could follow a record when
> it
> > was moved from "live" to "history".  Much of the history I'm trying to
> > preserve is not in the "my_data" table; it's in related tables that
> refer to
> > it.  I presumably could do this by not declaring the FKs to PostgreSQL,
> and
> > implementing the necessary referential integrity with triggers, but -
> well,
> > in a word, yuck.
>
> If you're going to do this with multiple tables you actually need (at
> least) three. For example, if you had different versions of e.g.
> documents being stored you would want:
>   document - invariants: the id, perhaps document-type.
>              FKeys link to this.
>              A row is only deleted from here if all live+history
>              is also deleted.
>   document_live - the one that gets edited.
>                   1:1 relationship with document if still live
>   document_hist - with timestamps. N:1 with document
>
> Have a google for Temporal Databases too - there's a lot of thinking
> been done about this.
>
> >
> > As it happens, I have found a rewrite of my UPDATE rule that works, so
> my
> > immediate need is past.  FYI, the old update rule was:
> >
> [snip]
>
> > The relevant change is that I'm now expiring the record with the old
> data
> > and inserting the one with the new data, rather than vice versa.  I
> still
> > don't know why the old rule didn't work and this one does, but hey,
> > whatever.  Another advantage of the new one is that I don't have to re-
> point
> > foreign keys that were already pointed to the record containing the old
> > data, because that record stays in place.
> >
> > (The other change, adding the lines
> >       AND effective_date_and_time <= CURRENT_TIMESTAMP
> >       AND expiration_date_and_time >= CURRENT_TIMESTAMP;
> > to the UPDATE, was necessary to keep updates to the "my_data_now" from
> > updating the expired rows as well.)
>
> Make sure you test it with inserts/updates of multiple rows too.
>
> --
>    Richard Huxton
>    Archonet Ltd

Re: Still struggling with history tables

From

Achilleus Mantzios

Date:

18 January 2006, 05:22:02

O Ken Winter έγραψε στις Jan 17, 2006 :

> Friends ~
> 
> I'm still trying to implement a solution to the requirement to keep a
> complete history of data changes to a "person" table.  (See earlier
> correspondence below.)  I'm trying for a variant of the architecture
> suggested by Richard Huxton (also below).  In my variant, I have this
> "identifier" table, carrying the id and invariant info about each person:
> 
> /*==============================================================*/
> /* Table: person_i                                              */
> /*==============================================================*/
> create table person_i (
> idi                  BIGSERIAL            not null,
> date_of_birth        DATE                 null,
> constraint PK_person_i_key_1 primary key (idi)
> )
> ;
> 
> And then I have this "history" table, carrying the time-varying info on
> which I want to keep a complete history:
> 
> /*==============================================================*/
> /* Table: person_h                                              */
> /*==============================================================*/
> create table person_h (
> idh                  INT8                 not null,
> start_date           DATE                 not null default 'now()',
> end_date             DATE                 null,
> name                 VARCHAR(255)         null,
> constraint PK_person_h_key_1 primary key (idh, start_date),
> constraint fk_reference_6 foreign key (idh)
>    references person_i (idi)
>  on delete restrict on update restrict
> )
> ;
> 
> Triggers are in place on the "person_h" table so that when an app does an
> update, the current h record is expired (with its old data) and a new record
> (wuth the updated data)is inserted and made effective "now".  What I'm now
> trying to build is this view:
> 
> /*==============================================================*/
> /* View: person                                                 */
> /*==============================================================*/
> create view person as
> select idi, date_of_birth, start_date, end_date, name
> from person_i i, person_h h
> where i.idi = h.idh;
> 
> I want to enable users (and apps) who need to deal only with current data to
> be able to treat "person" as a real table (i.e. to write to it as well as
> read from it).  Specifically, the requirements are that when a user does:
> 
> .    Insert - The system inserts a record into the i table and the first
> record in the h table.
> .    Select - The system returns attributes of i and h tables (not
> duplicating the identifier columns).
> .    Update - The system allows updating of i attributes
> (update-in-place, not creating a new history record) and h attributes
> (creating a new history record).
> .    Delete - The system deletes the i record and all of its h records.
> 
> I'm stuck on how to implement the "insert" action, which I thought would be
> simple.  The problem is this:  The i table id is of type BIGSERIAL, i.e.
> sequence-assigned.  I've tried writing the following rule to get both the i
> record and the first h record inserted:
> 
> CREATE RULE ru AS 
> ON INSERT TO person 
> DO INSTEAD (
>     INSERT INTO person_i DEFAULT VALUES; 
>     INSERT INTO person_h (idh) VALUES (NEW.idi)
> );

How about

INSERT INTO person_i (idi,dateofbirth) 
VALUES(nextval('public.person_i_idi_seq'::text),new.dateofbirth);
INSERT INTO person_h(idh) 
VALUES(currval('public.person_i_idi_seq'::text));

However, i have treated similar problems with two different approaches:

Generally it is very hard to distinguish between two kind of UPDATES:

a) UPDATEs that mean real data updates and they should be recorded
to the history system.
b) UPDATEs that are just false data entry, and they should mean
just plain correction UPDATES, with no recording.

This distinguishability is very important, otherwise someone would
define a way to store historic data of changes to the historic data
themselves, and so on.

1) Is the history data frequently needed? Are there a lot of apps
hitting these historic data? Do we need the freedom to query current live
data as well as past data in a uniform manner?

Then i just use one table, with endtimestamp is null meaning
this is a current (alive) record.
Then i write triggers to enforce interval wise integrity to the table,
(e.g. No records A,B exist with A<>B,A,B for the same person, so that 
(A.starttimestamp,coalesce(A.endtimestamp,now())) overlaps with 
(B.starttimestamp,coalesce(B.endtimestamp,now()))

This way i give people the ability to do what they want with the table.
The triggers do the job of enforcing integrity.

In this case what we mean as historic is
"what users define and input as historic".
Users are in charge here, not the DB.

2) If on the other hand, historic data are just a convinient way
of accessing history data, instead of going to find the backup of this 
past day
in the computer room,
then I keep one and only live table, and one trigger managed
history table.

The trigger as in your case does the "blind" job of creating
history records, and the users view only the real table.
Here the drawback is that even just erroneous data entry
creates historic data.

A variation of 2) is what Richard suggested.
I think when we are dealing with computer generated data,
solution 2) is best.
When we are dealing with human data, (when the human knows how to 
differentiate a real update from a wrong input) i think full freedom
to the user must be given with solution 1).

If you ask me, (altho i havent implemented that), i would do it
ALL from the application, and define 2 kind of updates operations:

UPDATE = Real Data Update, (creates historic data)
CORRECTION = Correction to either the live or historic data (creates no 
historic data)

If you write in a modern language (java) it is very easy to create
modules for these kind of things.

> 
> I thought this would grab the sequence-assigned value of person_i.idi to put
> into person_h.idh (this trick works in trigger functions), but instead it
> just assigns Null to person_h.idh, and the transaction fails with a "not
> null" violation on person_h.idh.  And now that I look at the documentation
> (http://www.postgresql.org/docs/7.4/static/rules-update.html), it says that
> a column not assigned in the invoking query "is replaced by a null value
> (for an INSERT)" in the NEW pseudo-record.  Bummer.  Triggers behave nicely,
> but rules don't.
> 
> I'd be willing to do it with a trigger function instead, but I can't attach
> a trigger to a view.
> 
> I considered doing it with a trigger function on the person_i table, but I
> don't know how that could be made to cause an insert of the person_h table
> record - and the assignment of h table values such as "name" from the app's
> query.
> 
> Suggestions?
> 
> ~ TIA 
> ~ Ken  
> 
> > -----Original Message-----
> > From: Richard Huxton [mailto:dev@archonet.com]
> > Sent: Tuesday, December 20, 2005 4:16 AM
> > To: Ken Winter
> > Cc: 'PostgreSQL pg-sql list'
> > Subject: Re: [SQL] Rule causes baffling error
> > 
> > Ken Winter wrote:
> > > Richard ~
> > >
> > > Let me zoom out for a moment, for the bigger picture.
> > >
> > > As you have inferred, what I'm trying to do is develop a history-
> > preserving
> > > table ("my_data" in the example that started this thread).  *Most* user
> > > programs would see and manipulate this table as if it contained only the
> > > current rows (marked by effective_date_and_time <= 'now' and
> > > expiration_date_and_time = 'infinity').
> > >
> > > When these programs do an INSERT, I need automatic actions that set the
> > > expiration and date timestamps to 'now' and 'infinity'; when they do an
> > > UPDATE, I need automatic actions that save the old data in a history
> > record
> > > and expire it as of 'now' and the new data in a record that's effective
> > > 'now' and expires at 'infinity'; when they do a DELETE, I need an
> > automatic
> > > action to expire the target record as of 'now' rather than actually
> > deleting
> > > it.
> ...
> > 
> > > However, I also need certain maintenance programs, designed to enable
> > > certain users to correct inaccurately entered data.  These need to be
> > able
> > > to "rewrite history" by doing actions against "my_data" without these
> > > automatic actions occurring.  It may prove advisable to provide some
> > > automatic actions for these programs too, but they definitely won't be
> > the
> > > actions described above.  If the above actions were implemented as
> > triggers,
> > > all the ways I could think of to conditionally disable them (and
> > possibly
> > > replace them with other actions) seemed architecturally very klunky.
> > That's
> > > when I decided I needed the "my_data_now" view, and from that I inferred
> > > (apparently correctly) that the actions would have to be implemented as
> > > rewrite rules.
> > 
> > > The cascading problem was solvable.  But the solution was a bit hard to
> > > reach because the user-invoked UPDATE action triggered both an INSERT
> > and an
> > > UPDATE on the same table (and user DELETE triggered an UPDATE), and so
> > one
> > > had to take into account that all of these triggered actions would cause
> > > their triggers to fire again.  Not a deal-killer, but the solution felt
> > > brittle.
> > >
> > > Yes, I did consider having a "live" table and a separate "history"
> > table.
> > > The killer of that idea was my inability to find a way to implement
> > foreign
> > > keys that could refer to both tables and that could follow a record when
> > it
> > > was moved from "live" to "history".  Much of the history I'm trying to
> > > preserve is not in the "my_data" table; it's in related tables that
> > refer to
> > > it.  I presumably could do this by not declaring the FKs to PostgreSQL,
> > and
> > > implementing the necessary referential integrity with triggers, but -
> > well,
> > > in a word, yuck.
> > 
> > If you're going to do this with multiple tables you actually need (at
> > least) three. For example, if you had different versions of e.g.
> > documents being stored you would want:
> >   document - invariants: the id, perhaps document-type.
> >              FKeys link to this.
> >              A row is only deleted from here if all live+history
> >              is also deleted.
> >   document_live - the one that gets edited.
> >                   1:1 relationship with document if still live
> >   document_hist - with timestamps. N:1 with document
> > 
> > Have a google for Temporal Databases too - there's a lot of thinking
> > been done about this.
> > 
> > >
> > > As it happens, I have found a rewrite of my UPDATE rule that works, so
> > my
> > > immediate need is past.  FYI, the old update rule was:
> > >
> > [snip]
> > 
> > > The relevant change is that I'm now expiring the record with the old
> > data
> > > and inserting the one with the new data, rather than vice versa.  I
> > still
> > > don't know why the old rule didn't work and this one does, but hey,
> > > whatever.  Another advantage of the new one is that I don't have to re-
> > point
> > > foreign keys that were already pointed to the record containing the old
> > > data, because that record stays in place.
> > >
> > > (The other change, adding the lines
> > >       AND effective_date_and_time <= CURRENT_TIMESTAMP
> > >       AND expiration_date_and_time >= CURRENT_TIMESTAMP;
> > > to the UPDATE, was necessary to keep updates to the "my_data_now" from
> > > updating the expired rows as well.)
> > 
> > Make sure you test it with inserts/updates of multiple rows too.
> > 
> > --
> >    Richard Huxton
> >    Archonet Ltd
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
> 

-- 
-Achilleus

Re: Still struggling with history tables

From

Michael Glaesemann

Date:

18 January 2006, 05:59:39

On Jan 18, 2006, at 19:23 , Achilleus Mantzios wrote:

> Generally it is very hard to distinguish between two kind of UPDATES:
>
> a) UPDATEs that mean real data updates and they should be recorded
> to the history system.
> b) UPDATEs that are just false data entry, and they should mean
> just plain correction UPDATES, with no recording.
>
> This distinguishability is very important, otherwise someone would
> define a way to store historic data of changes to the historic data
> themselves, and so on.

Just a quick note:

Sometimes the term "valid-time" is used to talk about the interval of  
data validity, and tables that include valid-time intervals are  
sometimes referred to as "state tables". If you're interested in  
tracking when corrections are made, this is referred to as  
"transaction-time". Correcting the is sometimes called a  
"nonsequenced" update, because it's not correcting the sequence of  
validity: it's just a correction.

For more information, you can check out "Developing Time-Oriented  
Database Applications in SQL" by Richard Snodgrass (available as a  
free PDF download from his website[1]), or, for more theoretical  
information, "Temporal Data and the Relational Model" by CJ Date,  
Hugh Darwen, and Nikos Lorentzos.[2]

Michael Glaesemann
grzm myrealbox com

[1](http://www.cs.arizona.edu/people/rts/tdbbook.pdf)
[2](http://www.amazon.com/gp/product/1558608559/)