Thread: Another perplexity with PG rules

Another perplexity with PG rules

From
"Ken Winter"
Date:
I'm stumped on the following problem.

Everything between the "-----------" rows should be executable.

Please advise.

~ TIA
~ Ken

-----------

-- Here's a table:

CREATE TABLE public.person_h
(
person_id bigint DEFAULT nextval('pop_seq'::text),
effective_date_and_time timestamptz DEFAULT ('now'::text)::timestamp(6) with
time zone,
expiration_date_and_time timestamptz DEFAULT 'infinity'::timestamp with time
zone,
first_name varchar(255),
middle_names varchar(255),
last_name_prefix varchar(255),
last_name varchar(255),
name_suffix varchar(255),
preferred_full_name varchar(255),
preferred_business_name varchar(255),
user_name varchar(255),
_action varchar(32) DEFAULT 'preserve'::character varying,
CONSTRAINT pk_person_h_identifier_2 PRIMARY KEY (person_id,
effective_date_and_time)
);
-- Indexes
CREATE UNIQUE INDEX personal_data_px ON person_h USING btree (person_id,
effective_date_and_time);

-- Here's a view of that table plus a few ALTERs on the view:

CREATE OR REPLACE VIEW person AS
    SELECT h.person_id AS person_id,
        h.effective_date_and_time AS effective_date_and_time,
        h.expiration_date_and_time AS expiration_date_and_time,
        h.first_name AS first_name,
        h.middle_names AS middle_names,
        h.last_name_prefix AS last_name_prefix,
        h.last_name AS last_name,
        h.name_suffix AS name_suffix,
        h.preferred_full_name AS preferred_full_name,
        h.preferred_business_name AS preferred_business_name,
        h.user_name AS user_name,
        h._action AS _action
    FROM person_h AS h
    WHERE h.effective_date_and_time <= CURRENT_TIMESTAMP
        AND h.expiration_date_and_time >= CURRENT_TIMESTAMP

ALTER TABLE person
    ALTER COLUMN person_id
    SET DEFAULT nextval('pop_seq'::text)
;
ALTER TABLE person
    ALTER COLUMN effective_date_and_time
    SET DEFAULT ('now'::text)::timestamp(6) with time zone
;
ALTER TABLE person
    ALTER COLUMN expiration_date_and_time
    SET DEFAULT 'infinity'::timestamp with time zone
;
ALTER TABLE person
    ALTER COLUMN _action
    SET DEFAULT 'preserve'::character varying

-- Here are a couple of rules on that view:

/*** Rule on_insert inserts the object's first history record into person_h.
***/
CREATE OR REPLACE RULE on_insert AS
    ON INSERT TO person
    DO INSTEAD (
        /* Insert the row into the H table.
        Effective and expiration dates take the defaults,
        unless query overrides them. */
        INSERT INTO person_h
            ( person_id,
            effective_date_and_time,
            first_name,
            middle_names,
            last_name_prefix,
            last_name,
            name_suffix,
            preferred_full_name,
            preferred_business_name,
            user_name,
            _action )
            VALUES ( nextval('pop_seq'::text),
            NEW.effective_date_and_time,
            NEW.first_name,
            NEW.middle_names,
            NEW.last_name_prefix,
            NEW.last_name,
            NEW.name_suffix,
            NEW.preferred_full_name,
            NEW.preferred_business_name,
            NEW.user_name,
            NEW._action )
        )
;

/*** Rule on_update_1_nothing meets the PostgreSQL requirement for one
unconditional UPDATE rule. ***/
CREATE OR REPLACE RULE on_update_1_nothing AS
    ON UPDATE TO person
    DO INSTEAD NOTHING
;

/*** Rule on_update_2_preserve_h inserts a new record with the old data into
history table person_h,
expires this record effective either now or at the effective time given in
the query,
and updates the current record as of the same time. ***/
CREATE OR REPLACE RULE on_update_2_preserve_h AS
    ON UPDATE TO person
    WHERE (
        (OLD.person_id <> NEW.person_id
            OR (OLD.person_id IS NULL AND NEW.person_id IS NOT NULL)
            OR (OLD.person_id IS NOT NULL AND NEW.person_id IS NULL ))
        OR (OLD.effective_date_and_time <> NEW.effective_date_and_time
            OR (OLD.effective_date_and_time IS NULL AND
NEW.effective_date_and_time IS NOT NULL)
            OR (OLD.effective_date_and_time IS NOT NULL AND
NEW.effective_date_and_time IS NULL ))
        OR (OLD.first_name <> NEW.first_name
            OR (OLD.first_name IS NULL AND NEW.first_name IS NOT NULL)
            OR (OLD.first_name IS NOT NULL AND NEW.first_name IS NULL ))
        OR (OLD.middle_names <> NEW.middle_names
            OR (OLD.middle_names IS NULL AND NEW.middle_names IS NOT NULL)
            OR (OLD.middle_names IS NOT NULL AND NEW.middle_names IS NULL ))
        OR (OLD.last_name_prefix <> NEW.last_name_prefix
            OR (OLD.last_name_prefix IS NULL AND NEW.last_name_prefix IS NOT
NULL)
            OR (OLD.last_name_prefix IS NOT NULL AND NEW.last_name_prefix IS
NULL ))
        OR (OLD.last_name <> NEW.last_name
            OR (OLD.last_name IS NULL AND NEW.last_name IS NOT NULL)
            OR (OLD.last_name IS NOT NULL AND NEW.last_name IS NULL ))
        OR (OLD.name_suffix <> NEW.name_suffix
            OR (OLD.name_suffix IS NULL AND NEW.name_suffix IS NOT NULL)
            OR (OLD.name_suffix IS NOT NULL AND NEW.name_suffix IS NULL ))
        OR (OLD.preferred_full_name <> NEW.preferred_full_name
            OR (OLD.preferred_full_name IS NULL AND NEW.preferred_full_name
IS NOT NULL)
            OR (OLD.preferred_full_name IS NOT NULL AND
NEW.preferred_full_name IS NULL ))
        OR (OLD.preferred_business_name <> NEW.preferred_business_name
            OR (OLD.preferred_business_name IS NULL AND
NEW.preferred_business_name IS NOT NULL)
            OR (OLD.preferred_business_name IS NOT NULL AND
NEW.preferred_business_name IS NULL ))
        OR (OLD.user_name <> NEW.user_name
            OR (OLD.user_name IS NULL AND NEW.user_name IS NOT NULL)
            OR (OLD.user_name IS NOT NULL AND NEW.user_name IS NULL ))
        OR (OLD._action <> NEW._action
            OR (OLD._action IS NULL AND NEW._action IS NOT NULL)
            OR (OLD._action IS NOT NULL AND NEW._action IS NULL )))
        AND (NEW._action = 'preserve' OR NEW._action = OLD._action OR
NEW._action IS NULL)
    DO
        (
        /* Update the current H record and make it effective
        as of either now (if no effective date
        was provided) or whenever the update query specifies.*/
        UPDATE person_h
            SET person_id = NEW.person_id,
            first_name = NEW.first_name,
            middle_names = NEW.middle_names,
            last_name_prefix = NEW.last_name_prefix,
            last_name = NEW.last_name,
            name_suffix = NEW.name_suffix,
            preferred_full_name = NEW.preferred_full_name,
            preferred_business_name = NEW.preferred_business_name,
            user_name = NEW.user_name,
            _action = NEW._action,
            effective_date_and_time =
                CASE
                    WHEN NEW.effective_date_and_time =
OLD.effective_date_and_time
                         THEN CURRENT_TIMESTAMP -- Query assigned no value
                    ELSE NEW.effective_date_and_time-- Query assigned a
value
                END
            WHERE person_id = OLD.person_id
                AND effective_date_and_time = OLD.effective_date_and_time
        ;
        /* Copy the old values to a new record.
        Expire it either now (if no effective date
        was provided) or whenever the update query specifies.*/
        INSERT INTO person_h (
            person_id,
            first_name,
            middle_names,
            last_name_prefix,
            last_name,
            name_suffix,
            preferred_full_name,
            preferred_business_name,
            user_name,
            _action,
            effective_date_and_time,
            expiration_date_and_time)
        VALUES (
            OLD.person_id,
            OLD.first_name,
            OLD.middle_names,
            OLD.last_name_prefix,
            OLD.last_name,
            OLD.name_suffix,
            OLD.preferred_full_name,
            OLD.preferred_business_name,
            OLD.user_name,
            OLD._action,
            OLD.effective_date_and_time,
            CASE
                WHEN NEW.effective_date_and_time =
OLD.effective_date_and_time
                     THEN CURRENT_TIMESTAMP-- Query assigned no value
                ELSE NEW.effective_date_and_time-- Query assigned a value
            END)
        ;
    )
;

-- And here are a couple of actions on the view:

INSERT INTO person (first_name, last_name) VALUES ('Lou', 'Foo');

UPDATE person SET first_name = 'Who' WHERE last_name like 'Foo%';

-----------

The problem is with 'on_update_2_preserve_h', the last rule listed above.

This rule is intended to preserve a complete history of all UPDATE actions
by updating the current record, inserting a copy of the old data into the
table, and managing the "effective_date_and_time" and
"expiration_date_and_time" columns to make the transaction effective as of a
query-specified timestamp (or the default: now()).

After the UPDATE action, the table person_h should contain two records:

* One with first_name = 'Who', effective_date_and_time = <the timestamp
specified in the query, otherwise now()>, and expiration_date_and_time =
'infinity'.  (This record represents the new data.)

* One with first_name = 'Lou', effective_date_and_time = <the
effective_date_and_time of the old record>, and expiration_date_and_time =
<the timestamp specified in the query, otherwise now()>.  (This record
preserves the old data.)

What actually happens is that the first action evoked by the rule
'on_update_2_preserve_h' ("UPDATE person_h ...") is executed just fine, but
the second ("INSERT INTO person_h ...") action does nothing; it doesn't even
evoke an error message.  The net of this is that the update occurs, but the
history is not preserved: the database contains only the first of the two
above-mentioned records.

After trying about a million things, I'm wondering about the meaning of
"OLD." as the actions in a rule are successively executed.  What I have done
assumes that:

(a) The statement (in
http://www.postgresql.org/docs/7.4/static/rules-update.html) that "Any
reference to OLD is replaced by a reference to the range-table entry that is
the result relation" means that (for example) "OLD.first_name" = 'Lou', for
EVERY action evoked by the rule - even for the action (INSERT) that is
executed second (or Nth).

(b) The "OLD." values that appear in the second (INSERT) action in the rule
are not changed by the execution of the first (UPDATE) rule.  For example,
even if the first action changes "first_name" to 'Who' in the current
database record, the value of "OLD.first_name" in the query remains "Lou".
(This probably is just a restatement of the first assumption.)

(c) Whatever the truth of the above assumptions, the second (INSERT) action
in the 'on_update_2_preserve_h' rule should insert SOMEthing.  It should
either insert a record, however mangled, or produce some error message.

So, again:  The problem is that the first embedded (UPDATE) rule has its
intended effect, but the second embedded (INSERT) rule does nothing.

Which of my assumptions is wrong?

How to make this whole thing do what is required?

~ TIA again
~ Ken again


Re: Another perplexity with PG rules

From
Tom Lane
Date:
"Ken Winter" <ken@sunward.org> writes:
> After trying about a million things, I'm wondering about the meaning of
> "OLD." as the actions in a rule are successively executed.  What I have done
> assumes that:
> ...
> (b) The "OLD." values that appear in the second (INSERT) action in the rule
> are not changed by the execution of the first (UPDATE) rule.

I believe this is mistaken.  OLD is effectively a macro for "the
existing row(s) satisfying the rule's WHERE clause".  You've got two
problems here --- one is that the UPDATE may have changed the data in
those rows, and the other is that the UPDATE may cause them to not
satisfy the WHERE clause anymore.

> (c) Whatever the truth of the above assumptions, the second (INSERT) action
> in the 'on_update_2_preserve_h' rule should insert SOMEthing.

See above.  If no rows remain satisfying WHERE, nothing will happen.

> How to make this whole thing do what is required?

I'd suggest seeing if you can't do the INSERT first then the UPDATE.
This may require rethinking which of the two resulting rows is the
"historical" one and which the "updated" one, but it could probably
be made to work.

Also, you might think about keeping the historical info in a separate
table (possibly it could be an inheritance child of the master table).
This would make it easier to distinguish the historical and current info
when you need to.

Lastly, I'd advise using triggers not rules wherever you possibly can.
In particular, generation of the historical-log records would be far
more reliable if implemented as an AFTER UPDATE trigger on the base
table.

(Over the years I've gotten less and less satisfied with Postgres' rules
feature --- it just seems way too hard to make it do what people want
reliably.  I'm afraid there's not much we can do to fix it without
creating an enormous compatibility problem unfortunately :-(.  But by
and large, triggers are a lot easier for people to wrap their brains
around, once they get over the notational hurdle of having to write a
trigger function.  I'd like to see us allow triggers on views, and then
maybe rules could fade into the sunset for any but the most abstruse
applications.)

            regards, tom lane

Re: Another perplexity with PG rules

From
"Ken Winter"
Date:
Tom ~

Thanks ever so much for - again - helping me get unstuck.  See comments and
results inserted below.

~ Ken

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Sunday, February 26, 2006 1:47 PM
> To: ken@sunward.org
> Cc: PostgreSQL pg-general List
> Subject: Re: [GENERAL] Another perplexity with PG rules
>
> "Ken Winter" <ken@sunward.org> writes:
> > After trying about a million things, I'm wondering about the meaning of
> > "OLD." as the actions in a rule are successively executed.  What I have
> done
> > assumes that:
> > ...
> > (b) The "OLD." values that appear in the second (INSERT) action in the
> rule
> > are not changed by the execution of the first (UPDATE) rule.
>
> I believe this is mistaken.  OLD is effectively a macro for "the
> existing row(s) satisfying the rule's WHERE clause".  You've got two
> problems here --- one is that the UPDATE may have changed the data in
> those rows, and the other is that the UPDATE may cause them to not
> satisfy the WHERE clause anymore.

I was afraid of this.  Your conclusions do seem to fit my results.
>
> > (c) Whatever the truth of the above assumptions, the second (INSERT)
> action
> > in the 'on_update_2_preserve_h' rule should insert SOMEthing.
>
> See above.  If no rows remain satisfying WHERE, nothing will happen.

Yep, that's what was happening.
>
> > How to make this whole thing do what is required?
>
> I'd suggest seeing if you can't do the INSERT first then the UPDATE.
> This may require rethinking which of the two resulting rows is the
> "historical" one and which the "updated" one, but it could probably
> be made to work.

Yes, I had already had it working with such a scheme.  It expired the
existing record, and then inserted a new record with the updated values.
However this scheme seemed to be causing troubles with other triggers on the
base tables.  That's why I was trying to recast it into a scheme that
updated the existing record and then inserted a new record containing the
"old" data.
>
> Also, you might think about keeping the historical info in a separate
> table (possibly it could be an inheritance child of the master table).
> This would make it easier to distinguish the historical and current info
> when you need to.

I've been striving mightily to avoid taking this path, because it threatens
to hopelessly complicate my foreign keys.
>
> Lastly, I'd advise using triggers not rules wherever you possibly can.
> In particular, generation of the historical-log records would be far
> more reliable if implemented as an AFTER UPDATE trigger on the base
> table.
>
This appears to be the WINNER!  I eliminated the INSERT action from my
UPDATE rule:

CREATE OR REPLACE RULE on_update_2_preserve_h AS
    ON UPDATE TO person
...
    DO
        (
        /* Update the current H record and make it effective
        as of either now (if no effective date
        was provided) or whenever the update query specifies.*/
        UPDATE person_h
            SET person_id = NEW.person_id,
            first_name = NEW.first_name,
            middle_names = NEW.middle_names,
            last_name_prefix = NEW.last_name_prefix,
            last_name = NEW.last_name,
            name_suffix = NEW.name_suffix,
            preferred_full_name = NEW.preferred_full_name,
            preferred_business_name = NEW.preferred_business_name,
            user_name = NEW.user_name,
            _action = NEW._action,
            effective_date_and_time =
                CASE
                    WHEN NEW.effective_date_and_time =
OLD.effective_date_and_time
                         THEN CURRENT_TIMESTAMP -- Query assigned no value
                    ELSE NEW.effective_date_and_time -- Query assigned value
                END
            WHERE person_id = OLD.person_id
                AND effective_date_and_time = OLD.effective_date_and_time
        ;
        /* Copy the old values to a new record.
        Expire it either now (if no effective date
        was provided) or whenever the update query specifies.*/
        INSERT INTO person_h (
            person_id,
            first_name,
            middle_names,
            last_name_prefix,
            last_name,
            name_suffix,
            preferred_full_name,
            preferred_business_name,
            user_name,
            _action,
            effective_date_and_time,
            expiration_date_and_time)
        VALUES (
            OLD.person_id,
            OLD.first_name,
            OLD.middle_names,
            OLD.last_name_prefix,
            OLD.last_name,
            OLD.name_suffix,
            OLD.preferred_full_name,
            OLD.preferred_business_name,
            OLD.user_name,
            OLD._action,
            OLD.effective_date_and_time,
            CASE
                WHEN NEW.effective_date_and_time =
OLD.effective_date_and_time
                     THEN CURRENT_TIMESTAMP-- Query assigned no value
                ELSE NEW.effective_date_and_time-- Query assigned a value
            END)
        ;
    )
;

And turned it instead into this AFTER UPDATE trigger function:

CREATE OR REPLACE FUNCTION public.history_for_person()
RETURNS trigger AS
'
BEGIN
    IF NEW._action = ''preserve'' THEN
        /* Copy the old values to a new record.
        Expire it either now (if no effective date
        was provided) or whenever the update query specifies.*/
        INSERT INTO person_h (
            person_id,
            first_name,
            middle_names,
            last_name_prefix,
            last_name,
            name_suffix,
            preferred_full_name,
            preferred_business_name,
            user_name,
            effective_date_and_time,
            expiration_date_and_time,
            _action )
        VALUES (
            OLD.person_id,
            OLD.first_name,
            OLD.middle_names,
            OLD.last_name_prefix,
            OLD.last_name,
            OLD.name_suffix,
            OLD.preferred_full_name,
            OLD.preferred_business_name,
            OLD.user_name,
            OLD.effective_date_and_time,
            NEW.effective_date_and_time,
            ''old'' )
        ;
    END IF;
    RETURN NULL;
END;
'
LANGUAGE plpgsql VOLATILE
;

I haven't fully tested this, but at worst - unlike the previous two-action
rule - it properly handles these action queries:

INSERT INTO person (first_name, last_name) VALUES ('Lou', 'Foo');

UPDATE person SET first_name = 'Who' WHERE last_name like 'Foo%';

That is, in response to the UPDATE query, it leaves me with one current
record with first name = 'Who' AND one expired record with first_name =
'Lou'.

My quick-and-dirty of why this works is that it eliminates the instability
of the OLD record when a rule contains multiple actions.  Instead, it gives
me the stability of OLD inside of a trigger function.  Of course, I still
have to deal with the tendency of a trigger function to execute no matter
what was the source of the action query that triggered it, and I have
resorted to the "_action" column as a sort of parameter.  But I intend to
encapsulate this so that the ordinary users of the "person" table don't have
to know about it.

> (Over the years I've gotten less and less satisfied with Postgres' rules
> feature --- it just seems way too hard to make it do what people want
> reliably.  I'm afraid there's not much we can do to fix it without
> creating an enormous compatibility problem unfortunately :-(.  But by
> and large, triggers are a lot easier for people to wrap their brains
> around, once they get over the notational hurdle of having to write a
> trigger function.  I'd like to see us allow triggers on views, and then
> maybe rules could fade into the sunset for any but the most abstruse
> applications.)

Amen, amen amen!  PostgreSQL's rewrite rules seemed a great idea at first
look and in the abstract, but I have wasted a few weeks now trying to get
them to do something that really isn't that complicated, which is to
implement encapsulated history-keeping.  If I could put triggers on views, I
would junk all my rules and do all my history-keeping with triggers.  An
alternate workaround that I could live with would be to make a base table
that behaves like an updatable view:  It has triggers that divert all action
queries to its underlying table(s), so the table never actually contains
anything.  But to have that fully work, I would have to be able to declare a
"SELECT trigger" on my table-imitating-a-view, so that SELECTs against that
table would be answered by data from the underlying table(s).

Anyway, I think the mixed rules-and-triggers solution works for now (if not,
watch this space for more please for help), and I'm grateful to you for
triggering (pun intended) the idea.

~ Ken