Thread: Another perplexity with PG rules
I'm stumped on the following problem. Everything between the "-----------" rows should be executable. Please advise. ~ TIA ~ Ken ----------- -- Here's a table: CREATE TABLE public.person_h ( person_id bigint DEFAULT nextval('pop_seq'::text), effective_date_and_time timestamptz DEFAULT ('now'::text)::timestamp(6) with time zone, expiration_date_and_time timestamptz DEFAULT 'infinity'::timestamp with time zone, first_name varchar(255), middle_names varchar(255), last_name_prefix varchar(255), last_name varchar(255), name_suffix varchar(255), preferred_full_name varchar(255), preferred_business_name varchar(255), user_name varchar(255), _action varchar(32) DEFAULT 'preserve'::character varying, CONSTRAINT pk_person_h_identifier_2 PRIMARY KEY (person_id, effective_date_and_time) ); -- Indexes CREATE UNIQUE INDEX personal_data_px ON person_h USING btree (person_id, effective_date_and_time); -- Here's a view of that table plus a few ALTERs on the view: CREATE OR REPLACE VIEW person AS SELECT h.person_id AS person_id, h.effective_date_and_time AS effective_date_and_time, h.expiration_date_and_time AS expiration_date_and_time, h.first_name AS first_name, h.middle_names AS middle_names, h.last_name_prefix AS last_name_prefix, h.last_name AS last_name, h.name_suffix AS name_suffix, h.preferred_full_name AS preferred_full_name, h.preferred_business_name AS preferred_business_name, h.user_name AS user_name, h._action AS _action FROM person_h AS h WHERE h.effective_date_and_time <= CURRENT_TIMESTAMP AND h.expiration_date_and_time >= CURRENT_TIMESTAMP ALTER TABLE person ALTER COLUMN person_id SET DEFAULT nextval('pop_seq'::text) ; ALTER TABLE person ALTER COLUMN effective_date_and_time SET DEFAULT ('now'::text)::timestamp(6) with time zone ; ALTER TABLE person ALTER COLUMN expiration_date_and_time SET DEFAULT 'infinity'::timestamp with time zone ; ALTER TABLE person ALTER COLUMN _action SET DEFAULT 'preserve'::character varying -- Here are a couple of rules on that view: /*** Rule on_insert inserts the object's first history record into person_h. ***/ CREATE OR REPLACE RULE on_insert AS ON INSERT TO person DO INSTEAD ( /* Insert the row into the H table. Effective and expiration dates take the defaults, unless query overrides them. */ INSERT INTO person_h ( person_id, effective_date_and_time, first_name, middle_names, last_name_prefix, last_name, name_suffix, preferred_full_name, preferred_business_name, user_name, _action ) VALUES ( nextval('pop_seq'::text), NEW.effective_date_and_time, NEW.first_name, NEW.middle_names, NEW.last_name_prefix, NEW.last_name, NEW.name_suffix, NEW.preferred_full_name, NEW.preferred_business_name, NEW.user_name, NEW._action ) ) ; /*** Rule on_update_1_nothing meets the PostgreSQL requirement for one unconditional UPDATE rule. ***/ CREATE OR REPLACE RULE on_update_1_nothing AS ON UPDATE TO person DO INSTEAD NOTHING ; /*** Rule on_update_2_preserve_h inserts a new record with the old data into history table person_h, expires this record effective either now or at the effective time given in the query, and updates the current record as of the same time. ***/ CREATE OR REPLACE RULE on_update_2_preserve_h AS ON UPDATE TO person WHERE ( (OLD.person_id <> NEW.person_id OR (OLD.person_id IS NULL AND NEW.person_id IS NOT NULL) OR (OLD.person_id IS NOT NULL AND NEW.person_id IS NULL )) OR (OLD.effective_date_and_time <> NEW.effective_date_and_time OR (OLD.effective_date_and_time IS NULL AND NEW.effective_date_and_time IS NOT NULL) OR (OLD.effective_date_and_time IS NOT NULL AND NEW.effective_date_and_time IS NULL )) OR (OLD.first_name <> NEW.first_name OR (OLD.first_name IS NULL AND NEW.first_name IS NOT NULL) OR (OLD.first_name IS NOT NULL AND NEW.first_name IS NULL )) OR (OLD.middle_names <> NEW.middle_names OR (OLD.middle_names IS NULL AND NEW.middle_names IS NOT NULL) OR (OLD.middle_names IS NOT NULL AND NEW.middle_names IS NULL )) OR (OLD.last_name_prefix <> NEW.last_name_prefix OR (OLD.last_name_prefix IS NULL AND NEW.last_name_prefix IS NOT NULL) OR (OLD.last_name_prefix IS NOT NULL AND NEW.last_name_prefix IS NULL )) OR (OLD.last_name <> NEW.last_name OR (OLD.last_name IS NULL AND NEW.last_name IS NOT NULL) OR (OLD.last_name IS NOT NULL AND NEW.last_name IS NULL )) OR (OLD.name_suffix <> NEW.name_suffix OR (OLD.name_suffix IS NULL AND NEW.name_suffix IS NOT NULL) OR (OLD.name_suffix IS NOT NULL AND NEW.name_suffix IS NULL )) OR (OLD.preferred_full_name <> NEW.preferred_full_name OR (OLD.preferred_full_name IS NULL AND NEW.preferred_full_name IS NOT NULL) OR (OLD.preferred_full_name IS NOT NULL AND NEW.preferred_full_name IS NULL )) OR (OLD.preferred_business_name <> NEW.preferred_business_name OR (OLD.preferred_business_name IS NULL AND NEW.preferred_business_name IS NOT NULL) OR (OLD.preferred_business_name IS NOT NULL AND NEW.preferred_business_name IS NULL )) OR (OLD.user_name <> NEW.user_name OR (OLD.user_name IS NULL AND NEW.user_name IS NOT NULL) OR (OLD.user_name IS NOT NULL AND NEW.user_name IS NULL )) OR (OLD._action <> NEW._action OR (OLD._action IS NULL AND NEW._action IS NOT NULL) OR (OLD._action IS NOT NULL AND NEW._action IS NULL ))) AND (NEW._action = 'preserve' OR NEW._action = OLD._action OR NEW._action IS NULL) DO ( /* Update the current H record and make it effective as of either now (if no effective date was provided) or whenever the update query specifies.*/ UPDATE person_h SET person_id = NEW.person_id, first_name = NEW.first_name, middle_names = NEW.middle_names, last_name_prefix = NEW.last_name_prefix, last_name = NEW.last_name, name_suffix = NEW.name_suffix, preferred_full_name = NEW.preferred_full_name, preferred_business_name = NEW.preferred_business_name, user_name = NEW.user_name, _action = NEW._action, effective_date_and_time = CASE WHEN NEW.effective_date_and_time = OLD.effective_date_and_time THEN CURRENT_TIMESTAMP -- Query assigned no value ELSE NEW.effective_date_and_time-- Query assigned a value END WHERE person_id = OLD.person_id AND effective_date_and_time = OLD.effective_date_and_time ; /* Copy the old values to a new record. Expire it either now (if no effective date was provided) or whenever the update query specifies.*/ INSERT INTO person_h ( person_id, first_name, middle_names, last_name_prefix, last_name, name_suffix, preferred_full_name, preferred_business_name, user_name, _action, effective_date_and_time, expiration_date_and_time) VALUES ( OLD.person_id, OLD.first_name, OLD.middle_names, OLD.last_name_prefix, OLD.last_name, OLD.name_suffix, OLD.preferred_full_name, OLD.preferred_business_name, OLD.user_name, OLD._action, OLD.effective_date_and_time, CASE WHEN NEW.effective_date_and_time = OLD.effective_date_and_time THEN CURRENT_TIMESTAMP-- Query assigned no value ELSE NEW.effective_date_and_time-- Query assigned a value END) ; ) ; -- And here are a couple of actions on the view: INSERT INTO person (first_name, last_name) VALUES ('Lou', 'Foo'); UPDATE person SET first_name = 'Who' WHERE last_name like 'Foo%'; ----------- The problem is with 'on_update_2_preserve_h', the last rule listed above. This rule is intended to preserve a complete history of all UPDATE actions by updating the current record, inserting a copy of the old data into the table, and managing the "effective_date_and_time" and "expiration_date_and_time" columns to make the transaction effective as of a query-specified timestamp (or the default: now()). After the UPDATE action, the table person_h should contain two records: * One with first_name = 'Who', effective_date_and_time = <the timestamp specified in the query, otherwise now()>, and expiration_date_and_time = 'infinity'. (This record represents the new data.) * One with first_name = 'Lou', effective_date_and_time = <the effective_date_and_time of the old record>, and expiration_date_and_time = <the timestamp specified in the query, otherwise now()>. (This record preserves the old data.) What actually happens is that the first action evoked by the rule 'on_update_2_preserve_h' ("UPDATE person_h ...") is executed just fine, but the second ("INSERT INTO person_h ...") action does nothing; it doesn't even evoke an error message. The net of this is that the update occurs, but the history is not preserved: the database contains only the first of the two above-mentioned records. After trying about a million things, I'm wondering about the meaning of "OLD." as the actions in a rule are successively executed. What I have done assumes that: (a) The statement (in http://www.postgresql.org/docs/7.4/static/rules-update.html) that "Any reference to OLD is replaced by a reference to the range-table entry that is the result relation" means that (for example) "OLD.first_name" = 'Lou', for EVERY action evoked by the rule - even for the action (INSERT) that is executed second (or Nth). (b) The "OLD." values that appear in the second (INSERT) action in the rule are not changed by the execution of the first (UPDATE) rule. For example, even if the first action changes "first_name" to 'Who' in the current database record, the value of "OLD.first_name" in the query remains "Lou". (This probably is just a restatement of the first assumption.) (c) Whatever the truth of the above assumptions, the second (INSERT) action in the 'on_update_2_preserve_h' rule should insert SOMEthing. It should either insert a record, however mangled, or produce some error message. So, again: The problem is that the first embedded (UPDATE) rule has its intended effect, but the second embedded (INSERT) rule does nothing. Which of my assumptions is wrong? How to make this whole thing do what is required? ~ TIA again ~ Ken again
"Ken Winter" <ken@sunward.org> writes: > After trying about a million things, I'm wondering about the meaning of > "OLD." as the actions in a rule are successively executed. What I have done > assumes that: > ... > (b) The "OLD." values that appear in the second (INSERT) action in the rule > are not changed by the execution of the first (UPDATE) rule. I believe this is mistaken. OLD is effectively a macro for "the existing row(s) satisfying the rule's WHERE clause". You've got two problems here --- one is that the UPDATE may have changed the data in those rows, and the other is that the UPDATE may cause them to not satisfy the WHERE clause anymore. > (c) Whatever the truth of the above assumptions, the second (INSERT) action > in the 'on_update_2_preserve_h' rule should insert SOMEthing. See above. If no rows remain satisfying WHERE, nothing will happen. > How to make this whole thing do what is required? I'd suggest seeing if you can't do the INSERT first then the UPDATE. This may require rethinking which of the two resulting rows is the "historical" one and which the "updated" one, but it could probably be made to work. Also, you might think about keeping the historical info in a separate table (possibly it could be an inheritance child of the master table). This would make it easier to distinguish the historical and current info when you need to. Lastly, I'd advise using triggers not rules wherever you possibly can. In particular, generation of the historical-log records would be far more reliable if implemented as an AFTER UPDATE trigger on the base table. (Over the years I've gotten less and less satisfied with Postgres' rules feature --- it just seems way too hard to make it do what people want reliably. I'm afraid there's not much we can do to fix it without creating an enormous compatibility problem unfortunately :-(. But by and large, triggers are a lot easier for people to wrap their brains around, once they get over the notational hurdle of having to write a trigger function. I'd like to see us allow triggers on views, and then maybe rules could fade into the sunset for any but the most abstruse applications.) regards, tom lane
Tom ~ Thanks ever so much for - again - helping me get unstuck. See comments and results inserted below. ~ Ken > -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: Sunday, February 26, 2006 1:47 PM > To: ken@sunward.org > Cc: PostgreSQL pg-general List > Subject: Re: [GENERAL] Another perplexity with PG rules > > "Ken Winter" <ken@sunward.org> writes: > > After trying about a million things, I'm wondering about the meaning of > > "OLD." as the actions in a rule are successively executed. What I have > done > > assumes that: > > ... > > (b) The "OLD." values that appear in the second (INSERT) action in the > rule > > are not changed by the execution of the first (UPDATE) rule. > > I believe this is mistaken. OLD is effectively a macro for "the > existing row(s) satisfying the rule's WHERE clause". You've got two > problems here --- one is that the UPDATE may have changed the data in > those rows, and the other is that the UPDATE may cause them to not > satisfy the WHERE clause anymore. I was afraid of this. Your conclusions do seem to fit my results. > > > (c) Whatever the truth of the above assumptions, the second (INSERT) > action > > in the 'on_update_2_preserve_h' rule should insert SOMEthing. > > See above. If no rows remain satisfying WHERE, nothing will happen. Yep, that's what was happening. > > > How to make this whole thing do what is required? > > I'd suggest seeing if you can't do the INSERT first then the UPDATE. > This may require rethinking which of the two resulting rows is the > "historical" one and which the "updated" one, but it could probably > be made to work. Yes, I had already had it working with such a scheme. It expired the existing record, and then inserted a new record with the updated values. However this scheme seemed to be causing troubles with other triggers on the base tables. That's why I was trying to recast it into a scheme that updated the existing record and then inserted a new record containing the "old" data. > > Also, you might think about keeping the historical info in a separate > table (possibly it could be an inheritance child of the master table). > This would make it easier to distinguish the historical and current info > when you need to. I've been striving mightily to avoid taking this path, because it threatens to hopelessly complicate my foreign keys. > > Lastly, I'd advise using triggers not rules wherever you possibly can. > In particular, generation of the historical-log records would be far > more reliable if implemented as an AFTER UPDATE trigger on the base > table. > This appears to be the WINNER! I eliminated the INSERT action from my UPDATE rule: CREATE OR REPLACE RULE on_update_2_preserve_h AS ON UPDATE TO person ... DO ( /* Update the current H record and make it effective as of either now (if no effective date was provided) or whenever the update query specifies.*/ UPDATE person_h SET person_id = NEW.person_id, first_name = NEW.first_name, middle_names = NEW.middle_names, last_name_prefix = NEW.last_name_prefix, last_name = NEW.last_name, name_suffix = NEW.name_suffix, preferred_full_name = NEW.preferred_full_name, preferred_business_name = NEW.preferred_business_name, user_name = NEW.user_name, _action = NEW._action, effective_date_and_time = CASE WHEN NEW.effective_date_and_time = OLD.effective_date_and_time THEN CURRENT_TIMESTAMP -- Query assigned no value ELSE NEW.effective_date_and_time -- Query assigned value END WHERE person_id = OLD.person_id AND effective_date_and_time = OLD.effective_date_and_time ; /* Copy the old values to a new record. Expire it either now (if no effective date was provided) or whenever the update query specifies.*/ INSERT INTO person_h ( person_id, first_name, middle_names, last_name_prefix, last_name, name_suffix, preferred_full_name, preferred_business_name, user_name, _action, effective_date_and_time, expiration_date_and_time) VALUES ( OLD.person_id, OLD.first_name, OLD.middle_names, OLD.last_name_prefix, OLD.last_name, OLD.name_suffix, OLD.preferred_full_name, OLD.preferred_business_name, OLD.user_name, OLD._action, OLD.effective_date_and_time, CASE WHEN NEW.effective_date_and_time = OLD.effective_date_and_time THEN CURRENT_TIMESTAMP-- Query assigned no value ELSE NEW.effective_date_and_time-- Query assigned a value END) ; ) ; And turned it instead into this AFTER UPDATE trigger function: CREATE OR REPLACE FUNCTION public.history_for_person() RETURNS trigger AS ' BEGIN IF NEW._action = ''preserve'' THEN /* Copy the old values to a new record. Expire it either now (if no effective date was provided) or whenever the update query specifies.*/ INSERT INTO person_h ( person_id, first_name, middle_names, last_name_prefix, last_name, name_suffix, preferred_full_name, preferred_business_name, user_name, effective_date_and_time, expiration_date_and_time, _action ) VALUES ( OLD.person_id, OLD.first_name, OLD.middle_names, OLD.last_name_prefix, OLD.last_name, OLD.name_suffix, OLD.preferred_full_name, OLD.preferred_business_name, OLD.user_name, OLD.effective_date_and_time, NEW.effective_date_and_time, ''old'' ) ; END IF; RETURN NULL; END; ' LANGUAGE plpgsql VOLATILE ; I haven't fully tested this, but at worst - unlike the previous two-action rule - it properly handles these action queries: INSERT INTO person (first_name, last_name) VALUES ('Lou', 'Foo'); UPDATE person SET first_name = 'Who' WHERE last_name like 'Foo%'; That is, in response to the UPDATE query, it leaves me with one current record with first name = 'Who' AND one expired record with first_name = 'Lou'. My quick-and-dirty of why this works is that it eliminates the instability of the OLD record when a rule contains multiple actions. Instead, it gives me the stability of OLD inside of a trigger function. Of course, I still have to deal with the tendency of a trigger function to execute no matter what was the source of the action query that triggered it, and I have resorted to the "_action" column as a sort of parameter. But I intend to encapsulate this so that the ordinary users of the "person" table don't have to know about it. > (Over the years I've gotten less and less satisfied with Postgres' rules > feature --- it just seems way too hard to make it do what people want > reliably. I'm afraid there's not much we can do to fix it without > creating an enormous compatibility problem unfortunately :-(. But by > and large, triggers are a lot easier for people to wrap their brains > around, once they get over the notational hurdle of having to write a > trigger function. I'd like to see us allow triggers on views, and then > maybe rules could fade into the sunset for any but the most abstruse > applications.) Amen, amen amen! PostgreSQL's rewrite rules seemed a great idea at first look and in the abstract, but I have wasted a few weeks now trying to get them to do something that really isn't that complicated, which is to implement encapsulated history-keeping. If I could put triggers on views, I would junk all my rules and do all my history-keeping with triggers. An alternate workaround that I could live with would be to make a base table that behaves like an updatable view: It has triggers that divert all action queries to its underlying table(s), so the table never actually contains anything. But to have that fully work, I would have to be able to declare a "SELECT trigger" on my table-imitating-a-view, so that SELECTs against that table would be answered by data from the underlying table(s). Anyway, I think the mixed rules-and-triggers solution works for now (if not, watch this space for more please for help), and I'm grateful to you for triggering (pun intended) the idea. ~ Ken