How to Implement Versioned Rows in PostgreSQL? - Mailing list pgsql-general
From | Alan Gutierrez |
---|---|
Subject | How to Implement Versioned Rows in PostgreSQL? |
Date | |
Msg-id | avble7$1k20$1@news.hub.org Whole thread Raw |
Responses |
Re: How to Implement Versioned Rows in PostgreSQL?
|
List | pgsql-general |
My application is gathering personel and patient data for a hospice. It must keep track of all changes to patient stats, chart, and med sheet over time. Therefore, I would like to keep versions of the rows in many of my tables. This is my stab at a PostgreSQL implementation. I would greatly appreciate any input, criticisms, dire warnings, etc. I plan on adding a transaction table with a transaction_id column fed by a sequence. The transaction_id will indicate the order of creation as well as time of creation. -- A row for each of our customers who subscribe to the application. create table firm ( -- firm => hospice, clinic, hospital firm_id integer not null, name varchar(32), -- Just one example data column primary key (firm_id) ) \g -- A table to keep row of a transaction. create table transaction ( firm_id integer not null, transaction_id integer not null, -- Fed by sequence, one for each firm so -- we can part and merge databases by -- firm without collision! modified timestamp not null, modified_by person_id not null, primary key (firm_id, transaction_id), foreign key (firm_id) references firm ) \g -- Example versioned table. create table person_history ( -- Base for patient and employee firm_id integer not null, person_id integer not null, transaction_id integer not null, first_name varchar(32), -- Just two example data columns last_name varchar(32) not null, deleted boolean not null, primary key (firm_id, person_id, transaction_id) ) \g -- Show latest row view. create view person as select * from person_history where transaction_id = ( -- In explain this subselect appears to use index! select max(transaction_id) from person_history as ph1 where firm_id = ph1.firm_id and person_id = ph1.firm_id ) and deleted = 0 \g -- Time travel view. create view person_as_of as select * from person_history where transction_id = ( select max(transaction_id) from person_history as ph1 where firm_id = ph1.firm_id and person_id = ph1.firm_id and transaction_id <= ( select transaction_id from past_transaction limit 1 ) ) and deleted = 0 \g In my application I can travel in time thus: create temporary table past_transaction as select transaction_id from transaction where modified <= '2002/12/2 17:59:00' -- the minute I turned 31 order by firm_id desc, transaction_id desc, limit 1 \g -- If only I could pass the view a parameter! select * from person_as_of \g Thoughts: * I can write a query and have it travel back in time by setting one variable. Neeto. * No archive tables or such means no copying, new version is a simple insert. Good. * With expliain the sub selects appear to use the indexes with aggregates, and if not I can alsways sort descending limit 1. Good. * Even with complex joins on the latest view tables the query plans use the primary index for the sub select. Good. * There is little need for vacuuming, since no updates are made to the busy tables of the application. Does this matter? * Referenital integrity goes away from what I can see, since it won't understand the deleted column. Pity. Questions: * Is this viable or overly clever? * Should I have a boolean latest column on a versioned table? This would mean update and vacuum, but potentially a faster query. * Is there a penalty for long (how do you say?) concatenated keys in PostgreSQL? * Any reason why this won't work with the coming distrubuted PostgreSQL? * Where can I read about alternative implemenations for history/versioning? Nuances? Thank you all for any input whatsoever. Alan Gutierrez
pgsql-general by date: