Home > mailing lists

Re: table versioning approach (not auditing) - Mailing list pgsql-general

From	Jonathan Vanasco
Subject	Re: table versioning approach (not auditing)
Date	September 29, 2014 18:09:00
Msg-id	16895FA4-4F30-4D2C-A21F-BA85F34AE6C8@2xlp.com Whole thread Raw
In response to	table versioning approach (not auditing) (Abelard Hoffman <abelardhoffman@gmail.com>)
Responses	Re: table versioning approach (not auditing) (Nick Guenther <nguenthe@uwaterloo.ca>)
List	pgsql-general

Tree view

In the past, to accomplish the same thing I've done this:

- store the data in hstore/json. instead of storing snapshots, I store deltas. i've been using a second table though, because it's improved performance on reads and writes.

- use a "transaction" log. every write session gets logged into the transaction table (serial, timestamp, user_id). all updates to the recorded tables include the transaction's serial. then there is a "transactions" table, that is just "transaction_serial , object_id , object_action".

whenever I have needs for auditing or versioning, I can just query the transaction table for the records I want... then use that to grab the data out of hstore.

On Sep 28, 2014, at 10:00 PM, Abelard Hoffman wrote:

Hi. I need to maintain a record of all changes to certain tables so assist in viewing history and reverting changes when necessary (customer service makes an incorrect edit, etc.).

I have studied these two audit trigger examples:
https://wiki.postgresql.org/wiki/Audit_trigger
https://wiki.postgresql.org/wiki/Audit_trigger_91plus

I've also read about two other approaches to versioning:
1. maintain all versions in one table, with a flag to indicate which is the current version
2. have a separate versions table for each real table, and insert into the associated version table whenever an update or insert is done.

My current implementation is based on the wiki trigger examples, using a single table, and a json column to record the row changes (rather than hstore). What I like about that, in particular, is I can have a "global," chronological view of all versioned changes very easily.

But there are two types of queries I need to run.
1. Find all changes made by a specific user
2. Find all changes related to a specific record

#1 is simple to do. The versioning table has a user_id column of who made the change, so I can query on that.

#2 is more difficult. I may want to fetch all changes to a group of tables that are all related by foreign keys (e.g., find all changes to "user" record 849, along with any changes to their "articles," "photos," etc.). All of the data is in the json column, of course, but it seems like a pain to try and build a query on the json column that can fetch all those relationships (and if I mess it up, I probably won't generate any errors, since the json is so free-form).

So my question is, do you think using the json approach is wrong for this case? Does it seem better to have separate versioning tables associated with each real table? Or another approach?

Thanks

pgsql-general by date:

From: "Felix Kunde"
Date: 29 September 2014, 10:27:00
Subject: Re: table versioning approach (not auditing)

From: Nick Guenther
Date: 29 September 2014, 23:06:45
Subject: Re: table versioning approach (not auditing)

Re: table versioning approach (not auditing) - Mailing list pgsql-general

Previous

Next