Thread: get a list of table modifications in a day?

get a list of table modifications in a day?

From
Ottavio Campana
Date:
I need to generate a diff (or something similar) of a table, day by day.
What is the best way to tack insert/update/delete operations? I have two
ideas, and I'd like to hear your opinion:

1) pg_dump each day and run diff
2) modify some triggers we use and store the information in another table

I am not aware of any functionality offered by postgresql. Does it exists?

If not, which solution would you prefer?


Attachment

Re: get a list of table modifications in a day?

From
hubert depesz lubaczewski
Date:
On Thu, Sep 13, 2007 at 09:59:30AM +0200, Ottavio Campana wrote:
> 1) pg_dump each day and run diff

it will become increasingly painful as the table size increases.

> 2) modify some triggers we use and store the information in another table

this is the best choice. you can use table_log extension to avoid
writing your own triggers.

http://ads.wars-nicht.de/blog/archives/100-Log-Table-Changes-in-PostgreSQL-with-tablelog.html

depesz

--
quicksil1er: "postgres is excellent, but like any DB it requires a
highly paid DBA.  here's my CV!" :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)

Re: get a list of table modifications in a day?

From
"Asko Oja"
Date:
Hi

PgQ can be used this purpose. Idea is to have triggers on table that push events into queue and then on that queue you can do whatever suits you best. As we don't want to keep these logs online PgQ is most conenient as it efficiently removes them as soon as they are handled.

PgQ - table_dispatcher.py
Has url encoded events as data source and writes them into table on target database.
Used to partiton data. For example change log's that need to kept online only shortly can be written to daily tables and then dropped as they become irrelevant.
Also allows to select which columns have to be written into target database
Creates target tables according to configuration file as needed

PgQ - cube_dispatcher.py
Has url encoded events as data source and writes them into partitoned tables in target database. Logutriga is used to create events.
Used to provide batches of data for business intelligence and data cubes.
Only one instance of each record is stored. For example if record is created and then updated twice only latest version of record stays in that days table.
Does not support deletes (not that it is hard to support just we have no need for it).

PgQ - queue_archiver.py
Writes queue contents into file. Used for backing up queue contents for safety.

regards,
Asko

On 9/13/07, Ottavio Campana <ottavio@campana.vi.it> wrote:
I need to generate a diff (or something similar) of a table, day by day.
What is the best way to tack insert/update/delete operations? I have two
ideas, and I'd like to hear your opinion:

1) pg_dump each day and run diff
2) modify some triggers we use and store the information in another table

I am not aware of any functionality offered by postgresql. Does it exists?

If not, which solution would you prefer?



Re: get a list of table modifications in a day?

From
Gregory Stark
Date:
"Ottavio Campana" <ottavio@campana.vi.it> writes:

> I need to generate a diff (or something similar) of a table, day by day.
> What is the best way to tack insert/update/delete operations? I have two
> ideas, and I'd like to hear your opinion:
>
> 1) pg_dump each day and run diff

You can't use pg_dump directly as the rows are unordered. An update will
remove the old row in one place and put the new row possibly in a completely
different place. Some operations like CLUSTER or VACUUM FULL could move around
rows which doesn't matter to SQL but would show up in a diff.

You would have to COPY to a file a query which includes an ORDER BY.

> 2) modify some triggers we use and store the information in another table
>
> I am not aware of any functionality offered by postgresql. Does it exists?

Well alternatively you could do the same as 1) but do it in SQL. Something
like

CREATE TABLE copy_table AS (SELECT * FROM original_table);
... wait a day
SELECT * FROM copy_table EXCEPT SELECT * FROM original_table

It's not going to be fast though. Probably the triggers are the best option
really. They give you more information than a diff in that they tell you when
the change occurred, what user made the change, and if multiple changes to the
same record occurred you get a record of each.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: get a list of table modifications in a day?

From
Ottavio Campana
Date:
hubert depesz lubaczewski ha scritto:
> On Thu, Sep 13, 2007 at 09:59:30AM +0200, Ottavio Campana wrote:
>> 1) pg_dump each day and run diff
>
> it will become increasingly painful as the table size increases.
>
>> 2) modify some triggers we use and store the information in another table
>
> this is the best choice. you can use table_log extension to avoid
> writing your own triggers.
>
> http://ads.wars-nicht.de/blog/archives/100-Log-Table-Changes-in-PostgreSQL-with-tablelog.html

since I already use triggers on that table, can I use table_log?

I mean, can I have two triggers for the same event on the same table?



Attachment

Re: get a list of table modifications in a day?

From
"A. Kretschmer"
Date:
am  Thu, dem 13.09.2007, um 10:44:41 +0200 mailte Ottavio Campana folgendes:
> > http://ads.wars-nicht.de/blog/archives/100-Log-Table-Changes-in-PostgreSQL-with-tablelog.html
>
> since I already use triggers on that table, can I use table_log?
>
> I mean, can I have two triggers for the same event on the same table?

Yes.

Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net