Re: [Solved] Generic logging system for pre-hstore using plperl triggers - Mailing list pgsql-general

From Diego Augusto Molina
Subject Re: [Solved] Generic logging system for pre-hstore using plperl triggers
Date
Msg-id CAGOxLdHbRRUC1X=t_Dr6Gi5qbBCBL7ASY3O6diJdw_imAWPsBA@mail.gmail.com
Whole thread Raw
In response to Re: [Solved] Generic logging system for pre-hstore using plperl triggers  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: [Solved] Generic logging system for pre-hstore using plperl triggers  (Diego Augusto Molina <diegoaugustomolina@gmail.com>)
List pgsql-general
2011/9/28, Merlin Moncure <mmoncure@gmail.com>:
>
> I disagree.  unnest() and array_agg() (or, even better, array()
> constructor syntax) are an absolute joy to work with and thinking in a
> more functional way, which is usually the key to making things run
> quickly.  Also both functions are trivial to emulate in userland for
> compatibility.  Arrays of composites IIRC only go back to 8.3 so  that
> would be a true stopper for any solution in that vein.

Ok, tastes are tastes: I hate to make two or three more levels of
subqueries. Regarding arrays of composites, that would be perfectly
solved if we use no composite at all! Instead of a field with an array
of a composite of three instrinsics, use three fields, each of an
intrinsic type. See your proposal:

>> create type audit_field_t as (field text, old_value text, new_value text);

Instad, in the audit table you may use:

..., field smallint[], before text[], after text[],...

Note the smallint in field, that means I really want to keep the
reference to the "field" table. That is for the two reasons I had
mentioned earlier (to reduce space: 2 bytes of type "smallint" against
variable size of type "text"; and to keep track of names been used
too). You can also set up something like this if you like dimensions:

..., field smallint[], values text[][],...

Implying that the first dimension is the "before" value and the second
one is for the "after" value. Any of these prevents us from using
composites and makes the box a little wider and simpler. Even further,
I would like to keep the logging "on demand":

..., field smallint[], is_pk boolean[], { before text[], after text[]
| values text[][] },...

You know what are the braces and pipe for...
So, at the end, we have the entire "audet" table inside the "audit"
table, as a series of arrays. We got a real compact table with only
enough data to fully log the changes which triggered the event. No
less, no more.
At this point we know querying this table will be much more slow and
rotation will have to be done more frequently. If we dump>restore the
table somewhere else we will still be able to split the table in the
original two ones, and make indexes, cluster them, and query as
desired. But this can get so complicated that maybe I should implement
a function doing all this. In an event, we are getting less
responsiveness because of this. But a couple of mins more may not be a
problem for most cases. I'm just trying to summarize.

As a rule of thumb, you may need to run a cron job every night or so
to check if 'select count(*) from audit' is bigger than X then rotate
the table (or maybe each X days/weeks/etc.). The smaller the X, the
bigger responsiveness _in_ some cases: if we know an interval in time
we will just have to dump>restore those logs. In other cases this
would not be of much help: if you need to track a tupple to the very
beggining of the times, you'll have a lot of work to do
dumping>restoring (and so forth... remember to split the table,
indexing...). Still, rotation seems to be a good practice, and you can
include in the cron job the dump/restore part into another server and
then delete the old table. That would save a lot of space in your
production environment.

> As for the rest of it, I'd be looking to try and come up with an all
> sql implementation.  Also you should give an honest comparison between
> what you've come up with vs. this:
> http://pgfoundry.org/projects/tablelog/.
>
> merlin
>

"All SQL implementation"? Didn't we agree that's not possible in
pg<=8.4? then what do you mean by that?

About project "tablelog", I didn't really try it, but read it's
documentation and seems not appropiate at all for my case. First of
all, it's propose seems to be to log everything in a table to be able
to restore it later as of any time in the past. My propose is to log
to run analysis. Also, it needs to create one table per logged table,
consisting of the same structure of the logged table (without
constraints) plus three, four or five columns for control (depending
on usage, four or five recommended). I have a lot of tables to log
(hundreds!) with small changes to each of them; that means to
duplicate the amount of tables for a few changes. Speaking of
compactness... It also logs everything, not only changed values.
It is written in C, so I assume it runs much, much faster (specially
needed for highly transactional DBs). But it's not proven to be binary
safe (which I don't remember what that is). Bugs: nothing known.

So, if you need to be able to restore your table as of any time, use
tablelog. If you need to run analysis on who did what, use my option.


Finally attaching the code!

Cheers.

--
Diego Augusto Molina
diegoaugustomolina@gmail.com

ES: Por favor, evite adjuntar documentos de Microsoft Office. Serán
desestimados.
EN: Please, avoid attaching Microsoft Office documents. They shall be discarded.
LINK: http://www.gnu.org/philosophy/no-word-attachments.html

Attachment

pgsql-general by date:

Previous
From: Mike Christensen
Date:
Subject: Re: Searching for "bare" letters
Next
From: Joseph S
Date:
Subject: pg_upgrade 8.4 -> 9.1 failures