Thread: question on audit columns
Hello,
In postgres database , we have all the tables with audit columns like created_by_user, created_timestamp,updated_by_user, updated_timestamp. So we have these fields that were supposed to be populated by the time at which the insert/update operation happened on the database but not at the application level. So we are planning to populate the created_by_user, created_timestamp columns by setting a default value of "current_timestamp" and "current_user" for the two columns, but no such this is available to populate while we do the update of the row, so the only option seems to be through a trigger.
So wanted to check with the experts here ,considering the table will be DML heavy table (300M+ transactions will be inserted daily), Is is okay to have the trigger for this table for populating all the audit columns or should we keep default for created_by_user, created_timestamp and just trigger for the update related two audit column? Basically wanted to see, if the default value does the same thing as a trigger or it does something more optimally than trigger?
Regards
Yudhi
Hi,
In your scenario, triggers can add some overhead since they require extra processing after each update operation. Considering the size of your table and the high transaction volume, you need to observe that this might significantly affect performance.
On Wed, 4 Sept 2024 at 17:50, yudhi s <learnerdatabase99@gmail.com> wrote:
Hello,In postgres database , we have all the tables with audit columns like created_by_user, created_timestamp,updated_by_user, updated_timestamp. So we have these fields that were supposed to be populated by the time at which the insert/update operation happened on the database but not at the application level. So we are planning to populate the created_by_user, created_timestamp columns by setting a default value of "current_timestamp" and "current_user" for the two columns, but no such this is available to populate while we do the update of the row, so the only option seems to be through a trigger.So wanted to check with the experts here ,considering the table will be DML heavy table (300M+ transactions will be inserted daily), Is is okay to have the trigger for this table for populating all the audit columns or should we keep default for created_by_user, created_timestamp and just trigger for the update related two audit column? Basically wanted to see, if the default value does the same thing as a trigger or it does something more optimally than trigger?RegardsYudhi
On Wed, Sep 4, 2024 at 6:29 PM Muhammad Usman Khan <usman.k@bitnine.net> wrote:
Hi,
In your scenario, triggers can add some overhead since they require extra processing after each update operation. Considering the size of your table and the high transaction volume, you need to observe that this might significantly affect performance.On Wed, 4 Sept 2024 at 17:50, yudhi s <learnerdatabase99@gmail.com> wrote:Hello,In postgres database , we have all the tables with audit columns like created_by_user, created_timestamp,updated_by_user, updated_timestamp. So we have these fields that were supposed to be populated by the time at which the insert/update operation happened on the database but not at the application level. So we are planning to populate the created_by_user, created_timestamp columns by setting a default value of "current_timestamp" and "current_user" for the two columns, but no such this is available to populate while we do the update of the row, so the only option seems to be through a trigger.So wanted to check with the experts here ,considering the table will be DML heavy table (300M+ transactions will be inserted daily), Is is okay to have the trigger for this table for populating all the audit columns or should we keep default for created_by_user, created_timestamp and just trigger for the update related two audit column? Basically wanted to see, if the default value does the same thing as a trigger or it does something more optimally than trigger?RegardsYudhi
Thank you so much. So do you mean to say that , we should add default values for the create_timestamp and create_user_id as current_timestamp and current_user, but for update_user_id and update_timestamp , we can ask the application to update the values manually , whenever they are executing the update statement on the rows?
Yes this would be the better approach.
From: yudhi s <learnerdatabase99@gmail.com>
Sent: Wednesday, September 4, 2024 6:10:04 PM
To: Muhammad Usman Khan <usman.k@bitnine.net>
Cc: pgsql-general <pgsql-general@lists.postgresql.org>
Subject: Re: question on audit columns
Sent: Wednesday, September 4, 2024 6:10:04 PM
To: Muhammad Usman Khan <usman.k@bitnine.net>
Cc: pgsql-general <pgsql-general@lists.postgresql.org>
Subject: Re: question on audit columns
On Wed, Sep 4, 2024 at 6:29 PM Muhammad Usman Khan <usman.k@bitnine.net> wrote:
Hi,
In your scenario, triggers can add some overhead since they require extra processing after each update operation. Considering the size of your table and the high transaction volume, you need to observe that this might significantly affect performance.On Wed, 4 Sept 2024 at 17:50, yudhi s <learnerdatabase99@gmail.com> wrote:Hello,In postgres database , we have all the tables with audit columns like created_by_user, created_timestamp,updated_by_user, updated_timestamp. So we have these fields that were supposed to be populated by the time at which the insert/update operation happened on the database but not at the application level. So we are planning to populate the created_by_user, created_timestamp columns by setting a default value of "current_timestamp" and "current_user" for the two columns, but no such this is available to populate while we do the update of the row, so the only option seems to be through a trigger.So wanted to check with the experts here ,considering the table will be DML heavy table (300M+ transactions will be inserted daily), Is is okay to have the trigger for this table for populating all the audit columns or should we keep default for created_by_user, created_timestamp and just trigger for the update related two audit column? Basically wanted to see, if the default value does the same thing as a trigger or it does something more optimally than trigger?RegardsYudhi
Thank you so much. So do you mean to say that , we should add default values for the create_timestamp and create_user_id as current_timestamp and current_user, but for update_user_id and update_timestamp , we can ask the application to update the values manually , whenever they are executing the update statement on the rows?
On Wed, Sep 4, 2024 at 9:10 AM yudhi s <learnerdatabase99@gmail.com> wrote:
On Wed, Sep 4, 2024 at 6:29 PM Muhammad Usman Khan <usman.k@bitnine.net> wrote:Hi,
In your scenario, triggers can add some overhead since they require extra processing after each update operation. Considering the size of your table and the high transaction volume, you need to observe that this might significantly affect performance.
On Wed, 4 Sept 2024 at 17:50, yudhi s <learnerdatabase99@gmail.com> wrote:Hello,In postgres database , we have all the tables with audit columns like created_by_user, created_timestamp,updated_by_user, updated_timestamp. So we have these fields that were supposed to be populated by the time at which the insert/update operation happened on the database but not at the application level. So we are planning to populate the created_by_user, created_timestamp columns by setting a default value of "current_timestamp" and "current_user" for the two columns, but no such this is available to populate while we do the update of the row, so the only option seems to be through a trigger.So wanted to check with the experts here ,considering the table will be DML heavy table (300M+ transactions will be inserted daily), Is is okay to have the trigger for this table for populating all the audit columns or should we keep default for created_by_user, created_timestamp and just trigger for the update related two audit column? Basically wanted to see, if the default value does the same thing as a trigger or it does something more optimally than trigger?RegardsYudhiThank you so much. So do you mean to say that , we should add default values for the create_timestamp and create_user_id as current_timestamp and current_user,
That's the simplest way. But the application can overwrite those fields.
but for update_user_id and update_timestamp , we can ask the application to update the values manually , whenever they are executing the update statement on the rows?
If they're really strict, you might need INSERT and UPDATE triggers that call security defined functions which write into a separate table not accessible by the application. That table would have the application table's PK, created_by_user, created_timestamp, updated_by_user and updated_timestamp.
Would that table have a LOT of records? Sure.
Would it add overhead? Sure.
But the subsequently beefier hardware requirements and care in designing the physical schema (for example, audit tables in a separate tablespace and pg_wal/ on separate disk controllers, or a 10Gb SAN) are the price you pay for strict audit requirements.
Of course, if the audit requirements are minimal, then sure, "default values and the application" are Good Enough.
Death to America, and butter sauce.
Iraq lobster!
On 9/4/24 06:17, Khan Muhammad Usman wrote: > Yes this would be the better approach. 1) Except the overhead is now shifted to the application, which may or not be better. You are also moving the audit responsibility to the application and the application maintainers and making it application specific. If a new application/client starts hitting the database and it did not get the memo about the audit fields they won't be filled in. 2) I would recommend setting up a some realistic tests and see if the overhead of the update triggers would be a concern. -- Adrian Klaver adrian.klaver@aklaver.com
> > On 9/4/24 06:17, Khan Muhammad Usman wrote: > > Yes this would be the better approach. > > 1) Except the overhead is now shifted to the application, which may or > not be better. You are also moving the audit responsibility to the > application and the application maintainers and making it application > specific. If a new application/client starts hitting the database and it > did not get the memo about the audit fields they won't be filled in. > > 2) I would recommend setting up a some realistic tests and see if the > overhead of the update triggers would be a concern. > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com If it helps, we implemented a trigger based audit system 20 years ago. It both creates a separate inviolate audit table recordand updates the record being changed with a timestamp and a userid of last change. We've not regretted it and moderatehardware deals well with the overhead (500+ table database, 80+ concurrent users, 18 million audit records a month). -- Tim Clarke MSc, BSc (Hons), MBCS IT Director Direct: +44 (0)1376 504510 | Mobile: +44 (0)7887 563420 [https://i0.wp.com/www.manifest.co.uk/wp-content/uploads/2022/12/Minerva-Analytics-Logo-PORTRAIT.png] [https://i0.wp.com/www.manifest.co.uk/wp-content/uploads/2022/12/esg_finalist.png] Telephone: Witham: +44(0)1376 503500 | London: +44 (0)20 3009 0853 | Frankfurt: +49 (0)69 7191 6000 | Hong Kong: +852 58031687 | Toronto: +1 647 503 2848 Web: https://www.manifest.co.uk/ Watch our latest Minerva Briefings on BrightTALK<https://www.brighttalk.com/channel/18792/?utm_source=brighttalk-sharing&utm_medium=web&utm_campaign=linkshare> Minerva Analytics Ltd - A Solactive Company 9 Freebournes Court | Newland Street | Witham | Essex | CM8 2BL | United Kingdom ________________________________ Copyright: This e-mail may contain confidential or legally privileged information. If you are not the named addressee youmust not use or disclose such information, instead please report it to admin@minerva.info<mailto:admin@minerva.info> Legal: Minerva Analytics is the trading name of: Minerva Analytics Ltd: Registered in England Number 11260966 & The ManifestVoting Agency Ltd: Registered in England Number 2920820 Registered Office at above address. Please Click Here https://www.manifest.co.uk/legal/for further information.
On Wed, 4 Sep 2024 18:19:47 +0530 yudhi s <learnerdatabase99@gmail.com> wrote: Hi, > In postgres database , we have all the tables with audit columns like > created_by_user, created_timestamp,updated_by_user, updated_timestamp. So > we have these fields that were supposed to be populated by the time at > which the insert/update operation happened on the database but not at the > application level. So we are planning to populate the created_by_user, > created_timestamp columns by setting a default value of "current_timestamp" > and "current_user" for the two columns, but no such this is available to > populate while we do the update of the row, so the only option seems to be > through a trigger. > If you can live with the fact that updated_by_user and updated_timestamp get the same values as created_by_user and created_timestampwhen inserting the record, then you can do : vv=> create table audit (created_by_user text default current_user, created_timestamp timestamp default now(), updated_by_usertext default current_user, updated_timestamp timestamp default now(), data text); CREATE TABLE vv=> insert into audit (data) values ('abc'); INSERT 0 1 vv=> select * from audit; created_by_user | created_timestamp | updated_by_user | updated_timestamp | data -----------------+----------------------------+-----------------+----------------------------+------ vincent | 2024-09-05 19:17:53.446109 | vincent | 2024-09-05 19:17:53.446109 | abc (1 row) --as user postgres update audit set updated_by_user = DEFAULT, updated_timestamp = DEFAULT, data = 'def'; vv=> select * from audit; created_by_user | created_timestamp | updated_by_user | updated_timestamp | data -----------------+----------------------------+-----------------+---------------------------+------ vincent | 2024-09-05 19:17:53.446109 | postgres | 2024-09-05 19:24:01.19186 | def (1 row) -- Bien à vous, Vincent Veyron https://marica.fr/ Logiciel de suivi des contentieux juridiques, des sinistres d'assurance et des contrats
... and (3) the values are not updated on manual actions without triggers - which might or might not be desirable depending on the intention. Best, Johannes On 9/4/2024 16:36, Adrian Klaver wrote: > On 9/4/24 06:17, Khan Muhammad Usman wrote: >> Yes this would be the better approach. > > 1) Except the overhead is now shifted to the application, which may or > not be better. You are also moving the audit responsibility to the > application and the application maintainers and making it application > specific. If a new application/client starts hitting the database and > it did not get the memo about the audit fields they won't be filled in. > > 2) I would recommend setting up a some realistic tests and see if the > overhead of the update triggers would be a concern. > > >
As far as the application being able to change those fields itself, you can prevent that via column permissions, by leaving out the four audit columns and doing something like:
GRANT INSERT (email, widget_count), UPDATE (email, widget_count) ON TABLE foobar TO PUBLIC;
That way, inserts are guaranteed to use the default values of current_timestamp() and current_user. And a BEFORE UPDATE trigger ensures it changes the other two fields via the trigger function only.
Cheers,
Greg
P.S. Also check out https://www.pgaudit.org/ (PGAudit) as an alternative approach, which puts the information into your Postgres logs, rather than in the tables themselves.