Thread: Question on triggers and plpgsql
Hello, A question from a complete newbie on this area. I'm trying to implement a mechanism that would allow me to keep track of the last time each row of a table was modified. I have many applications modifying the data, and I would like to avoid having to modify each of those applications (with the risk of forgetting one of them). So, I figured a better approach would be a trigger that gets activated on update (before update, to be specific). Below is what I came up with, but being the very first time I do (or even read about) something with triggers or with plpgsql, I'd like to check if there are any obvious red flags, or if what I'm doing is hopelessly wrong. I added a column last_modified (timestamp data type), and create the following function: create function set_last_modified() returns trigger as ' begin new.last_modified = now(); return new; end; ' language plpgsql; (this is similar to an example from the PG documentation; I'm not sure the keyword "new" is the right thing to use in my case, but it would look like it's a standard way to refer to the "new row" that is about to replace the old one) Then, I created the trigger as follows: create trigger last_modified_on_update before update on table_name for each row execute procedure set_last_modified(); The thing seems to work -- I had to go in a shell as user postgres and execute the command: $ createlang -d dbname plpgsql (I'm not sure I understand why that is necessary, or what implications -- positive or negative -- it may have) Am I doing the right thing? Have I introduced some sort of catastrophe waiting to happen? Thanks for any guidance you may offer to this PL/PGSQL beginner! Carlos --
On Apr 7, 2005, at 5:45 PM, Carlos Moreno wrote: > The thing seems to work -- I had to go in a shell as user > postgres and execute the command: > > $ createlang -d dbname plpgsql > > (I'm not sure I understand why that is necessary, or > what implications -- positive or negative -- it may have) As a security measure, no pl language is available by default. What you did is correct. There is not much (any?) risk with pl/pgsql, so you can install it in template1 so it will be available in any new database you create. > > Am I doing the right thing? Have I introduced some sort > of catastrophe waiting to happen? I did not notice any problems. John DeSoi, Ph.D. http://pgedit.com/ Power Tools for PostgreSQL
On Apr 8, 2005, at 8:28 AM, John DeSoi wrote: > > On Apr 7, 2005, at 5:45 PM, Carlos Moreno wrote: > >> The thing seems to work -- I had to go in a shell as user >> postgres and execute the command: >> >> $ createlang -d dbname plpgsql >> >> (I'm not sure I understand why that is necessary, or >> what implications -- positive or negative -- it may have) > > As a security measure, no pl language is available by default. What > you did is correct. There is not much (any?) risk with pl/pgsql, so > you can install it in template1 so it will be available in any new > database you create. > >> >> Am I doing the right thing? Have I introduced some sort >> of catastrophe waiting to happen? > > I did not notice any problems. > Just one detail, but in the form of a question. In the original posting, I think the trigger was doing the logging for something happening on a table as a before insert or update--I may be wrong on that detail. I would think of doing such actions AFTER the update/insert. In the world of transaction-safe operations, is there ANY danger in doing the logging as a BEFORE trigger rather than an AFTER trigger? Thanks, Sean
On Apr 8, 2005, at 9:41 AM, Sean Davis wrote: > Just one detail, but in the form of a question. In the original > posting, I think the trigger was doing the logging for something > happening on a table as a before insert or update--I may be wrong on > that detail. I would think of doing such actions AFTER the > update/insert. In the world of transaction-safe operations, is there > ANY danger in doing the logging as a BEFORE trigger rather than an > AFTER trigger? Good point. I think both will work in this case and it would depend on the application if it makes a difference. You definitely want an AFTER trigger if you need to see the final state of the row before making changes. In this case the assignment of the column does not depend on any other factors so it would not seem to matter. But I agree from a semantics point of view, an AFTER trigger might be a little better for this. John DeSoi, Ph.D. http://pgedit.com/ Power Tools for PostgreSQL
Sean Davis <sdavis2@mail.nih.gov> writes: > Just one detail, but in the form of a question. In the original > posting, I think the trigger was doing the logging for something > happening on a table as a before insert or update--I may be wrong on > that detail. I would think of doing such actions AFTER the > update/insert. In the world of transaction-safe operations, is there > ANY danger in doing the logging as a BEFORE trigger rather than an > AFTER trigger? No, actually Carlos wanted to donew.last_modified = now(); so he *must* use a BEFORE trigger --- AFTER is too late to change the data that will be stored. Generalizing freely, I've seen three basic uses for triggers:1. Modify the data that will be stored.2. Check that data isvalid (eg, consistent with another table).3. Propagate updates in one place to other places. Clearly #1 must be done in BEFORE triggers. #2 and #3 could be done either way. They are often done in AFTER triggers because that way you *know* that any case-1 triggers have done their work and you are looking at the correct final state of the row. But you could do them in a BEFORE trigger if you were willing to assume that no later-fired trigger would make a change that invalidates your check or propagation. AFTER triggers are relatively expensive (since the triggering event state has to be saved and then recalled) so I could see making that tradeoff if performance is critical. AFAICS the only way that you could get into a can't-roll-back situation is if the trigger tries to propagate the update outside the database. For instance, the proverbial trigger to send mail: once sent you can't cancel it. But really this is dangerous even in an AFTER trigger --- the transaction could still be rolled back after the AFTER trigger fires. regards, tom lane
On Fri, Apr 08, 2005 at 10:36:26AM -0400, Tom Lane wrote: > AFAICS the only way that you could get into a can't-roll-back situation > is if the trigger tries to propagate the update outside the database. > For instance, the proverbial trigger to send mail: once sent you can't > cancel it. But really this is dangerous even in an AFTER trigger --- > the transaction could still be rolled back after the AFTER trigger > fires. People who know more about this will no doubt correct me, but isn't such a case crying out for LISTEN/NOTIFY instead? That is, your trigger puts the mail content into a table of mails to be sent, and wakes up the mail-sender client with the NOTIFY; the NOTIFY and the commit to the mail-it table only happen in that case if the transaction commits. And since mail is async anyway, the extra few seconds shouldn't make any difference, right? A -- Andrew Sullivan | ajs@crankycanuck.ca The plural of anecdote is not data. --Roger Brinner
Tom, Thanks for setting the record straight. It has been a while since I have written a trigger and I forgot that you can't modify the row in the AFTER trigger. Makes perfect sense. For the record, here is what the docs say: Typically, row before triggers are used for checking or modifying the data that will be inserted or updated. For example, a before trigger might be used to insert the current time into a timestamp column, or to check that two elements of the row are consistent. Row after triggers are most sensibly used to propagate the updates to other tables, or make consistency checks against other tables. The reason for this division of labor is that an after trigger can be certain it is seeing the final value of the row, while a before trigger cannot; there might be other before triggers firing after it. If you have no specific reason to make a trigger before or after, the before case is more efficient, since the information about the operation doesn't have to be saved until end of statement. It might be worth adding a sentence here that explicitly states modifications can only be made in the BEFORE trigger. I did not see that anywhere else in the document. On Apr 8, 2005, at 10:36 AM, Tom Lane wrote: > No, actually Carlos wanted to do > new.last_modified = now(); > so he *must* use a BEFORE trigger --- AFTER is too late to change the > data that will be stored. > > Generalizing freely, I've seen three basic uses for triggers: > 1. Modify the data that will be stored. > 2. Check that data is valid (eg, consistent with another table). > 3. Propagate updates in one place to other places. > Clearly #1 must be done in BEFORE triggers. #2 and #3 could be done > either way. They are often done in AFTER triggers because that way you > *know* that any case-1 triggers have done their work and you are > looking > at the correct final state of the row. But you could do them in a > BEFORE trigger if you were willing to assume that no later-fired > trigger > would make a change that invalidates your check or propagation. AFTER > triggers are relatively expensive (since the triggering event state has > to be saved and then recalled) so I could see making that tradeoff if > performance is critical. John DeSoi, Ph.D. http://pgedit.com/ Power Tools for PostgreSQL
Andrew Sullivan <ajs@crankycanuck.ca> writes: > On Fri, Apr 08, 2005 at 10:36:26AM -0400, Tom Lane wrote: >> AFAICS the only way that you could get into a can't-roll-back situation >> is if the trigger tries to propagate the update outside the database. >> For instance, the proverbial trigger to send mail: once sent you can't >> cancel it. But really this is dangerous even in an AFTER trigger --- >> the transaction could still be rolled back after the AFTER trigger >> fires. > People who know more about this will no doubt correct me, but isn't > such a case crying out for LISTEN/NOTIFY instead? That is, your > trigger puts the mail content into a table of mails to be sent, and > wakes up the mail-sender client with the NOTIFY; the NOTIFY and the > commit to the mail-it table only happen in that case if the > transaction commits. And since mail is async anyway, the extra few > seconds shouldn't make any difference, right? We do often recommend that, though it occurs to me that this just moves the failure case somewhere else. The hypothetical mail-sending process would presumably want to send mail and then delete the associated record from the table of pending mails ... so what if it fails after sending the mail and before committing the delete? What this does do for you is replace the risk of phantom emails (mail sent but corresponding action inside the database never committed) with the risk of duplicate emails (mail-sender sends you another one after it restarts). In most cases I think I'd prefer the latter. regards, tom lane
On Fri, Apr 08, 2005 at 11:35:47AM -0400, Tom Lane wrote: > What this does do for you is replace the risk of phantom emails (mail > sent but corresponding action inside the database never committed) > with the risk of duplicate emails (mail-sender sends you another one > after it restarts). In most cases I think I'd prefer the latter. Me too. Besides, you already have this risk with SMTP, because a message can be queued and accepted on the remote side when the local side goes away, so that the session is completed improperly. Depending on configuration and a bunch of painful start-up possibilities with the server, you might well get a duplicate copy of a mail transmitted later. (In the present age, given the remarkable quality of networks and mail servers everyone has, you almost never have this happen any more. But it's still strictly speaking possible.) A -- Andrew Sullivan | ajs@crankycanuck.ca The fact that technology doesn't work is no bar to success in the marketplace. --Philip Greenspun
Tom Lane wrote: > The hypothetical mail-sending process > would presumably want to send mail and then delete the associated record > from the table of pending mails ... so what if it fails after sending > the mail and before committing the delete? > > What this does do for you is replace the risk of phantom emails (mail > sent but corresponding action inside the database never committed) > with the risk of duplicate emails (mail-sender sends you another one > after it restarts). In most cases I think I'd prefer the latter. You have this possibility anyway. If a mailserver thinks it has failed to forward the message, it will resend. There is always a small window where the receiving mailserver might actually have received the message without the acknowledgement being logged by the sender. -- Richard Huxton Archonet Ltd
On Apr 8, 2005, at 10:59 AM, Andrew Sullivan wrote: > wakes up the mail-sender client with the NOTIFY; the NOTIFY and the > commit to the mail-it table only happen in that case if the > transaction commits. And since mail is async anyway, the extra few > seconds shouldn't make any difference, right? > I have a lot of processing that could benefit from this type of synchronization, except the fact that there's no Pg command to "wait until I get a notify message". You have to constantly poll to see if you got one, which negates a lot of the benefit of async notification to rarely run processes. Vivek Khera, Ph.D. +1-301-869-4449 x806
Vivek Khera <vivek@khera.org> writes: > I have a lot of processing that could benefit from this type of > synchronization, except the fact that there's no Pg command to "wait > until I get a notify message". This is a client library deficiency, not a problem with the backend or the protocol. In libpq it is actually possible to do it, but you have to select() or poll() on the socket for yourself, which is a tad ugly. OTOH, most apps that want to do that also want to wait on other sockets at the same time, so a cleaner-looking API wouldn't necessarily be any more useful. regards, tom lane
I think I sent my previous message to John only (sorry!) I just wanted to double check one detail that is not explicitly stated in the documentation for createlang. My question is: can I use createlang on a database that is currently active? That is, a database with plenty of tables that has been and is currently in use? My guess is that there should be no problem and no risk in doing that -- but being my first steps in PL, I wouldn't like to trust a beginner's intuition for a production system. Thanks, Carlos --
Carlos Moreno <moreno@mochima.com> writes: > My question is: can I use createlang on a database that is > currently active? That is, a database with plenty of tables > that has been and is currently in use? Sure. regards, tom lane