Thread: Question on triggers and plpgsql

Question on triggers and plpgsql

From
Carlos Moreno
Date:
Hello,

A question from a complete newbie on this area.

I'm trying to implement a mechanism that would allow me to
keep track of the last time each row of a table was modified.

I have many applications modifying the data, and I would
like to avoid having to modify each of those applications
(with the risk of forgetting one of them).

So, I figured a better approach would be a trigger that
gets activated on update (before update, to be specific).

Below is what I came up with, but being the very first time
I do (or even read about) something with triggers or with
plpgsql, I'd like to check if there are any obvious red
flags, or if what I'm doing is hopelessly wrong.

I added a column last_modified (timestamp data type), and
create the following function:

create function set_last_modified() returns trigger as '
begin
new.last_modified = now();
return new;
end;
' language plpgsql;

(this is similar to an example from the PG documentation;
I'm not sure the keyword "new" is the right thing to use
in my case, but it would look like it's a standard way to
refer to the "new row" that is about to replace the old
one)

Then, I created the trigger as follows:

create trigger last_modified_on_update
before update on table_name
for each row
execute procedure set_last_modified();


The thing seems to work -- I had to go in a shell as user
postgres and execute the command:

$ createlang -d dbname plpgsql

(I'm not sure I understand why that is necessary, or
what implications -- positive or negative -- it may have)

Am I doing the right thing?  Have I introduced some sort
of catastrophe waiting to happen?

Thanks for any guidance you may offer to this PL/PGSQL
beginner!

Carlos
--



Re: Question on triggers and plpgsql

From
John DeSoi
Date:
On Apr 7, 2005, at 5:45 PM, Carlos Moreno wrote:

> The thing seems to work -- I had to go in a shell as user
> postgres and execute the command:
>
> $ createlang -d dbname plpgsql
>
> (I'm not sure I understand why that is necessary, or
> what implications -- positive or negative -- it may have)

As a security measure, no pl language is available by default. What you 
did is correct. There is not much (any?) risk with pl/pgsql, so you can 
install it in template1 so it will be available in any new database you 
create.

>
> Am I doing the right thing?  Have I introduced some sort
> of catastrophe waiting to happen?

I did not notice any problems.


John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL



Re: Question on triggers and plpgsql

From
Sean Davis
Date:
On Apr 8, 2005, at 8:28 AM, John DeSoi wrote:

>
> On Apr 7, 2005, at 5:45 PM, Carlos Moreno wrote:
>
>> The thing seems to work -- I had to go in a shell as user
>> postgres and execute the command:
>>
>> $ createlang -d dbname plpgsql
>>
>> (I'm not sure I understand why that is necessary, or
>> what implications -- positive or negative -- it may have)
>
> As a security measure, no pl language is available by default. What 
> you did is correct. There is not much (any?) risk with pl/pgsql, so 
> you can install it in template1 so it will be available in any new 
> database you create.
>
>>
>> Am I doing the right thing?  Have I introduced some sort
>> of catastrophe waiting to happen?
>
> I did not notice any problems.
>

Just one detail, but in the form of a question. In the original 
posting, I think the trigger was doing the logging for something 
happening on a table as a before insert or update--I may be wrong on 
that detail.  I would think of doing such actions AFTER the 
update/insert.  In the world of transaction-safe operations, is there 
ANY danger in doing the logging as a BEFORE trigger rather than an 
AFTER trigger?

Thanks,
Sean



Re: Question on triggers and plpgsql

From
John DeSoi
Date:
On Apr 8, 2005, at 9:41 AM, Sean Davis wrote:

> Just one detail, but in the form of a question. In the original 
> posting, I think the trigger was doing the logging for something 
> happening on a table as a before insert or update--I may be wrong on 
> that detail.  I would think of doing such actions AFTER the 
> update/insert.  In the world of transaction-safe operations, is there 
> ANY danger in doing the logging as a BEFORE trigger rather than an 
> AFTER trigger?


Good point. I think both will work in this case and it would depend on 
the application if it makes a difference. You definitely want an AFTER 
trigger if you need to see the final state of the row before making 
changes. In this case the assignment of the column does not depend on 
any other factors so it would not seem to matter. But I agree from a 
semantics point of view, an AFTER trigger might be a little better for 
this.

John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL



Re: Question on triggers and plpgsql

From
Tom Lane
Date:
Sean Davis <sdavis2@mail.nih.gov> writes:
> Just one detail, but in the form of a question. In the original 
> posting, I think the trigger was doing the logging for something 
> happening on a table as a before insert or update--I may be wrong on 
> that detail.  I would think of doing such actions AFTER the 
> update/insert.  In the world of transaction-safe operations, is there 
> ANY danger in doing the logging as a BEFORE trigger rather than an 
> AFTER trigger?

No, actually Carlos wanted to donew.last_modified = now();
so he *must* use a BEFORE trigger --- AFTER is too late to change the
data that will be stored.

Generalizing freely, I've seen three basic uses for triggers:1. Modify the data that will be stored.2. Check that data
isvalid (eg, consistent with another table).3. Propagate updates in one place to other places.
 
Clearly #1 must be done in BEFORE triggers.  #2 and #3 could be done
either way.  They are often done in AFTER triggers because that way you
*know* that any case-1 triggers have done their work and you are looking
at the correct final state of the row.  But you could do them in a
BEFORE trigger if you were willing to assume that no later-fired trigger
would make a change that invalidates your check or propagation.  AFTER
triggers are relatively expensive (since the triggering event state has
to be saved and then recalled) so I could see making that tradeoff if
performance is critical.

AFAICS the only way that you could get into a can't-roll-back situation
is if the trigger tries to propagate the update outside the database.
For instance, the proverbial trigger to send mail: once sent you can't
cancel it.  But really this is dangerous even in an AFTER trigger ---
the transaction could still be rolled back after the AFTER trigger
fires.
        regards, tom lane


Re: Question on triggers and plpgsql

From
Andrew Sullivan
Date:
On Fri, Apr 08, 2005 at 10:36:26AM -0400, Tom Lane wrote:

> AFAICS the only way that you could get into a can't-roll-back situation
> is if the trigger tries to propagate the update outside the database.
> For instance, the proverbial trigger to send mail: once sent you can't
> cancel it.  But really this is dangerous even in an AFTER trigger ---
> the transaction could still be rolled back after the AFTER trigger
> fires.

People who know more about this will no doubt correct me, but isn't
such a case crying out for LISTEN/NOTIFY instead?  That is, your
trigger puts the mail content into a table of mails to be sent, and
wakes up the mail-sender client with the NOTIFY; the NOTIFY and the
commit to the mail-it table only happen in that case if the
transaction commits.  And since mail is async anyway, the extra few
seconds shouldn't make any difference, right?

A

-- 
Andrew Sullivan  | ajs@crankycanuck.ca
The plural of anecdote is not data.    --Roger Brinner


Re: Question on triggers and plpgsql

From
John DeSoi
Date:
Tom,

Thanks for setting the record straight. It has been a while since I 
have written a trigger and I forgot that you can't modify the row in 
the AFTER trigger. Makes perfect sense.

For the record, here is what the docs say:

Typically, row before triggers are used for checking or modifying the 
data that will be inserted or updated. For example, a before trigger 
might be used to insert the current time into a timestamp column, or to 
check that two elements of the row are consistent. Row after triggers 
are most sensibly used to propagate the updates to other tables, or 
make consistency checks against other tables. The reason for this 
division of labor is that an after trigger can be certain it is seeing 
the final value of the row, while a before trigger cannot; there might 
be other before triggers firing after it. If you have no specific 
reason to make a trigger before or after, the before case is more 
efficient, since the information about the operation doesn't have to be 
saved until end of statement.

It might be worth adding a sentence here that explicitly states 
modifications can only be made in the BEFORE trigger. I did not see 
that anywhere else in the document.

On Apr 8, 2005, at 10:36 AM, Tom Lane wrote:

> No, actually Carlos wanted to do
>     new.last_modified = now();
> so he *must* use a BEFORE trigger --- AFTER is too late to change the
> data that will be stored.
>
> Generalizing freely, I've seen three basic uses for triggers:
>     1. Modify the data that will be stored.
>     2. Check that data is valid (eg, consistent with another table).
>     3. Propagate updates in one place to other places.
> Clearly #1 must be done in BEFORE triggers.  #2 and #3 could be done
> either way.  They are often done in AFTER triggers because that way you
> *know* that any case-1 triggers have done their work and you are 
> looking
> at the correct final state of the row.  But you could do them in a
> BEFORE trigger if you were willing to assume that no later-fired 
> trigger
> would make a change that invalidates your check or propagation.  AFTER
> triggers are relatively expensive (since the triggering event state has
> to be saved and then recalled) so I could see making that tradeoff if
> performance is critical.


John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL



Re: Question on triggers and plpgsql

From
Tom Lane
Date:
Andrew Sullivan <ajs@crankycanuck.ca> writes:
> On Fri, Apr 08, 2005 at 10:36:26AM -0400, Tom Lane wrote:
>> AFAICS the only way that you could get into a can't-roll-back situation
>> is if the trigger tries to propagate the update outside the database.
>> For instance, the proverbial trigger to send mail: once sent you can't
>> cancel it.  But really this is dangerous even in an AFTER trigger ---
>> the transaction could still be rolled back after the AFTER trigger
>> fires.

> People who know more about this will no doubt correct me, but isn't
> such a case crying out for LISTEN/NOTIFY instead?  That is, your
> trigger puts the mail content into a table of mails to be sent, and
> wakes up the mail-sender client with the NOTIFY; the NOTIFY and the
> commit to the mail-it table only happen in that case if the
> transaction commits.  And since mail is async anyway, the extra few
> seconds shouldn't make any difference, right?

We do often recommend that, though it occurs to me that this just moves
the failure case somewhere else.  The hypothetical mail-sending process
would presumably want to send mail and then delete the associated record
from the table of pending mails ... so what if it fails after sending
the mail and before committing the delete?

What this does do for you is replace the risk of phantom emails (mail
sent but corresponding action inside the database never committed)
with the risk of duplicate emails (mail-sender sends you another one
after it restarts).  In most cases I think I'd prefer the latter.
        regards, tom lane


Re: Question on triggers and plpgsql

From
Andrew Sullivan
Date:
On Fri, Apr 08, 2005 at 11:35:47AM -0400, Tom Lane wrote:
> What this does do for you is replace the risk of phantom emails (mail
> sent but corresponding action inside the database never committed)
> with the risk of duplicate emails (mail-sender sends you another one
> after it restarts).  In most cases I think I'd prefer the latter.

Me too.  Besides, you already have this risk with SMTP, because a
message can be queued and accepted on the remote side when the local
side goes away, so that the session is completed improperly. 
Depending on configuration and a bunch of painful start-up
possibilities with the server, you might well get a duplicate copy of
a mail transmitted later.  (In the present age, given the remarkable
quality of networks and mail servers everyone has, you almost never
have this happen any more.  But it's still strictly speaking
possible.)

A

-- 
Andrew Sullivan  | ajs@crankycanuck.ca
The fact that technology doesn't work is no bar to success in the marketplace.    --Philip Greenspun


Re: Question on triggers and plpgsql

From
Richard Huxton
Date:
Tom Lane wrote:
>  The hypothetical mail-sending process
> would presumably want to send mail and then delete the associated record
> from the table of pending mails ... so what if it fails after sending
> the mail and before committing the delete?
> 
> What this does do for you is replace the risk of phantom emails (mail
> sent but corresponding action inside the database never committed)
> with the risk of duplicate emails (mail-sender sends you another one
> after it restarts).  In most cases I think I'd prefer the latter.

You have this possibility anyway. If a mailserver thinks it has failed 
to forward the message, it will resend. There is always a small window 
where the receiving mailserver might actually have received the message 
without the acknowledgement being logged by the sender.

--  Richard Huxton  Archonet Ltd


Re: Question on triggers and plpgsql

From
Vivek Khera
Date:
On Apr 8, 2005, at 10:59 AM, Andrew Sullivan wrote:

> wakes up the mail-sender client with the NOTIFY; the NOTIFY and the
> commit to the mail-it table only happen in that case if the
> transaction commits.  And since mail is async anyway, the extra few
> seconds shouldn't make any difference, right?
>

I have a lot of processing that could benefit from this type of 
synchronization, except the fact that there's no Pg command to "wait 
until I get a notify message".  You have to constantly poll to see if 
you got one, which negates a lot of the benefit of async notification 
to rarely run processes.

Vivek Khera, Ph.D.
+1-301-869-4449 x806


Re: Question on triggers and plpgsql

From
Tom Lane
Date:
Vivek Khera <vivek@khera.org> writes:
> I have a lot of processing that could benefit from this type of 
> synchronization, except the fact that there's no Pg command to "wait 
> until I get a notify message".

This is a client library deficiency, not a problem with the backend or
the protocol.  In libpq it is actually possible to do it, but you have
to select() or poll() on the socket for yourself, which is a tad ugly.

OTOH, most apps that want to do that also want to wait on other sockets
at the same time, so a cleaner-looking API wouldn't necessarily be any
more useful.
        regards, tom lane


Re: Question on triggers and plpgsql

From
Carlos Moreno
Date:
I think I sent my previous message to John only  (sorry!)


I just wanted to double check one detail that is not explicitly
stated in the documentation for createlang.

My question is:  can I use createlang on a database that is
currently active?  That is, a database with plenty of tables
that has been and is currently in use?

My guess is that there should be no problem and no risk in
doing that -- but being my first steps in PL, I wouldn't
like to trust a beginner's intuition for a production
system.

Thanks,

Carlos
--


Re: Question on triggers and plpgsql

From
Tom Lane
Date:
Carlos Moreno <moreno@mochima.com> writes:
> My question is:  can I use createlang on a database that is
> currently active?  That is, a database with plenty of tables
> that has been and is currently in use?

Sure.
        regards, tom lane