Thread: Coping with huge deferred-trigger lists

Coping with huge deferred-trigger lists

From
Tom Lane
Date:
I had a thought just now about how to deal with the TODO item about
coping with deferred trigger lists that are so long as to overrun
main memory.  This might be a bit harebrained, but I offer it for
consideration:

What we need to do, at the end of a transaction in which deferred
triggers were fired, is to find each tuple that was inserted or
updated in the current transaction in each table that has such
triggers.  Well, we know where those tuples are: to a first
approximation, they're all near the end of the table.  Perhaps instead
of storing each and every trigger-related tuple in memory, we only need
to store one value per affected table: the lowest CTID of any tuple
that we need to revisit for deferred-trigger purposes.  At the end of
the transaction, scan forward from that point to the end of the table,
looking for tuples that were inserted by the current xact.  Process each
one using the table's list of deferred triggers.

Instead of a list of all tuples subject to deferred triggers, we now
need only a list of all tables subject to deferred triggers, which
should pose no problems for memory consumption.  It might be objected
that this means more disk activity --- but in an xact that hasn't
inserted very many tuples, most likely the disk blocks containing 'em
are still in memory and won't need a physical re-read.  Once we get to
inserting so many tuples that that's not true, this approach should
require less disk activity overall than the previous idea of writing
(and re-reading) a separate disk file for the tuple list.

I am not sure exactly what the "triggered data change violation" test
does or is good for, but if we want to keep it, I *think* that in these
terms we'd just need to signal error if we come across a tuple that was
both inserted and deleted by the current xact.  I'm a bit fuzzy on this
though.

An interesting property of this approach is that if the set of triggers
for the table changes during the xact (which could only happen if this
same xact created or deleted triggers; no other xact can, since changing
triggers requires an exclusive lock on the table), the set of triggers
applied to a tuple is the set that exists at the end of the xact, not
the set that existed when the tuple was modified.  Offhand I think this
is a good change.

Comments?
        regards, tom lane


Re: Coping with huge deferred-trigger lists

From
Jan Wieck
Date:
Tom Lane wrote:
> I had a thought just now about how to deal with the TODO item about
> coping with deferred trigger lists that are so long as to overrun
> main memory.  This might be a bit harebrained, but I offer it for
> consideration:
>
> What we need to do, at the end of a transaction in which deferred
> triggers were fired, is to find each tuple that was inserted or
> updated in the current transaction in each table that has such
> triggers.  Well, we know where those tuples are: to a first
> approximation, they're all near the end of the table.  Perhaps instead
> of storing each and every trigger-related tuple in memory, we only need
> to store one value per affected table: the lowest CTID of any tuple
> that we need to revisit for deferred-trigger purposes.  At the end of
> the transaction, scan forward from that point to the end of the table,
> looking for tuples that were inserted by the current xact.  Process each
> one using the table's list of deferred triggers.
>
> Instead of a list of all tuples subject to deferred triggers, we now
> need only a list of all tables subject to deferred triggers, which
> should pose no problems for memory consumption.  It might be objected
> that this means more disk activity --- but in an xact that hasn't
> inserted very many tuples, most likely the disk blocks containing 'em
> are still in memory and won't need a physical re-read.  Once we get to
> inserting so many tuples that that's not true, this approach should
> require less disk activity overall than the previous idea of writing
> (and re-reading) a separate disk file for the tuple list.
>
> I am not sure exactly what the "triggered data change violation" test
> does or is good for, but if we want to keep it, I *think* that in these
> terms we'd just need to signal error if we come across a tuple that was
> both inserted and deleted by the current xact.  I'm a bit fuzzy on this
> though.
   The  check  came  from my possible wrong understanding of the   SQL3 specs.  The idea I  had  is  that  the  SUMMARY
of  all   changes during a transaction counts. If you INSERT a row into   a table and have immediate triggers invoked,
a later  DELETE   cannot  undo  the  triggers.  So the question is did this row   ever exist?
 

>
> An interesting property of this approach is that if the set of triggers
> for the table changes during the xact (which could only happen if this
> same xact created or deleted triggers; no other xact can, since changing
> triggers requires an exclusive lock on the table), the set of triggers
> applied to a tuple is the set that exists at the end of the xact, not
> the set that existed when the tuple was modified.  Offhand I think this
> is a good change.
>
> Comments?
   Giving you have two separate, named, deferred constraints  on   one table.  Now after a couple of INSERTs and
UPDATEsyou SET   one of them to IMMEDIATE and back to DEFERRED.  This  has  to   run the triggers for one and only one
ofthe constraints now.   If you don't worry about the need of running the checks later   again, it's OK.
 
   The  detail  I'm wondering about most is how you'd know in an   UPDATE case which two tuples (one deleted  during
this XACT   and  one inserted) are the two for OLD and NEW in the call to   the trigger. Note that the referential
actionON  UPDATE  SET   NULL  for  example  doesn't  have  to  take place if the user   didn't change the referenced
keyfields.
 


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com



Re: Coping with huge deferred-trigger lists

From
Tom Lane
Date:
Jan Wieck <JanWieck@Yahoo.com> writes:
>     The  detail  I'm wondering about most is how you'd know in an
>     UPDATE case which two tuples (one deleted  during  this  XACT
>     and  one inserted) are the two for OLD and NEW in the call to
>     the trigger.

Ugh ... good point.  There's no back-link from the updated tuple to
its original on disk, is there?

Back to the drawing board ...
        regards, tom lane


Re: Coping with huge deferred-trigger lists

From
Jan Wieck
Date:
Tom Lane wrote:
> Jan Wieck <JanWieck@Yahoo.com> writes:
> >     The  detail  I'm wondering about most is how you'd know in an
> >     UPDATE case which two tuples (one deleted  during  this  XACT
> >     and  one inserted) are the two for OLD and NEW in the call to
> >     the trigger.
>
> Ugh ... good point.  There's no back-link from the updated tuple to
> its original on disk, is there?
   AFAIK nothing other than the Oid. And that's IMHO a weak one.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com