Re: Why is the number of dead tuples causing the performance of deferred triggers to degrade so rapidly (exponentionally)? - Mailing list pgsql-performance

From Frank van Vugt
Subject Re: Why is the number of dead tuples causing the performance of deferred triggers to degrade so rapidly (exponentionally)?
Date
Msg-id 200408181124.56823.ftm.van.vugt@foxi.nl
Whole thread Raw
In response to Re: Why is the number of dead tuples causing the performance of deferred triggers to degrading so rapidly (exponentionally)?  (Josh Berkus <josh@agliodbs.com>)
Responses Re: Why is the number of dead tuples causing the performance  (DeJuan Jackson <djackson@speedfc.com>)
List pgsql-performance
Hi Josh,

> > It seems in this case the time needed for a single deferred trigger
> > somehow depends on the number of dead tuples in the table

After further investigation I think I have a better grasp of what's going on.

The thing biting me here is indeed the 'delete from' on a table with a number
of dead rows, possibly made worse in some cases where not everything can be
handled in memory.

> I'm not clear on all of the work you're doing in the trigger.
> > NB. My real-world application 'collects' id's in need for deferred work
> I think you're doing a lot more than is wise to do in triggers.

I probably wasn't clear enough on this. I'm not creating types and/or
temporary tables or anything of that kind.

The ratio is probably explained better by this example:

- the database has knowledge on 'parts' and 'sets', the sets have a few fields
whose content depend on the parts, but the proper value for these fields can
only be determined by looking at all the parts of the particular set together
(i.e. it's not a plain 'part-count' that one could update by a trigger on the
part)

- during a transaction, a number of things will happen to various parts of
various sets, so I have after triggers on the parts that will insert the ids
of the sets that need an update into a set_update holding table; in turn,
this set_update table has a deferred trigger

- upon execution of the deferred triggers, I now know that all the work on the
parts is finished, so the deferred trigger initiates an update for the sets
whose ids are in the update table and it will delete these ids afterwards

Now, because multiple updates to parts of the same set will result in multiple
inserts in the update table, I want to avoid doing the set-update more that
once.

Obviously, it would be better to be able to 'cancel' the rest of the calls to
the deferred trigger after it has been executed for the first time, but that
doesn't seem possible.

Even better would be to use a 'for each statement' trigger on the set_update
holding table instead, but it is not possible to create a deferred 'for each
statement' trigger..... ;(

So, I seem to be a bit between a rock and a hard place here, I must use
deferred triggers in order to avoid a costly set update on each part update,
but in such a deferred trigger I cannot avoid doing the update multiple
times....(due to the growing cost of a 'delete from' in the trigger)

Mmm, it seems that by hacking pg_trigger I am able to create a for each
statement trigger that is 'deferrable initially deferred'.

This probably solves my problem, I will ask on 'general' whether this has any
unforseen side effects and whether or not a 'regular' deferrable for each
statement trigger is incorporated in v8.0.

Thanks for you reply!



--
Best,




Frank.


pgsql-performance by date:

Previous
From: Grega Bremec
Date:
Subject: Re: high load caused by I/O - a hint
Next
From: Gaetano Mendola
Date:
Subject: Re: high load caused by I/O - a hint