Home > mailing lists

Re: Performance problem with row count trigger - Mailing list pgsql-sql

From	Tony Cebzanov
Subject	Re: Performance problem with row count trigger
Date	April 3, 2009 18:22:45
Msg-id	49D653E8.7040008@andrew.cmu.edu Whole thread Raw
In response to	Re: Performance problem with row count trigger ("Greg Sabino Mullane" <greg@turnstep.com>)
Responses	Re: Performance problem with row count trigger
List	pgsql-sql

Tree view

Greg Sabino Mullane wrote:
> A few things spring to mind:
> 
> 1) Use a separate table, rather than storing things inside of
> dataset itself. This will reduce the activity on the dataset table.

A separate table just for that one column?  Would that really help,
given that I'd have to add the foreign key dataset_id to the related
table?  How does splitting activity across dataset and, say,
dataset_counts help things?

> 2) Do you really need bigint for the counts?

Probably not.  Still, changing to INTEGER hasn't changed the performance
in any measurable way.

> 3) If you do want to do this, you'll need a different approach as
> Tom mentioned. One way to do this is to have a special method for
> bulk loading, that gets around the normal updates and requires that
> the user take responsiblity for knowing when and how to call the
> alternate path. The basic scheme is this:
> 
> 1. Disable the normal triggers
> 2. Enable special (perl) triggers that keep the count in memory
> 3. Do the bulk changes
> 4. Enable normal triggers, disable special perl one
> 5. Do other things as needed....
> 6. Commit the changes to the assoc_count field.

I gave this a shot, and my initial testing looks very promising.  Using
your scheme, the performance looks to be just as good as without the
trigger.

I haven't done any kind of edge case testing to see if weird things
happen when multiple datasets are added simultaneously, or when someone
inserts an assoc record out-of-band while a bulk dataset load is
happening, but you've certainly got me well on my way to a workable
solution.  Many thanks!

There's one part I don't get, though...

> Number 6 can be done anytime, as long as you are in the same session. The danger
> is in leaving the session without calling the final function. This can be
> solved with some deferred FK trickery, or by careful scripting of the events.
> All this doesn't completely remove the pain, but it may shift it around enough
> in useful ways for your app.

I'm not sure I understand the danger you're talking about here.  Doesn't
putting the whole start_bulkload_assoc_count(), bulk insert, and
end_bulkload_assoc_count() process in a transaction save me from any of
these problems?  Or is there some more subtle problem I'm not seeing?

pgsql-sql by date:

From: Adrian Klaver
Date: 03 April 2009, 16:59:20
Subject: Re: FUNCTION problem

From: Peter Willis
Date: 03 April 2009, 19:10:14
Subject: Re: FUNCTION problem

Re: Performance problem with row count trigger - Mailing list pgsql-sql

Previous

Next