Re: Performance problem with row count trigger - Mailing list pgsql-sql

From Tony Cebzanov
Subject Re: Performance problem with row count trigger
Date
Msg-id 49D653E8.7040008@andrew.cmu.edu
Whole thread Raw
In response to Re: Performance problem with row count trigger  ("Greg Sabino Mullane" <greg@turnstep.com>)
Responses Re: Performance problem with row count trigger
List pgsql-sql
Greg Sabino Mullane wrote:
> A few things spring to mind:
> 
> 1) Use a separate table, rather than storing things inside of
> dataset itself. This will reduce the activity on the dataset table.

A separate table just for that one column?  Would that really help,
given that I'd have to add the foreign key dataset_id to the related
table?  How does splitting activity across dataset and, say,
dataset_counts help things?

> 2) Do you really need bigint for the counts?

Probably not.  Still, changing to INTEGER hasn't changed the performance
in any measurable way.

> 3) If you do want to do this, you'll need a different approach as
> Tom mentioned. One way to do this is to have a special method for
> bulk loading, that gets around the normal updates and requires that
> the user take responsiblity for knowing when and how to call the
> alternate path. The basic scheme is this:
> 
> 1. Disable the normal triggers
> 2. Enable special (perl) triggers that keep the count in memory
> 3. Do the bulk changes
> 4. Enable normal triggers, disable special perl one
> 5. Do other things as needed....
> 6. Commit the changes to the assoc_count field.

I gave this a shot, and my initial testing looks very promising.  Using
your scheme, the performance looks to be just as good as without the
trigger.

I haven't done any kind of edge case testing to see if weird things
happen when multiple datasets are added simultaneously, or when someone
inserts an assoc record out-of-band while a bulk dataset load is
happening, but you've certainly got me well on my way to a workable
solution.  Many thanks!

There's one part I don't get, though...

> Number 6 can be done anytime, as long as you are in the same session. The danger
> is in leaving the session without calling the final function. This can be
> solved with some deferred FK trickery, or by careful scripting of the events.
> All this doesn't completely remove the pain, but it may shift it around enough
> in useful ways for your app.

I'm not sure I understand the danger you're talking about here.  Doesn't
putting the whole start_bulkload_assoc_count(), bulk insert, and
end_bulkload_assoc_count() process in a transaction save me from any of
these problems?  Or is there some more subtle problem I'm not seeing?


pgsql-sql by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: FUNCTION problem
Next
From: Peter Willis
Date:
Subject: Re: FUNCTION problem