Thread: Need advice about triggers

Need advice about triggers

From
"Mindaugas Riauba"
Date:
  Hello,

  I have small table (up to 10000 rows) and every row will be updated
once per minute. Table also has "before update on each row" trigger
written in plpgsql. But trigger 99.99% of the time will do nothing
to the database. It will just compare old and new values in the row
and those values almost always will be identical.

  Now I tried simple test and was able to do 10000 updates on 1000
rows table in ~30s. That's practically enough but I'd like to have
more room to slow down.
  Also best result I achieved by doing commit+vacuum every ~500
updates.

  How can I improve performance and will version 7.4 bring something
valuable for my task? Rewrite to some other scripting language is not
a problem. Trigger is simple enough.

  Postgres v7.3.4, shared_buffers=4096 max_fsm settings also bumped up
10 times.

  Thanks,

  Mindaugas


Re: Need advice about triggers

From
Richard Huxton
Date:
On Tuesday 09 September 2003 13:40, Mindaugas Riauba wrote:
>   Hello,
>
>   I have small table (up to 10000 rows) and every row will be updated
> once per minute. Table also has "before update on each row" trigger
> written in plpgsql. But trigger 99.99% of the time will do nothing
> to the database. It will just compare old and new values in the row
> and those values almost always will be identical.
>
>   Now I tried simple test and was able to do 10000 updates on 1000
> rows table in ~30s. That's practically enough but I'd like to have
> more room to slow down.
>   Also best result I achieved by doing commit+vacuum every ~500
> updates.
>
>   How can I improve performance and will version 7.4 bring something
> valuable for my task? Rewrite to some other scripting language is not
> a problem. Trigger is simple enough.

Well, try it without the trigger. If performance improves markedly, it might
be worth rewriting in C.

If not, you're probably saturating the disk I/O - using iostat/vmstat will let
you see what's happening. If it is your disks, you might see if moving the
WAL onto a separate drive would help, or check the archives for plenty of
discussion about raid setups.

>   Postgres v7.3.4, shared_buffers=4096 max_fsm settings also bumped up
> 10 times.

Well effective_cache_size is useful for reads, but won't help with writing.
You might want to look at wal_buffers and see if increasing that helps, but I
couldn't say for sure.
--
  Richard Huxton
  Archonet Ltd

Re: Need advice about triggers

From
Rod Taylor
Date:
>   How can I improve performance and will version 7.4 bring something
> valuable for my task? Rewrite to some other scripting language is not
> a problem. Trigger is simple enough.

Your best bet is to have additional clients connected to the database
requesting work. Approx NUMCPUs * 2 + 1 seems to be ideal. (+1 to ensure
there is something waiting when the others complete.  *2 to ensure that
you can have 50% reading from disk, 50% doing calculations)

You may simply want to put vacuum into a loop of it's own so it executes
~1 second after the previous run finished.  Work should still be going
on even though vacuum is running.

Attachment

Re: Need advice about triggers

From
"Mindaugas Riauba"
Date:
> >   How can I improve performance and will version 7.4 bring something
> > valuable for my task? Rewrite to some other scripting language is not
> > a problem. Trigger is simple enough.
>
> Well, try it without the trigger. If performance improves markedly, it
might
> be worth rewriting in C.

  Nope. Execution time is practically the same without trigger.

> If not, you're probably saturating the disk I/O - using iostat/vmstat will
let
> you see what's happening. If it is your disks, you might see if moving the
> WAL onto a separate drive would help, or check the archives for plenty of
> discussion about raid setups.

  Bottleneck in this case is CPU. postmaster process uses almost 100% of
CPU.

> >   Postgres v7.3.4, shared_buffers=4096 max_fsm settings also bumped up
> > 10 times.
> Well effective_cache_size is useful for reads, but won't help with
writing.
> You might want to look at wal_buffers and see if increasing that helps,
but I
> couldn't say for sure.

  Disk I/O should not be a problem in this case. vmstat shows ~300kb/s write
activity.

  Mindaugas


Re: Need advice about triggers

From
"scott.marlowe"
Date:
On Tue, 9 Sep 2003, Mindaugas Riauba wrote:

>
>   Hello,
>
>   I have small table (up to 10000 rows) and every row will be updated
> once per minute. Table also has "before update on each row" trigger
> written in plpgsql. But trigger 99.99% of the time will do nothing
> to the database. It will just compare old and new values in the row
> and those values almost always will be identical.

If the rows aren't going to actually change all that often, perhaps you
could program your trigger to just silently drop the update, i.e. only
change the rows that need updating and ignore the rest?  That should speed
things up.  Unless I'm misunderstanding your needs here.


Re: Need advice about triggers

From
Tom Lane
Date:
"Mindaugas Riauba" <mind@bi.lt> writes:
>> Well, try it without the trigger. If performance improves markedly, it
>> might be worth rewriting in C.

>   Nope. Execution time is practically the same without trigger.

>> If not, you're probably saturating the disk I/O -

>   Bottleneck in this case is CPU. postmaster process uses almost 100% of
> CPU.

That seems very odd.  Updates should be I/O intensive, not CPU
intensive.  I wouldn't have been surprised to hear of a plpgsql trigger
consuming lots of CPU, but without it, I'm not sure where the time is
going.  Can you show us an EXPLAIN ANALYZE result for a typical update
command?

            regards, tom lane

Re: Need advice about triggers

From
Richard Huxton
Date:
On Tuesday 09 September 2003 14:33, Mindaugas Riauba wrote:
> > Well, try it without the trigger. If performance improves markedly, it
> might
> > be worth rewriting in C.
>
>   Nope. Execution time is practically the same without trigger.

OK - no point in rewriting it then.

> > If not, you're probably saturating the disk I/O - using iostat/vmstat
> > will
>
> let
>
> > you see what's happening. If it is your disks, you might see if moving
> > the WAL onto a separate drive would help, or check the archives for
> > plenty of discussion about raid setups.
>
>   Bottleneck in this case is CPU. postmaster process uses almost 100% of
> CPU.

>   Disk I/O should not be a problem in this case. vmstat shows ~300kb/s
> write activity.

Hmm - I must admit I wasn't expecting that. Closest I can get on my test
machine here: AMD 400MHz / 256MB / IDE disk / other stuff running is about 20
secs.

I've attached the perl script I used - what sort of timings does it give you?

--
  Richard Huxton
  Archonet Ltd

Attachment

Re: Need advice about triggers

From
Hannu Krosing
Date:
Mindaugas Riauba kirjutas T, 09.09.2003 kell 15:40:
>   Hello,
>
>   I have small table (up to 10000 rows) and every row will be updated
> once per minute. Table also has "before update on each row" trigger
> written in plpgsql. But trigger 99.99% of the time will do nothing
> to the database. It will just compare old and new values in the row
> and those values almost always will be identical.
>
>   Now I tried simple test and was able to do 10000 updates on 1000
> rows table in ~30s. That's practically enough but I'd like to have
> more room to slow down.

Is it 10000 *rows* or 10000*1000 = 10 000 000 *rows* updated ?

When I run a simple update 10 times on 1000 rows (with no trigger, which
you claim to take about the same time) it took 0.25 sec.

> Also best result I achieved by doing commit+vacuum every ~500
> updates.

It seems like you are updating more than one row at each update ?

---------------
Hannu


Re: Need advice about triggers

From
"Mindaugas Riauba"
Date:
> >> Well, try it without the trigger. If performance improves markedly, it
> >> might be worth rewriting in C.
>
> >   Nope. Execution time is practically the same without trigger.
>
> >> If not, you're probably saturating the disk I/O -
>
> >   Bottleneck in this case is CPU. postmaster process uses almost 100% of
> > CPU.
>
> That seems very odd.  Updates should be I/O intensive, not CPU
> intensive.  I wouldn't have been surprised to hear of a plpgsql trigger
> consuming lots of CPU, but without it, I'm not sure where the time is
> going.  Can you show us an EXPLAIN ANALYZE result for a typical update
> command?

  Two EXPLAIN ANALYZE below. One is before another is after REINDEX. It
seems
that REINDEX before updates helps. Time went down to ~17s. Also CPU is not
at
100%. vmstat output is below (machine is 2xCPU so 40% load means 80% on one
CPU).

  So the solution would be REINDEX before updates and VACUUM at the same
time?
Without REINDEX performance slowly degrades.

  Mindaugas


router_db=# explain analyze update ifdata set ifspeed=256000,
ifreason='12121', iflastupdate=CURRENT_TIMESTAMP WHERE clientid='#0003904#';
                                                         QUERY PLAN
----------------------------------------------------------------------------
------------------------------------------------
 Index Scan using ifdata_clientid_key on ifdata  (cost=0.00..5.64 rows=1
width=116) (actual time=0.17..0.36 rows=1 loops=1)
   Index Cond: (clientid = '#0003904#'::character varying)
 Total runtime: 1.70 msec
(3 rows)

router_db=# reindex table ifdata;
REINDEX
router_db=# explain analyze update ifdata set ifspeed=256000,
ifreason='12121', iflastupdate=CURRENT_TIMESTAMP WHERE clientid='#0003904#';
                                                         QUERY PLAN
----------------------------------------------------------------------------
------------------------------------------------
 Index Scan using ifdata_clientid_key on ifdata  (cost=0.00..5.65 rows=1
width=116) (actual time=0.06..0.07 rows=1 loops=1)
   Index Cond: (clientid = '#0003904#'::character varying)
 Total runtime: 0.47 msec
(3 rows)

----------------------------------------------------------------------------
---

   procs                      memory    swap          io     system
cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy
id
 0  0  0   5048  20616 273556 1614692   0   0     4     3    2     0   0   1
3
 0  0  0   5048  20612 273556 1614692   0   0     0     0  109     8   0   0
100
 0  0  0   5048  20612 273556 1614692   0   0     0   168  144    20   0   0
100
 1  0  0   5048  19420 273556 1614612   0   0     0   192  123  4120  35   2
63
 0  1  1   5048  19420 273572 1614652   0   0     0   672  144  4139  32   2
66
 1  0  0   5048  19420 273580 1614660   0   0     0   360  125  4279  33  12
55
 1  0  0   5048  19420 273580 1614724   0   0     0   272  119  5887  41   2
57
 1  0  0   5048  19420 273580 1614716   0   0     0   488  124  4871  40   1
59


Re: Need advice about triggers

From
Hannu Krosing
Date:
Mindaugas Riauba kirjutas K, 10.09.2003 kell 13:21:

>
> router_db=# explain analyze update ifdata set ifspeed=256000,
> ifreason='12121', iflastupdate=CURRENT_TIMESTAMP WHERE clientid='#0003904#';
>                                                          QUERY PLAN
> ----------------------------------------------------------------------------
> ------------------------------------------------
>  Index Scan using ifdata_clientid_key on ifdata  (cost=0.00..5.64 rows=1
> width=116) (actual time=0.17..0.36 rows=1 loops=1)
>    Index Cond: (clientid = '#0003904#'::character varying)
>  Total runtime: 1.70 msec
> (3 rows)

could you try the same query on similar table, where clientid is int4 ?

is it faster ?

does the performance degrade at a slower rate?

---------------
Hannu