Re: Speedier count(*)

From: Tino Wildenhain
Subject: Re: Speedier count(*)
Date: ,
Msg-id: 1123763640.27613.18.camel@Andrea.peacock.de
(view: Whole thread, Raw)
In response to: Re: Speedier count(*)  (PFC)
Responses: Re: Speedier count(*)  (Dan Harris)
List: pgsql-performance

Tree view

Speedier count(*)  (Dan Harris, )
 Re: Speedier count(*)  ("Joshua D. Drake", )
 Re: Speedier count(*)  (Michael Fuhr, )
 Re: Speedier count(*)  (John A Meinel, )
 Re: Speedier count(*)  (Gavin Sherry, )
  Re: Speedier count(*)  (Mark Cotner, )
   Re: Speedier count(*)  (Tino Wildenhain, )
    Re: Speedier count(*)  (Gavin Sherry, )
     Re: Speedier count(*)  (Tino Wildenhain, )
      Re: Speedier count(*)  (PFC, )
       Re: Speedier count(*)  (Tino Wildenhain, )
        Re: Speedier count(*)  (Dan Harris, )

Am Donnerstag, den 11.08.2005, 14:08 +0200 schrieb PFC:
>
> > You could lock the count table to prevent the problem
> > where 2 competing transactions do an insert, read the
> > start value and add 1 to it and then write the result
> > - which is n+1 rather then n+2 - so you are off by one.
> > Think of the same when one transaction inserts 100
> > and the other 120. Then you could even be off by 100.
>
>     Niet.
>
>     If your trigger does UPDATE counts_cache SET cached_count =
> cached_count+N WHERE ...
>     Then all locking is taken care of by Postgres.
>     Of course if you use 2 queries then you have locking issues.

Yes, in the case you use just the UPDATE statement you are right. This
does the locking I was talking about.

In either case I'd use an after trigger and not before to minimize
the impact.

>     However the UPDATE counts_cache has a problem, ie. it locks this row FOR
> UPDATE for the whole transaction, and all transactions which want to
> update the same row must wait to see if the update commits or rollbacks,
> so if you have one count cache row for the whole table you get MySQL style
> scalability...
>
>     To preserve scalability you could, instead of UPDATE, INSERT the delta of
> rows inserted/deleted in a table (which has no concurrencies issues) and
> compute the current count with the sum() of the deltas, then with a cron,
> consolidate the deltas and update the count_cache table so that the deltas
> table stays very small.

Yes, this is in fact a better approach to this problem.

(All this provided you want an unqualified count() - as the
 original poster)





pgsql-performance by date:

From: Tom Arthurs
Date:
Subject: Re: [SPAM?] Re: PG8 Tuning
From: Mark Lewis
Date:
Subject: Re: PG8 Tuning