Re: CLUSTER and MVCC - Mailing list pgsql-hackers

From Csaba Nagy
Subject Re: CLUSTER and MVCC
Date
Msg-id 1173443129.9058.66.camel@coppola.muc.ecircle.de
Whole thread Raw
In response to CLUSTER and MVCC  (Heikki Linnakangas <heikki@enterprisedb.com>)
Responses Re: CLUSTER and MVCC  (Gregory Stark <stark@enterprisedb.com>)
Re: CLUSTER and MVCC  (Heikki Linnakangas <heikki@enterprisedb.com>)
List pgsql-hackers
On Fri, 2007-03-09 at 12:29, Heikki Linnakangas wrote:
> Csaba, you mentioned recently 
> (http://archives.postgresql.org/pgsql-hackers/2007-03/msg00027.php) that 
> you're actually using the MVCC-violation to clean up tables during a 
> backup. Can you tell us a bit more about that? Would you be upset if we 
> shut that backdoor?

My use case: a queue-like table (in fact a 'task' table) which is very
frequently inserted/updated/deleted. This table tends to be bloated in
the presence of any long running transaction... the only transactional
behavior we need from this table is to make sure that when we insert
something in this table in a transaction (possibly together with other
actions) and then commit/rollback, it commits/rolls back the insert.
CLUSTER's violation of MVCC does not affect this, as CLUSTER will not be
able to lock the table if another transaction inserted something in it
(the inserting transaction will have a lock on the table). Selections on
this table are not critical for us, it just doesn't matter which job
processor is getting which task and in what order... (actually it does
matter, but CLUSTER won't affect that either).

So what I do is execute CLUSTER once in 5 minutes on this table. This
works just fine, and keeps the table size small even if I have long
running transactions in progress. The DB backup is one of such
unavoidable long running transactions, and I use the table exclusion
switch to exclude this task table from the backup so it won't get locked
by it and let CLUSTER still do it's job (I had a rudimentary patch to do
this even before the feature was introduced to pg_dump). The table can
be dumped separately which is a brief operation, but I would have anyway
to clear it on a crash...

Now I could try and disable the CLUSTER cron job and see if i get
problems, as last it was disabled with postgres 7.4, maybe something
changed in between... but I can tell for sure that last time I enabled
it it really fixed our load on the DB server...

Wouldn't be possible to do it like Simon (IIRC) suggested, and add a
parameter to enable/disable the current behavior, and use the MVCC
behavior as default ?

Cheers,
Csaba.




pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: who gets paid for this
Next
From: Hannu Krosing
Date:
Subject: Re: CLUSTER and MVCC