Re: CLUSTER all tables - Mailing list pgsql-patches

From Alvaro Herrera
Subject Re: CLUSTER all tables
Date
Msg-id 20020902012531.00e60e41.alvherre@atentus.com
Whole thread Raw
In response to Re: CLUSTER all tables  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-patches
En Sun, 01 Sep 2002 10:40:26 -0400
Tom Lane <tgl@sss.pgh.pa.us> escribió:

> You must acquire exclusive lock on a table before you even look to
> see if it has a clusterable index, I think.  Otherwise there's too
> much risk of the state changing underneath you.

I now acquire RowExclusiveLock on pg_index while I'm scanning it.  The
index is rechecked when the actual cluster is done.


> I think the only practical way to do CLUSTER ALL (or REINDEX ALL for
> that matter) is to make it work the way VACUUM does: run a separate
> transaction for each table to be processed.  In this way you can
> release the lock on each table as you finish with it, and avoid
> deadlock problems.

Well, I now open and close transactions for each table in a way that's
very similar to VACUUM (identical in some places); the table and index
are opened in this transaction with a previous existance checking, and
the indisclustered and owner status are re-checked also.


> If you study the VACUUM code you will also notice that it is prepared
> for tables to "go away" before it reaches them; CLUSTER ALL will have
> the same issue, along with the issue about clustering status changing.

Ok.  I read through the VACUUM code and while it's not that easy to
follow (with the analyze parts intermixed here and there), I think I got
the idea.

I attach the patch that does all this.  It passes the regression test,
which checks that it clusters the appropiate tables (i.e. not ones
without indisclustered or that belong to another user).  I also made
some tests to check the concurrency issues (what happens if I drop a
table or if I assign it to another user).  I tried to test that when I
cluster on another index it does the right thing, but now I'm falling
asleep.

What I cannot test is that the locking mode is right, for example for
pg_index (it sounds reasonable to me, but then I'm only a beginner in
this kind of issues).

Please have a look at it...

--
Alvaro Herrera (<alvherre[a]atentus.com>)
www.google.com: interfaz de linea de comando para la web.

Attachment

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: update to contrib/dblink
Next
From: Bruce Momjian
Date:
Subject: Re: update to contrib/dblink