Thread: CLUSTER all tables

CLUSTER all tables

From
Alvaro Herrera
Date:
Hello,

This is a first attempt at CLUSTER ALL.  It works in every way I've
tested, but maybe I'm missing something.

What it does:

- if CLUSTER is called with no arguments, cluster all indexes that have
  indisclustered set (in the current database).  There's no "ALL"
  argument: that's just pollution IMHO.

- Gets a list of such indexes (checking ownership of each) and passes
  them one by one to the standard cluster routine (modified a little so
  it accepts OIDs of table and index, not names).

- Not much else...  (no regression test nor documentation yet).

I don't know if I will be able to do the REINDEX all thing now that beta
is almost here.

Please review; I may be doing something stupid.

--
Alvaro Herrera (<alvherre[a]atentus.com>)
"Aprender sin pensar es inutil; pensar sin aprender, peligroso" (Confucio)

Attachment

Re: CLUSTER all tables

From
Alvaro Herrera
Date:
En Fri, 30 Aug 2002 19:02:47 -0400 (CLT)
Alvaro Herrera <alvherre@atentus.com> escribió:

> What it does:
>
> - if CLUSTER is called with no arguments, cluster all indexes that
> have  indisclustered set (in the current database).  There's no "ALL"
>   argument: that's just pollution IMHO.
>
> - Gets a list of such indexes (checking ownership of each) and passes
>   them one by one to the standard cluster routine (modified a little
>   so it accepts OIDs of table and index, not names).

The same as before, but I do include documentation patch (feel free to
rewrite or suggest improvements) and regression test.


Now I'm thinking about concurrency: suppose table A and B have
indisclustered set on indexes ind_a and ind_b respectively.  The
user fires a CLUSTER without arguments; the backend will begin
clustering table A on ind_a.

Now, while this is going on, the user fires a CLUSTER on table B on
index ind_b_2, on a separate session.  This table is shorter than table
A and finishes first.

When the first cluster finishes clustering table A, it will start
clustering table B on ind_b.  This is because the cluster-all creates a
list of the tables to be clustered, and _then_ it clusters them one by
one.  So the info saved about table B is old and overrides the new
cluster that the user has done on another session.

The question is: is this a situation worth to protect against? and what
is the best way to do it?  I can see two ways:

1. allow only one cluster operation at the same time, with some kind of
  lock (can the lightweight lock manager be used for this?)

2. generate the list of tables as it goes.  This requires keeping
  pg_index open (with AccessShareLock) for a potentially long time (I
  don't need to tell you that cluster can be *slow*).  Is this
  acceptable?

Is there another?

--
Alvaro Herrera (<alvherre[a]atentus.com>)
"Granting software the freedom to evolve guarantees only different
results, not better ones." (Zygo Blaxell)

Re: CLUSTER all tables

From
Alvaro Herrera
Date:
I wrote:

> > What it does:
> >
> > - if CLUSTER is called with no arguments, cluster all indexes that
> > have  indisclustered set (in the current database).  There's no "ALL"
> >   argument: that's just pollution IMHO.
> >
> > - Gets a list of such indexes (checking ownership of each) and passes
> >   them one by one to the standard cluster routine (modified a little
> >   so it accepts OIDs of table and index, not names).
>
> The same as before, but I do include documentation patch (feel free to
> rewrite or suggest improvements) and regression test.

--
Alvaro Herrera (<alvherre[a]atentus.com>)
"Para tener mas hay que desear menos"

Attachment

Re: CLUSTER all tables

From
Tom Lane
Date:
Alvaro Herrera <alvherre@atentus.com> writes:
> Now I'm thinking about concurrency: suppose table A and B have
> indisclustered set on indexes ind_a and ind_b respectively.  The
> user fires a CLUSTER without arguments; the backend will begin
> clustering table A on ind_a.

> Now, while this is going on, the user fires a CLUSTER on table B on
> index ind_b_2, on a separate session.  This table is shorter than table
> A and finishes first.

> When the first cluster finishes clustering table A, it will start
> clustering table B on ind_b. ...
> So the info saved about table B is old and overrides the new
> cluster that the user has done on another session.

You must acquire exclusive lock on a table before you even look to
see if it has a clusterable index, I think.  Otherwise there's too
much risk of the state changing underneath you.

The bigger problem with implementing CLUSTER ALL this way is that it's
going to try to get exclusive lock on a large number of tables, which
is going to lead to very high risk of deadlock --- even if other
transactions are not doing CLUSTERs, but only ordinary table accesses.

I think the only practical way to do CLUSTER ALL (or REINDEX ALL for
that matter) is to make it work the way VACUUM does: run a separate
transaction for each table to be processed.  In this way you can
release the lock on each table as you finish with it, and avoid
deadlock problems.

If you study the VACUUM code you will also notice that it is prepared
for tables to "go away" before it reaches them; CLUSTER ALL will have
the same issue, along with the issue about clustering status changing.

So what you need is something like:

* make preliminary list of things to cluster

* END starting transaction

* for each item in list:

    START new transaction

    Attempt to open and exclusive-lock target table

    If successful, *and* index still is the clustered index, CLUSTER

    END transaction

* START closing transaction so that we return with an open transaction


            regards, tom lane

Re: CLUSTER all tables

From
Alvaro Herrera
Date:
En Sun, 01 Sep 2002 10:40:26 -0400
Tom Lane <tgl@sss.pgh.pa.us> escribió:

> You must acquire exclusive lock on a table before you even look to
> see if it has a clusterable index, I think.  Otherwise there's too
> much risk of the state changing underneath you.

I now acquire RowExclusiveLock on pg_index while I'm scanning it.  The
index is rechecked when the actual cluster is done.


> I think the only practical way to do CLUSTER ALL (or REINDEX ALL for
> that matter) is to make it work the way VACUUM does: run a separate
> transaction for each table to be processed.  In this way you can
> release the lock on each table as you finish with it, and avoid
> deadlock problems.

Well, I now open and close transactions for each table in a way that's
very similar to VACUUM (identical in some places); the table and index
are opened in this transaction with a previous existance checking, and
the indisclustered and owner status are re-checked also.


> If you study the VACUUM code you will also notice that it is prepared
> for tables to "go away" before it reaches them; CLUSTER ALL will have
> the same issue, along with the issue about clustering status changing.

Ok.  I read through the VACUUM code and while it's not that easy to
follow (with the analyze parts intermixed here and there), I think I got
the idea.

I attach the patch that does all this.  It passes the regression test,
which checks that it clusters the appropiate tables (i.e. not ones
without indisclustered or that belong to another user).  I also made
some tests to check the concurrency issues (what happens if I drop a
table or if I assign it to another user).  I tried to test that when I
cluster on another index it does the right thing, but now I'm falling
asleep.

What I cannot test is that the locking mode is right, for example for
pg_index (it sounds reasonable to me, but then I'm only a beginner in
this kind of issues).

Please have a look at it...

--
Alvaro Herrera (<alvherre[a]atentus.com>)
www.google.com: interfaz de linea de comando para la web.

Attachment