Thread: CLUSTER all tables
Hello, This is a first attempt at CLUSTER ALL. It works in every way I've tested, but maybe I'm missing something. What it does: - if CLUSTER is called with no arguments, cluster all indexes that have indisclustered set (in the current database). There's no "ALL" argument: that's just pollution IMHO. - Gets a list of such indexes (checking ownership of each) and passes them one by one to the standard cluster routine (modified a little so it accepts OIDs of table and index, not names). - Not much else... (no regression test nor documentation yet). I don't know if I will be able to do the REINDEX all thing now that beta is almost here. Please review; I may be doing something stupid. -- Alvaro Herrera (<alvherre[a]atentus.com>) "Aprender sin pensar es inutil; pensar sin aprender, peligroso" (Confucio)
Attachment
En Fri, 30 Aug 2002 19:02:47 -0400 (CLT) Alvaro Herrera <alvherre@atentus.com> escribió: > What it does: > > - if CLUSTER is called with no arguments, cluster all indexes that > have indisclustered set (in the current database). There's no "ALL" > argument: that's just pollution IMHO. > > - Gets a list of such indexes (checking ownership of each) and passes > them one by one to the standard cluster routine (modified a little > so it accepts OIDs of table and index, not names). The same as before, but I do include documentation patch (feel free to rewrite or suggest improvements) and regression test. Now I'm thinking about concurrency: suppose table A and B have indisclustered set on indexes ind_a and ind_b respectively. The user fires a CLUSTER without arguments; the backend will begin clustering table A on ind_a. Now, while this is going on, the user fires a CLUSTER on table B on index ind_b_2, on a separate session. This table is shorter than table A and finishes first. When the first cluster finishes clustering table A, it will start clustering table B on ind_b. This is because the cluster-all creates a list of the tables to be clustered, and _then_ it clusters them one by one. So the info saved about table B is old and overrides the new cluster that the user has done on another session. The question is: is this a situation worth to protect against? and what is the best way to do it? I can see two ways: 1. allow only one cluster operation at the same time, with some kind of lock (can the lightweight lock manager be used for this?) 2. generate the list of tables as it goes. This requires keeping pg_index open (with AccessShareLock) for a potentially long time (I don't need to tell you that cluster can be *slow*). Is this acceptable? Is there another? -- Alvaro Herrera (<alvherre[a]atentus.com>) "Granting software the freedom to evolve guarantees only different results, not better ones." (Zygo Blaxell)
I wrote: > > What it does: > > > > - if CLUSTER is called with no arguments, cluster all indexes that > > have indisclustered set (in the current database). There's no "ALL" > > argument: that's just pollution IMHO. > > > > - Gets a list of such indexes (checking ownership of each) and passes > > them one by one to the standard cluster routine (modified a little > > so it accepts OIDs of table and index, not names). > > The same as before, but I do include documentation patch (feel free to > rewrite or suggest improvements) and regression test. -- Alvaro Herrera (<alvherre[a]atentus.com>) "Para tener mas hay que desear menos"
Attachment
Alvaro Herrera <alvherre@atentus.com> writes: > Now I'm thinking about concurrency: suppose table A and B have > indisclustered set on indexes ind_a and ind_b respectively. The > user fires a CLUSTER without arguments; the backend will begin > clustering table A on ind_a. > Now, while this is going on, the user fires a CLUSTER on table B on > index ind_b_2, on a separate session. This table is shorter than table > A and finishes first. > When the first cluster finishes clustering table A, it will start > clustering table B on ind_b. ... > So the info saved about table B is old and overrides the new > cluster that the user has done on another session. You must acquire exclusive lock on a table before you even look to see if it has a clusterable index, I think. Otherwise there's too much risk of the state changing underneath you. The bigger problem with implementing CLUSTER ALL this way is that it's going to try to get exclusive lock on a large number of tables, which is going to lead to very high risk of deadlock --- even if other transactions are not doing CLUSTERs, but only ordinary table accesses. I think the only practical way to do CLUSTER ALL (or REINDEX ALL for that matter) is to make it work the way VACUUM does: run a separate transaction for each table to be processed. In this way you can release the lock on each table as you finish with it, and avoid deadlock problems. If you study the VACUUM code you will also notice that it is prepared for tables to "go away" before it reaches them; CLUSTER ALL will have the same issue, along with the issue about clustering status changing. So what you need is something like: * make preliminary list of things to cluster * END starting transaction * for each item in list: START new transaction Attempt to open and exclusive-lock target table If successful, *and* index still is the clustered index, CLUSTER END transaction * START closing transaction so that we return with an open transaction regards, tom lane
En Sun, 01 Sep 2002 10:40:26 -0400 Tom Lane <tgl@sss.pgh.pa.us> escribió: > You must acquire exclusive lock on a table before you even look to > see if it has a clusterable index, I think. Otherwise there's too > much risk of the state changing underneath you. I now acquire RowExclusiveLock on pg_index while I'm scanning it. The index is rechecked when the actual cluster is done. > I think the only practical way to do CLUSTER ALL (or REINDEX ALL for > that matter) is to make it work the way VACUUM does: run a separate > transaction for each table to be processed. In this way you can > release the lock on each table as you finish with it, and avoid > deadlock problems. Well, I now open and close transactions for each table in a way that's very similar to VACUUM (identical in some places); the table and index are opened in this transaction with a previous existance checking, and the indisclustered and owner status are re-checked also. > If you study the VACUUM code you will also notice that it is prepared > for tables to "go away" before it reaches them; CLUSTER ALL will have > the same issue, along with the issue about clustering status changing. Ok. I read through the VACUUM code and while it's not that easy to follow (with the analyze parts intermixed here and there), I think I got the idea. I attach the patch that does all this. It passes the regression test, which checks that it clusters the appropiate tables (i.e. not ones without indisclustered or that belong to another user). I also made some tests to check the concurrency issues (what happens if I drop a table or if I assign it to another user). I tried to test that when I cluster on another index it does the right thing, but now I'm falling asleep. What I cannot test is that the locking mode is right, for example for pg_index (it sounds reasonable to me, but then I'm only a beginner in this kind of issues). Please have a look at it... -- Alvaro Herrera (<alvherre[a]atentus.com>) www.google.com: interfaz de linea de comando para la web.