Re: ALTER tbl rewrite loses CLUSTER ON index - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: ALTER tbl rewrite loses CLUSTER ON index
Date
Msg-id 20200402061421.GA112468@paquier.xyz
Whole thread Raw
In response to Re: ALTER tbl rewrite loses CLUSTER ON index  (Michael Paquier <michael@paquier.xyz>)
Responses Re: ALTER tbl rewrite loses CLUSTER ON index  (Justin Pryzby <pryzby@telsasoft.com>)
Re: ALTER tbl rewrite loses CLUSTER ON index  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
On Wed, Mar 18, 2020 at 11:48:37AM +0900, Michael Paquier wrote:
> Anyway, Tom, Alvaro, are you planning to look at what is proposed on
> this thread?  I don't want to step on your toes if that's the case and
> it seems to me that the approach taken by the patch is sound, using as
> basic fix the addition of an AT_ClusterOn sub-command to the list of
> commands to execute when rebuilding the table, ensuring that any
> follow-up CLUSTER command will use the correct index.

Hearing nothing, I have been looking at the patches sent upthread, and
did some modifications as per the attached for 0001.  The logic looked
fine to me and it is unchanged as you are taking care of normal
indexes as well as constraint indexes.  Please note that I have
tweaked some comments, and removed what was on top of
RememberConstraintForRebuilding() as that was just a duplicate.
Regarding the tests, I was annoyed by the fact that the logic was not
manipulating two indexes at the same time on the relation rewritten
with a normal and a constraint index, and cross-checking both at the
same time to see which one is clustered after each rewrite is good for
consistency.

Now, regarding patch 0002, note that you have a problem for this part:
-            tuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexOid));
-            if (!HeapTupleIsValid(tuple))    /* probably can't happen */
-            {
-                relation_close(OldHeap, AccessExclusiveLock);
-                pgstat_progress_end_command();
-                return;
-            }
-            indexForm = (Form_pg_index) GETSTRUCT(tuple);
-            if (!indexForm->indisclustered)
+            if (!get_index_isclustered(indexOid))
             {
-                ReleaseSysCache(tuple);
                 relation_close(OldHeap, AccessExclusiveLock);
                 pgstat_progress_end_command();
                 return;
             }
-            ReleaseSysCache(tuple);

On an invalid tuple for pg_index, the new code would issue an error,
while the old code would just return.  And it seems to me that this
can lead to problems because the parent relation is processed and
locked at the beginning of cluster_rel(), *after* we know the index
OID to work on.  The refactoring is fine for the other two areas
though, so I think that there is still value to apply
get_index_isclustered() within mark_index_clustered() and cluster(),
and I would suggest to keep 0002 to that.

Justin, what do you think?
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Berserk Autovacuum (let's save next Mandrill)
Next
From: Michael Paquier
Date:
Subject: Re: potential stuck lock in SaveSlotToPath()