Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements - Mailing list pgsql-hackers

From Josh Kupershmidt
Subject Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Date
Msg-id CAK3UJREa7qT_=QUx7Fc9D=Z2V-BAti8EXFR=9kB7yfN-bao32Q@mail.gmail.com
Whole thread
In response to Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements  (Mihail Nikalayeu <mihailnikalayeu@gmail.com>)
List pgsql-hackers


On Tue, Apr 7, 2026 at 7:20 PM Mihail Nikalayeu <mihailnikalayeu@gmail.com> wrote:
Hello, Josh!

Your review looks a bit LLM-generated, but anyway - thanks for review! :)
Especially because at least one point seems to be valid.

> We're leaving behind two invalid indexes now that the user has to figure
> out how to drop in case of an error - that seems like it could be confusing for the user.
> Could we have some better way (error handler, background worker) try to perform this cleanup automatically?
> If not, we should at least tell the user clearly in the error message that both
> invalid indexes are left behind (i.e. "idx" and "idx_ccaux" in the example)

Commit 0005 adds automatic dropping of auxiliary indexes when the
original index is reindexed or dropped. Also, documentation reflects
the ccaux index (similar to ccnew).

Well, we auto-drop the aux index if the user figures out that they should drop the main index first.
 
> Docs are inconsistent or confusing about whether there's one or two indexes left behind in case of error
> - e.g. "command will fail but leave behind *an* invalid index and its associated auxiliary index"
> somewhat implying there is only one invalid index, and somehow the auxiliary index is valid?

Auxiliary index is never marked as valid; I'm not sure we need to
highlight it here. Or do you have an idea how to rephrase?

For example in this doc change hunk:

@@ -664,12 +665,19 @@ postgres=# \d tab
  col    | integer |           |          |
 Indexes:
     "idx" btree (col) INVALID
+    "idx_ccaux" stir (col) INVALID
 </programlisting>
 
-    The recommended recovery
-    method in such cases is to drop the index and try again to perform
-    <command>CREATE INDEX CONCURRENTLY</command>.  (Another possibility is
-    to rebuild the index with <command>REINDEX INDEX CONCURRENTLY</command>).
+    The recommended recovery method in such cases is to drop the index with
+    <command>DROP INDEX</command>. The auxiliary index (suffixed with
+    <literal>_ccaux</literal>) will be automatically dropped when the main
+    index is dropped. After dropping the indexes, you can try again to perform
+    <command>CREATE INDEX CONCURRENTLY</command>. (Another possibility is to
+    rebuild the index with <command>REINDEX INDEX CONCURRENTLY</command>,
+    which will also handle cleanup of any invalid auxiliary indexes.)
+    If the only invalid index is one suffixed <literal>_ccaux</literal>,
+    the recommended recovery method is just <literal>DROP INDEX</literal>
+    for that index.
    </para>

The output we're showing the user from psql is two INVALID indexes, and we're keeping the original doc suggestion on the first line that "The recommended recovery method in such cases is to drop the index with DROP INDEX". The next sentence clarifies a bit that there's an auxiliary index that "will be automatically dropped". But now it's on the user to figure out which index is which, and drop the right one.


> Similarly, when the doc mentions e.g. "drop the index" - it's not necessarily clear which index
> we're talking about when there are two invalid indexes left behind that the user will see with `\d`

In one commit it says: "method in such cases is to drop these indexes
and try again to perform".
After 0005 "The auxiliary index (suffixed with
<literal>_ccaux</literal>) will be automatically dropped when the main
index is dropped".
It seems clear to me, but feel free to provide your variant.

>  * It would be nice to guard against users trying arbitrary CREATE INDEX ... USING stir(...) with a clear error

It will fail with "Building STIR indexes is not supported".

Sorry, you are right, this is handled with a good error.
 

> One of the testcases (line 2478 of patch 0004) does `DELETE FROM concur_reindex_tab4 WHERE c1 = 1;`
> but the table `concur_reindex_tab4` looks like it has been dropped a few lines above that?

Hm, yep, I'll fix it.

> The StirPageGetFreeSpace macro from patch 0002 reads `StirPageGetMaxOffset(page)`
> which seems like it could cause an unsafe read of opaque->maxoff if used on the metapage

But it was never used for the metapage.

Yes, but I think it'd be better not to leave a possible foot-gun around for the next developer. Even just adding an AssertMacro like:

#define StirPageGetMaxOffset(page) (AssertMacro(!StirPageIsMeta(page)), StirPageGetOpaque(page)->maxoff)

There are some other asserts for some of the trickier bookkeeping that happens in this patch that I think would help check the code, and make it easier to understand as well. E.g. adding an assertion check at the end of StirPageAddItem(), and inside stirbulkdelete() (I tried calling it around L499 of stir.c, just before 'while (itup < itupEnd)'). 

/*
 * Validate that maxoff and pd_lower are consistent on a STIR data page.
 *
 * On a freshly initialized empty page, pd_lower is SizeOfPageHeaderData
 * (set by PageInit).  After the first insert, pd_lower is computed from
 * PageGetContents which uses MAXALIGN(SizeOfPageHeaderData).
 */
static inline void
StirPageValidate(Page page)
{
Assert(!StirPageIsMeta(page));
Assert(StirPageGetOpaque(page)->maxoff == 0
  ? ((PageHeader) page)->pd_lower == SizeOfPageHeaderData
  : ((PageHeader) page)->pd_lower ==
    MAXALIGN(SizeOfPageHeaderData) +
    StirPageGetOpaque(page)->maxoff * sizeof(StirTuple));
}
 

> A comment explains "No predicate evaluation is needed here" , i.e. we are skipping predicate
> evaluation in the validation scan step, assuming that the
> auxiliary index contains only qualifying TIDs. Is this really bulletproof for e.g. partial indexes which may
> no longer satisfy the predicate at the time of the validation scan due to conflicting HOT updates?

Conflicting HOT updates are not possible because the catalog contains
the new index definition from the start of the process.
Or do you mean a different scenario?

Sorry for the false alarm, I believe you are right - I had to double check RelationGetIndexAttrBitmap(), but I believe this is safe based on the hotblockingattrs bitmap. 

Overall, this is a nice improvement for CIC/RC that I think should help particularly on large, busy systems.

Thanks,
Josh
 

pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: Add missing period to DETAIL messages
Next
From: David Rowley
Date:
Subject: Re: StringInfo fixes, v19 edition. Plus a few oddities