Re: INVALID index while concurrent indexing in progress? - Mailing list pgsql-docs

From Laurenz Albe
Subject Re: INVALID index while concurrent indexing in progress?
Date
Msg-id c10cb5991b0214abe5e768ab5a87490d6aeca1d1.camel@cybertec.at
Whole thread Raw
In response to Re: INVALID index while concurrent indexing in progress?  (Rajakavitha Kodhandapani <krajakavitha@gmail.com>)
Responses Re: INVALID index while concurrent indexing in progress?  (Lauren Fliksteen <dancernerd32@gmail.com>)
List pgsql-docs
On Thu, 2022-05-19 at 19:02 +0530, Rajakavitha Kodhandapani wrote:
> > I think the INVALID index can use further explanation, in particular, from
> > my experience it seems like when building an index concurrently, the index
> > gets inserted and labeled invalid while the index is being built, and then
> > the label gets removed if it finishes successfully or gets left on the index
> > if there is a failure while building the index.  It is my current
> > understanding, after experimenting, that INVALID means 'incomplete', whether
> > that's because it's in progress or because it was unable to be completed,
> > but prior to my experiment my understanding was that INVALID indicated
> > failure.
> > 
> > This was especially confusing when we were adding an index to a very large
> > table because we assumed the INVALID index indicated failure when we
> > couldn't find any other sign of progress or failure.
>
> This is my first attempt at contributing to the documentation of PostgreSQL.
> Here's the patch. Please let me know if any other changes need to be made.

Thank you!  Please send patches as plain text and use bottom posting.

> \cf3 @@ -665,11 +665,14 @@\cf5  Indexes:\
>     <para>\
>      Another caveat when building a unique index concurrently is that the\
>      uniqueness constraint is already being enforced against other transactions\
> \cf6 -    when the second table scan begins.  This means that constraint violations\cf5 \
> \cf4 +    when the second table scan begins. This means that constraint violations\cf5 \
>      could be reported in other queries prior to the index becoming available\
> \cf6 -    for use, or even in cases where the index build eventually fails.  Also,\cf5 \
> \cf6 -    if a failure does occur in the second scan, the <quote>invalid</quote> index\cf5 \
> \cf6 -    continues to enforce its uniqueness constraint afterwards.\cf5 \
> \cf4 +    for use, or even in cases where the index build eventually fails. The index\cf5 \
> \cf4 +    is inserted and labeled <quote>invalid</quote> while the index is being built,\cf5 \
> \cf4 +    and then the label is removed if the index builds successfully. If the index does\cf5 \
> \cf4 +    not build successfully, then the label <quote>invalid</quote> remains. Also, if a\cf5 \
> }

I don't think that this information should be added to a paragraph that
focuses on uniqueness checks in concurrent index builds.

Actually, most of the information is already there.  To quote from the page:

  If a problem arises while scanning the table, such as a deadlock or a uniqueness violation
  in a unique index, the CREATE INDEX command will fail but leave behind an “invalid” index.

How about the following patch to emphasize the role of "invalid":

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index d3102a87d9..fee2c61e5e 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -622,7 +622,7 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
    </para>
 
    <para>
-    In a concurrent index build, the index is actually entered into
+    In a concurrent index build, the index is actually entered as <quote>invalid</quote> index into
     the system catalogs in one transaction, then two table scans occur in
     two more transactions.  Before each table scan, the index build must
     wait for existing transactions that have modified the table to terminate.
@@ -631,7 +631,7 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
     scan to terminate, including transactions used by any phase of concurrent
     index builds on other tables, if the indexes involved are partial or have
     columns that are not simple column references.
-    Then finally the index can be marked ready for use,
+    Then finally the index can be marked <quote>valid</quote> and ready for use,
     and the <command>CREATE INDEX</command> command terminates.
     Even then, however, the index may not be immediately usable for queries:
     in the worst case, it cannot be used as long as transactions exist that

Yours,
Laurenz Albe



pgsql-docs by date:

Previous
From: Rajakavitha Kodhandapani
Date:
Subject: Re: INVALID index while concurrent indexing in progress?
Next
From: Lauren Fliksteen
Date:
Subject: Re: INVALID index while concurrent indexing in progress?