Thread: BUG #18499: Reindexing spgist index concurrently triggers Assert("TransactionIdIsValid(state->myXid)")

The following bug has been logged on the website:

Bug reference:      18499
Logged by:          Alexander Lakhin
Email address:      exclusion@gmail.com
PostgreSQL version: 17beta1
Operating system:   Ubuntu 22.04
Description:

The following script:
cat << EOF | psql &
CREATE TABLE t(i int, p point);
CREATE INDEX spgist_point_idx ON t USING spgist(p);

INSERT INTO t (i, p)
        SELECT g, point(g*10+1, g*10+1) FROM generate_series(1, 100000) g;
SELECT pg_sleep(1);
INSERT INTO t (i, p)
        SELECT g, point(g*10+1, g*10+1) FROM generate_series(1, 100000) g;
EOF

cat << EOF | psql
SELECT pg_sleep(1);
REINDEX INDEX CONCURRENTLY spgist_point_idx;
EOF

triggers an assertion failure with the following stack trace:
TRAP: failed Assert("TransactionIdIsValid(state->myXid)"), File:
"spgutils.c", Line: 1078, PID: 2975757

(gdb) bt
...
#5  0x00005569aefe0727 in ExceptionalCondition
(conditionName=conditionName@entry=0x5569af0562a0
"TransactionIdIsValid(state->myXid)", fileName=fileName@entry=0x5569af05636c
"spgutils.c", lineNumber=lineNumber@entry=1078)
    at assert.c:66
#6  0x00005569aeb3c344 in spgFormDeadTuple
(state=state@entry=0x7ffc3a771760, tupstate=tupstate@entry=1,
blkno=blkno@entry=0, offnum=offnum@entry=1) at spgutils.c:1078
#7  0x00005569aeb3385b in spgPageIndexMultiDelete (...) at
spgdoinsert.c:166
#8  0x00005569aeb34d90 in doPickSplit (...) at spgdoinsert.c:1177
#9  0x00005569aeb35a57 in spgdoinsert (...) at spgdoinsert.c:2139
#10 0x00005569aeb3661f in spginsert (...) at spginsert.c:206
#11 0x00005569aeb11a6d in index_insert (...) at indexam.c:230
#12 0x00005569aeb035f5 in heapam_index_validate_scan (...) at
heapam_handler.c:1963
#13 0x00005569aeb9589b in table_index_validate_scan (...) at
../../../src/include/access/tableam.h:1855
#14 validate_index (...) at index.c:3392
#15 0x00005569aec4dbeb in ReindexRelationConcurrently (...) at
indexcmds.c:4036
#16 0x00005569aec4ec05 in ReindexIndex (...) at indexcmds.c:2814
#17 0x00005569aec4efd9 in ExecReindex (...) at indexcmds.c:2743

(gdb) frame 6
(gdb) p state->myXid
$3 = 0

Without asserts enabled, REINDEX executed with no error.

Reproduced on REL_12_STABLE .. master.


PG Bug reporting form <noreply@postgresql.org> writes:
> The following script:
> ...
> triggers an assertion failure with the following stack trace:
> TRAP: failed Assert("TransactionIdIsValid(state->myXid)"), File:
> "spgutils.c", Line: 1078, PID: 2975757

Thanks for the report!  The problem evidently starts in
initSpGistState, which does

    state->myXid = GetTopTransactionIdIfAny();

Ordinarily this would produce a valid myXid, since if we're creating
an index or inserting a tuple into an existing index, there would
already be an XID.  But REINDEX CONCURRENTLY reaches this code in a
new transaction that hasn't assigned an XID and probably never will.
So if we have to make a REDIRECT tuple, we have trouble.

> Without asserts enabled, REINDEX executed with no error.

It's still broken though, because creating a REDIRECT tuple with
an invalid XID is not OK; it'll cause trouble later on.

I experimented with

-   state->myXid = GetTopTransactionIdIfAny();
+   state->myXid = GetTopTransactionId();

but that causes the parallel-vacuum tests to fall over with

ERROR:  cannot assign XIDs during a parallel operation

Apparently, we never make REDIRECT tuples during VACUUM, or we'd
have seen this error reported before.  We could maybe make this
code read something like "if not VACUUM then assign an XID", but
that might still fail in parallelized concurrent REINDEX.

In any case, it feels like a bad idea that initSpGistState doesn't
always make myXid valid: that seems like it's assuming too much
about when we can create REDIRECT tuples.

What seems like it should work is

     state->myXid = GetTopTransactionIdIfAny();
+    if (!TransactionIdIsValid(state->myXid))
+        state->myXid = ReadNextTransactionId();

on the grounds that if we lack an XID, then the next-to-be-assigned
XID should be a safe expiry horizon for any REDIRECT we might need.

There are some comments to be updated, and in HEAD I'd be inclined to
rename "myXid" to something that doesn't imply that it's necessarily
our own XID.  Maybe "redirectXid", since it's not used for anything
but that.

            regards, tom lane



I wrote:
> In any case, it feels like a bad idea that initSpGistState doesn't
> always make myXid valid: that seems like it's assuming too much
> about when we can create REDIRECT tuples.
> What seems like it should work is
>      state->myXid = GetTopTransactionIdIfAny();
> +    if (!TransactionIdIsValid(state->myXid))
> +        state->myXid = ReadNextTransactionId();
> on the grounds that if we lack an XID, then the next-to-be-assigned
> XID should be a safe expiry horizon for any REDIRECT we might need.

I spent a good deal more time thinking about this, and concluded that
that idea doesn't help.  The only case where this is problematic is
redirect tuples inserted during REINDEX CONCURRENTLY.  As far as I can
see, using ReadNextTransactionId adds no safety because that XID could
be consumed and then fall below the global xmin horizon before REINDEX
finishes.  (If REINDEX CONCURRENTLY held onto a single snapshot for
the duration, that snapshot's xmin would prevent this; but it doesn't,
so there could be times when there's no snapshot anywhere.)

Nonetheless, it seems like it'd mostly be okay to insert invalid XID
in redirects generated by REINDEX CONCURRENTLY.  The reason that that
seems safe is that R.C. doesn't allow any scans to use the index at
all till it's done; therefore there could never be a scan "in flight"
to such a redirect.  We do allow insertions to happen concurrently
with R.C., but those should never be in flight to a redirect at all,
as explained in src/backend/access/spgist/README.  And furthermore,
R.C. also locks out VACUUM.  This means that it should be safe for
VACUUM to clean up a redirect inserted by R.C. immediately when it
sees one.

The one problem is that VACUUM's test

        GlobalVisTestIsRemovableXid(vistest, dt->xid))

would fail on invalid xid, mainly because FullXidRelativeTo doesn't
work on that.  (It'd get an assertion failure, or produce the wrong
value without assertions.)  We can fix that by adding an explicit
test for invalid xid, which we really need to do anyway because of
the possibility that existing indexes contain such redirects.

So I conclude that we basically just need to remove the failing
assertion in spgFormDeadTuple and instead add a guard for invalid
xid in vacuumRedirectAndPlaceholder.  The attached draft patch
also adds some comments and performs the threatened rename of
myXid to redirectXid.  (I wouldn't do that renaming in back
branches, only HEAD.)

Thoughts?

            regards, tom lane

diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 3f793125f7..76b80146ff 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -358,8 +358,19 @@ initSpGistState(SpGistState *state, Relation index)
     /* Make workspace for constructing dead tuples */
     state->deadTupleStorage = palloc0(SGDTSIZE);

-    /* Set XID to use in redirection tuples */
-    state->myXid = GetTopTransactionIdIfAny();
+    /*
+     * Set horizon XID to use in redirection tuples.  Use our own XID if we
+     * have one, else use InvalidTransactionId.  The latter case can happen in
+     * VACUUM or REINDEX CONCURRENTLY, and in neither case would it be okay to
+     * force an XID to be assigned.  VACUUM won't create any redirection
+     * tuples anyway, but REINDEX CONCURRENTLY can.  Fortunately, REINDEX
+     * CONCURRENTLY doesn't mark the index valid until the end, so there could
+     * never be any concurrent scans "in flight" to a redirection tuple it has
+     * inserted.  And it locks out VACUUM until the end, too.  So it's okay
+     * for VACUUM to immediately expire a redirection tuple that contains an
+     * invalid xid.
+     */
+    state->redirectXid = GetTopTransactionIdIfAny();

     /* Assume we're not in an index build (spgbuild will override) */
     state->isBuild = false;
@@ -1075,8 +1086,7 @@ spgFormDeadTuple(SpGistState *state, int tupstate,
     if (tupstate == SPGIST_REDIRECT)
     {
         ItemPointerSet(&tuple->pointer, blkno, offnum);
-        Assert(TransactionIdIsValid(state->myXid));
-        tuple->xid = state->myXid;
+        tuple->xid = state->redirectXid;
     }
     else
     {
diff --git a/src/backend/access/spgist/spgvacuum.c b/src/backend/access/spgist/spgvacuum.c
index d2e1624924..0da069fd4d 100644
--- a/src/backend/access/spgist/spgvacuum.c
+++ b/src/backend/access/spgist/spgvacuum.c
@@ -188,7 +188,9 @@ vacuumLeafPage(spgBulkDeleteState *bds, Relation index, Buffer buffer,

             /*
              * Add target TID to pending list if the redirection could have
-             * happened since VACUUM started.
+             * happened since VACUUM started.  (If xid is invalid, assume it
+             * must have happened before VACUUM started, since REINDEX
+             * CONCURRENTLY locks out VACUUM.)
              *
              * Note: we could make a tighter test by seeing if the xid is
              * "running" according to the active snapshot; but snapmgr.c
@@ -523,8 +525,17 @@ vacuumRedirectAndPlaceholder(Relation index, Relation heaprel, Buffer buffer)

         dt = (SpGistDeadTuple) PageGetItem(page, PageGetItemId(page, i));

+        /*
+         * We can convert a REDIRECT to a PLACEHOLDER if there could no longer
+         * be any index scans "in flight" to it.  Such an index scan would
+         * have to be in a transaction whose snapshot sees the REDIRECT's XID
+         * as still running, so comparing the XID against global xmin is a
+         * conservatively safe test.  If the XID is invalid, it must have been
+         * inserted by REINDEX CONCURRENTLY, so we can zap it immediately.
+         */
         if (dt->tupstate == SPGIST_REDIRECT &&
-            GlobalVisTestIsRemovableXid(vistest, dt->xid))
+            (!TransactionIdIsValid(dt->xid) ||
+             GlobalVisTestIsRemovableXid(vistest, dt->xid)))
         {
             dt->tupstate = SPGIST_PLACEHOLDER;
             Assert(opaque->nRedirection > 0);
diff --git a/src/backend/access/spgist/spgxlog.c b/src/backend/access/spgist/spgxlog.c
index 11d006998e..4e31d33e70 100644
--- a/src/backend/access/spgist/spgxlog.c
+++ b/src/backend/access/spgist/spgxlog.c
@@ -36,7 +36,7 @@ fillFakeState(SpGistState *state, spgxlogState stateSrc)
 {
     memset(state, 0, sizeof(*state));

-    state->myXid = stateSrc.myXid;
+    state->redirectXid = stateSrc.redirectXid;
     state->isBuild = stateSrc.isBuild;
     state->deadTupleStorage = palloc0(SGDTSIZE);
 }
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
index 2e9c757b30..e7cbe10a89 100644
--- a/src/include/access/spgist_private.h
+++ b/src/include/access/spgist_private.h
@@ -157,7 +157,7 @@ typedef struct SpGistState

     char       *deadTupleStorage;    /* workspace for spgFormDeadTuple */

-    TransactionId myXid;        /* XID to use when creating a redirect tuple */
+    TransactionId redirectXid;    /* XID to use when creating a redirect tuple */
     bool        isBuild;        /* true if doing index build */
 } SpGistState;

@@ -421,7 +421,8 @@ typedef struct SpGistLeafTupleData
  * field, to satisfy some Asserts that we make when replacing a leaf tuple
  * with a dead tuple.
  * We don't use t_info, but it's needed to align the pointer field.
- * pointer and xid are only valid when tupstate = REDIRECT.
+ * pointer and xid are only valid when tupstate = REDIRECT, and in some
+ * cases xid can be InvalidTransactionId even then; see initSpGistState.
  */
 typedef struct SpGistDeadTupleData
 {
@@ -464,7 +465,7 @@ typedef SpGistDeadTupleData *SpGistDeadTuple;

 #define STORE_STATE(s, d)  \
     do { \
-        (d).myXid = (s)->myXid; \
+        (d).redirectXid = (s)->redirectXid; \
         (d).isBuild = (s)->isBuild; \
     } while(0)

diff --git a/src/include/access/spgxlog.h b/src/include/access/spgxlog.h
index a08f0dc8d5..16cc735001 100644
--- a/src/include/access/spgxlog.h
+++ b/src/include/access/spgxlog.h
@@ -35,7 +35,7 @@
  */
 typedef struct spgxlogState
 {
-    TransactionId myXid;
+    TransactionId redirectXid;
     bool        isBuild;
 } spgxlogState;


I wrote:
> So I conclude that we basically just need to remove the failing
> assertion in spgFormDeadTuple and instead add a guard for invalid
> xid in vacuumRedirectAndPlaceholder.

BTW, a different line of attack could be to not generate redirects
at all during REINDEX CONCURRENTLY: on the basis of this argument,
we don't need them.  So that would look roughly similar to the tests
that skip making redirects when isBuild is true, and it'd allow
keeping the assertion in spgFormDeadTuple, and it'd save some
usually-trifling amount of work in the next VACUUM.  However, I'm
not sure there's a nice way for spginsert() to know whether it's
being invoked in REINDEX CONCURRENTLY or a normal INSERT/UPDATE
query.  Can we trust indexInfo->ii_Concurrent for that?

            regards, tom lane



On Sun, Jun 16, 2024 at 06:52:52PM -0400, Tom Lane wrote:
> BTW, a different line of attack could be to not generate redirects
> at all during REINDEX CONCURRENTLY: on the basis of this argument,
> we don't need them.  So that would look roughly similar to the tests
> that skip making redirects when isBuild is true, and it'd allow
> keeping the assertion in spgFormDeadTuple, and it'd save some
> usually-trifling amount of work in the next VACUUM.  However, I'm
> not sure there's a nice way for spginsert() to know whether it's
> being invoked in REINDEX CONCURRENTLY or a normal INSERT/UPDATE
> query.  Can we trust indexInfo->ii_Concurrent for that?

I am not sure to understand the redirection part for spgist, but
except if I am missing something, we already rely on ii_Concurrent for
other index AMs like BRIN paths to check if we are dealing with a
concurrent build path or not.  index_concurrently_build() is used
by both CIC and REINDEX CONCURRENTLY, where the flag is set after a
BuildIndexInfo().
--
Michael

Attachment
Michael Paquier <michael@paquier.xyz> writes:
> On Sun, Jun 16, 2024 at 06:52:52PM -0400, Tom Lane wrote:
>> ... not sure there's a nice way for spginsert() to know whether it's
>> being invoked in REINDEX CONCURRENTLY or a normal INSERT/UPDATE
>> query.  Can we trust indexInfo->ii_Concurrent for that?

> I am not sure to understand the redirection part for spgist, but
> except if I am missing something, we already rely on ii_Concurrent for
> other index AMs like BRIN paths to check if we are dealing with a
> concurrent build path or not.  index_concurrently_build() is used
> by both CIC and REINDEX CONCURRENTLY, where the flag is set after a
> BuildIndexInfo().

Right.  I was thinking that CIC wouldn't reach spginsert(), rather
spgbuild(), but it does feel a bit rickety.  A separate flag would
be better.

On the whole I'm inclined to stick with the patch as I have it.
Maybe somebody would like to investigate this idea as an improvement
for v18 or later.

            regards, tom lane



On Sun, Jun 16, 2024 at 07:30:46PM -0400, Tom Lane wrote:
> Right.  I was thinking that CIC wouldn't reach spginsert(), rather
> spgbuild(), but it does feel a bit rickety.  A separate flag would
> be better.

Why does it feel rickety for you?  We already rely on this flag so it
does not seem like a huge problem to rely on it again for the sake of
this index AM.
--
Michael

Attachment