Re: BUG #18499: Reindexing spgist index concurrently triggers Assert("TransactionIdIsValid(state->myXid)") - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #18499: Reindexing spgist index concurrently triggers Assert("TransactionIdIsValid(state->myXid)")
Date
Msg-id 3030961.1718570047@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #18499: Reindexing spgist index concurrently triggers Assert("TransactionIdIsValid(state->myXid)")  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #18499: Reindexing spgist index concurrently triggers Assert("TransactionIdIsValid(state->myXid)")
List pgsql-bugs
I wrote:
> In any case, it feels like a bad idea that initSpGistState doesn't
> always make myXid valid: that seems like it's assuming too much
> about when we can create REDIRECT tuples.
> What seems like it should work is
>      state->myXid = GetTopTransactionIdIfAny();
> +    if (!TransactionIdIsValid(state->myXid))
> +        state->myXid = ReadNextTransactionId();
> on the grounds that if we lack an XID, then the next-to-be-assigned
> XID should be a safe expiry horizon for any REDIRECT we might need.

I spent a good deal more time thinking about this, and concluded that
that idea doesn't help.  The only case where this is problematic is
redirect tuples inserted during REINDEX CONCURRENTLY.  As far as I can
see, using ReadNextTransactionId adds no safety because that XID could
be consumed and then fall below the global xmin horizon before REINDEX
finishes.  (If REINDEX CONCURRENTLY held onto a single snapshot for
the duration, that snapshot's xmin would prevent this; but it doesn't,
so there could be times when there's no snapshot anywhere.)

Nonetheless, it seems like it'd mostly be okay to insert invalid XID
in redirects generated by REINDEX CONCURRENTLY.  The reason that that
seems safe is that R.C. doesn't allow any scans to use the index at
all till it's done; therefore there could never be a scan "in flight"
to such a redirect.  We do allow insertions to happen concurrently
with R.C., but those should never be in flight to a redirect at all,
as explained in src/backend/access/spgist/README.  And furthermore,
R.C. also locks out VACUUM.  This means that it should be safe for
VACUUM to clean up a redirect inserted by R.C. immediately when it
sees one.

The one problem is that VACUUM's test

        GlobalVisTestIsRemovableXid(vistest, dt->xid))

would fail on invalid xid, mainly because FullXidRelativeTo doesn't
work on that.  (It'd get an assertion failure, or produce the wrong
value without assertions.)  We can fix that by adding an explicit
test for invalid xid, which we really need to do anyway because of
the possibility that existing indexes contain such redirects.

So I conclude that we basically just need to remove the failing
assertion in spgFormDeadTuple and instead add a guard for invalid
xid in vacuumRedirectAndPlaceholder.  The attached draft patch
also adds some comments and performs the threatened rename of
myXid to redirectXid.  (I wouldn't do that renaming in back
branches, only HEAD.)

Thoughts?

            regards, tom lane

diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 3f793125f7..76b80146ff 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -358,8 +358,19 @@ initSpGistState(SpGistState *state, Relation index)
     /* Make workspace for constructing dead tuples */
     state->deadTupleStorage = palloc0(SGDTSIZE);

-    /* Set XID to use in redirection tuples */
-    state->myXid = GetTopTransactionIdIfAny();
+    /*
+     * Set horizon XID to use in redirection tuples.  Use our own XID if we
+     * have one, else use InvalidTransactionId.  The latter case can happen in
+     * VACUUM or REINDEX CONCURRENTLY, and in neither case would it be okay to
+     * force an XID to be assigned.  VACUUM won't create any redirection
+     * tuples anyway, but REINDEX CONCURRENTLY can.  Fortunately, REINDEX
+     * CONCURRENTLY doesn't mark the index valid until the end, so there could
+     * never be any concurrent scans "in flight" to a redirection tuple it has
+     * inserted.  And it locks out VACUUM until the end, too.  So it's okay
+     * for VACUUM to immediately expire a redirection tuple that contains an
+     * invalid xid.
+     */
+    state->redirectXid = GetTopTransactionIdIfAny();

     /* Assume we're not in an index build (spgbuild will override) */
     state->isBuild = false;
@@ -1075,8 +1086,7 @@ spgFormDeadTuple(SpGistState *state, int tupstate,
     if (tupstate == SPGIST_REDIRECT)
     {
         ItemPointerSet(&tuple->pointer, blkno, offnum);
-        Assert(TransactionIdIsValid(state->myXid));
-        tuple->xid = state->myXid;
+        tuple->xid = state->redirectXid;
     }
     else
     {
diff --git a/src/backend/access/spgist/spgvacuum.c b/src/backend/access/spgist/spgvacuum.c
index d2e1624924..0da069fd4d 100644
--- a/src/backend/access/spgist/spgvacuum.c
+++ b/src/backend/access/spgist/spgvacuum.c
@@ -188,7 +188,9 @@ vacuumLeafPage(spgBulkDeleteState *bds, Relation index, Buffer buffer,

             /*
              * Add target TID to pending list if the redirection could have
-             * happened since VACUUM started.
+             * happened since VACUUM started.  (If xid is invalid, assume it
+             * must have happened before VACUUM started, since REINDEX
+             * CONCURRENTLY locks out VACUUM.)
              *
              * Note: we could make a tighter test by seeing if the xid is
              * "running" according to the active snapshot; but snapmgr.c
@@ -523,8 +525,17 @@ vacuumRedirectAndPlaceholder(Relation index, Relation heaprel, Buffer buffer)

         dt = (SpGistDeadTuple) PageGetItem(page, PageGetItemId(page, i));

+        /*
+         * We can convert a REDIRECT to a PLACEHOLDER if there could no longer
+         * be any index scans "in flight" to it.  Such an index scan would
+         * have to be in a transaction whose snapshot sees the REDIRECT's XID
+         * as still running, so comparing the XID against global xmin is a
+         * conservatively safe test.  If the XID is invalid, it must have been
+         * inserted by REINDEX CONCURRENTLY, so we can zap it immediately.
+         */
         if (dt->tupstate == SPGIST_REDIRECT &&
-            GlobalVisTestIsRemovableXid(vistest, dt->xid))
+            (!TransactionIdIsValid(dt->xid) ||
+             GlobalVisTestIsRemovableXid(vistest, dt->xid)))
         {
             dt->tupstate = SPGIST_PLACEHOLDER;
             Assert(opaque->nRedirection > 0);
diff --git a/src/backend/access/spgist/spgxlog.c b/src/backend/access/spgist/spgxlog.c
index 11d006998e..4e31d33e70 100644
--- a/src/backend/access/spgist/spgxlog.c
+++ b/src/backend/access/spgist/spgxlog.c
@@ -36,7 +36,7 @@ fillFakeState(SpGistState *state, spgxlogState stateSrc)
 {
     memset(state, 0, sizeof(*state));

-    state->myXid = stateSrc.myXid;
+    state->redirectXid = stateSrc.redirectXid;
     state->isBuild = stateSrc.isBuild;
     state->deadTupleStorage = palloc0(SGDTSIZE);
 }
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
index 2e9c757b30..e7cbe10a89 100644
--- a/src/include/access/spgist_private.h
+++ b/src/include/access/spgist_private.h
@@ -157,7 +157,7 @@ typedef struct SpGistState

     char       *deadTupleStorage;    /* workspace for spgFormDeadTuple */

-    TransactionId myXid;        /* XID to use when creating a redirect tuple */
+    TransactionId redirectXid;    /* XID to use when creating a redirect tuple */
     bool        isBuild;        /* true if doing index build */
 } SpGistState;

@@ -421,7 +421,8 @@ typedef struct SpGistLeafTupleData
  * field, to satisfy some Asserts that we make when replacing a leaf tuple
  * with a dead tuple.
  * We don't use t_info, but it's needed to align the pointer field.
- * pointer and xid are only valid when tupstate = REDIRECT.
+ * pointer and xid are only valid when tupstate = REDIRECT, and in some
+ * cases xid can be InvalidTransactionId even then; see initSpGistState.
  */
 typedef struct SpGistDeadTupleData
 {
@@ -464,7 +465,7 @@ typedef SpGistDeadTupleData *SpGistDeadTuple;

 #define STORE_STATE(s, d)  \
     do { \
-        (d).myXid = (s)->myXid; \
+        (d).redirectXid = (s)->redirectXid; \
         (d).isBuild = (s)->isBuild; \
     } while(0)

diff --git a/src/include/access/spgxlog.h b/src/include/access/spgxlog.h
index a08f0dc8d5..16cc735001 100644
--- a/src/include/access/spgxlog.h
+++ b/src/include/access/spgxlog.h
@@ -35,7 +35,7 @@
  */
 typedef struct spgxlogState
 {
-    TransactionId myXid;
+    TransactionId redirectXid;
     bool        isBuild;
 } spgxlogState;


pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #18512: Backend memory leak when using command parameters after generating notifications
Next
From: Tom Lane
Date:
Subject: Re: BUG #18499: Reindexing spgist index concurrently triggers Assert("TransactionIdIsValid(state->myXid)")