Re: BUG #18815: Logical replication worker Segmentation fault - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: BUG #18815: Logical replication worker Segmentation fault |
Date | |
Msg-id | 1761873.1739911114@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: BUG #18815: Logical replication worker Segmentation fault
|
List | pgsql-hackers |
[ redirecting to -hackers for patch discussion ] Here's a completed set of patches for bug #18815 [1] (which, briefly, is that the logrep worker opens/closes indexes multiple times and thereby breaks brininsertcleanup). 0001 adds a subscriber-side BRIN index to 013_partition.pl, which replicates the crash Sergey reported. I've got mixed feelings about whether to actually commit this: it's certainly useful for testing this fix, but without context it looks like a pretty random thing to do. It could be justified perhaps on the grounds of testing aminsertcleanup callbacks within replication, since BRIN is the only core index type that has such a callback. 0002 fixes the crash by hardening brininsertcleanup against multiple calls. I think that this is patching a symptom not the root cause, but it still seems like good defensive programming. 0003 adds Asserts to ExecOpenIndices and ExecCloseIndices to check that they're not called more than once per ResultRelInfo, and thereby exposes what I consider the root cause: apply_handle_tuple_routing opens the indexes twice and then closes them twice. This somewhat accidentally leaves us with zero refcounts and zero locks on the indexes, so in the absence of aminsertcleanup callbacks the only real reason to complain is the duplicative construction of the IndexInfo structures. But the double call of brininsertcleanup breaks the existing coding of that function, and it might well break third-party aminsertcleanup callbacks if there are any. So I think we should institute a policy that this coding pattern is forbidden. And finally, 0004 fixes worker.c to not do that. This turned out simpler than I thought, because I was misled by believing that the ExecOpenIndices/ExecCloseIndices calls in apply_handle_tuple_routing itself do something useful. They don't, and should be nuked outright. I don't intend 0003 for back-patch, but the rest of this has to go back as far as the bug can be observed, which I didn't test yet. To be clear, 0004 would fix the issue even without 0002, but I still think we should back-patch both. regards, tom lane [1] https://www.postgresql.org/message-id/flat/18815-2a0407cc7f40b327%40postgresql.org From 70071453e273e676267a58b6ad31a0886b0fec13 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Tue, 18 Feb 2025 12:18:05 -0500 Subject: [PATCH v1 1/4] Demonstrate crash in brininsertcleanup during logical replication. A cross-partition update crashes if the subscriber's table has a BRIN index. This is easily demonstrated by adding such an index to 013_partition.pl. I'm not sure if we actually want to include this in the final patchset, so it's a separate patch for now. Bug: #18815 Reported-by: Sergey Belyashov <sergey.belyashov@gmail.com> Discussion: https://postgr.es/m/18815-2a0407cc7f40b327@postgresql.org Backpatch-through: TBD --- src/test/subscription/t/013_partition.pl | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl index 14a3beae6e..c1f3ef781b 100644 --- a/src/test/subscription/t/013_partition.pl +++ b/src/test/subscription/t/013_partition.pl @@ -49,6 +49,9 @@ $node_publisher->safe_psql('postgres', $node_subscriber1->safe_psql('postgres', "CREATE TABLE tab1 (c text, a int PRIMARY KEY, b text) PARTITION BY LIST (a)" ); +$node_subscriber1->safe_psql('postgres', + "CREATE INDEX ON tab1 USING brin (c)" +); $node_subscriber1->safe_psql('postgres', "CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)" ); -- 2.43.5 From badeb15ca1261a3681603ab7fe0a3898a640fdfc Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Tue, 18 Feb 2025 12:25:49 -0500 Subject: [PATCH v1 2/4] Harden brininsertcleanup against repeat calls. Leaving an empty BrinInsertState struct linked in ii_AmCache confuses both brininsertcleanup itself and brininsert, were that to be called again. We should fully remove it, instead. This stops the crash reported in bug #18815. I think it's fixing the symptom rather than the root cause, but nonetheless there's little argument for brininsertcleanup to be leaving an obviously broken data structure behind. Bug: #18815 Reported-by: Sergey Belyashov <sergey.belyashov@gmail.com> Discussion: https://postgr.es/m/18815-2a0407cc7f40b327@postgresql.org Backpatch-through: TBD --- src/backend/access/brin/brin.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c index 4265687afa..60320440fc 100644 --- a/src/backend/access/brin/brin.c +++ b/src/backend/access/brin/brin.c @@ -511,16 +511,18 @@ brininsertcleanup(Relation index, IndexInfo *indexInfo) BrinInsertState *bistate = (BrinInsertState *) indexInfo->ii_AmCache; /* bail out if cache not initialized */ - if (indexInfo->ii_AmCache == NULL) + if (bistate == NULL) return; + /* do this first to avoid dangling pointer if we fail partway through */ + indexInfo->ii_AmCache = NULL; + /* * Clean up the revmap. Note that the brinDesc has already been cleaned up * as part of its own memory context. */ brinRevmapTerminate(bistate->bis_rmAccess); - bistate->bis_rmAccess = NULL; - bistate->bis_desc = NULL; + pfree(bistate); } /* -- 2.43.5 From 484115dc9598cc496040ef0058239f8f61fea457 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Tue, 18 Feb 2025 12:47:57 -0500 Subject: [PATCH v1 3/4] Assert that ExecOpenIndices and ExecCloseIndices are not repeated. These functions should be called at most once per ResultRelInfo; it's wasteful to do otherwise, and certainly the pattern of opening twice and then closing twice is a bad idea. Moreover, index_insert_cleanup functions might not be prepared to be called twice, as the just-hardened code in BRIN demonstrates. Sadly, logical replication's apply_handle_tuple_routing() does exactly that, meaning that either hunk of this patch is sufficient to make it crash. While other patches in this series are back-patch candidates, this one should only be applied to HEAD, as perhaps there are extensions doing the now-forbidden coding pattern. We shouldn't break them in minor releases. Bug: #18815 Reported-by: Sergey Belyashov <sergey.belyashov@gmail.com> Discussion: https://postgr.es/m/18815-2a0407cc7f40b327@postgresql.org --- src/backend/executor/execIndexing.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c index 7c87f012c3..742f3f8c08 100644 --- a/src/backend/executor/execIndexing.c +++ b/src/backend/executor/execIndexing.c @@ -181,6 +181,9 @@ ExecOpenIndices(ResultRelInfo *resultRelInfo, bool speculative) if (len == 0) return; + /* This Assert will fail if ExecOpenIndices is called twice */ + Assert(resultRelInfo->ri_IndexRelationDescs == NULL); + /* * allocate space for result arrays */ @@ -246,19 +249,23 @@ ExecCloseIndices(ResultRelInfo *resultRelInfo) for (i = 0; i < numIndices; i++) { - if (indexDescs[i] == NULL) - continue; /* shouldn't happen? */ + /* This Assert will fail if ExecCloseIndices is called twice */ + Assert(indexDescs[i] != NULL); /* Give the index a chance to do some post-insert cleanup */ index_insert_cleanup(indexDescs[i], indexInfos[i]); /* Drop lock acquired by ExecOpenIndices */ index_close(indexDescs[i], RowExclusiveLock); + + /* Mark the index as closed */ + indexDescs[i] = NULL; } /* - * XXX should free indexInfo array here too? Currently we assume that - * such stuff will be cleaned up automatically in FreeExecutorState. + * We don't attempt to free the IndexInfo data structures or the arrays, + * instead assuming that such stuff will be cleaned up automatically in + * FreeExecutorState. */ } -- 2.43.5 From 71a286cd08c840f438bf4ec71f59d1ac4615662e Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Tue, 18 Feb 2025 15:09:19 -0500 Subject: [PATCH v1 4/4] Avoid duplicate ExecOpenIndices/ExecCloseIndices in logrep worker. During a cross-partition update, apply_handle_tuple_routing calls ExecFindPartition which does ExecOpenIndices (and expects that ExecCleanupTupleRouting will close the indexes again). Therefore, we must avoid re-opening/closing the table's indexes when the update code path calls apply_handle_insert_internal or apply_handle_delete_internal. That leaves us with only one caller of each function that needs indexes to be opened/closed, so we can just move those calls to the callers. Also, remove the entirely-duplicative open/close calls within apply_handle_tuple_routing itself. Bug: #18815 Reported-by: Sergey Belyashov <sergey.belyashov@gmail.com> Discussion: https://postgr.es/m/18815-2a0407cc7f40b327@postgresql.org Backpatch-through: TBD --- src/backend/replication/logical/worker.c | 30 +++++++++++++----------- 1 file changed, 16 insertions(+), 14 deletions(-) diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c index f09ab41c60..a79c80e656 100644 --- a/src/backend/replication/logical/worker.c +++ b/src/backend/replication/logical/worker.c @@ -2454,8 +2454,13 @@ apply_handle_insert(StringInfo s) apply_handle_tuple_routing(edata, remoteslot, NULL, CMD_INSERT); else - apply_handle_insert_internal(edata, edata->targetRelInfo, - remoteslot); + { + ResultRelInfo *relinfo = edata->targetRelInfo; + + ExecOpenIndices(relinfo, true); + apply_handle_insert_internal(edata, relinfo, remoteslot); + ExecCloseIndices(relinfo); + } finish_edata(edata); @@ -2482,16 +2487,13 @@ apply_handle_insert_internal(ApplyExecutionData *edata, { EState *estate = edata->estate; - /* We must open indexes here. */ - ExecOpenIndices(relinfo, true); + /* Caller will not have done this bit. */ + Assert(relinfo->ri_onConflictArbiterIndexes == NIL); InitConflictIndexes(relinfo); /* Do the insert. */ TargetPrivilegesCheck(relinfo->ri_RelationDesc, ACL_INSERT); ExecSimpleRelationInsert(relinfo, estate, remoteslot); - - /* Cleanup. */ - ExecCloseIndices(relinfo); } /* @@ -2816,8 +2818,14 @@ apply_handle_delete(StringInfo s) apply_handle_tuple_routing(edata, remoteslot, NULL, CMD_DELETE); else - apply_handle_delete_internal(edata, edata->targetRelInfo, + { + ResultRelInfo *relinfo = edata->targetRelInfo; + + ExecOpenIndices(relinfo, false); + apply_handle_delete_internal(edata, relinfo, remoteslot, rel->localindexoid); + ExecCloseIndices(relinfo); + } finish_edata(edata); @@ -2851,7 +2859,6 @@ apply_handle_delete_internal(ApplyExecutionData *edata, bool found; EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1, NIL); - ExecOpenIndices(relinfo, false); found = FindReplTupleInLocalRel(edata, localrel, remoterel, localindexoid, remoteslot, &localslot); @@ -2892,7 +2899,6 @@ apply_handle_delete_internal(ApplyExecutionData *edata, } /* Cleanup. */ - ExecCloseIndices(relinfo); EvalPlanQualEnd(&epqstate); } @@ -3131,7 +3137,6 @@ apply_handle_tuple_routing(ApplyExecutionData *edata, * work already done above to find the local tuple in the * partition. */ - ExecOpenIndices(partrelinfo, true); InitConflictIndexes(partrelinfo); EvalPlanQualSetSlot(&epqstate, remoteslot_part); @@ -3181,8 +3186,6 @@ apply_handle_tuple_routing(ApplyExecutionData *edata, get_namespace_name(RelationGetNamespace(partrel_new)), RelationGetRelationName(partrel_new)); - ExecOpenIndices(partrelinfo, false); - /* DELETE old tuple found in the old partition. */ EvalPlanQualSetSlot(&epqstate, localslot); TargetPrivilegesCheck(partrelinfo->ri_RelationDesc, ACL_DELETE); @@ -3217,7 +3220,6 @@ apply_handle_tuple_routing(ApplyExecutionData *edata, remoteslot_part); } - ExecCloseIndices(partrelinfo); EvalPlanQualEnd(&epqstate); } break; -- 2.43.5
pgsql-hackers by date: