Re: BUG #18815: Logical replication worker Segmentation fault - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #18815: Logical replication worker Segmentation fault
Date
Msg-id 1147221.1739843510@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #18815: Logical replication worker Segmentation fault  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
I wrote:
> The problem seems to be that apply_handle_insert_internal does
> ExecOpenIndices and then ExecCloseIndices, and then
> ExecCleanupTupleRouting does ExecCloseIndices again, which nicely
> explains why brininsertcleanup blows up if you happen to have a BRIN
> index involved.  What it doesn't explain is how come we don't see
> other symptoms from the duplicate index_close calls, regardless of
> index type.

Hah, I see it: this bug is specific to the apply_handle_tuple_routing
code path.  The ResultRelInfo we're dealing with is made by
ExecFindPartition, which does ExecOpenIndices on it.  Then
apply_handle_tuple_routing calls apply_handle_insert_internal, which
does ExecOpenIndices *again* on the same ResultRelInfo.  That gets a
second refcount and second lock on the index(es), and leaks all the
IndexInfo data structures made by the first call.  When done,
apply_handle_insert_internal does ExecCloseIndices, releasing one
refcount and lock.  Then, back in apply_handle_tuple_routing, we do
ExecCloseIndices again.  So the refcounts and locks balance out, and
it very accidentally fails to crash, so long as nobody is expecting
the contents of the IndexInfos to match up.

The "Move the tuple into the new partition" path in
apply_handle_tuple_routing's UPDATE case is even squirrelier.  That
does another ExecFindPartition and then apply_handle_insert_internal,
but there's no ExecCloseIndices call balancing the ExecFindPartition,
meaning it looks like it's leaking refcount and lock.  Experimentation
shows that it's not, because the matching ExecCloseIndices is
eventually done by the ExecCleanupTupleRouting call in finish_edata
after returning to the outer apply_handle_update.

So if you ask me, this is impossibly convoluted and in need of a
significant rewrite.  It's not okay to be opening/closing the
ResultRelInfo's indexes twice (not least because "speculative"
is different in the two open calls).  And it's not okay to have
such radically different paths for cleaning up the index-opening
done by ExecFindPartition.  It's even more not okay that there's
not one word of commentary about this complexity, strongly suggesting
that the original author didn't quite understand how it worked either.

            regards, tom lane



pgsql-bugs by date:

Previous
From: Chris BSomething
Date:
Subject: Re: BUG #18594: CASE WHEN ELSE failing to return the expected output when the same colum is used in WHEN and ELSE
Next
From: Tom Lane
Date:
Subject: Re: BUG #18594: CASE WHEN ELSE failing to return the expected output when the same colum is used in WHEN and ELSE