Home > mailing lists

Re: ResourceOwner refactoring - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: ResourceOwner refactoring
Date	June 3, 2024 22:49:52
Msg-id	7d9fa5cf-07e3-4691-b40a-802048a0b998@iki.fi Whole thread Raw
In response to	Re: ResourceOwner refactoring (Alexander Lakhin <exclusion@gmail.com>)
Responses	Re: ResourceOwner refactoring
List	pgsql-hackers

Tree view

Thanks for the testing!

On 01/06/2024 11:00, Alexander Lakhin wrote:
> Hello Heikki,
> 
> I've stumbled upon yet another anomaly introduced with b8bff07da.
> 
> Please try the following script:
> SELECT 'init' FROM pg_create_logical_replication_slot('slot',
>     'test_decoding');
> CREATE TABLE tbl(t text);
> BEGIN;
> TRUNCATE table tbl;
> 
> SELECT data FROM pg_logical_slot_get_changes('slot', NULL, NULL,
>    'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
> 
> On b8bff07da (and the current HEAD), it ends up with:
> ERROR:  ResourceOwnerEnlarge called after release started
> 
> But on the previous commit, I get no error:
>    data
> ------
> (0 rows)

This is another case of the issue I tried to fix in commit b8bff07da 
with this:

> --- b/src/backend/access/transam/xact.c
> +++ a/src/backend/access/transam/xact.c
> @@ -5279,7 +5279,20 @@
>  
>          AtEOSubXact_RelationCshould be ache(false, s->subTransactionId,
>                                    s->parent->subTransactionId);
> +
> +
> +        /*
> +         * AtEOSubXact_Inval sometimes needs to temporarily bump the refcount
> +         * on the relcache entries that it processes.  We cannot use the
> +         * subtransaction's resource owner anymore, because we've already
> +         * started releasing it.  But we can use the parent resource owner.
> +         */
> +        CurrentResourceOwner = s->parent->curTransactionOwner;
> +
>          AtEOSubXact_Inval(false);
> +
> +        CurrentResourceOwner = s->curTransactionOwner;
> +
>          ResourceOwnerRelease(s->curTransactionOwner,
>                               RESOURCE_RELEASE_LOCKS,
>                               false, false);

AtEOSubXact_Inval calls LocalExecuteInvalidationMessage(), which 
ultimately calls RelationFlushRelation(). Your test case reaches 
ReorderBufferExecuteInvalidations(), which also calls 
LocalExecuteInvalidationMessage(), in an already aborted subtransaction.

Looking closer at relcache.c, I think we need to make 
RelationFlushRelation() safe to call without a valid resource owner. 
There are already IsTransactionState() checks in 
RelationClearRelation(), and a comment noting that it does not touch 
CurrentResourceOwner. So we should make sure RelationFlushRelation() 
doesn't touch it either, and revert the above change to 
AbortTransaction(). I didn't feel very good about it in the first place, 
and now I see what the proper fix is.

A straightforward fix is to modify RelationFlushRelation() so that if 
!IsTransactionState(), it just marks the entry as invalid instead of 
calling RelationClearRelation(). That's what RelationClearRelation() 
would do anyway, if it didn't hit the assertion first.

RelationClearRelation() is complicated. Depending on the circumstances, 
it removes the relcache entry, rebuilds it, marks it as invalid, or 
performs a partial reload of a nailed relation. So before going for the 
straightforward fix, I'm going to see if I can refactor 
RelationClearRelation() to be less complicated.

-- 
Heikki Linnakangas
Neon (https://neon.tech)

pgsql-hackers by date:

From: Alexander Korotkov
Date: 03 June 2024, 22:47:43
Subject: Re: POC: GROUP BY optimization

From: Michael Paquier
Date: 03 June 2024, 22:58:27
Subject: Re: Test to dump and restore objects left behind by regression

Re: ResourceOwner refactoring - Mailing list pgsql-hackers

Previous

Next