Re: ResourceOwner refactoring - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: ResourceOwner refactoring
Date
Msg-id 9a3245db-ca03-4a47-5706-73404cfe5108@iki.fi
Whole thread Raw
In response to Re: ResourceOwner refactoring  (Andres Freund <andres@anarazel.de>)
Responses Re: ResourceOwner refactoring
Re: ResourceOwner refactoring
List pgsql-hackers
On 01/11/2022 02:15, Andres Freund wrote:
> On 2022-10-31 10:51:36 +0100, Heikki Linnakangas wrote:
>> These are functions where quite a lot of things happen between the
>> ResourceOwnerEnlarge and ResourceOwnerRemember calls. It's important that
>> there are no unrelated ResourceOwnerRemember() calls in the code in
>> between, otherwise the entry reserved by the ResourceOwnerEnlarge() call
>> might be used up by the intervening ResourceOwnerRemember() and not be
>> available at the intended ResourceOwnerRemember() call anymore. The longer
>> the code path between them is, the harder it is to verify that.
> 
> This seems to work towards a future where only one kind of resource can be
> reserved ahead of time. That doesn't strike me as great.

True. It is enough for all the current callers AFAICS, though. I don't 
remember wanting to reserve multiple resources at the same time. 
Usually, the distance between the Enlarge and Remember calls is very 
short. If it's not, it's error-prone anyway, so we should try to keep 
the distance short.

While we're at it, who says that it's enough to reserve space for only 
one resource of a kind either? For example, what if you want to pin two 
buffers?

If we really need to support that, I propose something like this:

/*
  * Reserve a slot in the resource owner.
  *
  * This is separate from actually inserting an entry because if we run out
  * of memory, it's critical to do so *before* acquiring the resource.
  */
ResOwnerReservation *
ResourceOwnerReserve(ResourceOwner owner)

/*
  * Remember that an object is owner by a ResourceOwner
  *
  * 'reservation' is a slot in the resource owner that was pre-reserved
  * by ResOwnerReservation.
  */
void
ResOwnerRemember(ResOwnerReservaton *reservation, Datum value, 
ResourceOwnerFuncs *kind)


>> Instead of having a separate array/hash for each resource kind, use a
>> single array and hash to hold all kinds of resources. This makes it
>> possible to introduce new resource "kinds" without having to modify the
>> ResourceOwnerData struct. In particular, this makes it possible for
>> extensions to register custom resource kinds.
> 
> As a goal I like this.
> 
> However, I'm not quite sold on the implementation. Two main worries:
> 
> 1) As far as I can tell, the way ResourceOwnerReleaseAll() now works seems to
>     assume that within a phase the reset order does not matter. I don't think
>     that's a good assumption. I e.g. have a patch to replace InProgressBuf with
>     resowner handling, and in-progress IO errors should be processed before
>     before pins are released.

Hmm. Currently, you're not supposed to hold any resources at commit. You 
get warnings about resource leaks if a resource owner is not empty on 
ResourceOwnerReleaseAll(). On abort, does the order matter? I'm not 
familiar with your InProgressBuf patch, but I guess you could handle the 
in-progress IO errors in ReleaseBuffer().

If we do need to worry about release order, perhaps add a "priority" or 
"phase" to each resource kind, and release them in priority order. We 
already have before- and after-locks as phases, but we could generalize 
that.

However, I feel that trying to enforce a particular order moves the 
goalposts. If we need that, let's add it as a separate patch later.

> 2) There's quite a few resource types where we actually don't need an entry in
>     an array, because we can instead add a dlist_node to the resource -
>     avoiding memory overhead and making removal cheap. I have a few pending
>     patches that use that approach, and this doesn't really provide a path for
>     that anymore.

Is that materially better than using the array? The fast path with an 
array is very fast. If it is better, perhaps we should bite the bullet 
and require a dlist node and use that mechanism for all resource types?

> I did try out the benchmark from
> https://postgr.es/m/20221029200025.w7bvlgvamjfo6z44%40awork3.anarazel.de and
> the patches performed well, slightly better than my approach of allocating
> some initial memory for each resarray.

Thank you, glad to hear that!

- Heikki




pgsql-hackers by date:

Previous
From: "Anton A. Melnikov"
Date:
Subject: Re: [PATCH] Backport perl tests for pg_upgrade from 322becb60
Next
From: Peter Eisentraut
Date:
Subject: Re: real/float example for testlibpq3