ResourceOwner refactoring - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject ResourceOwner refactoring
Date
Msg-id cbfabeb0-cd3c-e951-a572-19b365ed314d@iki.fi
Whole thread Raw
Responses Re: ResourceOwner refactoring  (Michael Paquier <michael@paquier.xyz>)
Re: ResourceOwner refactoring  (Craig Ringer <craig.ringer@enterprisedb.com>)
List pgsql-hackers
Two recent patches that I have reviewed have reminded me that the 
ResourceOwner interface is a bit clunky. There are two issues:

1. Performance. It's fast enough in most use, but when I was testing 
Kyotaro's catcache patches [1], the Resowner functions to track catcache 
references nevertheless showed up, accounting for about 20% of the CPU 
time used by catcache lookups.

2. It's difficult for extensions to use. There is a callback mechanism 
for extensions, but it's much less convenient to use than the built-in 
functions. The pgcrypto code uses the callbacks currently, and Michael's 
patch [2] would move that support for tracking OpenSSL contexts to the 
core, which makes it a lot more convenient for pgcrypto. Wouldn't it be 
nice if extensions could have the same ergonomics as built-in code, if 
they need to track external resources?

Attached patch refactors the ResourceOwner internals to do that.

The current code in HEAD has a separate array for each "kind" of object 
that can be tracked. The number of different kinds of objects has grown 
over the years, currently there is support for tracking buffers, files, 
catcache, relcache and plancache references, tupledescs, snapshots, DSM 
segments and LLVM JIT contexts. And locks, which are a bit special.

In the patch, I'm using the same array to hold all kinds of objects, and 
carry another field with each tracked object, to tell what kind of an 
object it is. An extension can define a new object kind, by defining 
Release and PrintLeakWarning callback functions for it. The code in 
resowner.c is now much more generic, as it doesn't need to know about 
all the different object kinds anymore (with a couple of exceptions). In 
the new scheme, there is one small fixed-size array to hold a few most 
recently-added references, that is linear-searched, and older entries 
are moved to a hash table.

I haven't done thorough performance testing of this, but with some quick 
testing with Kyotaro's "catcachebench" to stress-test syscache lookups, 
this performs a little better. In that test, all the activity happens in 
the small array portion, but I believe that's true for most use cases.

Thoughts? Can anyone suggest test scenarios to verify the performance of 
this?

[1] 
https://www.postgresql.org/message-id/20201106.172958.1103727352904717607.horikyota.ntt%40gmail.com

[2] https://www.postgresql.org/message-id/20201113031429.GB1631@paquier.xyz

- Heikki

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: logical streaming of xacts via test_decoding is broken
Next
From: torikoshia
Date:
Subject: Re: Is it useful to record whether plans are generic or custom?