Re: generate syscache info automatically - Mailing list pgsql-hackers

From John Naylor
Subject Re: generate syscache info automatically
Date
Msg-id CAFBsxsHMaZ9yR7p=JpzdC2ynKwkDb1PrEr=Q5M__6cmhbbt2iQ@mail.gmail.com
Whole thread Raw
In response to generate syscache info automatically  (Peter Eisentraut <peter@eisentraut.org>)
Responses Re: generate syscache info automatically
List pgsql-hackers
On Wed, May 31, 2023 at 4:58 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> I want to report on my on-the-plane-to-PGCon project.
>
> The idea was mentioned in [0].  genbki.pl already knows everything about
> system catalog indexes.  If we add a "please also make a syscache for
> this one" flag to the catalog metadata, we can have genbki.pl produce
> the tables in syscache.c and syscache.h automatically.
>
> Aside from avoiding the cumbersome editing of those tables, I think this
> layout is also conceptually cleaner, as you can more easily see which
> system catalog indexes have syscaches and maybe ask questions about why
> or why not.

When this has come up before, one objection was that index declarations shouldn't know about cache names and bucket sizes [1]. The second paragraph above makes a reasonable case for that, however. I believe one alternative idea was for a script to read the enum, which would look something like this:

#define DECLARE_SYSCACHE(cacheid,indexname,numbuckets) cacheid

enum SysCacheIdentifier
{
DECLARE_SYSCACHE(AGGFNOID, pg_aggregate_fnoid_index, 16) = 0,
...
};

...which would then look up the other info in the usual way from Catalog.pm.

> As a possible follow-up, I have also started work on generating the
> ObjectProperty structure in objectaddress.c.  One of the things you need
> for that is making genbki.pl aware of the syscache information.  There
> is some more work to be done there, but it's looking promising.

I haven't studied this, but it seems interesting.

One other possible improvement: syscache.c has a bunch of #include's, one for each catalog with a cache, so there's still a bit of manual work in adding a cache, and the current #include list is a bit cumbersome. Perhaps it's worth it to have the script emit them as well?

I also wonder if at some point it will make sense to split off a separate script(s) for some things that are unrelated to the bootstrap data. genbki.pl is getting pretty large, and there are additional things that could be done with syscaches, e.g. inlined eq/hash functions for cache lookup [2].

[1] https://www.postgresql.org/message-id/12460.1570734874@sss.pgh.pa.us
[2] https://www.postgresql.org/message-id/20210831205906.4wk3s4lvgzkdaqpi%40alap3.anarazel.de

--
John Naylor
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: trying again to get incremental backup
Next
From: Amit Kapila
Date:
Subject: Re: Consistent coding for the naming of LR workers