Re: generate syscache info automatically - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: generate syscache info automatically
Date
Msg-id 9aea7c02-ca6f-4cf0-0e5a-bcc0fd959b7c@eisentraut.org
Whole thread Raw
In response to Re: generate syscache info automatically  (Peter Eisentraut <peter@eisentraut.org>)
Responses Re: generate syscache info automatically
Re: generate syscache info automatically
List pgsql-hackers
I have committed 0002 and 0003, and also a small bug fix in the 
ObjectProperty entry for "transforms".

I have also gotten the automatic generation of the ObjectProperty lookup 
table working (with some warts).

Attached is an updated patch set.

One win here is that the ObjectProperty lookup now uses a hash function 
instead of a linear search.  So hopefully the performance is better (to 
be confirmed) and it could now be used for things like [0].

[0]: 
https://www.postgresql.org/message-id/flat/12f1b1d2-f8cf-c4a2-72ec-441bd79546cb@enterprisedb.com

There was some discussion about whether the catalog files are the right 
place to keep syscache information.  Personally, I would find that more 
convenient.  (Note that we also recently moved the definitions of 
indexes and toast tables from files with the whole list into the various 
catalog files.)  But we can also keep it somewhere else.  The important 
thing is that Catalog.pm can find it somewhere in a structured form.

To finish up the ObjectProperty generation, we also need to store some 
more data, namely the OBJECT_* mappings.  We probably do not want to 
store those in the catalog headers; that looks like a layering violation 
to me.

So, there is some discussion to be had about where to put all this 
across various use cases.


On 24.08.23 16:03, Peter Eisentraut wrote:
> On 03.07.23 07:45, Peter Eisentraut wrote:
>> Here is an updated patch set that adjusts for the recently introduced 
>> named captures.  I will address the other issues later.  I think the 
>> first few patches in the series can be considered nonetheless.
> 
> I have committed the 0001 patch, which was really a (code comment) bug fix.
> 
> I think the patches 0002 and 0003 should be uncontroversial, so I'd like 
> to commit them if no one objects.
> 
> The remaining patches are still WIP.
> 
> 
>> On 15.06.23 09:45, John Naylor wrote:
>>> On Wed, May 31, 2023 at 4:58 AM Peter Eisentraut 
>>> <peter@eisentraut.org <mailto:peter@eisentraut.org>> wrote:
>>>  >
>>>  > I want to report on my on-the-plane-to-PGCon project.
>>>  >
>>>  > The idea was mentioned in [0]. genbki.pl <http://genbki.pl> 
>>> already knows everything about
>>>  > system catalog indexes.  If we add a "please also make a syscache for
>>>  > this one" flag to the catalog metadata, we can have genbki.pl 
>>> <http://genbki.pl> produce
>>>  > the tables in syscache.c and syscache.h automatically.
>>>  >
>>>  > Aside from avoiding the cumbersome editing of those tables, I 
>>> think this
>>>  > layout is also conceptually cleaner, as you can more easily see which
>>>  > system catalog indexes have syscaches and maybe ask questions 
>>> about why
>>>  > or why not.
>>>
>>> When this has come up before, one objection was that index 
>>> declarations shouldn't know about cache names and bucket sizes [1]. 
>>> The second paragraph above makes a reasonable case for that, however. 
>>> I believe one alternative idea was for a script to read the enum, 
>>> which would look something like this:
>>>
>>> #define DECLARE_SYSCACHE(cacheid,indexname,numbuckets) cacheid
>>>
>>> enum SysCacheIdentifier
>>> {
>>> DECLARE_SYSCACHE(AGGFNOID, pg_aggregate_fnoid_index, 16) = 0,
>>> ...
>>> };
>>>
>>> ...which would then look up the other info in the usual way from 
>>> Catalog.pm.
>>>
>>>  > As a possible follow-up, I have also started work on generating the
>>>  > ObjectProperty structure in objectaddress.c.  One of the things 
>>> you need
>>>  > for that is making genbki.pl <http://genbki.pl> aware of the 
>>> syscache information.  There
>>>  > is some more work to be done there, but it's looking promising.
>>>
>>> I haven't studied this, but it seems interesting.
>>>
>>> One other possible improvement: syscache.c has a bunch of #include's, 
>>> one for each catalog with a cache, so there's still a bit of manual 
>>> work in adding a cache, and the current #include list is a bit 
>>> cumbersome. Perhaps it's worth it to have the script emit them as well?
>>>
>>> I also wonder if at some point it will make sense to split off a 
>>> separate script(s) for some things that are unrelated to the 
>>> bootstrap data. genbki.pl <http://genbki.pl> is getting pretty large, 
>>> and there are additional things that could be done with syscaches, 
>>> e.g. inlined eq/hash functions for cache lookup [2].
>>>
>>> [1] 
>>> https://www.postgresql.org/message-id/12460.1570734874@sss.pgh.pa.us 
>>> <https://www.postgresql.org/message-id/12460.1570734874@sss.pgh.pa.us>
>>> [2] 
>>> https://www.postgresql.org/message-id/20210831205906.4wk3s4lvgzkdaqpi%40alap3.anarazel.de
<https://www.postgresql.org/message-id/20210831205906.4wk3s4lvgzkdaqpi%40alap3.anarazel.de>
>>>
>>> -- 
>>> John Naylor
>>> EDB: http://www.enterprisedb.com <http://www.enterprisedb.com>
> 
> 
>

Attachment

pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: Re: Commitfest 2023-09 starts soon
Next
From: Aleksander Alekseev
Date:
Subject: Re: Commitfest 2023-09 starts soon