Re: generate syscache info automatically - Mailing list pgsql-hackers
From | Peter Eisentraut |
---|---|
Subject | Re: generate syscache info automatically |
Date | |
Msg-id | 9aea7c02-ca6f-4cf0-0e5a-bcc0fd959b7c@eisentraut.org Whole thread Raw |
In response to | Re: generate syscache info automatically (Peter Eisentraut <peter@eisentraut.org>) |
Responses |
Re: generate syscache info automatically
Re: generate syscache info automatically |
List | pgsql-hackers |
I have committed 0002 and 0003, and also a small bug fix in the ObjectProperty entry for "transforms". I have also gotten the automatic generation of the ObjectProperty lookup table working (with some warts). Attached is an updated patch set. One win here is that the ObjectProperty lookup now uses a hash function instead of a linear search. So hopefully the performance is better (to be confirmed) and it could now be used for things like [0]. [0]: https://www.postgresql.org/message-id/flat/12f1b1d2-f8cf-c4a2-72ec-441bd79546cb@enterprisedb.com There was some discussion about whether the catalog files are the right place to keep syscache information. Personally, I would find that more convenient. (Note that we also recently moved the definitions of indexes and toast tables from files with the whole list into the various catalog files.) But we can also keep it somewhere else. The important thing is that Catalog.pm can find it somewhere in a structured form. To finish up the ObjectProperty generation, we also need to store some more data, namely the OBJECT_* mappings. We probably do not want to store those in the catalog headers; that looks like a layering violation to me. So, there is some discussion to be had about where to put all this across various use cases. On 24.08.23 16:03, Peter Eisentraut wrote: > On 03.07.23 07:45, Peter Eisentraut wrote: >> Here is an updated patch set that adjusts for the recently introduced >> named captures. I will address the other issues later. I think the >> first few patches in the series can be considered nonetheless. > > I have committed the 0001 patch, which was really a (code comment) bug fix. > > I think the patches 0002 and 0003 should be uncontroversial, so I'd like > to commit them if no one objects. > > The remaining patches are still WIP. > > >> On 15.06.23 09:45, John Naylor wrote: >>> On Wed, May 31, 2023 at 4:58 AM Peter Eisentraut >>> <peter@eisentraut.org <mailto:peter@eisentraut.org>> wrote: >>> > >>> > I want to report on my on-the-plane-to-PGCon project. >>> > >>> > The idea was mentioned in [0]. genbki.pl <http://genbki.pl> >>> already knows everything about >>> > system catalog indexes. If we add a "please also make a syscache for >>> > this one" flag to the catalog metadata, we can have genbki.pl >>> <http://genbki.pl> produce >>> > the tables in syscache.c and syscache.h automatically. >>> > >>> > Aside from avoiding the cumbersome editing of those tables, I >>> think this >>> > layout is also conceptually cleaner, as you can more easily see which >>> > system catalog indexes have syscaches and maybe ask questions >>> about why >>> > or why not. >>> >>> When this has come up before, one objection was that index >>> declarations shouldn't know about cache names and bucket sizes [1]. >>> The second paragraph above makes a reasonable case for that, however. >>> I believe one alternative idea was for a script to read the enum, >>> which would look something like this: >>> >>> #define DECLARE_SYSCACHE(cacheid,indexname,numbuckets) cacheid >>> >>> enum SysCacheIdentifier >>> { >>> DECLARE_SYSCACHE(AGGFNOID, pg_aggregate_fnoid_index, 16) = 0, >>> ... >>> }; >>> >>> ...which would then look up the other info in the usual way from >>> Catalog.pm. >>> >>> > As a possible follow-up, I have also started work on generating the >>> > ObjectProperty structure in objectaddress.c. One of the things >>> you need >>> > for that is making genbki.pl <http://genbki.pl> aware of the >>> syscache information. There >>> > is some more work to be done there, but it's looking promising. >>> >>> I haven't studied this, but it seems interesting. >>> >>> One other possible improvement: syscache.c has a bunch of #include's, >>> one for each catalog with a cache, so there's still a bit of manual >>> work in adding a cache, and the current #include list is a bit >>> cumbersome. Perhaps it's worth it to have the script emit them as well? >>> >>> I also wonder if at some point it will make sense to split off a >>> separate script(s) for some things that are unrelated to the >>> bootstrap data. genbki.pl <http://genbki.pl> is getting pretty large, >>> and there are additional things that could be done with syscaches, >>> e.g. inlined eq/hash functions for cache lookup [2]. >>> >>> [1] >>> https://www.postgresql.org/message-id/12460.1570734874@sss.pgh.pa.us >>> <https://www.postgresql.org/message-id/12460.1570734874@sss.pgh.pa.us> >>> [2] >>> https://www.postgresql.org/message-id/20210831205906.4wk3s4lvgzkdaqpi%40alap3.anarazel.de <https://www.postgresql.org/message-id/20210831205906.4wk3s4lvgzkdaqpi%40alap3.anarazel.de> >>> >>> -- >>> John Naylor >>> EDB: http://www.enterprisedb.com <http://www.enterprisedb.com> > > >
Attachment
pgsql-hackers by date: