Re: generate syscache info automatically - Mailing list pgsql-hackers

From John Naylor
Subject Re: generate syscache info automatically
Date
Msg-id CAFBsxsEY5tYGpJ3OPEtFd_cW-=DP5WXG3tsCo+f+mAoV+WU8hw@mail.gmail.com
Whole thread Raw
In response to Re: generate syscache info automatically  (Peter Eisentraut <peter@eisentraut.org>)
Responses Re: generate syscache info automatically
List pgsql-hackers

On Thu, Aug 31, 2023 at 6:23 PM Peter Eisentraut <peter@eisentraut.org> wrote:

> Attached is an updated patch set.

> There was some discussion about whether the catalog files are the right
> place to keep syscache information.  Personally, I would find that more
> convenient.  (Note that we also recently moved the definitions of
> indexes and toast tables from files with the whole list into the various
> catalog files.)  But we can also keep it somewhere else.  The important
> thing is that Catalog.pm can find it somewhere in a structured form.

I don't have anything further to say on how to fit it together, but I'll go ahead share some other comments from a quick read of v3-0003:

+ # XXX hardcoded exceptions
+ # extension doesn't belong to extnamespace
+ $attnum_namespace = undef if $catname eq 'pg_extension';
+ # pg_database owner is spelled datdba
+ $attnum_owner = "Anum_pg_database_datdba" if $catname eq 'pg_database';

These exceptions seem like true exceptions...

+ # XXX?
+ $name_syscache = "SUBSCRIPTIONNAME" if $catname eq 'pg_subscription';
+ # XXX?
+ $is_nsp_name_unique = 1 if $catname eq 'pg_collation';
+ $is_nsp_name_unique = 1 if $catname eq 'pg_opclass';
+ $is_nsp_name_unique = 1 if $catname eq 'pg_opfamily';
+ $is_nsp_name_unique = 1 if $catname eq 'pg_subscription';

... but as the XXX conveys, these indicate a failure to do the right thing. Trying to derive this info from the index declarations is currently fragile. E.g. making one small change to this regex:

-               if ($index->{index_decl} =~ /\(\w+name name_ops(, \w+namespace oid_ops)?\)/)
+               if ($index->{index_decl} =~ /\w+name name_ops(, \w+namespace oid_ops)?\)/)

...causes "is_nsp_name_unique" to flip from false to true, and/or sets "name_catcache_id" where it wasn't before, for several entries. It's not quite clear what the "rule" is intended to be, or whether it's robust going forward.

That being the case, perhaps some effort should also be made to make it easy to verify the output matches HEAD (or rather, v3-0001). This is now difficult because of the C99 ".foo = bar" syntax, plus the alphabetical ordering.

+foreach my $catname (@catnames)
+{
+ print $objectproperty_info_fh qq{#include "catalog/${catname}_d.h"\n};
+}

Assuming we have a better way to know which catalogs need object properties, is there a todo to only #include those?

> To finish up the ObjectProperty generation, we also need to store some
> more data, namely the OBJECT_* mappings.  We probably do not want to
> store those in the catalog headers; that looks like a layering violation
> to me.

Possibly, but it's also convenient and, one could argue, more straightforward than storing syscache id and num_buckets in the index info.

One last thing: I did try to make the hash function use uint16 for the key (to reduce loop iterations on nul bytes), and that didn't help, so we are left with the ideas I mentioned earlier.

(not changing CF status, because nothing specific is really required to change at the moment, some things up in the air)

--
John Naylor
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: pg_upgrade and logical replication
Next
From: Nazir Bilal Yavuz
Date:
Subject: Re: Build the docs if there are changes in docs and don't run other tasks if the changes are only in docs