Re: generate syscache info automatically - Mailing list pgsql-hackers
From | John Naylor |
---|---|
Subject | Re: generate syscache info automatically |
Date | |
Msg-id | CAFBsxsEY5tYGpJ3OPEtFd_cW-=DP5WXG3tsCo+f+mAoV+WU8hw@mail.gmail.com Whole thread Raw |
In response to | Re: generate syscache info automatically (Peter Eisentraut <peter@eisentraut.org>) |
Responses |
Re: generate syscache info automatically
|
List | pgsql-hackers |
On Thu, Aug 31, 2023 at 6:23 PM Peter Eisentraut <peter@eisentraut.org> wrote:
> Attached is an updated patch set.
> There was some discussion about whether the catalog files are the right
> place to keep syscache information. Personally, I would find that more
> convenient. (Note that we also recently moved the definitions of
> indexes and toast tables from files with the whole list into the various
> catalog files.) But we can also keep it somewhere else. The important
> thing is that Catalog.pm can find it somewhere in a structured form.
I don't have anything further to say on how to fit it together, but I'll go ahead share some other comments from a quick read of v3-0003:
+ # XXX hardcoded exceptions
+ # extension doesn't belong to extnamespace
+ $attnum_namespace = undef if $catname eq 'pg_extension';
+ # pg_database owner is spelled datdba
+ $attnum_owner = "Anum_pg_database_datdba" if $catname eq 'pg_database';
These exceptions seem like true exceptions...
+ # XXX?
+ $name_syscache = "SUBSCRIPTIONNAME" if $catname eq 'pg_subscription';
+ # XXX?
+ $is_nsp_name_unique = 1 if $catname eq 'pg_collation';
+ $is_nsp_name_unique = 1 if $catname eq 'pg_opclass';
+ $is_nsp_name_unique = 1 if $catname eq 'pg_opfamily';
+ $is_nsp_name_unique = 1 if $catname eq 'pg_subscription';
... but as the XXX conveys, these indicate a failure to do the right thing. Trying to derive this info from the index declarations is currently fragile. E.g. making one small change to this regex:
- if ($index->{index_decl} =~ /\(\w+name name_ops(, \w+namespace oid_ops)?\)/)
+ if ($index->{index_decl} =~ /\w+name name_ops(, \w+namespace oid_ops)?\)/)
...causes "is_nsp_name_unique" to flip from false to true, and/or sets "name_catcache_id" where it wasn't before, for several entries. It's not quite clear what the "rule" is intended to be, or whether it's robust going forward.
That being the case, perhaps some effort should also be made to make it easy to verify the output matches HEAD (or rather, v3-0001). This is now difficult because of the C99 ".foo = bar" syntax, plus the alphabetical ordering.
+foreach my $catname (@catnames)
+{
+ print $objectproperty_info_fh qq{#include "catalog/${catname}_d.h"\n};
+}
Assuming we have a better way to know which catalogs need object properties, is there a todo to only #include those?
> To finish up the ObjectProperty generation, we also need to store some
> more data, namely the OBJECT_* mappings. We probably do not want to
> store those in the catalog headers; that looks like a layering violation
> to me.
Possibly, but it's also convenient and, one could argue, more straightforward than storing syscache id and num_buckets in the index info.
One last thing: I did try to make the hash function use uint16 for the key (to reduce loop iterations on nul bytes), and that didn't help, so we are left with the ideas I mentioned earlier.
(not changing CF status, because nothing specific is really required to change at the moment, some things up in the air)
> There was some discussion about whether the catalog files are the right
> place to keep syscache information. Personally, I would find that more
> convenient. (Note that we also recently moved the definitions of
> indexes and toast tables from files with the whole list into the various
> catalog files.) But we can also keep it somewhere else. The important
> thing is that Catalog.pm can find it somewhere in a structured form.
I don't have anything further to say on how to fit it together, but I'll go ahead share some other comments from a quick read of v3-0003:
+ # XXX hardcoded exceptions
+ # extension doesn't belong to extnamespace
+ $attnum_namespace = undef if $catname eq 'pg_extension';
+ # pg_database owner is spelled datdba
+ $attnum_owner = "Anum_pg_database_datdba" if $catname eq 'pg_database';
These exceptions seem like true exceptions...
+ # XXX?
+ $name_syscache = "SUBSCRIPTIONNAME" if $catname eq 'pg_subscription';
+ # XXX?
+ $is_nsp_name_unique = 1 if $catname eq 'pg_collation';
+ $is_nsp_name_unique = 1 if $catname eq 'pg_opclass';
+ $is_nsp_name_unique = 1 if $catname eq 'pg_opfamily';
+ $is_nsp_name_unique = 1 if $catname eq 'pg_subscription';
... but as the XXX conveys, these indicate a failure to do the right thing. Trying to derive this info from the index declarations is currently fragile. E.g. making one small change to this regex:
- if ($index->{index_decl} =~ /\(\w+name name_ops(, \w+namespace oid_ops)?\)/)
+ if ($index->{index_decl} =~ /\w+name name_ops(, \w+namespace oid_ops)?\)/)
...causes "is_nsp_name_unique" to flip from false to true, and/or sets "name_catcache_id" where it wasn't before, for several entries. It's not quite clear what the "rule" is intended to be, or whether it's robust going forward.
That being the case, perhaps some effort should also be made to make it easy to verify the output matches HEAD (or rather, v3-0001). This is now difficult because of the C99 ".foo = bar" syntax, plus the alphabetical ordering.
+foreach my $catname (@catnames)
+{
+ print $objectproperty_info_fh qq{#include "catalog/${catname}_d.h"\n};
+}
Assuming we have a better way to know which catalogs need object properties, is there a todo to only #include those?
> To finish up the ObjectProperty generation, we also need to store some
> more data, namely the OBJECT_* mappings. We probably do not want to
> store those in the catalog headers; that looks like a layering violation
> to me.
Possibly, but it's also convenient and, one could argue, more straightforward than storing syscache id and num_buckets in the index info.
One last thing: I did try to make the hash function use uint16 for the key (to reduce loop iterations on nul bytes), and that didn't help, so we are left with the ideas I mentioned earlier.
(not changing CF status, because nothing specific is really required to change at the moment, some things up in the air)
pgsql-hackers by date: