Thread: evtfoid and evtowner missing in findoidjoins/README

evtfoid and evtowner missing in findoidjoins/README

From
"Joel Jacobson"
Date:
The below references are already properly documented in


but missing in src/tools/findoidjoins/README.

Join pg_catalog.pg_event_trigger.evtowner => pg_catalog.pg_authid.oid
Join pg_catalog.pg_event_trigger.evtfoid => pg_catalog.pg_proc.oid

I'm not sure what the process of updating the README is,
the git log seems to indicate this is usually part of the release cycle,
so perhaps it's OK this file is out-of-sync between releases?

But if so, that won't explain these two, since they have been around for ages.

/Joel

Re: evtfoid and evtowner missing in findoidjoins/README

From
Tom Lane
Date:
"Joel Jacobson" <joel@compiler.org> writes:
> The below references are already properly documented in
>    https://www.postgresql.org/docs/current/catalog-pg-event-trigger.html
> but missing in src/tools/findoidjoins/README.
> Join pg_catalog.pg_event_trigger.evtowner => pg_catalog.pg_authid.oid
> Join pg_catalog.pg_event_trigger.evtfoid => pg_catalog.pg_proc.oid

Yup, no surprise given the way findoidjoins works: it could only
discover those relationships if pg_event_trigger had some entries in
the end state of the regression database.  Of course it doesn't, and
I'd be against leaving a live event trigger in place in that DB.
(I suspect there are other similar gaps in the coverage.)

I kind of wonder whether findoidjoins has outlived its purpose and
we should just remove it (along with the oidjoins test script).
IMO it was intended to find mistakes in the initial catalog data,
but given the current policy that the .dat files shall not contain
numeric OID references, that type of mistake is impossible anymore.
Certainly, it's been so long since that test script has caught
anything that it doesn't seem worth the annual-or-so maintenance
effort to update it.

A different line of thought would be to try to teach findoidjoins
to scrape info about catalog references out of catalogs.sgml, and
use that instead of or in addition to its current methods.  That
seems like a fair amount of work though, and again I can't get
excited that it'd be worth the trouble.

Also, I recall mutterings on -hackers about adding foreign-key
entries to pg_constraint to document the catalog reference
relationships.  (In my possibly-faulty recollection, the idea
was that these'd only be documentation and would lack enforcement
triggers; but perhaps we'd allow the planner to trust them for
purposes of optimizing multi-catalog queries.)  If we had those,
we could make findoidjoins use them instead of trawling the data,
or maybe throw away findoidjoins per se and let the oidjoins.sql
script read the FK entries to find out what to check dynamically.

            regards, tom lane



Re: evtfoid and evtowner missing in findoidjoins/README

From
"Joel Jacobson"
Date:
On Sun, Jan 17, 2021, at 18:16, Tom Lane wrote:
> I kind of wonder whether findoidjoins has outlived its purpose and
> we should just remove it (along with the oidjoins test script).

A bit of background:
I'm working on an extension where I need SQL access to this reference information.
My extension is successfully automatically helping me to find problems in extension update-scripts,
where an update from a version to a version would give a different result than directly installing the to-version from scratch.

Currently, I'm parsing findoidjoins/README and importing the "from column" and "to column"
to a lookup-table, which is used by my extension.

If findoidjoins is removed, I would be happy as long as this reference information
continues to be provided in some other simple machine-readable way,
like a CSV-file somewhere in the repo, or even better: making this information
available from SQL via a new lookup-table in pg_catalog.

I can see how parsing catalogs.sgml would be doable,
but proper SGML parsing is quite complex since it's a recursive language,
and can't be reliably parsed with e.g. simple regexes.

So I think adding this as a lookup table in pg_catalog is the best solution,
since extension writers could then use this information in various ways.

The information is theoretically already available via catalogs.sgml,
but a) it's not easy to parse, and b) it's not available from SQL.

Are there any other hackers who ever wished they would have had SQL
access to these catalog references?

If desired by enough others, perhaps something along these lines could work?

CREATE TABLE pg_catalog.pg_references (
colfrom text,
colto text,
UNIQUE (colfrom)
);

Where "colfrom" would be e.g. "pg_catalog.pg_class.relfilenode"
and "colto" would be "pg_catalog.pg_class.oid" for that example.

Not sure about the column names "colfrom"/"colto" though,
since the abbreviation for columns seems to be "att" in the pg_catalog context.

/Joel

Re: evtfoid and evtowner missing in findoidjoins/README

From
Tom Lane
Date:
"Joel Jacobson" <joel@compiler.org> writes:
> A bit of background:
> I'm working on an extension where I need SQL access to this reference
> information.  My extension is successfully automatically helping me to
> find problems in extension update-scripts, where an update from a
> version to a version would give a different result than directly
> installing the to-version from scratch.

> Currently, I'm parsing findoidjoins/README and importing the "from
> column" and "to column" to a lookup-table, which is used by my
> extension.

Hmm.  That README was certainly never intended to be used that way ;-)

> So I think adding this as a lookup table in pg_catalog is the best solution,
> since extension writers could then use this information in various ways.

I'm definitely -1 on adding a catalog for that.  But it seems like the
idea of not-really-enforced FK entries in pg_constraint would serve your
purposes just as well (and it'd be better from the standpoint of getting
the planner to be aware of these relationships).

> The information is theoretically already available via catalogs.sgml,
> but a) it's not easy to parse, and b) it's not available from SQL.

Well, SGML is actually plenty easy to parse as long as you've got xml
tooling at hand.  We'd never want to introduce such a dependency in the
normal build process, but making something like findoidjoins depend on
such tools seems within reason.  I recall having whipped up some one-off
Perl scripts of that sort when I was doing that massive rewrite of the
func.sgml tables last year.  I didn't save them though, and in any case
I'm the world's worst Perl programmer, so it'd be better for someone
with more Perl-fu to take point if we decide to go that route.

            regards, tom lane



Re: evtfoid and evtowner missing in findoidjoins/README

From
"Joel Jacobson"
Date:
On Sun, Jan 17, 2021, at 21:32, Tom Lane wrote:
>Well, SGML is actually plenty easy to parse as long as you've got xml
>tooling at hand.  We'd never want to introduce such a dependency in the
>normal build process, but making something like findoidjoins depend on
>such tools seems within reason.  I recall having whipped up some one-off
>Perl scripts of that sort when I was doing that massive rewrite of the
>func.sgml tables last year.  I didn't save them though, and in any case
>I'm the world's worst Perl programmer, so it'd be better for someone
>with more Perl-fu to take point if we decide to go that route.

I went a head and implemented the parser, it was indeed easy.

Patch attached.

    Add catalog_oidjoins.pl -- parses catalog references out of catalogs.sgml

    Since doc/src/sgml/catalogs.sgml is the master where catalog references
    are to be documented, if someone needs machine-readable access to
    such information, it should be extracted from this document.

    The added script catalog_oidjoins.pl parses the SGML and extract
    the fields necessary to produce the same output as generated
    by findoidjoins, which has historically been copy/pasted to the README.

    This is to allow for easy comparison, to verify the correctness
    of catalog_oidjoins.pl, and if necessary, to update the README
    based on the information in catalogs.sgml.

    Helper-files:

    diff_oidjoins.sh
    Runs ./catalog_oidjoins.pl and compares its output
    with STDIN. Shows a diff of the result, witout any context.

    test_oidjoins.sh
    Runs ./diff_oidjoins.sh for both the README
    and the output from ./findoidjoins regression.

    bogus_oidjoins.txt
    List of bogus diff entires to ignore,
    based on documentation in README.

After having run make installcheck in src/test/regress,
the test_oidjoins.sh can be run:

$ ./test_oidjoins.sh
README:
+ Join pg_catalog.pg_attrdef.adnum => pg_catalog.pg_attribute.attnum
+ Join pg_catalog.pg_class.relrewrite => pg_catalog.pg_class.oid
+ Join pg_catalog.pg_constraint.confkey []=> pg_catalog.pg_attribute.attnum
+ Join pg_catalog.pg_constraint.conkey []=> pg_catalog.pg_attribute.attnum
+ Join pg_catalog.pg_db_role_setting.setrole => pg_catalog.pg_authid.oid
+ Join pg_catalog.pg_default_acl.defaclnamespace => pg_catalog.pg_namespace.oid
+ Join pg_catalog.pg_default_acl.defaclrole => pg_catalog.pg_authid.oid
+ Join pg_catalog.pg_event_trigger.evtfoid => pg_catalog.pg_proc.oid
+ Join pg_catalog.pg_event_trigger.evtowner => pg_catalog.pg_authid.oid
+ Join pg_catalog.pg_extension.extconfig []=> pg_catalog.pg_class.oid
+ Join pg_catalog.pg_foreign_data_wrapper.fdwhandler => pg_catalog.pg_proc.oid
+ Join pg_catalog.pg_foreign_data_wrapper.fdwvalidator => pg_catalog.pg_proc.oid
+ Join pg_catalog.pg_foreign_table.ftrelid => pg_catalog.pg_class.oid
+ Join pg_catalog.pg_foreign_table.ftserver => pg_catalog.pg_foreign_server.oid
+ Join pg_catalog.pg_index.indkey => pg_catalog.pg_attribute.attnum
+ Join pg_catalog.pg_partitioned_table.partattrs => pg_catalog.pg_attribute.attnum
+ Join pg_catalog.pg_policy.polroles []=> pg_catalog.pg_authid.oid
+ Join pg_catalog.pg_publication.pubowner => pg_catalog.pg_authid.oid
+ Join pg_catalog.pg_publication_rel.prpubid => pg_catalog.pg_publication.oid
+ Join pg_catalog.pg_publication_rel.prrelid => pg_catalog.pg_class.oid
+ Join pg_catalog.pg_range.rngmultitypid => pg_catalog.pg_type.oid
+ Join pg_catalog.pg_seclabel.classoid => pg_catalog.pg_class.oid
+ Join pg_catalog.pg_shseclabel.classoid => pg_catalog.pg_class.oid
+ Join pg_catalog.pg_statistic.staattnum => pg_catalog.pg_attribute.attnum
+ Join pg_catalog.pg_statistic_ext.stxkeys => pg_catalog.pg_attribute.attnum
+ Join pg_catalog.pg_subscription.subdbid => pg_catalog.pg_database.oid
+ Join pg_catalog.pg_subscription.subowner => pg_catalog.pg_authid.oid
+ Join pg_catalog.pg_subscription_rel.srrelid => pg_catalog.pg_class.oid
+ Join pg_catalog.pg_subscription_rel.srsubid => pg_catalog.pg_subscription.oid
+ Join pg_catalog.pg_trigger.tgattr => pg_catalog.pg_attribute.attnum
+ Join pg_catalog.pg_type.typsubscript => pg_catalog.pg_proc.oid
+ Join pg_catalog.pg_user_mapping.umserver => pg_catalog.pg_foreign_server.oid
+ Join pg_catalog.pg_user_mapping.umuser => pg_catalog.pg_authid.oid
findoidjoins:
+ Join pg_catalog.pg_attrdef.adnum => pg_catalog.pg_attribute.attnum
+ Join pg_catalog.pg_class.relrewrite => pg_catalog.pg_class.oid
+ Join pg_catalog.pg_constraint.confkey []=> pg_catalog.pg_attribute.attnum
+ Join pg_catalog.pg_constraint.conkey []=> pg_catalog.pg_attribute.attnum
+ Join pg_catalog.pg_db_role_setting.setrole => pg_catalog.pg_authid.oid
+ Join pg_catalog.pg_default_acl.defaclnamespace => pg_catalog.pg_namespace.oid
+ Join pg_catalog.pg_default_acl.defaclrole => pg_catalog.pg_authid.oid
+ Join pg_catalog.pg_event_trigger.evtfoid => pg_catalog.pg_proc.oid
+ Join pg_catalog.pg_event_trigger.evtowner => pg_catalog.pg_authid.oid
+ Join pg_catalog.pg_extension.extconfig []=> pg_catalog.pg_class.oid
+ Join pg_catalog.pg_foreign_data_wrapper.fdwhandler => pg_catalog.pg_proc.oid
+ Join pg_catalog.pg_foreign_data_wrapper.fdwvalidator => pg_catalog.pg_proc.oid
+ Join pg_catalog.pg_foreign_table.ftrelid => pg_catalog.pg_class.oid
+ Join pg_catalog.pg_foreign_table.ftserver => pg_catalog.pg_foreign_server.oid
+ Join pg_catalog.pg_index.indkey => pg_catalog.pg_attribute.attnum
+ Join pg_catalog.pg_partitioned_table.partattrs => pg_catalog.pg_attribute.attnum
+ Join pg_catalog.pg_policy.polroles []=> pg_catalog.pg_authid.oid
+ Join pg_catalog.pg_publication.pubowner => pg_catalog.pg_authid.oid
+ Join pg_catalog.pg_publication_rel.prpubid => pg_catalog.pg_publication.oid
+ Join pg_catalog.pg_publication_rel.prrelid => pg_catalog.pg_class.oid
+ Join pg_catalog.pg_seclabel.classoid => pg_catalog.pg_class.oid
+ Join pg_catalog.pg_statistic_ext.stxkeys => pg_catalog.pg_attribute.attnum
+ Join pg_catalog.pg_subscription.subdbid => pg_catalog.pg_database.oid
+ Join pg_catalog.pg_subscription.subowner => pg_catalog.pg_authid.oid
+ Join pg_catalog.pg_subscription_rel.srrelid => pg_catalog.pg_class.oid
+ Join pg_catalog.pg_subscription_rel.srsubid => pg_catalog.pg_subscription.oid
+ Join pg_catalog.pg_trigger.tgattr => pg_catalog.pg_attribute.attnum
+ Join pg_catalog.pg_user_mapping.umserver => pg_catalog.pg_foreign_server.oid
+ Join pg_catalog.pg_user_mapping.umuser => pg_catalog.pg_authid.oid
diff of diffs:
21d20
< + Join pg_catalog.pg_range.rngmultitypid => pg_catalog.pg_type.oid
23,24d21
< + Join pg_catalog.pg_shseclabel.classoid => pg_catalog.pg_class.oid
< + Join pg_catalog.pg_statistic.staattnum => pg_catalog.pg_attribute.attnum
31d27
< + Join pg_catalog.pg_type.typsubscript => pg_catalog.pg_proc.oid



Attachment