Re: pg_dump vs. TRANSFORMs - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: pg_dump vs. TRANSFORMs
Date
Msg-id 20160505132851.GM10850@tamriel.snowman.net
Whole thread Raw
In response to Re: pg_dump vs. TRANSFORMs  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> > On 5/4/16 2:39 PM, Stephen Frost wrote:
> >> These checks are looking for the functions used by the transform in the
> >> list of functions that pg_dump has loaded, but in 9.5, we don't load any
> >> of the function in pg_catalog, and even with my patches, we only dump
> >> the functions in pg_catalog that have an ACL which has been changed from
> >> the default.
>
> > This issue is not specific to transforms.  For example, if you create a
> > user-defined cast using a built-in function, the cast won't be dumped.
> > Obviously, this is a problem, but it needs a more general solution.
>
> Actually, I believe it will be dumped.  selectDumpableCast believes it
> should dump casts with OID >= FirstNormalObjectId.  That's a kluge no
> doubt, but reasonably effective; looks like we've been doing that since
> 9.0.

No, it isn't dumped.  An interesting example test case is this:

-----
create function pg_catalog.toint(varchar) returns integer strict immutable language sql as 'select cast($1 as
integer);';

create cast (varchar as integer) with function toint(varchar);
-----

Note that neither the function nor the cast is dumped, and this was with
9.5.  This is because:

getFuncs() intentionally skips all functions in pg_catalog (unless they
are members of extensions...).  dumpCast() attempts to find the function
referenced by the cast in the set of functions which getFuncs()
collected, and simply gives up and doesn't dump the cast if it can't
find the function.

For my 2c, this coding pattern:

--------------
funcInfo = findFuncByOid(cast->castfunc);
if (funcInfo == NULL)return;
--------------

in pg_dump is really bad.  Thankfully, it looks like that's only
happening in dumpCast() and dumpTransform().  We used to have a similar
issue, it looks like, with extensions, which was fixed in 89c29c0.  It
looks like we need to either stop excluding the functions in pg_catalog
during getFuncs(), or add in more exceptions to the "don't collect info
about functions in pg_catalog" rule.

I'm also inclined to think that we should fix dumping of non-initdb
functions which have been created in pg_catalog.  I'm not sure how far
to take that though, as we have a similar issue for most object types
when it comes to pg_catalog.  Perhaps selectDumpableObject() should be
considering FirstNormalObjectId and constraining what component we dump
for from-initdb objects to only ACL, and pg_catalog, as a namespace,
should be set to dump all components (in HEAD, and just set to not dump
anything for from-initdb objects in back-branches, but everything for
user-defined objects).

> pg_dump appears not to have a special-case selectDumpableTransform
> routine.  Instead it falls back on the generic selectDumpableObject
> for transforms.  That's a bad idea because the only useful bit of
> knowledge selectDumpableObject has is to look at the containing
> namespace ... and transforms don't have one, just as casts don't.
>
> My recommendation is to clone that cast logic for transforms.

We should do this also, if we don't change selectDumpableObject, but
we should do it because we might have from-initdb transforms one day
and the current logic would end up selecting those transforms to dump
out (though dumpTransform would end up skipping them currently because
they'd be referring to functions that we didn't get information about,
but hopefully we're going to fix that).

Thanks!

Stephen

pgsql-hackers by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: what to revert
Next
From: Stephen Frost
Date:
Subject: Re: pg_dump vs. TRANSFORMs