Thread: generic pseudotype IO functions?

generic pseudotype IO functions?

From
Andres Freund
Date:
Hi,

Does anybody have an opinion about introducing generic pseudotype IO
functions? Pseudotype.c/pg_proc.h are slowly growing a number of pretty
useless/redundant copy&pasted functions... Most for cases that are
pretty damn unlikely to be hit by users not knowing what they do.

What about adding a pseudotype_in/out that just error out with a generic
message?

We could start trying to guess the type they are operating on using
get_fn_expr_rettype/get_fn_expr_argtype but that'd require modifying
several callers not providing that info to work reliably?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: generic pseudotype IO functions?

From
Tom Lane
Date:
Andres Freund <andres@2ndquadrant.com> writes:
> Does anybody have an opinion about introducing generic pseudotype IO
> functions?

Yes: -1.

> Pseudotype.c/pg_proc.h are slowly growing a number of pretty
> useless/redundant copy&pasted functions... Most for cases that are
> pretty damn unlikely to be hit by users not knowing what they do.

That's hardly the largest cost associated with inventing a new pseudotype.
Nor are there lots of new pseudotypes coming down the pike, anyway.

> What about adding a pseudotype_in/out that just error out with a generic
> message?

This will break some of the function sanity checks in opr_sanity,
I believe.  Yeah, we could lobotomize that, but I don't see any benefit.
        regards, tom lane



Re: generic pseudotype IO functions?

From
Andres Freund
Date:
On 2014-01-06 10:29:06 -0500, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > Does anybody have an opinion about introducing generic pseudotype IO
> > functions?
> 
> Yes: -1.

Ok, fine with me.

> > Pseudotype.c/pg_proc.h are slowly growing a number of pretty
> > useless/redundant copy&pasted functions... Most for cases that are
> > pretty damn unlikely to be hit by users not knowing what they do.
> 
> That's hardly the largest cost associated with inventing a new pseudotype.
> Nor are there lots of new pseudotypes coming down the pike, anyway.

Robert suggested modelling the lookup of changeset extraction output
callbacks after fdw's FdwRoutine, that's why I am wondering about
it. I noticed while reviewing that I so far had borrowed fdw's C
routines which didn't seem like such a nice thing to do...

> > What about adding a pseudotype_in/out that just error out with a generic
> > message?
> 
> This will break some of the function sanity checks in opr_sanity,
> I believe.  Yeah, we could lobotomize that, but I don't see any benefit.

Yes. But there's precedent in refcursor using text's routines...

(it's in type_sanity, but whatever)

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: generic pseudotype IO functions?

From
Peter Eisentraut
Date:
On 1/6/14, 10:29 AM, Tom Lane wrote:
>> Pseudotype.c/pg_proc.h are slowly growing a number of pretty
>> useless/redundant copy&pasted functions... Most for cases that are
>> pretty damn unlikely to be hit by users not knowing what they do.
> 
> That's hardly the largest cost associated with inventing a new pseudotype.
> Nor are there lots of new pseudotypes coming down the pike, anyway.

If someone wants to do the work, what's the harm in reducing some code
redundancy?

>> What about adding a pseudotype_in/out that just error out with a generic
>> message?
> 
> This will break some of the function sanity checks in opr_sanity,

Then the tests can be changed.





Re: generic pseudotype IO functions?

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> On 1/6/14, 10:29 AM, Tom Lane wrote:
>> This will break some of the function sanity checks in opr_sanity,

> Then the tests can be changed.

That will weaken their ability to detect actual mistakes, no?

If there were a large benefit to merging the pseudotype I/O functions,
I'd think this would be acceptable; but merging them seems of mighty
marginal value.
        regards, tom lane



Re: generic pseudotype IO functions?

From
Andres Freund
Date:
On 2014-01-06 11:28:29 -0500, Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > On 1/6/14, 10:29 AM, Tom Lane wrote:
> >> This will break some of the function sanity checks in opr_sanity,
> 
> > Then the tests can be changed.
> 
> That will weaken their ability to detect actual mistakes, no?

FWIW, I am perfectly fine with duplicating the functions for now - I
just thought that that might not be the best way but I didn't (and still
don't) have a strong opinion. That's why I didn't supply a patch ;)

> If there were a large benefit to merging the pseudotype I/O functions,
> I'd think this would be acceptable; but merging them seems of mighty
> marginal value.

I think I am less concerned about pseudotypes.c than about bloating
pg_proc.h even further and about the annoyance of editing it - but I
guess that should rather be fixed by storing it in a more sensible
format at some point...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: generic pseudotype IO functions?

From
Tom Lane
Date:
Andres Freund <andres@2ndquadrant.com> writes:
> I think I am less concerned about pseudotypes.c than about bloating
> pg_proc.h even further and about the annoyance of editing it - but I
> guess that should rather be fixed by storing it in a more sensible
> format at some point...

Yeah, getting rid of a dozen pseudotype I/O functions is hardly going
to reduce the PITA factor of editing pg_proc.h.  It's interesting to
think about moving all those DATA() macros into some more-maintainable
format --- I'm not sure what it should be exactly, but I think something
that can insert plausible defaults for omitted columns would be a big help
for pg_proc and maybe some of the other catalogs too.
        regards, tom lane



Re: generic pseudotype IO functions?

From
Andres Freund
Date:
On 2014-01-06 11:56:28 -0500, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > I think I am less concerned about pseudotypes.c than about bloating
> > pg_proc.h even further and about the annoyance of editing it - but I
> > guess that should rather be fixed by storing it in a more sensible
> > format at some point...
> 
> Yeah, getting rid of a dozen pseudotype I/O functions is hardly going
> to reduce the PITA factor of editing pg_proc.h.  It's interesting to
> think about moving all those DATA() macros into some more-maintainable
> format --- I'm not sure what it should be exactly, but I think something
> that can insert plausible defaults for omitted columns would be a big help
> for pg_proc and maybe some of the other catalogs too.

Alvaro previously suggested storing pg_proc in json. Not sure I like it,
but it'd sure be better than the current format if we derive unspecified
values from other columns (e.g. prorows = 0 iff proretset).

I think we also should auto-assign the oids for pg_proc (and some other
tables) rows if we go there. Afaics there's really not much reason to
keep them stable and it's by far the most frequent conflict I have seen
with keeping patches up2date.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: generic pseudotype IO functions?

From
Tom Lane
Date:
Andres Freund <andres@2ndquadrant.com> writes:
> I think we also should auto-assign the oids for pg_proc (and some other
> tables) rows if we go there.

-1 ... you've evidently not built any opclasses lately.

Yeah, we could probably improve the bootstrap infrastructure enough to not
need literal OIDs in so many places in the initial catalog contents, but
it'd be lots of trouble.  It definitely is *not* something to buy into at
the same time as redoing the creation of the .bki file.
        regards, tom lane



Re: generic pseudotype IO functions?

From
Andres Freund
Date:
On 2014-01-07 10:04:33 -0500, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > I think we also should auto-assign the oids for pg_proc (and some other
> > tables) rows if we go there.
> 
> -1 ... you've evidently not built any opclasses lately.

True. Not sure if I ever built one, but when playing around.

To the point that I am not seing the problem right now. I am not
proposing to get rid of statically assigned oids in pg_type et al.. The
references to procs mostly seem to be typed 'regproc' so there aren't
many references to function's oids. There's a few procs that are
involved in bootstraping, those likely would need to keep a fixed oid,
but that doesn't sound problematic to me.

> Yeah, we could probably improve the bootstrap infrastructure enough to not
> need literal OIDs in so many places in the initial catalog contents, but
> it'd be lots of trouble.  It definitely is *not* something to buy into at
> the same time as redoing the creation of the .bki file.

Not sure. Any such infrastructure should have support for filling in
defaults for things that don't need to be specified - imo generating an
oid for that sounds like it would fit in there. Doesn't - and possibly
shouldn't - be the same commit though.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: generic pseudotype IO functions?

From
Tom Lane
Date:
Andres Freund <andres@2ndquadrant.com> writes:
> To the point that I am not seing the problem right now. I am not
> proposing to get rid of statically assigned oids in pg_type et al.. The
> references to procs mostly seem to be typed 'regproc' so there aren't
> many references to function's oids.

*Some* of them are typed regproc.  Many are not, because there's need to
refer to overloaded function names.

A minimum change before we could even consider abandoning auto assignment
of pg_proc entries would be to make regprocedure_in work in bootstrap
mode, so that we could use regprocedure as a catalog column type.  And
regoperator_in, too, if you wanted to auto-assign operator OIDs.  And
you'd need some solution for generating fmgroids.h, as well as the
handmade #define's for selected operator OIDs that appear now in
pg_operator.h.  There are probably more issues, but those are the ones
that occur to me after thirty seconds' thought.

Even after all that, we can *not* go over to auto-assignment of pg_type
OIDs, because there is way too much stuff that assumes those OIDs are
stable (starting with on-disk storage of arrays).  I'm not entirely
sure that it's okay to have function OIDs varying across releases, either;
who's to say that there's not client code looking at fmgroids.h, or
otherwise hard-wiring the numbers it uses for fastpath function calls?

On the whole I think that this'd be a lot more work, and a lot more
potential for breakage, than it's really worth.
        regards, tom lane



Re: generic pseudotype IO functions?

From
Andres Freund
Date:
On 2014-01-07 11:08:22 -0500, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > To the point that I am not seing the problem right now. I am not
> > proposing to get rid of statically assigned oids in pg_type et al.. The
> > references to procs mostly seem to be typed 'regproc' so there aren't
> > many references to function's oids.
> 
> *Some* of them are typed regproc.  Many are not, because there's need to
> refer to overloaded function names.

So, for the archives, this seems to be:
* pg_cast
* pg_aggregate
* pg_event_trigger
* pg_foreign_data_wrapper
* pg_language
* pg_proc (duh...)

It strikes me that several of the tables using reproc could very well
stand to use regprocedure instead.

> A minimum change before we could even consider abandoning auto assignment
> of pg_proc entries would be to make regprocedure_in work in bootstrap
> mode, so that we could use regprocedure as a catalog column type.

Yes.

> And regoperator_in, too, if you wanted to auto-assign operator OIDs.

I personally find that much less interesting because there are so much
fewer operators in comparison to procs, so conflicts are rarer.

> And
> you'd need some solution for generating fmgroids.h, as well as the
> handmade #define's for selected operator OIDs that appear now in
> pg_operator.h.

If we're able to generate the .bki file, this seems like a solveable problem.

> There are probably more issues, but those are the ones
> that occur to me after thirty seconds' thought.

Yes.

> Even after all that, we can *not* go over to auto-assignment of pg_type
> OIDs, because there is way too much stuff that assumes those OIDs are
> stable (starting with on-disk storage of arrays).

Yes, I think that's totally out of question.

> I'm not entirely
> sure that it's okay to have function OIDs varying across releases, either;
> who's to say that there's not client code looking at fmgroids.h, or
> otherwise hard-wiring the numbers it uses for fastpath function calls?

I am not particularly worried about that. That'd mean somebody built a
solution specifically for calling builtin functions since all user
created functions will be dynamic. And that that client is using a
fmgroids.h from another release than the one he's working with - a
recipe for trouble in such a solution independent of this imo.

> On the whole I think that this'd be a lot more work, and a lot more
> potential for breakage, than it's really worth.

That might be true. It's probably the most frequent cause for conflicts
in patches people submit tho...

Anyway, pg_proc's contents would need to be generated before this
anyway...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: generic pseudotype IO functions?

From
Peter Eisentraut
Date:
On Mon, 2014-01-06 at 17:36 +0100, Andres Freund wrote:
> FWIW, I am perfectly fine with duplicating the functions for now - I
> just thought that that might not be the best way but I didn't (and
> still
> don't) have a strong opinion.

Could we just put 0 in for the functions' OID and have code elsewhere
that errors "there is no input function for this type"?





Re: generic pseudotype IO functions?

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> On Mon, 2014-01-06 at 17:36 +0100, Andres Freund wrote:
>> FWIW, I am perfectly fine with duplicating the functions for now - I
>> just thought that that might not be the best way but I didn't (and
>> still don't) have a strong opinion.

> Could we just put 0 in for the functions' OID and have code elsewhere
> that errors "there is no input function for this type"?

That doesn't seem like much of an improvement to me: that would be
taking a catalog corruption condition and blessing it as a legitimate
state of affairs, thereby reducing our ability to detect problems.

One instance where it would create issues is that I'm pretty sure
pg_dump would get confused by such a type.  Admittedly, pg_dump will
never try to dump the built-in pseudotypes, but do we really want them
handled so differently from user-definable types?
        regards, tom lane