Thread: generic pseudotype IO functions?
Hi, Does anybody have an opinion about introducing generic pseudotype IO functions? Pseudotype.c/pg_proc.h are slowly growing a number of pretty useless/redundant copy&pasted functions... Most for cases that are pretty damn unlikely to be hit by users not knowing what they do. What about adding a pseudotype_in/out that just error out with a generic message? We could start trying to guess the type they are operating on using get_fn_expr_rettype/get_fn_expr_argtype but that'd require modifying several callers not providing that info to work reliably? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > Does anybody have an opinion about introducing generic pseudotype IO > functions? Yes: -1. > Pseudotype.c/pg_proc.h are slowly growing a number of pretty > useless/redundant copy&pasted functions... Most for cases that are > pretty damn unlikely to be hit by users not knowing what they do. That's hardly the largest cost associated with inventing a new pseudotype. Nor are there lots of new pseudotypes coming down the pike, anyway. > What about adding a pseudotype_in/out that just error out with a generic > message? This will break some of the function sanity checks in opr_sanity, I believe. Yeah, we could lobotomize that, but I don't see any benefit. regards, tom lane
On 2014-01-06 10:29:06 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > Does anybody have an opinion about introducing generic pseudotype IO > > functions? > > Yes: -1. Ok, fine with me. > > Pseudotype.c/pg_proc.h are slowly growing a number of pretty > > useless/redundant copy&pasted functions... Most for cases that are > > pretty damn unlikely to be hit by users not knowing what they do. > > That's hardly the largest cost associated with inventing a new pseudotype. > Nor are there lots of new pseudotypes coming down the pike, anyway. Robert suggested modelling the lookup of changeset extraction output callbacks after fdw's FdwRoutine, that's why I am wondering about it. I noticed while reviewing that I so far had borrowed fdw's C routines which didn't seem like such a nice thing to do... > > What about adding a pseudotype_in/out that just error out with a generic > > message? > > This will break some of the function sanity checks in opr_sanity, > I believe. Yeah, we could lobotomize that, but I don't see any benefit. Yes. But there's precedent in refcursor using text's routines... (it's in type_sanity, but whatever) Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 1/6/14, 10:29 AM, Tom Lane wrote: >> Pseudotype.c/pg_proc.h are slowly growing a number of pretty >> useless/redundant copy&pasted functions... Most for cases that are >> pretty damn unlikely to be hit by users not knowing what they do. > > That's hardly the largest cost associated with inventing a new pseudotype. > Nor are there lots of new pseudotypes coming down the pike, anyway. If someone wants to do the work, what's the harm in reducing some code redundancy? >> What about adding a pseudotype_in/out that just error out with a generic >> message? > > This will break some of the function sanity checks in opr_sanity, Then the tests can be changed.
Peter Eisentraut <peter_e@gmx.net> writes: > On 1/6/14, 10:29 AM, Tom Lane wrote: >> This will break some of the function sanity checks in opr_sanity, > Then the tests can be changed. That will weaken their ability to detect actual mistakes, no? If there were a large benefit to merging the pseudotype I/O functions, I'd think this would be acceptable; but merging them seems of mighty marginal value. regards, tom lane
On 2014-01-06 11:28:29 -0500, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > On 1/6/14, 10:29 AM, Tom Lane wrote: > >> This will break some of the function sanity checks in opr_sanity, > > > Then the tests can be changed. > > That will weaken their ability to detect actual mistakes, no? FWIW, I am perfectly fine with duplicating the functions for now - I just thought that that might not be the best way but I didn't (and still don't) have a strong opinion. That's why I didn't supply a patch ;) > If there were a large benefit to merging the pseudotype I/O functions, > I'd think this would be acceptable; but merging them seems of mighty > marginal value. I think I am less concerned about pseudotypes.c than about bloating pg_proc.h even further and about the annoyance of editing it - but I guess that should rather be fixed by storing it in a more sensible format at some point... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > I think I am less concerned about pseudotypes.c than about bloating > pg_proc.h even further and about the annoyance of editing it - but I > guess that should rather be fixed by storing it in a more sensible > format at some point... Yeah, getting rid of a dozen pseudotype I/O functions is hardly going to reduce the PITA factor of editing pg_proc.h. It's interesting to think about moving all those DATA() macros into some more-maintainable format --- I'm not sure what it should be exactly, but I think something that can insert plausible defaults for omitted columns would be a big help for pg_proc and maybe some of the other catalogs too. regards, tom lane
On 2014-01-06 11:56:28 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > I think I am less concerned about pseudotypes.c than about bloating > > pg_proc.h even further and about the annoyance of editing it - but I > > guess that should rather be fixed by storing it in a more sensible > > format at some point... > > Yeah, getting rid of a dozen pseudotype I/O functions is hardly going > to reduce the PITA factor of editing pg_proc.h. It's interesting to > think about moving all those DATA() macros into some more-maintainable > format --- I'm not sure what it should be exactly, but I think something > that can insert plausible defaults for omitted columns would be a big help > for pg_proc and maybe some of the other catalogs too. Alvaro previously suggested storing pg_proc in json. Not sure I like it, but it'd sure be better than the current format if we derive unspecified values from other columns (e.g. prorows = 0 iff proretset). I think we also should auto-assign the oids for pg_proc (and some other tables) rows if we go there. Afaics there's really not much reason to keep them stable and it's by far the most frequent conflict I have seen with keeping patches up2date. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > I think we also should auto-assign the oids for pg_proc (and some other > tables) rows if we go there. -1 ... you've evidently not built any opclasses lately. Yeah, we could probably improve the bootstrap infrastructure enough to not need literal OIDs in so many places in the initial catalog contents, but it'd be lots of trouble. It definitely is *not* something to buy into at the same time as redoing the creation of the .bki file. regards, tom lane
On 2014-01-07 10:04:33 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > I think we also should auto-assign the oids for pg_proc (and some other > > tables) rows if we go there. > > -1 ... you've evidently not built any opclasses lately. True. Not sure if I ever built one, but when playing around. To the point that I am not seing the problem right now. I am not proposing to get rid of statically assigned oids in pg_type et al.. The references to procs mostly seem to be typed 'regproc' so there aren't many references to function's oids. There's a few procs that are involved in bootstraping, those likely would need to keep a fixed oid, but that doesn't sound problematic to me. > Yeah, we could probably improve the bootstrap infrastructure enough to not > need literal OIDs in so many places in the initial catalog contents, but > it'd be lots of trouble. It definitely is *not* something to buy into at > the same time as redoing the creation of the .bki file. Not sure. Any such infrastructure should have support for filling in defaults for things that don't need to be specified - imo generating an oid for that sounds like it would fit in there. Doesn't - and possibly shouldn't - be the same commit though. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > To the point that I am not seing the problem right now. I am not > proposing to get rid of statically assigned oids in pg_type et al.. The > references to procs mostly seem to be typed 'regproc' so there aren't > many references to function's oids. *Some* of them are typed regproc. Many are not, because there's need to refer to overloaded function names. A minimum change before we could even consider abandoning auto assignment of pg_proc entries would be to make regprocedure_in work in bootstrap mode, so that we could use regprocedure as a catalog column type. And regoperator_in, too, if you wanted to auto-assign operator OIDs. And you'd need some solution for generating fmgroids.h, as well as the handmade #define's for selected operator OIDs that appear now in pg_operator.h. There are probably more issues, but those are the ones that occur to me after thirty seconds' thought. Even after all that, we can *not* go over to auto-assignment of pg_type OIDs, because there is way too much stuff that assumes those OIDs are stable (starting with on-disk storage of arrays). I'm not entirely sure that it's okay to have function OIDs varying across releases, either; who's to say that there's not client code looking at fmgroids.h, or otherwise hard-wiring the numbers it uses for fastpath function calls? On the whole I think that this'd be a lot more work, and a lot more potential for breakage, than it's really worth. regards, tom lane
On 2014-01-07 11:08:22 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > To the point that I am not seing the problem right now. I am not > > proposing to get rid of statically assigned oids in pg_type et al.. The > > references to procs mostly seem to be typed 'regproc' so there aren't > > many references to function's oids. > > *Some* of them are typed regproc. Many are not, because there's need to > refer to overloaded function names. So, for the archives, this seems to be: * pg_cast * pg_aggregate * pg_event_trigger * pg_foreign_data_wrapper * pg_language * pg_proc (duh...) It strikes me that several of the tables using reproc could very well stand to use regprocedure instead. > A minimum change before we could even consider abandoning auto assignment > of pg_proc entries would be to make regprocedure_in work in bootstrap > mode, so that we could use regprocedure as a catalog column type. Yes. > And regoperator_in, too, if you wanted to auto-assign operator OIDs. I personally find that much less interesting because there are so much fewer operators in comparison to procs, so conflicts are rarer. > And > you'd need some solution for generating fmgroids.h, as well as the > handmade #define's for selected operator OIDs that appear now in > pg_operator.h. If we're able to generate the .bki file, this seems like a solveable problem. > There are probably more issues, but those are the ones > that occur to me after thirty seconds' thought. Yes. > Even after all that, we can *not* go over to auto-assignment of pg_type > OIDs, because there is way too much stuff that assumes those OIDs are > stable (starting with on-disk storage of arrays). Yes, I think that's totally out of question. > I'm not entirely > sure that it's okay to have function OIDs varying across releases, either; > who's to say that there's not client code looking at fmgroids.h, or > otherwise hard-wiring the numbers it uses for fastpath function calls? I am not particularly worried about that. That'd mean somebody built a solution specifically for calling builtin functions since all user created functions will be dynamic. And that that client is using a fmgroids.h from another release than the one he's working with - a recipe for trouble in such a solution independent of this imo. > On the whole I think that this'd be a lot more work, and a lot more > potential for breakage, than it's really worth. That might be true. It's probably the most frequent cause for conflicts in patches people submit tho... Anyway, pg_proc's contents would need to be generated before this anyway... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, 2014-01-06 at 17:36 +0100, Andres Freund wrote: > FWIW, I am perfectly fine with duplicating the functions for now - I > just thought that that might not be the best way but I didn't (and > still > don't) have a strong opinion. Could we just put 0 in for the functions' OID and have code elsewhere that errors "there is no input function for this type"?
Peter Eisentraut <peter_e@gmx.net> writes: > On Mon, 2014-01-06 at 17:36 +0100, Andres Freund wrote: >> FWIW, I am perfectly fine with duplicating the functions for now - I >> just thought that that might not be the best way but I didn't (and >> still don't) have a strong opinion. > Could we just put 0 in for the functions' OID and have code elsewhere > that errors "there is no input function for this type"? That doesn't seem like much of an improvement to me: that would be taking a catalog corruption condition and blessing it as a legitimate state of affairs, thereby reducing our ability to detect problems. One instance where it would create issues is that I'm pretty sure pg_dump would get confused by such a type. Admittedly, pg_dump will never try to dump the built-in pseudotypes, but do we really want them handled so differently from user-definable types? regards, tom lane