Thread: Proposal: make "opaque" obsolete
Okay, I'm tired of hearing about this, and I've decided I can make the problem go away with a day or so's concentrated work. Here is the plan: Create several "pseudo types" (typtype 'p' in pg_type). We already have one pseudo-type (RECORD), plus the logic in heap.c to reject any attempt to create a table column that's of a pseudo-type. I think we need the following pseudotypes: cstring -- represents a null-terminated C stringanytype -- for count(*), nullvalue, nonnullvalue *ONLY*anyarraytype -- for array_eq, array_dimsvoid -- result type for functions with no useful resulttuple -- result type for BEFORE triggersinternal -- catchall for non-SQL internal data structures I am not by any means wedded to the above type names; does anyone have suggestions for better names? (In particular, I am wondering if "tuple" and "internal" would be better named "pg_tuple" and "pg_internal". We might also consider making a type specifically named "trigger" and using that to declare trigger functions, rather than "tuple".) I am also thinking of creating a pseudotype for "opaque" itself, so that we can get rid of the kluge of using type OID 0 in places where a valid type is expected. We cannot remove "opaque" completely (yet) because of backwards compatibility with existing user I/O functions and triggers; but we can see to it that no built-in or contrib function is declared with "opaque". About I/O behavior: the pg_type entries for these pseudo-types will have to have typinput and typoutput functions. In general these I/O routines must just throw errors. Otherwise you could break the intended type safety by supplying user-written constants. For instance, the present definition of RECORD is wrong (sorry Joe) because it uses oidin and oidout; so I could write "SELECT foo('42'::record)" and thereby crash a function expecting RECORD. Not that there will be any such function, but the analogous case with, say, INTERNAL would be bad news. An exception is that void_out should succeed and just return an empty string; this allows functions-returning-void to be called by SELECT and behave reasonably. Less obviously, void_in should succeed (and return nothing interesting, probably just a zero datum; it can ignore its input). This allows plpgsql functions to be defined to return VOID. I am also considering allowing cstring_out to succeed (and, of course, just return a copy of what it's given). That would allow explicit invocation of an output function to work, viz "SELECT cash_out('42'::money)" would actually do what the user expects. This is not really necessary though. We can't allow cstring_in to succeed, although that looks like a no-brainer, because the system isn't ready to support CSTRING Datums in general contexts. Trigger functions will now be expected to take no arguments and return either tuple (or trigger if we call it that) or opaque. It would also be sensible to allow VOID in the case of AFTER triggers, but I'm inclined not to do so: I think it's better that a trigger function be declared in a way that makes it clear it's supposed to be a trigger. If CREATE TRIGGER accepts functions returning void then I think you lose some useful error checking. Should we throw a NOTICE stating that opaque is deprecated if a trigger is declared with opaque? Or should we wait a release or two for that? Similarly, the preferred signature for I/O functions now uses cstring and the function's actual datatype, rather than OPAQUE (and again we could consider throwing a NOTICE). One of the original concerns about this was how to handle the circularity problem for user-defined I/O functions. If we don't do anything special, then it will still work as long as people define the input function first:create function foo_in(cstring) returns foo as' ... ';-- you'll get a NOTICE here about "type foo is not yet defined"create function foo_out(foo) returns cstring as'...';-- you'll get another NOTICE here about "foo is only a shell"create type foo (input = foo_in, output = foo_out, ...); Are the notices annoying enough to be a problem? Is there a way around them? Currently, most of the PL languages explicitly check for type opaque as a function argument or result type, and reject it. This should be generalized to "reject any pseudo-type except those explicitly supported" (which will probably be nothing except VOID). Comments? regards, tom lane
> anyarraytype -- for array_eq, array_dims Will this allow generic array iterator functions in the future?
Rod Taylor <rbt@zort.ca> writes: >> anyarraytype -- for array_eq, array_dims > Will this allow generic array iterator functions in the future? Hm. Not directly; there's still the issue of how to tell what element type the array is. array_dims doesn't care, and I think that we have some kluge for array_eq, but in general it'd be a problem for generic functions. I had been thinking that as long as we are going to break datafile compatibility (due to Manfred's tuple-header changes) this would be a good time to try to clean up the representation of arrays. It's bothered me for a long while that the array code is not doing alignment correctly --- it seems not to matter for any standard type, but arrays of, say, interval are not aligned the way pg_type says they should be. The reason I bring this up is that if we are changing the internal representation of arrays, we could add type OID and perhaps typmod to the array header, thus making an array value interpretable without any outside info. Then you could actually do something interesting with a function taking anyarraytype. regards, tom lane
On Tue, 20 Aug 2002, Tom Lane wrote: In general I think it sounds good, so I'm only responding to places where I want to say something specific. > I am not by any means wedded to the above type names; does anyone have > suggestions for better names? (In particular, I am wondering if "tuple" > and "internal" would be better named "pg_tuple" and "pg_internal". We > might also consider making a type specifically named "trigger" and using > that to declare trigger functions, rather than "tuple".) I like something with trigger better, makes it very obvious that it's a trigger function. > An exception is that void_out should succeed and just return an empty string; > this allows functions-returning-void to be called by SELECT and behave > reasonably. Less obviously, void_in should succeed (and return nothing > interesting, probably just a zero datum; it can ignore its input). This > allows plpgsql functions to be defined to return VOID. Does this require additional work to the plpgsql grammar? The natural way to return from such a function (return;) doesn't seem like it'd work without some changes. In any case I don't think this would be necessary for 7.3. > I am also considering allowing cstring_out to succeed (and, of course, > just return a copy of what it's given). That would allow explicit invocation > of an output function to work, viz "SELECT cash_out('42'::money)" would > actually do what the user expects. This is not really necessary though. I like the idea of cstring_out working, but I wonder if we should stop users from calling the I/O functions directly anyway even if they were made to be safe. > Should we throw a NOTICE stating that opaque is deprecated if a trigger > is declared with opaque? Or should we wait a release or two for that? > Similarly, the preferred signature for I/O functions now uses cstring > and the function's actual datatype, rather than OPAQUE (and again we > could consider throwing a NOTICE). I think we should throw the notices right away, although this makes me wonder in general about upgrade path. Are we ever planning to make that an error, and if so, how are we going to handle functions that are coming from previous versions where it was okay? It's relatively easy to do fix ones that are currently used as trigger functions or type i/o functions, but what about ones that aren't being used at dump time? Do we even need to do anything? > One of the original concerns about this was how to handle the circularity > problem for user-defined I/O functions. If we don't do anything special, > then it will still work as long as people define the input function first: > create function foo_in(cstring) returns foo as ' ... '; > -- you'll get a NOTICE here about "type foo is not yet defined" > create function foo_out(foo) returns cstring as '...'; > -- you'll get another NOTICE here about "foo is only a shell" > create type foo (input = foo_in, output = foo_out, ...); > Are the notices annoying enough to be a problem? Is there a way around > them? I personally don't think it's a big deal, although I'm alot less annoyed by notices than alot of people.
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes: > On Tue, 20 Aug 2002, Tom Lane wrote: >> Less obviously, void_in should succeed (and return nothing >> interesting, probably just a zero datum; it can ignore its input). This >> allows plpgsql functions to be defined to return VOID. > Does this require additional work to the plpgsql grammar? I suspect you'd need to say "return 0" (or return anything-at-all, pretty much) to make it fly with the current plpgsql sources. This is a tad ugly but I think we can live with it until someone wants to fix it. If we have type void then for sure people will want to use it for plpgsql functions; there are plenty of cases where you run a plpgsql function just for side-effects. > I think we should throw the notices right away, although this makes me > wonder in general about upgrade path. Are we ever planning to make that > an error, and if so, how are we going to handle functions that are coming > from previous versions where it was okay? We can't make it an error until sufficiently far down the road that we don't care about forward compatibility from 7.2-or-before dump files. That'll be a long while, probably. Throwing a notice right away is okay with me personally, but I wanted to see what other people thought... regards, tom lane
Tom Lane wrote: > > I think we should throw the notices right away, although this makes me > > wonder in general about upgrade path. Are we ever planning to make that > > an error, and if so, how are we going to handle functions that are coming > > from previous versions where it was okay? > > We can't make it an error until sufficiently far down the road that we > don't care about forward compatibility from 7.2-or-before dump files. > That'll be a long while, probably. > > Throwing a notice right away is okay with me personally, but I wanted to > see what other people thought... NOTICE seems good. We will have all this mentioned in the release notes too. In fact, we can point to the release notes in the NOTICE if desired. I like the array storage format change too. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
> Trigger functions will now be expected to take no arguments and return > either tuple (or trigger if we call it that) or opaque. It would also be > sensible to allow VOID in the case of AFTER triggers, but I'm inclined not > to do so: I think it's better that a trigger function be declared in a way > that makes it clear it's supposed to be a trigger. If CREATE > TRIGGER accepts > functions returning void then I think you lose some useful error checking. > > Should we throw a NOTICE stating that opaque is deprecated if a trigger > is declared with opaque? Or should we wait a release or two for that? I think a NOTICE at creation time is fine. > Comments? Sounds really good. Chris
Tom Lane wrote: > About I/O behavior: the pg_type entries for these pseudo-types will have to > have typinput and typoutput functions. In general these I/O routines must > just throw errors. Otherwise you could break the intended type safety by > supplying user-written constants. For instance, the present definition of > RECORD is wrong (sorry Joe) because it uses oidin and oidout; so I could > write "SELECT foo('42'::record)" and thereby crash a function expecting > RECORD. Not that there will be any such function, but the analogous case > with, say, INTERNAL would be bad news. Sorry I've been unable to be very involved today. Anything you want me to do here? > An exception is that void_out should succeed and just return an empty string; > this allows functions-returning-void to be called by SELECT and behave > reasonably. Less obviously, void_in should succeed (and return nothing > interesting, probably just a zero datum; it can ignore its input). This > allows plpgsql functions to be defined to return VOID. This will be useful if/when we want to implement "CALL stored_proc;" > Should we throw a NOTICE stating that opaque is deprecated if a trigger > is declared with opaque? Or should we wait a release or two for that? I'd throw the NOTICE now. Joe
Tom Lane wrote: > The reason I bring this up is that if we are changing the internal > representation of arrays, we could add type OID and perhaps typmod to > the array header, thus making an array value interpretable without any > outside info. Then you could actually do something interesting with > a function taking anyarraytype. This sounds very cool. I'd vote for that. Joe
Joe Conway <mail@joeconway.com> writes: > Tom Lane wrote: >> ... Then you could actually do something interesting with >> a function taking anyarraytype. > This sounds very cool. I'd vote for that. Um, am I hearing a volunteer to make it happen? I have other problems I need to deal with ... regards, tom lane
Tom Lane wrote: >>>... Then you could actually do something interesting with >>>a function taking anyarraytype. > >>This sounds very cool. I'd vote for that. > > Um, am I hearing a volunteer to make it happen? I have other problems > I need to deal with ... > Hmmm. I guess I should be careful what I wish for -- and plan to take some time-off from my "day job" between now and September 1st ;-) I'll give it a shot, but a crude gameplan/general guidance to get me started would be much appreciated. Joe
On Wed, 21 Aug 2002, Tom Lane wrote: > Joe Conway <mail@joeconway.com> writes: > > Tom Lane wrote: > >> ... Then you could actually do something interesting with > >> a function taking anyarraytype. > > > This sounds very cool. I'd vote for that. > > Um, am I hearing a volunteer to make it happen? I have other problems > I need to deal with ... Tom, I saw something in the other thread suggesting that you might be working on this. Is that so? If not I have had a little poke around the cash type but I'm no where near up to speed on the internals. Your proposal is that cstring etc. get entries like record on pg_type? That presumably means we'd need in and out functions defined for these, which in the case of cstring would just be copying the input to output? (As you can see I may not be the best person to work on this if it is to be available for the beta) -- Nigel J. Andrews Director --- Logictree Systems Limited Computer Consultants
"Nigel J. Andrews" <nandrews@investsystems.co.uk> writes: > I saw something in the other thread suggesting that you might be working on > this. Is that so? I am working on the main proposal to create pseudo-types. I was hoping to offload the bit about changing array representation, though. regards, tom lane
Joe Conway <mail@joeconway.com> writes: > I'll give it a shot, but a crude gameplan/general guidance to get me > started would be much appreciated. AFAIK, the only code you should need to touch is insrc/include/utils/array.hsrc/backend/utils/adt/arrayfuncs.csrc/backend/utils/adt/arrayutils.c Look around for other references to ArrayType, but I don't *think* there is anything else that processes arrays directly; everything else should be calling construct_array() or deconstruct_array(). Adding an element-type OID field to the array header (ArrayType struct) ought to be a pretty straightforward exercise. We need to debate whether to store a typmod as well; I'm not sure that the space for it would be justified. The other thing that was bothering me was that the code is sloppy about alignment: although the overall start of the data area is correctly MAXALIGN'd, the offsets of individual data items in the array aren't necessarily made to be multiples of the element data type's alignment spec. This would probably result in core dumps for datatypes whose size is not a multiple of their alignment requirement. (The only standard one is INTERVAL. For reasons I've never quite figured out, an array of INTERVAL doesn't provoke core dumps; seems like it should, at least on machines where doubles actually require 8-byte alignment.) There are a dozen or two places in arrayfuncs.c that understand the alignment conventions for array elements, and they all need to be changed. The att_align macro in src/include/access/tupmacs.h is probably the thing to be using; look at the code in heaptuple.c to see how we position field values inside a tuple, and do likewise. regards, tom lane
Tom Lane wrote: > Joe Conway <mail@joeconway.com> writes: > >>I'll give it a shot, but a crude gameplan/general guidance to get me >>started would be much appreciated. > > > AFAIK, the only code you should need to touch is in > src/include/utils/array.h > src/backend/utils/adt/arrayfuncs.c > src/backend/utils/adt/arrayutils.c > > Look around for other references to ArrayType, but I don't *think* there > is anything else that processes arrays directly; everything else > should be calling construct_array() or deconstruct_array(). OK. I should have a bit more time today and tonight. I'll see what I can get done with this. Thanks, Joe