Thread: Unsigned int functions
Hi, I finally seem to have my unsigned int2/int4 types working correctly, but will wait until 7.1 is out of the door, and test a bit more, before resubmitting. A question though: I've put in functions (as copied from the int2/int4 implementation) that implement operators for differently typed arguments, e.g. uint2*uint4. This saves the type conversions, but adds to the number of functions in the system. When sorting out the constant problems, I realised that (uint2,uint4) combinations will probably be very rarely used, while (int4,uint4) combinations will be much more common, i.e. when there are constants involved. Question is: should I add these functions? Are we looking at too much bloat, i.e. should I replace the (uint2,uint4) combinations with (int4,uint2) and (int4,uint4)? Lots of combinations are possible, but I do not have a good feel for the trade-offs. I only wanted unsigned ints, so that we could develop and test stuff on postgres before moving it onto Tandem. So please let me know what you think the correct trade-offs are and I will implement it and resubmit the patch. Cheers, Adriaan
Adriaan Joubert <a.joubert@albourne.com> writes: > Question is: should I add these functions? Are we looking at too much > bloat, i.e. should I replace the (uint2,uint4) combinations with > (int4,uint2) and (int4,uint4)? Lots of combinations are possible, but I > do not have a good feel for the trade-offs. My guess is that we ought to avoid bloating the system with cross-datatype functions. I know there are some already for int2*int4 and so forth, but I'd like to see those go away in favor of a smarter type promotion scheme --- ie, the parser should be able to figure out that it ought to do int2_var * uint4_var asuint4_mul(uint4(int2_var), uint4_var) A cross-datatype function ought to exist only if it can usefully do something different from an implicit promotion. Aside from bloating the system, providing a plethora of functions also tends to confuse the ambiguous-function-call resolution mechanism. See discussion of a few days ago wherein the parser could resolve an ambiguous situation involving varchar, but could not resolve the same situation with text, because there are too many possibilities for coercion of text to something else. regards, tom lane
> Adriaan Joubert <a.joubert@albourne.com> writes: > > Question is: should I add these functions? Are we looking at too much > > bloat, i.e. should I replace the (uint2,uint4) combinations with > > (int4,uint2) and (int4,uint4)? Lots of combinations are possible, but I > > do not have a good feel for the trade-offs. > > My guess is that we ought to avoid bloating the system with > cross-datatype functions. I know there are some already for int2*int4 > and so forth, but I'd like to see those go away in favor of a smarter > type promotion scheme --- ie, the parser should be able to figure out > that it ought to do int2_var * uint4_var as > uint4_mul(uint4(int2_var), uint4_var) > A cross-datatype function ought to exist only if it can usefully do > something different from an implicit promotion. A larger question is whether unsigned types really add much to the system vs. the bloat. We already have unsigned int4 as oid. Also, unsigned doubles the space of the type, but if a value doesn't fit in 32k, what are the odds it will fit in 64k. I am not sure unsigned optimzations for space really are significant in SQL. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: > > > Adriaan Joubert <a.joubert@albourne.com> writes: > > > Question is: should I add these functions? Are we looking at too much > > > bloat, i.e. should I replace the (uint2,uint4) combinations with > > > (int4,uint2) and (int4,uint4)? Lots of combinations are possible, but I > > > do not have a good feel for the trade-offs. > > > > My guess is that we ought to avoid bloating the system with > > cross-datatype functions. I know there are some already for int2*int4 > > and so forth, but I'd like to see those go away in favor of a smarter > > type promotion scheme --- ie, the parser should be able to figure out > > that it ought to do int2_var * uint4_var as > > uint4_mul(uint4(int2_var), uint4_var) > > A cross-datatype function ought to exist only if it can usefully do > > something different from an implicit promotion. > > A larger question is whether unsigned types really add much to the > system vs. the bloat. We already have unsigned int4 as oid. Also, > unsigned doubles the space of the type, but if a value doesn't fit in > 32k, what are the odds it will fit in 64k. I am not sure unsigned > optimzations for space really are significant in SQL. A fair question. As I said, I only implemented them to simplify porting applications between database systems. Personally I think it is good to support types that make porting easier. On the other hand the arguments about bloat are strong. It seems to me that all cross-datatype functions should be removed, to reduce the number of functions for the unsigned data types to a minimum. Would this be a reasonable compromise? If general opinion is that unsigned types should not be part of postgres, I'll have to look at turning them into a contrib type. Please let me know. Cheers, Adriaan
> If general opinion is that unsigned types should not be part of > postgres, I'll have to look at turning them into a contrib type. Please > let me know. Providing them as a contrib/ package will allow you to provide the *full* complement of cross-type conversion and operator functions without worrying about bloat. Tom Lane has shown how to use entry points on shared libraries to start us thinking about how to provide better package integration, which should allow us to reduce the distinction between contrib/ and standard features. This would be a great package to develop these additional package support features for 7.2 (hint hint ;) - Thomas