Thread: Unsigned int functions

Unsigned int functions

From
Adriaan Joubert
Date:
Hi,
I finally seem to have my unsigned int2/int4 types working correctly,
but will wait until 7.1 is out of the door, and test a bit more, before
resubmitting.

A question though: 

I've put in functions (as copied from the int2/int4 implementation) that
implement operators for differently typed arguments, e.g. uint2*uint4.
This saves the type conversions, but adds to the number of functions in
the system.

When sorting out the constant problems, I realised that (uint2,uint4)
combinations will probably be very rarely used, while (int4,uint4)
combinations will be much more common, i.e. when there are constants
involved. 

Question is: should I add these functions? Are we looking at too much
bloat, i.e. should I replace the (uint2,uint4) combinations with
(int4,uint2) and (int4,uint4)? Lots of combinations are possible, but I
do not have a good feel for the trade-offs. 

I only wanted unsigned ints, so that we could develop and test stuff on
postgres before moving it onto Tandem. So please let me know what you
think the correct trade-offs are and I will implement it and resubmit
the patch.

Cheers,

Adriaan


Re: Unsigned int functions

From
Tom Lane
Date:
Adriaan Joubert <a.joubert@albourne.com> writes:
> Question is: should I add these functions? Are we looking at too much
> bloat, i.e. should I replace the (uint2,uint4) combinations with
> (int4,uint2) and (int4,uint4)? Lots of combinations are possible, but I
> do not have a good feel for the trade-offs. 

My guess is that we ought to avoid bloating the system with
cross-datatype functions.  I know there are some already for int2*int4
and so forth, but I'd like to see those go away in favor of a smarter
type promotion scheme --- ie, the parser should be able to figure out
that it ought to do int2_var * uint4_var asuint4_mul(uint4(int2_var), uint4_var)
A cross-datatype function ought to exist only if it can usefully do
something different from an implicit promotion.

Aside from bloating the system, providing a plethora of functions also
tends to confuse the ambiguous-function-call resolution mechanism.
See discussion of a few days ago wherein the parser could resolve an
ambiguous situation involving varchar, but could not resolve the same
situation with text, because there are too many possibilities for
coercion of text to something else.
        regards, tom lane


Re: Unsigned int functions

From
Bruce Momjian
Date:
> Adriaan Joubert <a.joubert@albourne.com> writes:
> > Question is: should I add these functions? Are we looking at too much
> > bloat, i.e. should I replace the (uint2,uint4) combinations with
> > (int4,uint2) and (int4,uint4)? Lots of combinations are possible, but I
> > do not have a good feel for the trade-offs. 
> 
> My guess is that we ought to avoid bloating the system with
> cross-datatype functions.  I know there are some already for int2*int4
> and so forth, but I'd like to see those go away in favor of a smarter
> type promotion scheme --- ie, the parser should be able to figure out
> that it ought to do int2_var * uint4_var as
>     uint4_mul(uint4(int2_var), uint4_var)
> A cross-datatype function ought to exist only if it can usefully do
> something different from an implicit promotion.

A larger question is whether unsigned types really add much to the
system vs. the bloat.  We already have unsigned int4 as oid.  Also,
unsigned doubles the space of the type, but if a value doesn't fit in
32k, what are the odds it will fit in 64k.  I am not sure unsigned
optimzations for space really are significant in SQL.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Unsigned int functions

From
Adriaan Joubert
Date:
Bruce Momjian wrote:
> 
> > Adriaan Joubert <a.joubert@albourne.com> writes:
> > > Question is: should I add these functions? Are we looking at too much
> > > bloat, i.e. should I replace the (uint2,uint4) combinations with
> > > (int4,uint2) and (int4,uint4)? Lots of combinations are possible, but I
> > > do not have a good feel for the trade-offs.
> >
> > My guess is that we ought to avoid bloating the system with
> > cross-datatype functions.  I know there are some already for int2*int4
> > and so forth, but I'd like to see those go away in favor of a smarter
> > type promotion scheme --- ie, the parser should be able to figure out
> > that it ought to do int2_var * uint4_var as
> >       uint4_mul(uint4(int2_var), uint4_var)
> > A cross-datatype function ought to exist only if it can usefully do
> > something different from an implicit promotion.
> 
> A larger question is whether unsigned types really add much to the
> system vs. the bloat.  We already have unsigned int4 as oid.  Also,
> unsigned doubles the space of the type, but if a value doesn't fit in
> 32k, what are the odds it will fit in 64k.  I am not sure unsigned
> optimzations for space really are significant in SQL.

A fair question. As I said, I only implemented them to simplify porting
applications between database systems. Personally I think it is good to
support types that make porting easier.

On the other hand the arguments about bloat are strong. It seems to me
that all cross-datatype functions should be removed, to reduce the
number of functions for the unsigned data types to a minimum. 

Would this be a reasonable compromise? 

If general opinion is that unsigned types should not be part of
postgres, I'll have to look at turning them into a contrib type. Please
let me know.

Cheers,

Adriaan


Re: Unsigned int functions

From
Thomas Lockhart
Date:
> If general opinion is that unsigned types should not be part of
> postgres, I'll have to look at turning them into a contrib type. Please
> let me know.

Providing them as a contrib/ package will allow you to provide the
*full* complement of cross-type conversion and operator functions
without worrying about bloat. Tom Lane has shown how to use entry points
on shared libraries to start us thinking about how to provide better
package integration, which should allow us to reduce the distinction
between contrib/ and standard features.

This would be a great package to develop these additional package
support features for 7.2 (hint hint ;)
                      - Thomas