Re: Reducing data type space usage - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: Reducing data type space usage
Date
Msg-id 450B2F43.1080403@markdilger.com
Whole thread Raw
In response to Reducing data type space usage  (Gregory Stark <stark@enterprisedb.com>)
Responses Re: Reducing data type space usage  (Mark Dilger <pgsql@markdilger.com>)
List pgsql-hackers
Gregory Stark wrote:
> 
<snip>
> 
> Case 2) Solving this is quite difficult without introducing major performance
>    problems or security holes. The one approach we have that's practical right
>    now is introducing special data types such as the oft-mentioned "char" data
>    type. "char" doesn't have quite the right semantics to use as a transparent
>    substitute for CHAR but we could define a CHAR(1) with exactly the right
>    semantics and substitute it transparently in parser/analyze.c (btw having
>    two files named analyze.c is pretty annoying). We could do the same with
>    NUMERIC(a,b) for sufficiently small values of a and b with something like
>    D'Arcy's CASH data type (which uses an integer internally).

Didn't we discuss a problem with using CHAR(n), specifically that the 
number of bytes required to store n characters is variable?  I had 
suggested making an ascii1 type, ascii2 type, etc.  Someone else seemed 
to be saying that should be called bytea1, bytea2, or perhaps with the 
parenthesis bytea(1), bytea(2).  The point being that it is a fixed 
number of bytes.

>    The problem with defining lots of data types is that the number of casts
>    and cross-data-type comparisons grows quadratically as the number of data
>    types grows. In theory we would save space by defining a CHAR(n) for
>    whatever size n the user needs but I can't really see anything other than
>    CHAR(1) being worthwhile. Similarly a 4-byte NUMERIC substitute like CASH
>    (with full NUMERIC semantics though) and maybe a 2-byte and 8-byte
>    substitute might be reasonable but anything else would be pointless.

Wouldn't a 4-byte numeric be a "float4" and an 8-byte numeric be a 
"float8".  I'm not sure I see the difference.  As for a 2-byte floating 
point number, I like the idea and will look for an ieee specification 
for how the bits are arranged, if any such ieee spec exists.

mark


pgsql-hackers by date:

Previous
From: "Guillaume Smet"
Date:
Subject: Re: log_duration is redundant, no?
Next
From: Gevik Babakhani
Date:
Subject: Re: question regarding regression tests