Home > mailing lists

Re: Reducing data type space usage - Mailing list pgsql-hackers

From	Mark Dilger
Subject	Re: Reducing data type space usage
Date	September 15, 2006 22:55:11
Msg-id	450B2F43.1080403@markdilger.com Whole thread Raw
In response to	Reducing data type space usage (Gregory Stark <stark@enterprisedb.com>)
Responses	Re: Reducing data type space usage (Mark Dilger <pgsql@markdilger.com>)
List	pgsql-hackers

Tree view

Gregory Stark wrote:
> 
<snip>
> 
> Case 2) Solving this is quite difficult without introducing major performance
>    problems or security holes. The one approach we have that's practical right
>    now is introducing special data types such as the oft-mentioned "char" data
>    type. "char" doesn't have quite the right semantics to use as a transparent
>    substitute for CHAR but we could define a CHAR(1) with exactly the right
>    semantics and substitute it transparently in parser/analyze.c (btw having
>    two files named analyze.c is pretty annoying). We could do the same with
>    NUMERIC(a,b) for sufficiently small values of a and b with something like
>    D'Arcy's CASH data type (which uses an integer internally).

Didn't we discuss a problem with using CHAR(n), specifically that the 
number of bytes required to store n characters is variable?  I had 
suggested making an ascii1 type, ascii2 type, etc.  Someone else seemed 
to be saying that should be called bytea1, bytea2, or perhaps with the 
parenthesis bytea(1), bytea(2).  The point being that it is a fixed 
number of bytes.

>    The problem with defining lots of data types is that the number of casts
>    and cross-data-type comparisons grows quadratically as the number of data
>    types grows. In theory we would save space by defining a CHAR(n) for
>    whatever size n the user needs but I can't really see anything other than
>    CHAR(1) being worthwhile. Similarly a 4-byte NUMERIC substitute like CASH
>    (with full NUMERIC semantics though) and maybe a 2-byte and 8-byte
>    substitute might be reasonable but anything else would be pointless.

Wouldn't a 4-byte numeric be a "float4" and an 8-byte numeric be a 
"float8".  I'm not sure I see the difference.  As for a 2-byte floating 
point number, I like the idea and will look for an ieee specification 
for how the bits are arranged, if any such ieee spec exists.

mark

pgsql-hackers by date:

From: "Guillaume Smet"
Date: 15 September 2006, 22:37:54
Subject: Re: log_duration is redundant, no?

From: Gevik Babakhani
Date: 15 September 2006, 22:59:25
Subject: Re: question regarding regression tests

Re: Reducing data type space usage - Mailing list pgsql-hackers

Previous

Next