Thread: CHAR(N) storage requirement

CHAR(N) storage requirement

From
David Garamond
Date:
Is it the 4+N (aka. same as VARCHAR(n)) or is it N? Sorry, it was 100%
not clear for me after reading the docs, though the docs imply the
first: "The storage requirement for data of these types is 4 bytes plus
the actual string, and in case of character plus the padding."

As a comparison, MySQL seems to do storage saving for fixed-length
character (it doesn't store the length of the string).

--
dave

Re: CHAR(N) storage requirement

From
Peter Eisentraut
Date:
David Garamond wrote:
> Is it the 4+N (aka. same as VARCHAR(n)) or is it N? Sorry, it was
> 100% not clear for me after reading the docs, though the docs imply
> the first: "The storage requirement for data of these types is 4
> bytes plus the actual string, and in case of character plus the
> padding."

Storing varchar(n) takes 4 bytes plus as many bytes as are required to
store the actual string.  This may be more or less then "n".

Storing char(n) takes 4 bytes plus as many bytes are are required to
store the actual string, plus n - length(value) bytes for padding
spaces.  This is at least "n" bytes.


Re: CHAR(N) storage requirement

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Storing char(n) takes 4 bytes plus as many bytes are are required to
> store the actual string, plus n - length(value) bytes for padding
> spaces.  This is at least "n" bytes.

Peter omitted one critical point that I think David hasn't absorbed
yet: char(N) measures N in characters, not bytes.  When using a
multibyte encoding, N characters may require more than N bytes.
Only in single-byte encodings can you make any simple statements
about the number of bytes occupied by char(N).  This is why the docs
are a bit vague.

            regards, tom lane

Re: CHAR(N) storage requirement

From
David Garamond
Date:
Peter Eisentraut wrote:
>>Is it the 4+N (aka. same as VARCHAR(n)) or is it N? Sorry, it was
>>100% not clear for me after reading the docs, though the docs imply
>>the first: "The storage requirement for data of these types is 4
>>bytes plus the actual string, and in case of character plus the
>>padding."
>
> Storing varchar(n) takes 4 bytes plus as many bytes as are required to
> store the actual string.  This may be more or less then "n".
>
> Storing char(n) takes 4 bytes plus as many bytes are are required to
> store the actual string, plus n - length(value) bytes for padding
> spaces.  This is at least "n" bytes.

I see. Then there is no real benefits at all in using CHAR(N) in
PostgreSQL, is that right? I wonder why there is no storage saving done
by PostgreSQL for CHAR compared to VARCHAR (since the string length of
CHAR field will be the same for all rows)... Or perhaps I can use array
or CREATE TYPE to achieve this (that is, minimizing storage to only "n"
bytes to store CHAR(n)). Assuming no multibyte/Unicode, only US-ASCII
values.

--
dave


Re: CHAR(N) storage requirement

From
Bruce Momjian
Date:
David Garamond wrote:
> Peter Eisentraut wrote:
> >>Is it the 4+N (aka. same as VARCHAR(n)) or is it N? Sorry, it was
> >>100% not clear for me after reading the docs, though the docs imply
> >>the first: "The storage requirement for data of these types is 4
> >>bytes plus the actual string, and in case of character plus the
> >>padding."
> >
> > Storing varchar(n) takes 4 bytes plus as many bytes as are required to
> > store the actual string.  This may be more or less then "n".
> >
> > Storing char(n) takes 4 bytes plus as many bytes are are required to
> > store the actual string, plus n - length(value) bytes for padding
> > spaces.  This is at least "n" bytes.
>
> I see. Then there is no real benefits at all in using CHAR(N) in
> PostgreSQL, is that right? I wonder why there is no storage saving done
> by PostgreSQL for CHAR compared to VARCHAR (since the string length of
> CHAR field will be the same for all rows)... Or perhaps I can use array
> or CREATE TYPE to achieve this (that is, minimizing storage to only "n"
> bytes to store CHAR(n)). Assuming no multibyte/Unicode, only US-ASCII
> values.

The reason we store the length on disk for CHAR() is that the routines
to handle variable-length data types are used by all variable-length
data types, not just CHAR(), and there isn't an easy way to pass that
information around.  The only way we could do it would be to pull the
row from disk, add needed lengths so they can be passed around to the
inernal routines, then strip them when writing --- it didn't seem worth
it.


--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073