Unclear CHARACTER specification - Mailing list pgsql-docs

From Lyle
Subject Unclear CHARACTER specification
Date
Msg-id 50E05787.2050506@cosmicperl.com
Whole thread Raw
List pgsql-docs
Hi,
   Here:
http://www.postgresql.org/docs/9.2/interactive/datatype-character.html

There isn't really clear guidance on ranges and limits. All we have to
go on is this paragraph:

=start
The storage requirement for a short string (up to 126 bytes) is 1 byte
plus the actual string, which includes the space padding in the case of
character. Longer strings have 4 bytes of overhead instead of 1. Long
strings are compressed by the system automatically, so the physical
requirement on disk might be less. Very long values are also stored in
background tables so that they do not interfere with rapid access to
shorter column values. In any case, the longest possible character
string that can be stored is about 1 GB. (The maximum value that will be
allowed for n in the data type declaration is less than that. It
wouldn't be useful to change this because with multibyte character
encodings the number of characters and bytes can be quite different. If
you desire to store long strings with no specific upper limit, use text
or character varying without a length specifier, rather than making up
an arbitrary length limit.)
=end

 From this I can deduce that if I want to keep the character string as a
short string CHAR(126) should achieve that, for single byte character
encodings at least.

Then it says about long and very long strings, but doesn't give any
figures. Nor does it say whether it decides a string is long or very
long before or after the compression. There must be byte length figures
associated with this, please could we have them?

Why is the longest possible string about 1GB? Is that compressed or
uncompressed? Why can't we have an exact figure? What compression is
used? If the amount allowed for n in the data type definition is less,
what is it?

I can't imagine the PostgreSQL code has something like
if ( length( string ) + RAND() > 1GB ) {
error();
}
Seeing how short, long, and very long strings have been mentioned, along
with compression, why not complete this description with specific figures?

Is there other documentation available that covers this?

Sorry for all the questions, but I really thing all of this should be
covered already. Even if it's platform specific, a table with platforms
and figures would do the job nicely.


Lyle



pgsql-docs by date:

Previous
From: Tom Lane
Date:
Subject: Re: pattern matching indexing
Next
From: Magnus Hagander
Date:
Subject: Re: pg_resetxlog -l