Home > mailing lists

Re: Fixed length data types issue - Mailing list pgsql-hackers

From	Gregory Stark
Subject	Re: Fixed length data types issue
Date	September 7, 2006 09:27:19
Msg-id	873bb3dfqi.fsf@enterprisedb.com Whole thread Raw
In response to	Re: Fixed length data types issue (Andrew - Supernews <andrew+nonews@supernews.com>)
Responses	Re: Fixed length data types issue
List	pgsql-hackers

Tree view

Andrew - Supernews <andrew+nonews@supernews.com> writes:

> Are you sure? Perhaps you are assuming that a char(1) field can be made
> to be fixed-length; this is not the case (consider utf-8 for example).

Well that could still be fixed length, it would just be a longer fixed length.
(In theory it would have to be 6 bytes long which I suppose would open up the
argument that if you're usually storing 7-bit ascii then a varlena would
usually be shorter.)

In any case I think the intersection of columns for which you care about i18n
and columns that you're storing according to an old-fashioned fixed column
layout is pretty much nil. And not just because it hasn't been updated to
modern standards either. If you look again at the columns in my example you'll
see none of them are appropriate targets for i18n anyways. They're all codes
and even numbers.

In other words if you're actually storing localized text then you almost
certainly will be using a text or varchar and probably won't even have a
maximum size. The use case for CHAR(n) is when you have fixed length
statically defined strings that are always the same length. it doesn't make
sense to store these in UTF8.

Currently Postgres has a limitation that you can only have one encoding per
database and one locale per cluster. Personally I'm of the opinion that the
only correct choice for that is "C" and all localization should be handled in
the client and with pg_strxfrm. Putting the whole database into non-C locales
guarantees that the columns that should not be localized will have broken
semantics and there's no way to work around things in the other direction.

Perhaps given the current situation what we should have is a cvarchar and
cchar data types that are like varchar and char but guaranteed to always be
interpreted in the c locale with ascii encoding.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com

pgsql-hackers by date:

From: Dave Cramer
Date: 07 September 2006, 08:48:05
Subject: getting access to gborg, specifically the jdbc CVS files

From: Martijn van Oosterhout
Date: 07 September 2006, 09:30:29
Subject: Re: Fixed length data types issue

Re: Fixed length data types issue - Mailing list pgsql-hackers

Previous

Next