Re: char or int vs varchar - Mailing list pgsql-novice

From Mike Oxford
Subject Re: char or int vs varchar
Date
Msg-id 08F5636B48FE8F4D856E209D0E6C06379F9504@hq01.corp.xfire.com
Whole thread Raw
In response to char or int vs varchar  ("Keith Worthington" <keithw@narrowpathinc.com>)
List pgsql-novice
> -----Original Message-----
> From: pgsql-novice-owner@postgresql.org [mailto:pgsql-novice-
> owner@postgresql.org] On Behalf Of Keith Worthington
> Sent: Friday, April 22, 2005 1:21 PM
> To: PostgreSQL Novice
> Subject: [NOVICE] char or int vs varchar


> I need to store a (at the moment) tristate variable.  Now this could be
> done
> as a char "L", "S", "T" as an int2 1, 2, 3 or as a varchar(8) "loose",
> "standard", "tight" among possibly other ways.

Indexing is fastest on integer types.  Char types are probably stored
underneath as integers (guessing.)  VARCHARs are variable allocation
and when you change the string length it may have to reallocate.

Generally integers are fastest, then single characters then CHAR[x]
types and then VARCHARS.

String compares in your queries are an issue, as well.
"... where x = 'loose'" could be an issue if you query it a lot.

CHAR[1] would work, or you can use a bit field (true,false,not-set) or
if you're creative, 2 bits (01,11,10,11) and you don't use one set.  Check
out the "bytea" types, but it's got a 4 byte header so you'll end up using
33 bits. :P

Personally, I'd use an "char" (note quotes - not a char(1) ) or an unsigned
smallint (in that order) so I don't have to worry about a sign-flag, I have
extended room for more flags and it indexes quickly.

> Finally there is the smallint at 2 bytes.  Almost the smallest in terms of
> disk usage I wonder how it will affect performance since in most cases I
> would
> probably have to use a CASE statement to convert to something a human
> would
> intuitively understand.

Using "full strings of varchar(8)" is probably your slowest and least
efficient route...but the most readable.

I don't think Postgres does enum() fields.  That would probably be ideal
in this case, since you could use human-readable strings and it'd
store/index based on the numeric offset.  Oh well.

G'luck.

-Mike / Szii




pgsql-novice by date:

Previous
From: "ruel.cima"
Date:
Subject: SSL use
Next
From: brew@theMode.com
Date:
Subject: Re: CHECK vs BEFORE trigger