Re: 1 char in the world - Mailing list pgsql-performance

From Hannu Krosing
Subject Re: 1 char in the world
Date
Msg-id 1043842700.5008.11.camel@huli
Whole thread Raw
In response to 1 char in the world  (Matt Mello <alien@spaceship.com>)
Responses Re: 1 char in the world
List pgsql-performance
On Wed, 2003-01-29 at 06:22, Matt Mello wrote:
> TEXT vs "char" ... vs BOOLEAN
>
> I am porting from Informix to PG.  In doing so, I had to pick some data
> types for fields, and began wondering about the performance of char/text
> fields with one character.  For example, I have a field which has one of
> the following values/states: {'A', 'D', 'F', 'U'}.  Since CHAR(n),
> VARCHAR, and TEXT are all supposed to have the same performance
> according to the docs, it seems that they will all perform the same.
> For this reason, I did not squabble over which one of these to use.
> However, since "char" is implemented differently, I thought I would
> compare it to one of the others.  I chose to pit TEXT against "char".
>
> Test query = explain analyze select count(*) from table where onechar='D';
> Table size = 512 wide [mostly TEXT] * 400000 rows
> Performance averages:
>   "char"  44ms
>   TEXT    63ms
>
> This seems somewhat reasonable, and makes me want to use "char" for my
> single-char field.  Does everyone else find this to be reasonable?  Is
> this pretty much the behavior I can expect on extraordinarily large
> tables, too?

The actual compares will likely stay faster for char than for text.

OTOH the actual storage of one-char datatype should not play so
significant role for very large tables, even if this is the only field
in that table, as most of the overhead will be in other places - storage
overhead in page/tuple headers, performance in retrieving the
pages/tuples and cache lookups, etc.

Also, for very big tables you will most likely want to restrict selects
on other criteria than a 4-valued field, so that indexes could be used
in retrieving data.

>  And, should I worry about things like the backend
> developers removing "char" as a type later?
>
> --
>
> This naturally led me to another question.  How do TEXT, "char", and
> BOOLEAN compare for storing t/f values.  The test results I saw were
> surprising.
>
> Test query=
>   "char"/TEXT: explain analyze select count(*) from table where bool='Y';

You could also try just

select count(*) from table where bool;

>   boolean:    explain analyze select count(*) from table where bool=true;
> Table size (see above)
> Performance averages:
>   TEXT    24ms
>   BOOLEAN 28ms
> "char"   17ms
>
> Why does boolean rate closer to TEXT than "char"?  I would think that
> BOOLEANs would actually be stored like "char"s to prevent using the
> extra 4 bytes with TEXT types.
>
> Based on these results, I will probably store my booleans as "char"
> instead of boolean.  I don't use stored procedures with my application
> server, so I should never need my booleans to be the BOOLEAN type.  I
> can convert faster in my own code.
>
> --
>
> NOTE: the above tests all had the same relative data in the different
> fields (what was in TEXT could be found in "char", etc.) and were all
> indexed equally.

Did you repeat the texts enough times to be sure that you get reliable
results ?

>
> Thanks!
--
Hannu Krosing <hannu@tm.ee>

pgsql-performance by date:

Previous
From: Justin Clift
Date:
Subject: Re: 1 char in the world
Next
From: Tom Lane
Date:
Subject: Re: 1 char in the world