Re: 1 char in the world - Mailing list pgsql-performance
From | Hannu Krosing |
---|---|
Subject | Re: 1 char in the world |
Date | |
Msg-id | 1043842700.5008.11.camel@huli Whole thread Raw |
In response to | 1 char in the world (Matt Mello <alien@spaceship.com>) |
Responses |
Re: 1 char in the world
|
List | pgsql-performance |
On Wed, 2003-01-29 at 06:22, Matt Mello wrote: > TEXT vs "char" ... vs BOOLEAN > > I am porting from Informix to PG. In doing so, I had to pick some data > types for fields, and began wondering about the performance of char/text > fields with one character. For example, I have a field which has one of > the following values/states: {'A', 'D', 'F', 'U'}. Since CHAR(n), > VARCHAR, and TEXT are all supposed to have the same performance > according to the docs, it seems that they will all perform the same. > For this reason, I did not squabble over which one of these to use. > However, since "char" is implemented differently, I thought I would > compare it to one of the others. I chose to pit TEXT against "char". > > Test query = explain analyze select count(*) from table where onechar='D'; > Table size = 512 wide [mostly TEXT] * 400000 rows > Performance averages: > "char" 44ms > TEXT 63ms > > This seems somewhat reasonable, and makes me want to use "char" for my > single-char field. Does everyone else find this to be reasonable? Is > this pretty much the behavior I can expect on extraordinarily large > tables, too? The actual compares will likely stay faster for char than for text. OTOH the actual storage of one-char datatype should not play so significant role for very large tables, even if this is the only field in that table, as most of the overhead will be in other places - storage overhead in page/tuple headers, performance in retrieving the pages/tuples and cache lookups, etc. Also, for very big tables you will most likely want to restrict selects on other criteria than a 4-valued field, so that indexes could be used in retrieving data. > And, should I worry about things like the backend > developers removing "char" as a type later? > > -- > > This naturally led me to another question. How do TEXT, "char", and > BOOLEAN compare for storing t/f values. The test results I saw were > surprising. > > Test query= > "char"/TEXT: explain analyze select count(*) from table where bool='Y'; You could also try just select count(*) from table where bool; > boolean: explain analyze select count(*) from table where bool=true; > Table size (see above) > Performance averages: > TEXT 24ms > BOOLEAN 28ms > "char" 17ms > > Why does boolean rate closer to TEXT than "char"? I would think that > BOOLEANs would actually be stored like "char"s to prevent using the > extra 4 bytes with TEXT types. > > Based on these results, I will probably store my booleans as "char" > instead of boolean. I don't use stored procedures with my application > server, so I should never need my booleans to be the BOOLEAN type. I > can convert faster in my own code. > > -- > > NOTE: the above tests all had the same relative data in the different > fields (what was in TEXT could be found in "char", etc.) and were all > indexed equally. Did you repeat the texts enough times to be sure that you get reliable results ? > > Thanks! -- Hannu Krosing <hannu@tm.ee>
pgsql-performance by date: