I'm running PG 8.0.3 on WinXP and I'm coming across some performance issues related to text columns. Basically, it appears as though PG is storing the text data inline with the rest of the row data as queries that don't touch the text column are slower when there is data in the text column than when there isn't. According to section 8.3 of the doc:
"Long values are also stored in background tables so they do not interfere with rapid access to the shorter column values."
So, how long does a value have to be to be considered "long"?
If necessary, here is some more specific information about what I'm doing:
1) I create a new table and use 'COPY FROM' to populate it. When the data in the text column is limited to a max of 60 characters, this part takes 2-3 minutes less than when the data is at its full size. The table will be populated with ~750k rows. Here's an example of the table I create (no, I didn't name the fields "vc_field1", "vc_field2", etc ;) ):
2) I populate i_field1 and i_field2 from lookup tables. This step takes about 7 mins longer with the full text data than with the limited data.
update my_table_import set i_field1 = f.i_field1, i_field2 = u.i_field2 from lookup1 as f, lookup2 as u where vc_field2 = f.vc_field2 and vc_field1 = u.vc_field1;
3) I then create an index on this table and run a couple of queries on it. Each of these queries takes about 10 minutes longer with the full text data then without it. Here's the index that I create and an example of one of the queries that I run:
create index idx_my_table_import_i1_i2 on my_table_import (i_field1, i_field2); analyze my_table_import;
select i_field1, i_field2, max(ts_field1) as ts_field1, count(*) as dup_count from my_table_import where i_field1 between 0 and 9999 group by i_field1, i_field2