Re: Performance impact of record sizes - Mailing list pgsql-admin

From Tom Lane
Subject Re: Performance impact of record sizes
Date
Msg-id 6353.1025826585@sss.pgh.pa.us
Whole thread Raw
In response to Re: Performance impact of record sizes  (John Moore <postgres@tinyvital.com>)
Responses Re: Performance impact of record sizes
Re: Performance impact of record sizes
Re: Performance impact of record sizes
Re: Performance impact of record sizes
List pgsql-admin
John Moore <postgres@tinyvital.com> writes:
> So I *suspect* I want to keep the data in the physical row, rather than
> using TEXT and having it stored separately from the record.

You seem to be reading something into the TEXT type that's not there;
perhaps you are carrying over associations from some other DBMS?
Here's how it works in Postgres:

The *only* performance difference between TEXT and VARCHAR(n) is that
for a VARCHAR(n) column, on insert or update there's an extra function
call applied to the new value to verify that its length is within N.
TEXT makes no such check; so TEXT has a small performance advantage if
there are lots of updates.  On the SELECT side there's no difference.

Both TEXT and VARCHAR(n) data is stored using TOAST (automatic
compression or out-of-line storage) if the row gets too big.  There
is no difference between the two datatypes on this score.

For the particular application you describe I think TOAST storage will
be ideal, since the overhead of compression or out-of-line storage is
only paid on the few rows where it's needed.

What you really ought to be asking yourself is whether (for this
application) a hard limit on the field width makes sense at all.
IMHO you should only use VARCHAR(n) if you can make a defensible
argument for the particular value of N you are using.  If you can't
explain why you are setting the field width limit, then you probably
do not need a specific limit at all, and should be using TEXT.

> A related question: is it more expensive to use varchar than fixed char
> fields?

There is no savings from using CHAR(n) --- most of the time it's a
substantial loss, because of the extra I/O costs associated with the
space used by all those padding blanks.  I can think of very very few
applications where CHAR(n) is really a sensible choice over VARCHAR(n).
US state postal codes (CHAR(2)) are an example, but beyond that the
valid use cases are mighty thin on the ground.

Bruce, it occurs to me that the "Performance Tips" section of the user's
guide ought to have an explanation of the performance implications of
TOAST.  We have some material in the FAQ but there's precious little
mention in the manual.  Thoughts?

            regards, tom lane



pgsql-admin by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Performance impact of record sizes
Next
From: Bruce Momjian
Date:
Subject: Re: Performance impact of record sizes