Thread: A modest proposal for a FAQ addition

A modest proposal for a FAQ addition

From
Tom Lane
Date:
Q: Why do I get strange results with a CHAR(n) field?

A. Don't use CHAR(n).  VARCHAR(n) has the behavior you are probably
expecting; on top of which it's more compact and usually faster.


I suppose the above needs some fleshing out, but man am I getting tired
of explaining about significant vs non-significant trailing blanks.
        regards, tom lane


Re: A modest proposal for a FAQ addition

From
Bruce Momjian
Date:
Tom Lane wrote:
> Q: Why do I get strange results with a CHAR(n) field?
>
> A. Don't use CHAR(n).  VARCHAR(n) has the behavior you are probably
> expecting; on top of which it's more compact and usually faster.
>
>
> I suppose the above needs some fleshing out, but man am I getting tired
> of explaining about significant vs non-significant trailing blanks.

OK, good point.  I was mentioning CHAR() in the FAQ entry first, while
it should have been mentioned later.  I also added a specific mention of
the trailing spaces issue.  Patch attached.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
*** FAQ.html    Sun Jan 12 00:15:28 2003
--- /bjm/FAQ.html    Sun Jan 12 00:15:25 2003
***************
*** 1052,1066 ****
      stored out-of-line by <SMALL>TOAST</SMALL>, so the space on disk
      might also be less than expected.</P>

!     <SMALL>VARCHAR(n)</SMALL> is best when storing variable-length
!     strings but it limits how long a string can be. <SMALL>TEXT</SMALL>
!     is for strings of unlimited length, maximum 1 gigabyte.
!     <P><SMALL>CHAR(n)</SMALL> is for storing strings that are all the
!     same length. <SMALL>CHAR(n)</SMALL> stores trailing spaces, while
!     <SMALL>VARCHAR(n)</SMALL> trims them. <SMALL>BYTEA</SMALL> is for
!     storing binary data, particularly values that include
!     <SMALL>NULL</SMALL> bytes. These types have similar performance
!     characteristics.</P>

      <H4><A name="4.15.1">4.15.1</A>) How do I create a
      serial/auto-incrementing field?</H4>
--- 1052,1064 ----
      stored out-of-line by <SMALL>TOAST</SMALL>, so the space on disk
      might also be less than expected.</P>

!     <P><SMALL>CHAR(n)</SMALL> is best when storing strings that are
!     usually the same length. <SMALL>VARCHAR(n)</SMALL> is best when
!     storing variable-length strings but it limits how long a string can
!     be. <SMALL>TEXT</SMALL> is for strings of unlimited length, maximum
!     1 gigabyte. <SMALL>BYTEA</SMALL> is for storing binary data,
!     particularly values that include <SMALL>NULL</SMALL> bytes. These
!     types have similar performance characteristics.</P>

      <H4><A name="4.15.1">4.15.1</A>) How do I create a
      serial/auto-incrementing field?</H4>

Re: A modest proposal for a FAQ addition

From
Ian Barwick
Date:
On Sunday 12 January 2003 06:17, Bruce Momjian wrote:
> Tom Lane wrote:
> > Q: Why do I get strange results with a CHAR(n) field?
> >
> > A. Don't use CHAR(n).  VARCHAR(n) has the behavior you are probably
> > expecting; on top of which it's more compact and usually faster.
> >
> >
> > I suppose the above needs some fleshing out, but man am I getting tired
> > of explaining about significant vs non-significant trailing blanks.
>
> OK, good point.  I was mentioning CHAR() in the FAQ entry first, while
> it should have been mentioned later.  I also added a specific mention of
> the trailing spaces issue.  Patch attached.

Err, from the patch:

"CHAR(n) stores trailing spaces, while VARCHAR(n) trims them."

Surely this should read something like:

"CHAR(n) automatically pads strings with trailing blanks to the defined
column length."


Ian Barwick
barwick@gmx.net



Re: A modest proposal for a FAQ addition

From
Bruce Momjian
Date:
Ian Barwick wrote:
> On Sunday 12 January 2003 06:17, Bruce Momjian wrote:
> > Tom Lane wrote:
> > > Q: Why do I get strange results with a CHAR(n) field?
> > >
> > > A. Don't use CHAR(n).  VARCHAR(n) has the behavior you are probably
> > > expecting; on top of which it's more compact and usually faster.
> > >
> > >
> > > I suppose the above needs some fleshing out, but man am I getting tired
> > > of explaining about significant vs non-significant trailing blanks.
> >
> > OK, good point.  I was mentioning CHAR() in the FAQ entry first, while
> > it should have been mentioned later.  I also added a specific mention of
> > the trailing spaces issue.  Patch attached.
> 
> Err, from the patch:
> 
> "CHAR(n) stores trailing spaces, while VARCHAR(n) trims them."
> 
> Surely this should read something like:
> 
> "CHAR(n) automatically pads strings with trailing blanks to the defined
> column length."

OK, new text is:
   <SMALL>VARCHAR(n)</SMALL> is best when storing variable-length   strings but it limits how long a string can be.
<SMALL>TEXT</SMALL>  is for strings of unlimited length, maximum 1 gigabyte.   <P><SMALL>CHAR(n)</SMALL> is for storing
stringsthat are all the   same length. <SMALL>CHAR(n)</SMALL> pads with blanks to the specified   length, while
<SMALL>VARCHAR(n)</SMALL>only stores the characters   supplied.  <SMALL>BYTEA</SMALL> is for storing binary data,
particularlyvalues that include <SMALL>NULL</SMALL> bytes. These   types have similar performance characteristics.</P>
 


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: A modest proposal for a FAQ addition

From
Peter Eisentraut
Date:
Bruce Momjian writes:

> OK, new text is:

I think Tom specifically wanted the notion "don't use CHAR(n), it has
unusual behavior" to appear prominently in the FAQ.  The current text
simply rehashes the documentation.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: A modest proposal for a FAQ addition

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Bruce Momjian writes:
> 
> > OK, new text is:
> 
> I think Tom specifically wanted the notion "don't use CHAR(n), it has
> unusual behavior" to appear prominently in the FAQ.  The current text
> simply rehashes the documentation.

I can't say "don't use CHAR(n)" because there are valid reasons to use
it.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: A modest proposal for a FAQ addition

From
Ian Barwick
Date:
On Sunday 12 January 2003 17:55, Bruce Momjian wrote:
> Peter Eisentraut wrote:
> > Bruce Momjian writes:
> > > OK, new text is:
> >
> > I think Tom specifically wanted the notion "don't use CHAR(n), it has
> > unusual behavior" to appear prominently in the FAQ.  The current text
> > simply rehashes the documentation.
>
> I can't say "don't use CHAR(n)" because there are valid reasons to use
> it.

I think what Tom is saying is "always use VARCHAR(n) unless you know
for sure CHAR(n) is what you want, because if you slept through that part of
the SQL course CHAR(n) is not what you might think."

How about something like:

"4.14.1 Why do operations on CHAR(n) columns produce strange results?

Data inserted into a CHAR(n) column will be automatically padded with blanks
to the specified column length. This makes some operations, particularly
comparisions, appear to return unexpected results. For example, if you
insert the string 'hello' (5 characters) into a column defined as CHAR(8) it
will become 'hello   ' (8 characters) and simple comparisions with the
original 'hello' will fail.

Always define columns with VARCHAR(n) unless you have specific reasons for
using CHAR(n)."


Ian Barwick
barwick@gmx.net



Re: A modest proposal for a FAQ addition

From
Tom Lane
Date:
Ian Barwick <barwick@gmx.net> writes:
> On Sunday 12 January 2003 17:55, Bruce Momjian wrote:
>> I can't say "don't use CHAR(n)" because there are valid reasons to use
>> it.

> I think what Tom is saying is "always use VARCHAR(n) unless you know
> for sure CHAR(n) is what you want, because if you slept through that part of 
> the SQL course CHAR(n) is not what you might think."

Yes.  It is not clear from either the FAQ or the documentation that
CHAR() should not be one's default choice for a character field.
        regards, tom lane


Re: A modest proposal for a FAQ addition

From
Bruce Momjian
Date:
Tom Lane wrote:
> Ian Barwick <barwick@gmx.net> writes:
> > On Sunday 12 January 2003 17:55, Bruce Momjian wrote:
> >> I can't say "don't use CHAR(n)" because there are valid reasons to use
> >> it.
> 
> > I think what Tom is saying is "always use VARCHAR(n) unless you know
> > for sure CHAR(n) is what you want, because if you slept through that part of 
> > the SQL course CHAR(n) is not what you might think."
> 
> Yes.  It is not clear from either the FAQ or the documentation that
> CHAR() should not be one's default choice for a character field.

I think part of our problem is that we say CHAR() first, then VARCHAR(),
in the docs and the FAQ.  This of course suggests to look at CHAR()
first, then VARCHAR(), which is wrong.  I have fixed the FAQ, and now
the SGML docs.  I think this will help.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073