Thread: varchar(n) VS text

varchar(n) VS text

From
"Pierre Thibaudeau"
Date:
Having read http://www.postgresql.org/docs/8.2/interactive/datatype-character.html
I am puzzling over this issue:

1) Is there ever ANY reason to prefer "varchar(n)" to "text" as a column type?

2) For instance, if I know that a character-type column will never
contain more than 300 characters, would I be better to define it as
"varchar(300)" or as "text"?

3) What if, in practice that same column usually contains strings of
no more than 10 characters (although I can't guarantee that it
wouldn't explode up to 300 in the one-in-a-million case)?

4) On the other hand, what if a column systematically contains strings
of 5 or fewer characters.  Is it better to define it as "varchar(5)"
or as "text"?

From my reading of the dataype documentation, the ONLY reason I can
think of for using "varchar(n)" would be in order to add an extra
data-type constraint to the column.

Re: varchar(n) VS text

From
Ben
Date:
On Mon, 25 Jun 2007, Pierre Thibaudeau wrote:

> From my reading of the dataype documentation, the ONLY reason I can
> think of for using "varchar(n)" would be in order to add an extra
> data-type constraint to the column.

That's my understanding as well. I can think of a few reasons to use
char(n) over text if you know exactly how many characters you will always
have, but char isn't what you asked about.

Re: varchar(n) VS text

From
Tom Lane
Date:
"Pierre Thibaudeau" <pierdeux@gmail.com> writes:
> I am puzzling over this issue:

> 1) Is there ever ANY reason to prefer "varchar(n)" to "text" as a column type?

In words of one syllable: no.

Not unless you have an application requirement for a specific maximum
length limit (eg, your client code will crash if fed a string longer
than 256 bytes, or there's a genuine data-validity constraint that you
can enforce this way).

Or if you want to have schema-level portability to some other DB that
understands varchar(N) but not text.  (varchar(N) is SQL-standard,
while text isn't, so I'm sure there are some such out there.)

> From my reading of the dataype documentation, the ONLY reason I can
> think of for using "varchar(n)" would be in order to add an extra
> data-type constraint to the column.

That is *exactly* what it does.  No more and no less.  There's no
performance advantage, in fact you can expect to lose a few cycles
to the constraint check.

            regards, tom lane

Re: varchar(n) VS text

From
Paul Lambert
Date:
Tom Lane wrote:
> "Pierre Thibaudeau" <pierdeux@gmail.com> writes:
>> I am puzzling over this issue:
>
>> 1) Is there ever ANY reason to prefer "varchar(n)" to "text" as a column type?
>
> In words of one syllable: no.
>
> Not unless you have an application requirement for a specific maximum
> length limit (eg, your client code will crash if fed a string longer
> than 256 bytes, or there's a genuine data-validity constraint that you
> can enforce this way).
>
> Or if you want to have schema-level portability to some other DB that
> understands varchar(N) but not text.  (varchar(N) is SQL-standard,
> while text isn't, so I'm sure there are some such out there.)
>
>> From my reading of the dataype documentation, the ONLY reason I can
>> think of for using "varchar(n)" would be in order to add an extra
>> data-type constraint to the column.
>
> That is *exactly* what it does.  No more and no less.  There's no
> performance advantage, in fact you can expect to lose a few cycles
> to the constraint check.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org/
>
>

Is there any disk space advantages to using varchar over text? Or will a
text field only ever use up as much data as it needs.

I have a database where pretty much all text-type fields are created as
varchars - I inherited this db from an MS SQL server and left them as
varchar when I converted the database over to PG. My thoughts were text
being a non-constrained data type may use up more disk space than a
varchar and if I know there will never be more than 3 characters in the
field for example, I could save some space by only creating a 3 length
field.

In my case, any field length restrictions are governed by the
application so I don't really need the constraint built into the back
end. If there is a slight performance disadvantage to using varchar and
no real disk space saving - and I have in some cases 40 or 50 of these
fields in a table - then would it be better for me to convert these
fields to text?.

Not to mention that I run into a problem occasionally where inputting a
string that contains an apostraphe - PG behaves differently if it is a
varchar to if it is a text type and my app occasionally fails.

I.e.
insert into tester (test_varchar) values ('abc''test');
I get the following:
ERROR: array value must start with "{" or dimension information
SQL state: 22P02

If I use the same command but inserting into a text-type field.
insert into tester (test_text) values ('abc''test');
It works fine.

But that's beside the point - my question is should I convert everything
to text fields and, if so, is there any easy way of writting a script to
change all varchar fields to text?

--
Paul Lambert
Database Administrator
AutoLedgers


Re: varchar(n) VS text

From
Michael Glaesemann
Date:
On Jun 27, 2007, at 19:38 , Paul Lambert wrote:

> Is there any disk space advantages to using varchar over text?

No.

> Or will a text field only ever use up as much data as it needs.

Yes.

 From http://www.postgresql.org/docs/8.2/interactive/datatype-
character.html

> The storage requirement for data of these types is 4 bytes plus the
> actual string, and in case of character plus the padding. Long
> strings are compressed by the system automatically, so the physical
> requirement on disk may be less. Long values are also stored in
> background tables so they do not interfere with rapid access to the
> shorter column values. In any case, the longest possible character
> string that can be stored is about 1 GB. (The maximum value that
> will be allowed for n in the data type declaration is less than
> that. It wouldn't be very useful to change this because with
> multibyte character encodings the number of characters and bytes
> can be quite different anyway. If you desire to store long strings
> with no specific upper limit, use text or character varying without
> a length specifier, rather than making up an arbitrary length limit.)
>
>     Tip: There are no performance differences between these three
> types, apart from the increased storage size when using the blank-
> padded type. While character(n) has performance advantages in some
> other database systems, it has no such advantages in PostgreSQL. In
> most situations text or character varying should be used instead.


>  then would it be better for me to convert these fields to text?.

Probably not. See above.

> Not to mention that I run into a problem occasionally where
> inputting a string that contains an apostraphe - PG behaves
> differently if it is a varchar to if it is a text type and my app
> occasionally fails.
>
> I.e.
> insert into tester (test_varchar) values ('abc''test');
> I get the following:
> ERROR: array value must start with "{" or dimension information
> SQL state: 22P02

Works for me:

test=# select version();

version
------------------------------------------------------------------------
----------------------------------------------------------------------
PostgreSQL 8.2.4 on powerpc-apple-darwin8.9.0, compiled by GCC
powerpc-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc.
build 5367)
(1 row)

test=# create table tester (test_varchar varchar primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"tester_pkey" for table "tester"
CREATE TABLE
test=# insert into tester (test_varchar) values ('abc''test');
INSERT 0 1
test=# select * from tester;
test_varchar
--------------
abc'test
(1 row)

> But that's beside the point - my question is should I convert
> everything to text fields and, if so, is there any easy way of
> writting a script to change all varchar fields to text?

It's probably not worth the effort, but if you're interested you
could query the system catalogs for varchar columns and write a
script that would update them for you.

Michael Glaesemann
grzm seespotcode net



Re: varchar(n) VS text

From
Alvaro Herrera
Date:
Paul Lambert wrote:

> Is there any disk space advantages to using varchar over text?  Or will a
> text field only ever use up as much data as it needs.

1. no 2. yes.

--
Alvaro Herrera       Valdivia, Chile   ICBM: S 39º 49' 18.1", W 73º 13' 56.4"
"Hackers share the surgeon's secret pleasure in poking about in gross innards,
the teenager's secret pleasure in popping zits."                 (Paul Graham)

Re: varchar(n) VS text

From
Paul Lambert
Date:
Michael Glaesemann wrote:
> Works for me:
>
> test=# select version();
>                                                                    version
>
----------------------------------------------------------------------------------------------------------------------------------------------

>
> PostgreSQL 8.2.4 on powerpc-apple-darwin8.9.0, compiled by GCC
> powerpc-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build
> 5367)
> (1 row)
>
> test=# create table tester (test_varchar varchar primary key);
> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
> "tester_pkey" for table "tester"
> CREATE TABLE
> test=# insert into tester (test_varchar) values ('abc''test');
> INSERT 0 1
> test=# select * from tester;
> test_varchar
> --------------
> abc'test
> (1 row)
>
> Michael Glaesemann
> grzm seespotcode net

Looks like my bad - I created the table initially through pgAdminIII and
  it appears I selected the wrong character varying from the dropdown list.

CREATE TABLE tester
(
   test_varchar character varying[],
   test_text text
)

If I change it to character varying(20) it works fine.

Apologies for that.

Thanks for the other info though.

--
Paul Lambert
Database Administrator
AutoLedgers

Re: varchar(n) VS text

From
Alvaro Herrera
Date:
Paul Lambert wrote:

> Looks like my bad - I created the table initially through pgAdminIII and
>  it appears I selected the wrong character varying from the dropdown list.
>
> CREATE TABLE tester
> (
>   test_varchar character varying[],
>   test_text text
> )
>
> If I change it to character varying(20) it works fine.

Yeah, what you chose is an array of varchar.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: varchar(n) VS text

From
Kev
Date:
On Jun 26, 12:16 am, t...@sss.pgh.pa.us (Tom Lane) wrote:
> "Pierre Thibaudeau" <pierd...@gmail.com> writes:
> > I am puzzling over this issue:
> > 1) Is there ever ANY reason to prefer "varchar(n)" to "text" as a column type?
>
> In words of one syllable: no.

If you have any input from untrusted sources (like in a web app)
wouldn't it be easier to attack the server if you had a text field,
like by sending a couple TB of data over that pgsql then needs to
store, on a server that doesn't have that amount of space?

Er...I guess the web server would then be the cap?  Or whatever other
tiers you had between the client and pgsql?

Kev


Re: varchar(n) VS text

From
Tom Lane
Date:
Kev <kevinjamesfield@gmail.com> writes:
> On Jun 26, 12:16 am, t...@sss.pgh.pa.us (Tom Lane) wrote:
>> "Pierre Thibaudeau" <pierd...@gmail.com> writes:
>>> I am puzzling over this issue:
>>> 1) Is there ever ANY reason to prefer "varchar(n)" to "text" as a column type?
>>
>> In words of one syllable: no.

> If you have any input from untrusted sources (like in a web app)
> wouldn't it be easier to attack the server if you had a text field,
> like by sending a couple TB of data over that pgsql then needs to
> store, on a server that doesn't have that amount of space?

Well, the hard limit on a text field (or any other field) is 1Gb,
so it'd not be quite as easy as that, even assuming that the webapp
doesn't fall over first.

            regards, tom lane