Thread: varchar(n) VS text
Having read http://www.postgresql.org/docs/8.2/interactive/datatype-character.html I am puzzling over this issue: 1) Is there ever ANY reason to prefer "varchar(n)" to "text" as a column type? 2) For instance, if I know that a character-type column will never contain more than 300 characters, would I be better to define it as "varchar(300)" or as "text"? 3) What if, in practice that same column usually contains strings of no more than 10 characters (although I can't guarantee that it wouldn't explode up to 300 in the one-in-a-million case)? 4) On the other hand, what if a column systematically contains strings of 5 or fewer characters. Is it better to define it as "varchar(5)" or as "text"? From my reading of the dataype documentation, the ONLY reason I can think of for using "varchar(n)" would be in order to add an extra data-type constraint to the column.
On Mon, 25 Jun 2007, Pierre Thibaudeau wrote: > From my reading of the dataype documentation, the ONLY reason I can > think of for using "varchar(n)" would be in order to add an extra > data-type constraint to the column. That's my understanding as well. I can think of a few reasons to use char(n) over text if you know exactly how many characters you will always have, but char isn't what you asked about.
"Pierre Thibaudeau" <pierdeux@gmail.com> writes: > I am puzzling over this issue: > 1) Is there ever ANY reason to prefer "varchar(n)" to "text" as a column type? In words of one syllable: no. Not unless you have an application requirement for a specific maximum length limit (eg, your client code will crash if fed a string longer than 256 bytes, or there's a genuine data-validity constraint that you can enforce this way). Or if you want to have schema-level portability to some other DB that understands varchar(N) but not text. (varchar(N) is SQL-standard, while text isn't, so I'm sure there are some such out there.) > From my reading of the dataype documentation, the ONLY reason I can > think of for using "varchar(n)" would be in order to add an extra > data-type constraint to the column. That is *exactly* what it does. No more and no less. There's no performance advantage, in fact you can expect to lose a few cycles to the constraint check. regards, tom lane
Tom Lane wrote: > "Pierre Thibaudeau" <pierdeux@gmail.com> writes: >> I am puzzling over this issue: > >> 1) Is there ever ANY reason to prefer "varchar(n)" to "text" as a column type? > > In words of one syllable: no. > > Not unless you have an application requirement for a specific maximum > length limit (eg, your client code will crash if fed a string longer > than 256 bytes, or there's a genuine data-validity constraint that you > can enforce this way). > > Or if you want to have schema-level portability to some other DB that > understands varchar(N) but not text. (varchar(N) is SQL-standard, > while text isn't, so I'm sure there are some such out there.) > >> From my reading of the dataype documentation, the ONLY reason I can >> think of for using "varchar(n)" would be in order to add an extra >> data-type constraint to the column. > > That is *exactly* what it does. No more and no less. There's no > performance advantage, in fact you can expect to lose a few cycles > to the constraint check. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org/ > > Is there any disk space advantages to using varchar over text? Or will a text field only ever use up as much data as it needs. I have a database where pretty much all text-type fields are created as varchars - I inherited this db from an MS SQL server and left them as varchar when I converted the database over to PG. My thoughts were text being a non-constrained data type may use up more disk space than a varchar and if I know there will never be more than 3 characters in the field for example, I could save some space by only creating a 3 length field. In my case, any field length restrictions are governed by the application so I don't really need the constraint built into the back end. If there is a slight performance disadvantage to using varchar and no real disk space saving - and I have in some cases 40 or 50 of these fields in a table - then would it be better for me to convert these fields to text?. Not to mention that I run into a problem occasionally where inputting a string that contains an apostraphe - PG behaves differently if it is a varchar to if it is a text type and my app occasionally fails. I.e. insert into tester (test_varchar) values ('abc''test'); I get the following: ERROR: array value must start with "{" or dimension information SQL state: 22P02 If I use the same command but inserting into a text-type field. insert into tester (test_text) values ('abc''test'); It works fine. But that's beside the point - my question is should I convert everything to text fields and, if so, is there any easy way of writting a script to change all varchar fields to text? -- Paul Lambert Database Administrator AutoLedgers
On Jun 27, 2007, at 19:38 , Paul Lambert wrote: > Is there any disk space advantages to using varchar over text? No. > Or will a text field only ever use up as much data as it needs. Yes. From http://www.postgresql.org/docs/8.2/interactive/datatype- character.html > The storage requirement for data of these types is 4 bytes plus the > actual string, and in case of character plus the padding. Long > strings are compressed by the system automatically, so the physical > requirement on disk may be less. Long values are also stored in > background tables so they do not interfere with rapid access to the > shorter column values. In any case, the longest possible character > string that can be stored is about 1 GB. (The maximum value that > will be allowed for n in the data type declaration is less than > that. It wouldn't be very useful to change this because with > multibyte character encodings the number of characters and bytes > can be quite different anyway. If you desire to store long strings > with no specific upper limit, use text or character varying without > a length specifier, rather than making up an arbitrary length limit.) > > Tip: There are no performance differences between these three > types, apart from the increased storage size when using the blank- > padded type. While character(n) has performance advantages in some > other database systems, it has no such advantages in PostgreSQL. In > most situations text or character varying should be used instead. > then would it be better for me to convert these fields to text?. Probably not. See above. > Not to mention that I run into a problem occasionally where > inputting a string that contains an apostraphe - PG behaves > differently if it is a varchar to if it is a text type and my app > occasionally fails. > > I.e. > insert into tester (test_varchar) values ('abc''test'); > I get the following: > ERROR: array value must start with "{" or dimension information > SQL state: 22P02 Works for me: test=# select version(); version ------------------------------------------------------------------------ ---------------------------------------------------------------------- PostgreSQL 8.2.4 on powerpc-apple-darwin8.9.0, compiled by GCC powerpc-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build 5367) (1 row) test=# create table tester (test_varchar varchar primary key); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "tester_pkey" for table "tester" CREATE TABLE test=# insert into tester (test_varchar) values ('abc''test'); INSERT 0 1 test=# select * from tester; test_varchar -------------- abc'test (1 row) > But that's beside the point - my question is should I convert > everything to text fields and, if so, is there any easy way of > writting a script to change all varchar fields to text? It's probably not worth the effort, but if you're interested you could query the system catalogs for varchar columns and write a script that would update them for you. Michael Glaesemann grzm seespotcode net
Paul Lambert wrote: > Is there any disk space advantages to using varchar over text? Or will a > text field only ever use up as much data as it needs. 1. no 2. yes. -- Alvaro Herrera Valdivia, Chile ICBM: S 39º 49' 18.1", W 73º 13' 56.4" "Hackers share the surgeon's secret pleasure in poking about in gross innards, the teenager's secret pleasure in popping zits." (Paul Graham)
Michael Glaesemann wrote: > Works for me: > > test=# select version(); > version > ---------------------------------------------------------------------------------------------------------------------------------------------- > > PostgreSQL 8.2.4 on powerpc-apple-darwin8.9.0, compiled by GCC > powerpc-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build > 5367) > (1 row) > > test=# create table tester (test_varchar varchar primary key); > NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index > "tester_pkey" for table "tester" > CREATE TABLE > test=# insert into tester (test_varchar) values ('abc''test'); > INSERT 0 1 > test=# select * from tester; > test_varchar > -------------- > abc'test > (1 row) > > Michael Glaesemann > grzm seespotcode net Looks like my bad - I created the table initially through pgAdminIII and it appears I selected the wrong character varying from the dropdown list. CREATE TABLE tester ( test_varchar character varying[], test_text text ) If I change it to character varying(20) it works fine. Apologies for that. Thanks for the other info though. -- Paul Lambert Database Administrator AutoLedgers
Paul Lambert wrote: > Looks like my bad - I created the table initially through pgAdminIII and > it appears I selected the wrong character varying from the dropdown list. > > CREATE TABLE tester > ( > test_varchar character varying[], > test_text text > ) > > If I change it to character varying(20) it works fine. Yeah, what you chose is an array of varchar. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Jun 26, 12:16 am, t...@sss.pgh.pa.us (Tom Lane) wrote: > "Pierre Thibaudeau" <pierd...@gmail.com> writes: > > I am puzzling over this issue: > > 1) Is there ever ANY reason to prefer "varchar(n)" to "text" as a column type? > > In words of one syllable: no. If you have any input from untrusted sources (like in a web app) wouldn't it be easier to attack the server if you had a text field, like by sending a couple TB of data over that pgsql then needs to store, on a server that doesn't have that amount of space? Er...I guess the web server would then be the cap? Or whatever other tiers you had between the client and pgsql? Kev
Kev <kevinjamesfield@gmail.com> writes: > On Jun 26, 12:16 am, t...@sss.pgh.pa.us (Tom Lane) wrote: >> "Pierre Thibaudeau" <pierd...@gmail.com> writes: >>> I am puzzling over this issue: >>> 1) Is there ever ANY reason to prefer "varchar(n)" to "text" as a column type? >> >> In words of one syllable: no. > If you have any input from untrusted sources (like in a web app) > wouldn't it be easier to attack the server if you had a text field, > like by sending a couple TB of data over that pgsql then needs to > store, on a server that doesn't have that amount of space? Well, the hard limit on a text field (or any other field) is 1Gb, so it'd not be quite as easy as that, even assuming that the webapp doesn't fall over first. regards, tom lane