Thread: CHAR or VARCHAR
two questions. When should I use one, and when the other? Which is the limit on CHAR(n) and VARCHAR(n)? Saludos... :-) -- System Administration: It's a dirty job, but someone told me I had to do it. ----------------------------------------------------------------- Martín Marqués email: martin@math.unl.edu.ar Santa Fe - Argentina http://math.unl.edu.ar/~martin/ Administrador de sistemas en math.unl.edu.ar -----------------------------------------------------------------
On 21 Mar 2001, at 18:58, Martin A. Marques wrote: > two questions. > When should I use one, and when the other? > Which is the limit on CHAR(n) and VARCHAR(n)? Okay, here is my more "let's get this thing working" as opposed to "after dedicated study of the matter" opinion (which I hope some dedicated studier might correct or confirm or extend) (and, ps, I do 99% of dev on mysql): I live under the assumption that indexes on CHAR will be "faster" but that CHAR may vary well consume more disk space as I believe it pads data to fit the size you created for it. This page does not help much: http://www.postgresql.org/docs/user/datatype1066.htm This page is interesting: http://www.postgresql.org/docs/user/sql-createindex.htm > Tip: Indexes are primarily used to enhance database performance. But > inappropriate use will result in slower performance. I rarely index a table if there are many inserts/updates. So char vs. varchar .... Peter --------------------------- "Reality is that which, when you stop believing in it, doesn't go away". -- Philip K. Dick
Martin, > two questions. > When should I use one, and when the other? IMHO, CHAR serves three purposes: 1. It is a good data type for fixed-length abbreviations that will never be more or less than the number of characters intended. US state codes, for example, which are always exactly two characters (Unless you count Puerto Rico and the Virgin Islands!) 2. SOmetimes you have a field which is *almost* always the same exact length, and needs to be used extensively in JOIN operations. In that case, the 2-byte savings of the CHAR data-type over VARCHAR may make a difference. 3. Converting several VARCHAR columns to CHAR and then concatinating them (||) will give you a quick-and-easy fixed-width multi-column format for list boxes and the like. Otherwise, always use VARCHAR. You won't need to worry about trailing spaces or fields being equal to a bunch of blanks (' '). > Which is the limit on CHAR(n) and VARCHAR(n)? Limit? There's a limit? Ask Stephan. I'm pretty sure it varies by distribution, but is somewhere in the thousands. -Josh Berkus -- ______AGLIO DATABASE SOLUTIONS___________________________ Josh Berkus Complete informationtechnology josh@agliodbs.com and data management solutions (415) 565-7293 for law firms, small businesses fax 621-2533 and non-profit organizations. San Francisco
On Wed, 21 Mar 2001, Martin A. Marques wrote: > two questions. > When should I use one, and when the other? Almost always varchar() unless you know that your input is of constant size. > Which is the limit on CHAR(n) and VARCHAR(n)? In 7.0.x and earlier, there's an row limit for all values in a row which defaulted to 8k (just below technically i believe) but could be raised at compile time to 32k. All fields in the row had to fit in that 8-32k. So you'd have to figure the limit out based on the other fields. In 7.1(currently in beta), long values can be stored outside that so you have more of a number of columns limit rather than a per value limit. Technically, it's probably still not a great idea to be sticking megs in there for performance reasons. [I seem to remember at least on older versions there was a further limit on things that were indexed, but I don't remember the issues with that.]
El Mié 21 Mar 2001 22:21, Stephan Szabo escribió: > On Wed, 21 Mar 2001, Martin A. Marques wrote: > > two questions. > > When should I use one, and when the other? > > Almost always varchar() unless you know that your > input is of constant size. How does performace look when doing lots of searches on a VARCHAR column with respect of a CHAR column? That is my main concern. I mean, what if I have to do lots of queries with LIKEs on that column? > > Which is the limit on CHAR(n) and VARCHAR(n)? > > In 7.0.x and earlier, there's an row limit > for all values in a row which defaulted to 8k (just > below technically i believe) but could be raised > at compile time to 32k. > All fields in the row had to fit in that 8-32k. > So you'd have to figure the limit out based on the > other fields. > > In 7.1(currently in beta), long values can be stored > outside that so you have more of a number of columns > limit rather than a per value limit. Technically, > it's probably still not a great idea to be sticking > megs in there for performance reasons. I can't believe that there is no limit. You mean that I put this: VARCHAR(1000000000000000000000000000000000) ? Saludos... :-) -- System Administration: It's a dirty job, but someone told me I had to do it. ----------------------------------------------------------------- Martín Marqués email: martin@math.unl.edu.ar Santa Fe - Argentina http://math.unl.edu.ar/~martin/ Administrador de sistemas en math.unl.edu.ar -----------------------------------------------------------------
On Thu, Mar 22, 2001 at 08:05:19AM -0300, Martin A. Marques wrote: > > > > In 7.1(currently in beta), long values can be stored > > outside that so you have more of a number of columns > > limit rather than a per value limit. Technically, > > it's probably still not a great idea to be sticking > > megs in there for performance reasons. > > I can't believe that there is no limit. You mean that I put this: > > VARCHAR(1000000000000000000000000000000000) ...length for type 'varchar' cannot exceed 10485760.But "no limit" means 'text'. Karel -- Karel Zak <zakkr@zf.jcu.cz>http://home.zf.jcu.cz/~zakkr/C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz
"Martin A. Marques" <martin@math.unl.edu.ar> writes: > How does performace look when doing lots of searches on a VARCHAR > column with respect of a CHAR column? That is my main concern. There is *no* performance advantage of CHAR(n) over VARCHAR(n). If anything, there is a performance lossage due to extra disk I/O (because all those padding blanks take space, and time to read). My advice is to use CHAR(n) when that semantically describes your data (ie, truly fixed-width data, like US postal codes), or VARCHAR(n) when that semantically describes your data (ie, variable-width with a hard upper bound), or TEXT when that semantically describes your data (ie, variable width with no specific upper bound). Worrying about performance differences is a waste of time, because there aren't any. regards, tom lane
On Thu, 22 Mar 2001, Karel Zak wrote: > On Thu, Mar 22, 2001 at 08:05:19AM -0300, Martin A. Marques wrote: > > > > > > In 7.1(currently in beta), long values can be stored > > > outside that so you have more of a number of columns > > > limit rather than a per value limit. Technically, > > > it's probably still not a great idea to be sticking > > > megs in there for performance reasons. > > > > I can't believe that there is no limit. You mean that I put this: > > > > VARCHAR(1000000000000000000000000000000000) > > ...length for type 'varchar' cannot exceed 10485760. > > But "no limit" means 'text'. True, but in a practical sense, even 10M or so is probably pushing your luck unless you've got alot of ram.
On 22 Mar 2001, at 10:05, Tom Lane wrote: > There is *no* performance advantage of CHAR(n) over VARCHAR(n). > If anything, there is a performance lossage due to extra disk I/O > (because all those padding blanks take space, and time to read). > > My advice is to use CHAR(n) when that semantically describes your data > (ie, truly fixed-width data, like US postal codes), or VARCHAR(n) when > that semantically describes your data (ie, variable-width with a hard > upper bound), or TEXT when that semantically describes your data (ie, > variable width with no specific upper bound). Worrying about > performance differences is a waste of time, because there aren't any. I wonder if this question of char/varchar is postgresql specific or rdbms in general. In any case I did some looking around the web and in the end I think the best advice is what Tom says above. Here are some pertinent things I found after entering (Index Performance Char vs Varchar) in google. http://www.swynk.com/friends/mcgehee/developers_tuning_tutorial. asp > If the text data in a column varies greatly in length, use a VARCHAR > data type instead of a CHAR data type. Although the VARCHAR data type > has slightly more overhead than the CHAR data type, the amount of > space saved by using VARCHAR over CHAR on variable length columns can > reduce I/O, improving overall SQL Server performance. Forgot URL: > Several other people at the session who were familiar with the > performance effects of using char vs. varchar confirmed my advice. One > person said his team was charged with deploying an application that > used SQL Server. After deploying the application, the team found that > it performed terribly. Upon inspecting the database, team members > discovered that all the fields were varchar. They changed the fields > to char, and the application now performs fine. Forgot URL: > Here's the advice from IBM in from the DB2 Admin guide SC96-9003) > Copyright IBM Corp. 1982, 1999 " Choosing CHAR or VARCHAR: VARCHAR > saves DASD space, but costs a 2-byte > overhead for each value and the additional processing required for > varying-length records. Thus, CHAR is preferable to VARCHAR, > unless the space saved by the use of VARCHAR is significant. The > savings are not significant if the maximum length is small or the > lengths of the values do not have a significant variation. In > general, do not define a column as VARCHAR(n) unless n is at least > 18. (Consider, also, using data compression if your main > concern is DASD savings. See "Compressing data in a table space or > partition" in topic 2.6.2 for more information.) > > If you use VARCHAR, do not specify a maximum length that is greater > than necessary. Although VARCHAR saves space in a table space, it > does not save space in an index, because index records are padded with > blanks to the maximum length. Note particularly the restrictions on > columns of strings longer than 255 bytes; for example, they cannot be > indexed. These restrictions are listed in Chapter 3 of DB2 SQL > Reference." > > David Seibert > Compuware Corporation File-AID product planner > Dave_Seibert@Compuware.com > > --------------------------- "Reality is that which, when you stop believing in it, doesn't go away". -- Philip K. Dick
"Peter J. Schoenster" <peter@schoenster.com> writes: > On 22 Mar 2001, at 10:05, Tom Lane wrote: >> There is *no* performance advantage of CHAR(n) over VARCHAR(n). > I wonder if this question of char/varchar is postgresql specific or > rdbms in general. It's definitely RDBMS-specific. My comment applied to Postgres, which stores CHAR(n) and VARCHAR(n) in essentially the same fashion --- it doesn't really exploit the fact that CHAR(n) is fixed-size. (Mainly because it's *not* fixed size in PG, what with TOAST, multibyte, etc.) On other DBMSes there could be a difference, especially if the DBMS has performance problems with variable-length fields. regards, tom lane
Folks - > On other DBMSes there could be a difference, especially if the DBMS > has > performance problems with variable-length fields. For example ... MS-SQL Server 7.0 requires 2 extra bytes to store the length of a string for VARCHAR. Thus, CHAR(3) takes 3 bytes (roughly) and VARCHAR(3) takes 5. Thus, if you knew that 90% of a particular field's entries were going to be the same length, you could save some storage space, and some processing power on scans and joins, by using CHAR instead of VARCHAR. (FYI, SQL Server 7 still has an 8K row limit; I'm not sure about SQL Server 2000). Now, Tom, at what point do character values begin to be stored "outside the table"? This is something I want to avoid, having had some bad experiences with pointers and DB corruption (although not with PGSQL). -Josh Berkus ______AGLIO DATABASE SOLUTIONS___________________________ Josh Berkus Complete informationtechnology josh@agliodbs.com and data management solutions (415) 565-7293 for law firms, small businesses fax 621-2533 and non-profit organizations. San Francisco