Thread: seeking advise on char vs text or varchar in search table
I have a table that contains a column for keywords that I expect to become quite large and will be used for web searches. I will either index the column or come up with a simple hashing algorithm add the hash key to the table and index that column. I am thinking the max length in the keyword column I need to support is 30, but the average would be less than10 Any suggestions on whether to use char(30), varchar(30) or text, would be appreciated. I am looking for the best performance option, not necessarily the most economical on disk. Or any other suggestions would be greatly appreciated. -- View this message in context: http://www.nabble.com/seeking-advise-on-char-vs-text-or-varchar-in-search-table-tf3618204.html#a10103002 Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
On 4/20/07, chrisj <chrisj.wood@sympatico.ca> wrote: > > I have a table that contains a column for keywords that I expect to become > quite large and will be used for web searches. I will either index the > column or come up with a simple hashing algorithm add the hash key to the > table and index that column. > > I am thinking the max length in the keyword column I need to support is 30, > but the average would be less than10 > > Any suggestions on whether to use char(30), varchar(30) or text, would be > appreciated. I am looking for the best performance option, not necessarily > the most economical on disk. Don't use char...it pads out the string to the length always. It also has no real advantage over varchar in any practical situation. Think of varchar as text with a maximum length...its no faster or slower but the database will throw out entries based on length (which can be good or a bad thing)...in this case, text feels better. Have you looked at tsearch2, gist, etc? merlin
On Apr 23, 2007, at 7:16 AM, Merlin Moncure wrote: > On 4/20/07, chrisj <chrisj.wood@sympatico.ca> wrote: >> >> I have a table that contains a column for keywords that I expect >> to become >> quite large and will be used for web searches. I will either >> index the >> column or come up with a simple hashing algorithm add the hash key >> to the >> table and index that column. >> >> I am thinking the max length in the keyword column I need to >> support is 30, >> but the average would be less than10 >> >> Any suggestions on whether to use char(30), varchar(30) or text, >> would be >> appreciated. I am looking for the best performance option, not >> necessarily >> the most economical on disk. > > Don't use char...it pads out the string to the length always. It > also has no real advantage over varchar in any practical situation. > Think of varchar as text with a maximum length...its no faster or > slower but the database will throw out entries based on length (which > can be good or a bad thing)...in this case, text feels better. AIUI, char, varchar and text all store their data in *exactly* the same way in the database; char only pads data on output, and in the actual tables it still contains the regular varlena header. The only reason I've ever used char in other databases is to save the overhead of the variable-length information, so I recommend to people to just steer clear of char in PostgreSQL. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
"Jim Nasby" <decibel@decibel.org> writes: > AIUI, char, varchar and text all store their data in *exactly* the same way in > the database; char only pads data on output, and in the actual tables it still > contains the regular varlena header. The only reason I've ever used char in > other databases is to save the overhead of the variable-length information, so > I recommend to people to just steer clear of char in PostgreSQL. Everything you said is correct except that char actually pads its data on input, not output. This doesn't actually make a lot of sense since we're storing it as a varlena so we could pad it on output and modify the data type functions to pretend the spaces are there without storing them. However it would only make a difference if you're storing variable length data in a char field in which case I would 100% agree with your conclusion and strongly recommend using varchar. The only reason I would think of using char is when the data should always be the same length, like a SSN or md5hash or something like that. In which case it's purely for the self-documenting notational convenience, not any performance reason. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com