Thread: best practices with index on varchar column
Hi everyone,
I hope it is the correct newsletter for this question.
Can I use an index on a varchar column to optimize the SELECT queries that use " column LIKE 'header%' "?
If yes what is the best tree algotithm to use ?
I don't care about optimising INSERT, DELETE and UPDATE queries, as they are only done at night when the load is very low.
Thank you very much for any help,
Benjamin Layet
On Tue, 22 Mar 2005 18:22:24 +0900, Layet Benjamin <benjamin@moonfactory.co.jp> wrote: > Can I use an index on a varchar column to optimize the SELECT queries that > use " column LIKE 'header%' "? > If yes what is the best tree algotithm to use ? Yes, that is the correct place. The best tree algorithm is B-Tree, which is the default. So no need for giving 'USING ...' to CREATE INDEX. The other types of indexes are either not trees (HASH), different and more complex (GiST, RTREE) kinds of trees which are there for different kinds of data (spatial, full text, etc). Remember to VACUUM ANALYZE this table from time to time, so the planner can judge efficiently whether to use this new index or not. Use EXPLAIN ANALYZE SELECT .... to see whether the index is really used. > I don't care about optimising INSERT, DELETE and UPDATE queries, as they are > only done at night when the load is very low. > Thank you very much for any help, Oh, they can benefit from the index anyhow. :) Regards, Dawid
> Can I use an index on a varchar column to optimize the SELECT queries > that > use " column LIKE 'header%' "? Yes > If yes what is the best tree algotithm to use ? Btree Note that if you want case insensitive matching you need to make an index on lower(column) and SELECT WHERE lower(column) LIKE 'header%' Locales may bite you.
PFC <lists@boutiquenumerique.com> writes: >> Can I use an index on a varchar column to optimize the SELECT queries >> that use " column LIKE 'header%' "? > Yes > Note that if you want case insensitive matching you need to make an index > on lower(column) and SELECT WHERE lower(column) LIKE 'header%' > Locales may bite you. Yes. If your database locale is not "C" then the default btree index behavior does not match up with what LIKE needs. In that case you need a special index using the appropriate "pattern_ops" opclass, eg CREATE INDEX test_index ON test_table (col varchar_pattern_ops); or if you want case insensitive matching CREATE INDEX test_index ON test_table (lower(col) varchar_pattern_ops); and then write the queries with lower() as PFC illustrates. *Don't* use ILIKE --- it basically can't use indexes at all. For more info see http://www.postgresql.org/docs/8.0/static/indexes-opclass.html regards, tom lane
I have an experience using LIKE in a VARCHAR column and select statement suffers a lot so I decided to go back in CHAR Note: my database has about 50 millions records a b tree index > Can I use an index on a varchar column to optimize the SELECT queries > that > use " column LIKE 'header%' "? Yes > If yes what is the best tree algotithm to use ? Btree Note that if you want case insensitive matching you need to make an index on lower(column) and SELECT WHERE lower(column) LIKE 'header%' Locales may bite you. ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org __________ NOD32 1.1023 (20050310) Information __________ This message was checked by NOD32 Antivirus System. http://www.nod32.com
On Wed, 23 Mar 2005 12:11:56 +0800, Michael Ryan S. Puncia <mpuncia@census.gov.ph> wrote: > > I have an experience using LIKE in a VARCHAR column and select statement > suffers a lot so I decided to go back in CHAR > > Note: my database has about 50 millions records a b tree index Strange... According to the PostgreSQL's documentation: Tip: There are no performance differences between these three types, apart from the increased storage size when using the blank-padded type. While character(n) has performance advantages in some other database systems, it has no such advantages in PostgreSQL. In most situations text or character varying should be used instead. To my best knowledge char and varchar are stored in a same way (4-byte length plus textual value), so using char should make tables bigger in your case. Then again, having each row exactly the same size makes it easier to delete and then later insert a new row in a same spot. Am I thinking correct? Is it a case where using char(n) makes that table avoid hmm fragmentation of some sort? Regards, Dawid
Dawid Kuroczko wrote: > On Wed, 23 Mar 2005 12:11:56 +0800, Michael Ryan S. Puncia > <mpuncia@census.gov.ph> wrote: > >>I have an experience using LIKE in a VARCHAR column and select statement >>suffers a lot so I decided to go back in CHAR > According to the PostgreSQL's documentation: > > Tip: There are no performance differences between these three types, > apart from the increased storage size when using the blank-padded type. > While character(n) has performance advantages in some other database > systems, it has no such advantages in PostgreSQL. In most situations text > or character varying should be used instead. > > > To my best knowledge char and varchar are stored in a same way > (4-byte length plus textual value), so using char should make tables > bigger in your case. Then again, having each row exactly the same > size makes it easier to delete and then later insert a new row in > a same spot. Am I thinking correct? Is it a case where using char(n) > makes that table avoid hmm fragmentation of some sort? There aren't any noticeable differences between char and varchar. MVCC doesn't overwrite rows anyway, so static size is irrelevant. In any case, PG's toast setup splits out large text fields and compresses them - so it's not that simple. -- Richard Huxton Archonet Ltd