Home > mailing lists

Thread: best practices with index on varchar column

best practices with index on varchar column

From

"Layet Benjamin"

Date:

22 March 2005, 09:23:06

Hi everyone,

I hope it is the correct newsletter for this question.

Can I use an index on a varchar column to optimize the SELECT queries that use " column LIKE 'header%' "?

If yes what is the best tree algotithm to use ?

I don't care about optimising INSERT, DELETE and UPDATE queries, as they are only done at night when the load is very low.

Thank you very much for any help,

Benjamin Layet

Re: best practices with index on varchar column

From

Dawid Kuroczko

Date:

22 March 2005, 09:49:42

On Tue, 22 Mar 2005 18:22:24 +0900, Layet Benjamin
<benjamin@moonfactory.co.jp> wrote:
> Can I use an index on a varchar column to optimize the SELECT queries that
> use " column LIKE 'header%'  "?
> If yes what is the best tree algotithm to use ?

Yes, that is the correct place.  The best tree algorithm is B-Tree,
which is the default.  So no need for giving 'USING ...' to CREATE INDEX.

The other types of indexes are either not trees (HASH), different
and more complex (GiST, RTREE) kinds of trees which are there
for different kinds of data (spatial, full text, etc).

Remember to VACUUM ANALYZE this table from time to time,
so the planner can judge efficiently whether to use this new
index or not.

Use EXPLAIN ANALYZE SELECT .... to see whether the index
is really used.

> I don't care about optimising INSERT, DELETE and UPDATE queries, as they are
> only done at night when the load is very low.
> Thank you very much for any help,

Oh, they can benefit from the index anyhow. :)

  Regards,
     Dawid

Re: best practices with index on varchar column

From

PFC

Date:

22 March 2005, 10:49:47

> Can I use an index on a varchar column to optimize the SELECT queries
> that
> use " column LIKE 'header%'  "?

    Yes

> If yes what is the best tree algotithm to use ?

    Btree

    Note that if you want case insensitive matching you need to make an index
on lower(column) and SELECT WHERE lower(column) LIKE 'header%'

    Locales may bite you.

Re: best practices with index on varchar column

From

Tom Lane

Date:

22 March 2005, 18:08:33

PFC <lists@boutiquenumerique.com> writes:
>> Can I use an index on a varchar column to optimize the SELECT queries
>> that use " column LIKE 'header%'  "?

>     Yes

>     Note that if you want case insensitive matching you need to make an index
> on lower(column) and SELECT WHERE lower(column) LIKE 'header%'

>     Locales may bite you.

Yes.  If your database locale is not "C" then the default btree index
behavior does not match up with what LIKE needs.  In that case you need
a special index using the appropriate "pattern_ops" opclass, eg

CREATE INDEX test_index ON test_table (col varchar_pattern_ops);

or if you want case insensitive matching

CREATE INDEX test_index ON test_table (lower(col) varchar_pattern_ops);

and then write the queries with lower() as PFC illustrates.  *Don't* use
ILIKE --- it basically can't use indexes at all.

For more info see
http://www.postgresql.org/docs/8.0/static/indexes-opclass.html

            regards, tom lane

Re: best practices with index on varchar column

From

"Michael Ryan S. Puncia"

Date:

23 March 2005, 09:16:32

I have an experience using LIKE in a VARCHAR column and select statement
suffers a lot so I decided to go back in CHAR

Note: my database has about 50 millions records a b tree index



> Can I use an index on a varchar column to optimize the SELECT queries
> that
> use " column LIKE 'header%'  "?

    Yes

> If yes what is the best tree algotithm to use ?

    Btree

    Note that if you want case insensitive matching you need to make an
index
on lower(column) and SELECT WHERE lower(column) LIKE 'header%'

    Locales may bite you.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

__________ NOD32 1.1023 (20050310) Information __________

This message was checked by NOD32 Antivirus System.
http://www.nod32.com

Re: best practices with index on varchar column

From

Dawid Kuroczko

Date:

23 March 2005, 09:35:57

On Wed, 23 Mar 2005 12:11:56 +0800, Michael Ryan S. Puncia
<mpuncia@census.gov.ph> wrote:
>
> I have an experience using LIKE in a VARCHAR column and select statement
> suffers a lot so I decided to go back in CHAR
>
> Note: my database has about 50 millions records a b tree index

Strange...

According to the PostgreSQL's documentation:

     Tip: There are no performance differences between these three types,
apart from the increased storage size when using the blank-padded type.
While character(n) has performance advantages in some other database
systems, it has no such advantages in PostgreSQL. In most situations text
or character varying should be used instead.

To my best knowledge char and varchar are stored in a same way
(4-byte length plus textual value), so using char should make tables
bigger in your case.  Then again, having each row exactly the same
size makes it easier to delete and then later insert a new row in
a same spot.  Am I thinking correct?  Is it a case where using char(n)
makes that table avoid hmm fragmentation of some sort?

   Regards,
      Dawid

Re: best practices with index on varchar column

From

Richard Huxton

Date:

23 March 2005, 11:31:39

Dawid Kuroczko wrote:
> On Wed, 23 Mar 2005 12:11:56 +0800, Michael Ryan S. Puncia
> <mpuncia@census.gov.ph> wrote:
>
>>I have an experience using LIKE in a VARCHAR column and select statement
>>suffers a lot so I decided to go back in CHAR

> According to the PostgreSQL's documentation:
>
>      Tip: There are no performance differences between these three types,
> apart from the increased storage size when using the blank-padded type.
> While character(n) has performance advantages in some other database
> systems, it has no such advantages in PostgreSQL. In most situations text
> or character varying should be used instead.
>
>
> To my best knowledge char and varchar are stored in a same way
> (4-byte length plus textual value), so using char should make tables
> bigger in your case.  Then again, having each row exactly the same
> size makes it easier to delete and then later insert a new row in
> a same spot.  Am I thinking correct?  Is it a case where using char(n)
> makes that table avoid hmm fragmentation of some sort?

There aren't any noticeable differences between char and varchar. MVCC
doesn't overwrite rows anyway, so static size is irrelevant. In any
case, PG's toast setup splits out large text fields and compresses them
- so it's not that simple.

--
   Richard Huxton
   Archonet Ltd