Re: LIKE search and performance - Mailing list pgsql-performance

From PFC
Subject Re: LIKE search and performance
Date
Msg-id op.tsuqhpzycigqcu@apollo13
Whole thread Raw
In response to Re: LIKE search and performance  (Mark Lewis <mark.lewis@mir3.com>)
List pgsql-performance
> PG could scan the index looking for matches first and only load the
> actual rows if it found a match, but that could only be a possible win
> if there were very few matches, because the difference in cost between a
> full index scan and a sequential scan would need to be greater than the
> cost of randomly fetching all of the matching data rows from the table
> to look up the visibility information.

    If you need to do that kind of thing, ie. seq scanning a table checking
only one column among a large table of many columns, then don't use an
index. An index, being a btree, needs to be traversed in order (or else, a
lot of locking problems come up) which means some random accesses.

    So, you could make a table, with 2 columns, updated via triggers : your
text field, and the primary key of your main table. Scanning that would be
faster.

    Still, a better solution for searching in text is :

    - tsearch2 if you need whole words
    - trigrams for any substring match
    - xapian for full text search with wildcards (ie. John* = Johnny)

    Speed-wise those three will beat any seq scan on a large table by a huge
margin.

pgsql-performance by date:

Previous
From: Craig James
Date:
Subject: Re: LIKE search and performance
Next
From: Richard Huxton
Date:
Subject: Re: LIKE search and performance