Home > mailing lists

pg_trgm performance - Mailing list pgsql-performance

From	Florian Weimer
Subject	pg_trgm performance
Date	January 15, 2007 06:16:59
Msg-id	82mz4k4mt7.fsf@mid.bfk.de Whole thread Raw
Responses	Re: pg_trgm performance
List	pgsql-performance

Tree view

I've got a table with a few million rows, consisting of a single text
column.  The average length is about 17 characters.  For the sake of
an experiment, I put a trigram index on that table.  Unfortunately, %
queries without smallish LIMITs are ridiculously slow (they take
longer than an hour).  A full table scan with a "WHERE similarity(...)
>= 0.4" clause completes in just a couple of minutes.  The queries
only select a few hundred rows, so an index scan has got a real chance
to be faster than a sequential scan.

Am I missing something?  Or are trigrams just a poor match for my data
set?  Are the individual strings too long, maybe?

(This is with PostgreSQL 8.2.0, BTW.)

--
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstraße 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99

pgsql-performance by date:

From: Rolf Østvik (HA/EXA)
Date: 15 January 2007, 04:58:40
Subject: Re: Problem with grouping, uses Sort and GroupAggregate, HashAggregate is better(?)

From: Alvaro Herrera
Date: 15 January 2007, 06:36:01
Subject: Re: max() versus order/limit (WAS: High update activity, PostgreSQL vs BigDBMS)

pg_trgm performance - Mailing list pgsql-performance

Previous

Next