Re: pg_trgm version 1.2 - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: pg_trgm version 1.2
Date
Msg-id CAHyXU0wzXikiKEe6Ffrp=qXRq3+jA7q+LeFr0HZoi4XSu4A+BA@mail.gmail.com
Whole thread Raw
In response to pg_trgm version 1.2  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: pg_trgm version 1.2  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-hackers
On Sat, Jun 27, 2015 at 5:17 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> This patch implements version 1.2 of contrib module pg_trgm.
>
> This supports the triconsistent function, introduced in version 9.4 of the
> server, to make it faster to implement indexed queries where some keys are
> common and some are rare.
>
> I've included the paths to both upgrade and downgrade between 1.1 and 1.2,
> although after doing so you must close and restart the session before you
> can be sure the change has taken effect. There is no change to the on-disk
> index structure
>
> This shows the difference it can make in some cases:
>
> create extension pg_trgm version "1.1";
>
> create table foo as select
>
>   md5(random()::text)|| case when random()<0.000005 then 'lmnop' else '123'
> end ||
>
>   md5(random()::text) as bar
>
> from generate_series(1,10000000);
>
> create index on foo using gin (bar gin_trgm_ops);
>
> --some queries
>
> alter extension pg_trgm update to "1.2";
>
> --close, reopen, more queries
>
>
> select count(*) from foo where bar like '%12344321lmnabcddd%';
>
>
>
> V1.1: Time: 1743.691 ms  --- after repeated execution to warm the cache
>
> V1.2: Time:  2.839 ms      --- after repeated execution to warm the cache

Wow!  I'm going to test this.  I have some data sets for which trigram
searching isn't really practical...if the search string touches
trigrams with a lot of duplication the algorithm can have trouble
beating brute force searches.

trigram searching is important: it's the only way currently to search
string encoded structures for partial strings quickly.

merlin



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Rework the way multixact truncations work
Next
From: Merlin Moncure
Date:
Subject: Re: proposal: condition blocks in psql