Home > mailing lists

Re: Question regarding specifics of GIN and pg_trgm performance and potential use of show_trgm to improve it - Mailing list pgsql-general

From	Jeff Janes
Subject	Re: Question regarding specifics of GIN and pg_trgm performance and potential use of show_trgm to improve it
Date	September 24, 2023 23:53:31
Msg-id	CAMkU=1x2VvNSH_wHprQhffjq9CQJmuADBCsXhn7d05v_jUXMzg@mail.gmail.com Whole thread Raw
In response to	Question regarding specifics of GIN and pg_trgm performance and potential use of show_trgm to improve it (Pavel Horal <pavel.horal@orchitech.cz>)
List	pgsql-general

Tree view

On Wed, May 24, 2023 at 4:35 PM Pavel Horal <pavel.horal@orchitech.cz> wrote:

I didn't see your email when first sent, and stumbled upon it while searching for something else. But it still might be worthwhile commenting even after all of this time.

Is my understanding correct that this happens only because pg_trgm is not able to actually determine if the matched item from the index search is actually much much longer than the query? Is there any way how the performance can be improved in this case? I thought that I can store number of trigrams in the index, but that is not being used by the query planner:

CREATE INDEX test_idx2 ON test USING GIN (value gin_trgm_ops, array_length(show_trgm(value), 1));

EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM test WHERE value % 'lorem' AND array_length(show_trgm(value), 1) < array_length(show_trgm('lorem'), 1) / 0.5;

The main problem here is of expression type. You have an index using an expression returning an int, while you are comparing it to an expression returning a numeric. That inhibits the use of the index over that expression.

Just casting the type when creating the index is enough (given your test case) to get this to do what you want:

CREATE INDEX test_idx2 ON test USING GIN (value gin_trgm_ops, (array_length(show_trgm(value), 1)::numeric));

However, it would probably be more efficient to partition the table on the trigram count, rather than adding that count to the index. Then it could just skip any partition with too many trigrams.

Cheers,

Jeff

pgsql-general by date:

From: Laurenz Albe
Date: 24 September 2023, 18:18:58
Subject: Re: jdbc problem

From: Raivo Rebane
Date: 25 September 2023, 08:36:27
Subject: Right version of jdbc

Re: Question regarding specifics of GIN and pg_trgm performance and potential use of show_trgm to improve it - Mailing list pgsql-general

Previous

Next