Home > mailing lists

How to Optimize pg_trgm Performance - Mailing list pgsql-general

From	Igal @ Lucee.org
Subject	How to Optimize pg_trgm Performance
Date	January 29, 2018 02:42:17
Msg-id	0676f27e-118b-892b-f44d-c60ab13eb37c@lucee.org Whole thread Raw
Responses	Re: How to Optimize pg_trgm Performance
List	pgsql-general

Tree view

I want to use pg_trgm for auto-suggest functionality. I created a Materialized View with the information that I need, with the relevant columns being (keywords text, rank int). keywords is the column from which I build the tri-grams, and rank is some popularity factor so that popular results will show up higher than less popular results given the same tri-gram distance.

I want to return results in the order of [distance], [distance_word], [rank]. The input comes from the user and is not known in advance. My query is as follows:

SELECT title ,id ,(input <-> keywords) AS distance ,(input <<-> keywords) AS distance_word ,rank FROM (VALUES (cast('red pill' AS text))) consts(input) ,mv_autosuggest ORDER BY 3, 4, 5 LIMIT 20;

This gives me pretty good results, but it takes too long and is not likely to scale well.

I have created two indexes but neither seem to be used:

CREATE INDEX mv_autosuggest_keywords_tgrm_gist ON staging.mv_autosuggest USING gist (keywords gist_trgm_ops);CREATE INDEX mv_autosuggest_keywords_tgrm_gin ON staging.mv_autosuggest USING gin (keywords gin_trgm_ops);

This is the result of explain analyze:

QUERY PLAN | -------------------------------------------------------------------------------------------------------------------------------------------| Limit (cost=356.41..356.46 rows=20 width=51) (actual time=163.132..163.135 rows=20 loops=1) | -> Sort (cost=356.41..372.96 rows=6619 width=51) (actual time=163.130..163.131 rows=20 loops=1) | Sort Key: (('red pill'::text <-> mv_autosuggest.keywords)), (('red pill'::text <<-> mv_autosuggest.keywords)), mv_autosuggest.rank | Sort Method: top-N heapsort Memory: 28kB | -> Seq Scan on mv_autosuggest (cost=0.00..180.29 rows=6619 width=51) (actual time=0.263..161.289 rows=6619 loops=1) | Planning time: 0.139 ms | Execution time: 163.174 ms |

How can I improve the performance here?

Thank you,

Igal Sapir
Lucee Core Developer
Lucee.org

pgsql-general by date:

From: Martin Goodson
Date: 29 January 2018, 02:11:58
Subject: Re: FW: Setting up streaming replication problems

From: Bruce Momjian
Date: 29 January 2018, 06:48:29
Subject: Re: [GENERAL] pgpass file type restrictions

How to Optimize pg_trgm Performance - Mailing list pgsql-general

Previous

Next