Thread: query with pg_trgm sometimes very slow
Hi, I have a table for addresses: CREATE TABLE adressen.adresse ( pool_pk integer NOT NULL, adressnr_pk integer NOT NULL, anrede varchar(8), vorname varchar(50) DEFAULT ''::character varying NOT NULL, name1 varchar(100) NOT NULL, name2 varchar(80) DEFAULT ''::character varying NOT NULL, name3 varchar(80) DEFAULT ''::character varying NOT NULL, strasse varchar(80) DEFAULT ''::character varying NOT NULL, plz varchar(8) DEFAULT ''::character varying NOT NULL, ort varchar(80) DEFAULT ''::character varying NOT NULL, changed timestamptz, id2 integer ); The primary key is on 'pool_pk' and 'adressnr_pk' and amongst some other indexes there is an index CREATE INDEX trgm_adresse ON adressen.adresse USING gist (normalize_string((btrim((((((((normalize_string((((COALESCE((vorname)::text, ''::text) || ' '::text) || (name1)::text))::character varying, (-1)))::text || ' '::text) || (normalize_string((COALESCE((strasse)::text, ''::text))::character varying, (-2)))::text) || ' '::text) || (plz)::text) || ' '::text) || (normalize_string((COALESCE((ort)::text, ''::text))::character varying, (-3)))::text)))::character varying) gist_trgm_ops); When I try to retrieve some addresses similar to a new address, I use the following query SELECT pool_pk AS pool, adressnr_pk AS adrnr, vorname, name1, strasse, plz, ort, ratio_ld_adresse($1, $2, $3, $4, name1, strasse, plz, ort)::double precision AS ratio FROM adressen.adresse WHERE normalize_string(trim(normalize_string(coalesce(vorname::text, '') || ' ' || name1::text, -1) || ' ' || normalize_string(coalesce(strasse::text, ''), -2) || ' ' || plz::text || ' ' || normalize_string(coalesce(ort::text, ''), -3))) % normalize_string(trim(normalize_string($1::text) || ' ' || normalize_string(coalesce($2::text, ''), -2) || ' ' || $3::text || ' ' || normalize_string(coalesce($4::text, ''), -3))) ORDER BY 1, 8 DESC, 2; which means: take the normalized (lower case, no accents ...) parts of the address, concatinate them to <name> <street> <zip> <city> and search for a similar address in the existing addresses. The described index 'trgm_adresse' is built on the same expression. The function 'normalize_string' is written in plpythonu and doesn't use any database calls. The same with the function 'ratio_ld_adresse', which calculates the levenshtein distance for two entire addresses. Most the time everything works fine and one search (in about 500,000 addresses) lasts about 2 to 5 seconds. But sometimes the search takes 50 or even 300 seconds. We have two machines which have the same software installation (Ubuntu 14.10 server, Postgres 9.4.4) the production server has 4 GB memory and 2 processors and the test server 2 GB memory and 1 processor. Both are virtual machines on two different ESX-servers. On both machines postgres was installed out of the box (apt-get ...) without configuration modifications (except network interfaces in pg_hba.conf). The test machine is a little bit slower but the very slow searches occur much more (~ 35 %) than on the production machine (~10 %). An explain tells QUERY PLAN Sort (cost=2227.31..2228.53 rows=489 width=65) Sort Key: pool_pk, ((ratio_ld_adresse('Test 2'::character varying, 'Am Hang 12'::character varying, '12345'::character varying, 'Irgendwo'::character varying, name1, strasse, plz, ort))::double precision), adressnr_pk -> Index Scan using trgm_adresse on adresse (cost=1.43..2205.46 rows=489 width=65) Index Cond: ((normalize_string((btrim((((((((normalize_string((((COALESCE((vorname)::text, ''::text) || ' '::text) || (name1)::text))::character varying, (-1)))::text || ' '::text) || (normalize_string((COALESCE((strasse)::text, ''::text))::character varying, (-2)))::text) || ' '::text) || (plz)::text) || ' '::text) || (normalize_string((COALESCE((ort)::text, ''::text))::character varying, (-3)))::text)))::character varying, 0))::text % 'test 2 am hang 12 12345 irgendwo'::text) which shows that the index is used and the result should arrive within some seconds (2228 is not very expensive). When such a slow query is running, 'top' shows nearly '100 % wait' and 'iotop' shows 3 - 8 MB/sec disk read by a process postgres: vb vb 10.128.96.25(60435) FETCH Also the postgres log, which was told to log every task longer than 5000 ms, shows 2015-09-02 13:44:48 CEST [25237-1] vb@vb LOG: duration: 55817.191 ms execute <unnamed>: FETCH FORWARD 4096 IN "py:0xa2d61f6c" Since I never used a FETCH command in my life, this must be used by pg_trgm or something inside it (gin, gist etc.) If during a slow Query one (or several) more instance(s) of the same query are started, all of them hang and return at the _same second_ some minutes later. Even if the other queries are on different addresses. 2015-08-31 09:09:00 GMT LOG: duration: 98630.958 ms execute <unnamed>: FETCH FORWARD 4096 IN "py:0x7fb780a07e10" 2015-08-31 09:09:00 GMT LOG: duration: 266887.136 ms execute <unnamed>: FETCH FORWARD 4096 IN "py:0x7fb780a95dd8" 2015-08-31 09:09:00 GMT LOG: duration: 170311.627 ms execute <unnamed>: FETCH FORWARD 4096 IN "py:0x7fb780a77e10" 2015-08-31 09:09:00 GMT LOG: duration: 72614.474 ms execute <unnamed>: FETCH FORWARD 4096 IN "py:0x7fb780a0ce10" 2015-08-31 09:09:00 GMT LOG: duration: 78561.131 ms execute <unnamed>: FETCH FORWARD 4096 IN "py:0x7fb780a08da0" 2015-08-31 09:09:00 GMT LOG: duration: 182392.792 ms execute <unnamed>: FETCH FORWARD 4096 IN "py:0x7fb78170c2b0" 2015-08-31 09:09:00 GMT LOG: duration: 245632.530 ms execute <unnamed>: FETCH FORWARD 4096 IN "py:0x7fb7809ddc50" 2015-08-31 09:09:00 GMT LOG: duration: 84760.400 ms execute <unnamed>: FETCH FORWARD 4096 IN "py:0x7fb7809f7dd8" 2015-08-31 09:09:00 GMT LOG: duration: 176402.352 ms execute <unnamed>: FETCH FORWARD 4096 IN "py:0x7fb7809fc668" Does anyone have an idea, how to solve this problem? regards Volker -- Volker Böhm Tel.: +49 4141 981155 www.vboehm.de Voßkuhl 5 Fax: +49 4141 981154 21682 Stade mailto:volker@vboehm.de
On Wed, Sep 2, 2015 at 7:00 AM, Volker Böhm <volker@vboehm.de> wrote:
CREATE INDEX trgm_adresse ON adressen.adresse USING gist (normalize_string((btrim((((((((normalize_string((((COALESCE((vorname)::text, ''::text) || ' '::text) || (name1)::text))::character varying, (-1)))::text || ' '::text) || (normalize_string((COALESCE((strasse)::text, ''::text))::character varying, (-2)))::text) || ' '::text) || (plz)::text) || ' '::text) || (normalize_string((COALESCE((ort)::text, ''::text))::character varying, (-3)))::text)))::character varying) gist_trgm_ops);
You might have better luck with gin_trgm_ops than gist_trgm_ops. Have you tried that?
...
When such a slow query is running, 'top' shows nearly '100 % wait' and 'iotop' shows 3 - 8 MB/sec disk read by a process
postgres: vb vb 10.128.96.25(60435) FETCH
Also the postgres log, which was told to log every task longer than 5000 ms, shows
2015-09-02 13:44:48 CEST [25237-1] vb@vb LOG: duration: 55817.191 ms execute <unnamed>: FETCH FORWARD 4096 IN "py:0xa2d61f6c"
Since I never used a FETCH command in my life, this must be used by pg_trgm or something inside it (gin, gist etc.)
The FETCH is probably being automatically added by whatever python library you are use to talk to PostgreSQL. Are you using a named cursor in python? In any event, that is not the cause of the problem.
Can you get the result of the indexed expression for a query that is slow?
On Wed, Sep 2, 2015 at 4:29 PM, Jeff Janes <jeff.janes@gmail.com> wrote: > On Wed, Sep 2, 2015 at 7:00 AM, Volker Böhm <volker@vboehm.de> wrote: >> >> >> >> CREATE INDEX trgm_adresse ON adressen.adresse USING gist >> (normalize_string((btrim((((((((normalize_string((((COALESCE((vorname)::text, >> ''::text) || ' '::text) || (name1)::text))::character varying, >> (-1)))::text || ' '::text) || (normalize_string((COALESCE((strasse)::text, >> ''::text))::character varying, (-2)))::text) || ' '::text) || (plz)::text) >> || ' '::text) || (normalize_string((COALESCE((ort)::text, >> ''::text))::character varying, (-3)))::text)))::character varying) >> gist_trgm_ops); > > > > You might have better luck with gin_trgm_ops than gist_trgm_ops. Have you > tried that? I just had the exact same problem, and indeed gin fares much better.
On Thu, Sep 3, 2015 at 6:19 PM, Claudio Freire <klaussfreire@gmail.com> wrote: > On Wed, Sep 2, 2015 at 4:29 PM, Jeff Janes <jeff.janes@gmail.com> wrote: >> On Wed, Sep 2, 2015 at 7:00 AM, Volker Böhm <volker@vboehm.de> wrote: >>> >>> >>> >>> CREATE INDEX trgm_adresse ON adressen.adresse USING gist >>> (normalize_string((btrim((((((((normalize_string((((COALESCE((vorname)::text, >>> ''::text) || ' '::text) || (name1)::text))::character varying, >>> (-1)))::text || ' '::text) || (normalize_string((COALESCE((strasse)::text, >>> ''::text))::character varying, (-2)))::text) || ' '::text) || (plz)::text) >>> || ' '::text) || (normalize_string((COALESCE((ort)::text, >>> ''::text))::character varying, (-3)))::text)))::character varying) >>> gist_trgm_ops); >> >> >> >> You might have better luck with gin_trgm_ops than gist_trgm_ops. Have you >> tried that? > > > I just had the exact same problem, and indeed gin fares much better. Also, with 9.5 we will see much better worst case performance from gin via Jeff's patch: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=97f3014647a5bd570032abd2b809d3233003f13f (I had to previously abandon pg_tgrm for a previous project and go with solr; had this patch been in place that would not have happened) merlin
On 09/08/2015 09:15 PM, Merlin Moncure wrote: ... >> I just had the exact same problem, and indeed gin fares much better. > > Also, with 9.5 we will see much better worst case performance from gin > via Jeff's patch: > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=97f3014647a5bd570032abd2b809d3233003f13f > > (I had to previously abandon pg_tgrm for a previous project and go > with solr; had this patch been in place that would not have happened) Except that pg_tgrm-1.2 is not included in 9.5, because it was committed in July (i.e. long after 9.5 was branched). -- Tomas Vondra http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Sep 8, 2015 at 4:21 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > On 09/08/2015 09:15 PM, Merlin Moncure wrote: > ... >>> >>> I just had the exact same problem, and indeed gin fares much better. >> >> >> Also, with 9.5 we will see much better worst case performance from gin >> via Jeff's patch: >> >> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=97f3014647a5bd570032abd2b809d3233003f13f >> >> (I had to previously abandon pg_tgrm for a previous project and go >> with solr; had this patch been in place that would not have happened) > > Except that pg_tgrm-1.2 is not included in 9.5, because it was committed in > July (i.e. long after 9.5 was branched). oops! thinko. it shouldn't be too difficult to back patch though if you're so inclined? merlin