Inconsistent query times and spiky CPU with GIN tsvector search - Mailing list pgsql-performance

From Scott Rankin
Subject Inconsistent query times and spiky CPU with GIN tsvector search
Date
Msg-id E919673F-1BC3-4525-84CA-51FD854F3D0C@motus.com
Whole thread Raw
Responses Re: Inconsistent query times and spiky CPU with GIN tsvector search  (Laurenz Albe <laurenz.albe@cybertec.at>)
List pgsql-performance
Hello all,

We are running postgresql 9.4 and we have a table where we do some full-text searching using a GIN index on a tsvector
column:

CREATE TABLE public.location_search
(
    id bigint NOT NULL DEFAULT nextval('location_search_id_seq'::regclass),
    <snip some columns>…
    search_field_tsvector tsvector
)

and

CREATE INDEX location_search_tsvector_idx
    ON public.location_search USING gin
    (search_field_tsvector)
    TABLESPACE pg_default;

The search_field_tsvector column contains the data from the location's name and address:

to_tsvector('pg_catalog.english', COALESCE(NEW.name, '')) || to_tsvector(COALESCE(address, ''))

This setup has been running very well, but as our load is getting heavier, the performance seems to be getting much
moreinconsistent.  Our searches are run on a dedicated read replica, so this server is only doing queries against this
onetable.  IO is very low, indicating to me that the data is all in memory.  However, we're getting some queries taking
upwardsof 15-20 seconds, while the average is closer to 1 second.
 

A sample query that's running slowly is

explain (analyze, buffers)
SELECT ls.location AS locationId FROM location_search ls
WHERE ls.client = 1363
AND ls.favorite = TRUE
AND search_field_tsvector @@ to_tsquery('CA-94:* &E &San:*')
LIMIT 4;

And the explain analyze is:

Limit  (cost=39865.85..39877.29 rows=1 width=8) (actual time=4471.120..4471.120 rows=0 loops=1)
  Buffers: shared hit=25613
  ->  Bitmap Heap Scan on location_search ls  (cost=39865.85..39877.29 rows=1 width=8) (actual time=4471.117..4471.117
rows=0loops=1)
 
        Recheck Cond: (search_field_tsvector @@ to_tsquery('CA-94:* &E &San:*'::text))
        Filter: (favorite AND (client = 1363))
        Rows Removed by Filter: 74
        Heap Blocks: exact=84
        Buffers: shared hit=25613
        ->  Bitmap Index Scan on location_search_tsvector_idx  (cost=0.00..39865.85 rows=6 width=0) (actual
time=4470.895..4470.895rows=84 loops=1)
 
              Index Cond: (search_field_tsvector @@ to_tsquery('CA-94:* &E &San:*'::text))
              Buffers: shared hit=25529
Planning time: 0.335 ms
Execution time: 4487.224 ms

I'm a little bit at a loss to where to start at this - any suggestions would be hugely appreciated!

Thanks,
Scott

This email message contains information that Motus, LLC considers confidential and/or proprietary, or may later
designateas confidential and proprietary. It is intended only for use of the individual or entity named above and
shouldnot be forwarded to any other persons or entities without the express consent of Motus, LLC, nor should it be
usedfor any purpose other than in the course of any potential or actual business relationship with Motus, LLC. If the
readerof this message is not the intended recipient, or the employee or agent responsible to deliver it to the intended
recipient,you are hereby notified that any dissemination, distribution, or copying of this communication is strictly
prohibited.If you have received this communication in error, please notify sender immediately and destroy the original
message.

Internal Revenue Service regulations require that certain types of written advice include a disclaimer. To the extent
thepreceding message contains advice relating to a Federal tax issue, unless expressly stated otherwise the advice is
notintended or written to be used, and it cannot be used by the recipient or any other taxpayer, for the purpose of
avoidingFederal tax penalties, and was not written to support the promotion or marketing of any transaction or matter
discussedherein.
 

pgsql-performance by date:

Previous
From: jimmy
Date:
Subject: RE: Query is slow when run for first time; subsequent execution isfast
Next
From: Laurenz Albe
Date:
Subject: Re: Inconsistent query times and spiky CPU with GIN tsvector search