Re: Table not using tsvector gin index and performance much worse than when it uses it. - Mailing list pgsql-performance

From Tom Lane
Subject Re: Table not using tsvector gin index and performance much worse than when it uses it.
Date
Msg-id 5326.1491434397@sss.pgh.pa.us
Whole thread Raw
In response to Table not using tsvector gin index and performance much worse thanwhen it uses it.  (rverghese <riyav@hotmail.com>)
Responses Re: Table not using tsvector gin index and performance much worsethan when it uses it.  (rverghese <riyav@hotmail.com>)
List pgsql-performance
rverghese <riyav@hotmail.com> writes:
> I have a table with 22k rows - not large at all. I have a couple of indices
> on it as well as a gin index on  a tsvector column. If I reindex the table
> and run a query it takes ~20ms to execute using the tsvector-gin index. By
> the end of the day, the planner decides not to use the gin index and uses
> the other indices on the table and the query takes ~80ms. If I reindex, the
> pattern repeats-it uses the gin index for a while for superior performance
> and then drops back to using the alternate ones.
> The ibloat on the index shows as 0.4 and wastedibytes is 0.  Less than 2K
> rows have been updated of the 22K since the last reindex but the performance
> has dropped since it is no longer using the gin index by mid-day.
> Any thoughts on why it chooses to use alternate indices with hardly any
> updates? And is there a way to force it to use the gin index without having
> to reindex it twice a day.

You haven't mentioned what PG version this is, nor specified how many
updates is "hardly any", so you shouldn't expect any very precise answers.
But I'm suspicious that the problem is bloat of the index's pending list;
the planner's cost estimate is (correctly) pretty sensitive to the length
of that list.  If so, you need to arrange for the pending list to get
flushed into the main index structure more often.  Depending on your PG
version, that can be done by
* vacuum
* auto-analyze (but I bet your version doesn't, or you would not be
  complaining)
* gin_clean_pending_list() (but you probably ain't got that either)

Or you could reduce gin_pending_list_limit to cause insert-time flushes to
happen more often, or in the extremum even disable fastupdate for that
index.  Those options would slow down updates to make search performance
more stable, so they're not panaceas.

See
https://www.postgresql.org/docs/current/static/gin-implementation.html#GIN-FAST-UPDATE
for your version, also the "GIN Tips" on the next page.

Personally I'd try tweaking gin_pending_list_limit first, if you have
a version that has that ... but YMMV.

            regards, tom lane


pgsql-performance by date:

Previous
From: rverghese
Date:
Subject: Table not using tsvector gin index and performance much worse thanwhen it uses it.
Next
From: rverghese
Date:
Subject: Re: Table not using tsvector gin index and performance much worsethan when it uses it.