Re: Performance regression between 8.3 and 8.4 on heavy text indexing - Mailing list pgsql-performance

From Guillaume Smet
Subject Re: Performance regression between 8.3 and 8.4 on heavy text indexing
Date
Msg-id 1d4e0c10908261459s33ac5c02vd4494ed1dd20d390@mail.gmail.com
Whole thread Raw
In response to Re: Performance regression between 8.3 and 8.4 on heavy text indexing  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Performance regression between 8.3 and 8.4 on heavy text indexing  (gael@pilotsystems.net (Gaël Le Mignot))
List pgsql-performance
On Wed, Aug 26, 2009 at 6:29 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> gael@pilotsystems.net (=?iso-8859-1?Q?Ga=EBl?= Le Mignot) writes:
>> So it seems it was quite wrong about estimated matching rows (192 predicted, 10222 reals).
>
> Yup.  What's even more interesting is that it seems the real win would
> have been to use just the 'claude & duviau' condition (which apparently
> matched only 14 rows).  8.3 had no hope whatever of understanding that,
> it just got lucky.  8.4 should have figured it out, I'm thinking.
> Does it help if you increase the statistics target for fulltext_tsv?
> (Don't forget to re-ANALYZE after doing so.)

It could be interesting to run the query without the condition
(keywords_tsv @@ '''assassinat'''::tsquery) to see the estimate of
(fulltext_tsv @@ '''claud'' & ''duviau'''::tsquery) in 8.4.

Btw, what Tom means by increasing the statistics is executing the
following queries:
ALTER TABLE libeindex ALTER COLUMN fulltext_tsv SET STATISTICS 500;
ANALYZE;
run your query with EXPLAIN ANALYZE;
ALTER TABLE libeindex ALTER COLUMN fulltext_tsv SET STATISTICS 1000;
ANALYZE;
run your query with EXPLAIN ANALYZE;
ALTER TABLE libeindex ALTER COLUMN fulltext_tsv SET STATISTICS 5000;
ANALYZE;
run your query with EXPLAIN ANALYZE;

to see if it improves the estimates.

--
Guillaume

pgsql-performance by date:

Previous
From: Greg Stark
Date:
Subject: Re: Performance issues with large amounts of time-series data
Next
From: "Kevin Grittner"
Date:
Subject: Re: PostgreSQL does CAST implicitely between int and a domain derived from int