On Thu, Sep 09, 2010 at 11:16:36AM -0400, Tom Lane wrote:
> Henk van Lingen <H.G.K.vanLingen@uu.nl> writes:
> > On Thu, Sep 09, 2010 at 10:50:52AM -0400, Tom Lane wrote:
> >>>> Well, there's your problem: the planner is off by a factor of about 500
> >>>> on its estimate of the number of rows matching this query, and that's
> >>>> what's causing it to pick the wrong plan. What you need to look into
> >>>> is getting that estimate to be more in sync with reality. Probably
> >>>> increasing the stats target for the message column would help.
>
> > But how can I get sane estimates for syslog data? Some searchstrings will
> > result in only a few hits, others in thousands of records or more.
>
> That's what ANALYZE is for ...
Yes, off course. But I don't see how the most_common_vals & freqs and the
histogram_bounds for a text field with syslog data make any sense when
doing doing a search for a substring. Increasing the number of entries in
those stats lists doesn't make any sense also, i presume.
Those stats should be based on analysis of the to_tsvector index, to have
any meaning, i think.
Today I will look into the multicolumn index suggestion.
Regards,
--
Henk van Lingen, ICT-SC Netwerk & Telefonie, (o- -+
Universiteit Utrecht, Jenalaan 18a, room 0.12 /\ |
phone: +31-30-2538453 v_/_ |
http://henk.vanlingen.net/ http://www.tuxtown.net/netiquette/