I don't quote know how this data but any constant factor seems like it
would be arbitrary. It sounds like a more principled algorithm would
be to use stats_target^2. But that has the same problem. Even
stats_target^1.5 would be too big for stats_target 10,000.
I think just using 10 is probably the right thing.
--
Greg
On 13 Dec 2008, at 13:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I started making the changes to increase the default and maximum stats
> targets 10X, as I believe was agreed to in this thread:
> http://archives.postgresql.org/pgsql-hackers/2008-12/msg00386.php
>
> I came across this bit in ts_typanalyze.c:
>
> /* We want statistic_target * 100 lexemes in the MCELEM array */
> num_mcelem = stats->attr->attstattarget * 100;
>
> I wonder whether the multiplier here should be changed? This code is
> new for 8.4, so we have zero field experience about what desirable
> lexeme counts are; but the prospect of up to a million lexemes in
> a pg_statistic entry doesn't seem quite right. I'm tempted to cut the
> multiplier to 10 so that the effective range of MCELEM sizes remains
> the same as what Jan had in mind when he wrote the code.
>
> regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers