"Richard Huxton" <dev@archonet.com> writes:
>> ?? Knowing that your previous guess was wrong doesn't tell you what the
>> right answer is, especially not for the somewhat-different question that
>> the next query is likely to provide.
> Surely if you used a seqscan on "where x=1" and only got 2 rows rather than
> the 3000 you were expecting the only alternative is to try an index?
But if the next query is "where x=2", what do you do? Keep in mind that
the data distributions people have been having trouble with are
irregular: you can't conclude anything very reliable about x=2 based on
what you know about x=1.
> Thinking about it (along with Bruce's reply posted to the list) I guess the
> difference is whether you gather the statistics up-front during a vacuum, or
> build them as queries are used.
Stats gathered as a byproduct of individual queries might be useful if
you happen to get the exact same queries over again, but I doubt that
a succession of such results should be expected to build up a picture
that's complete enough to extrapolate to other queries. Stats gathered
by ANALYZE have the merit that they come from a process that's designed
specifically to give you a good statistical picture.
regards, tom lane