From: "Tom Lane" <tgl@sss.pgh.pa.us>
> "Richard Huxton" <dev@archonet.com> writes:
> > Why doesn't PG (or any other system afaik) just have a first guess, run
the
> > query and then if the costs are horribly wrong cache the right result.
>
> ?? Knowing that your previous guess was wrong doesn't tell you what the
> right answer is, especially not for the somewhat-different question that
> the next query is likely to provide.
Surely if you used a seqscan on "where x=1" and only got 2 rows rather than
the 3000 you were expecting the only alternative is to try an index?
> The real problem here is simply that PG hasn't been keeping adequately
> detailed statistics. I'm currently working on improving that for 7.2...
> see discussions over in pghackers if you are interested.
Thinking about it (along with Bruce's reply posted to the list) I guess the
difference is whether you gather the statistics up-front during a vacuum, or
build them as queries are used. You're always going to need *something* to
base your first guess on anyway - the "learning" would only help you in
those cases where the distribution of values wasn't a normal curve.
Anyway, given that I'm up to my neck in work at the moment and I don't
actually know what I'm talking about, I'll shut up and get back to keeping
clients happy :-)
- Richard Huxton