Mischa,
> Okay, although given the track record of page-based sampling for
> n-distinct, it's a bit like looking for your keys under the streetlight,
> rather than in the alley where you dropped them :-)
Bad analogy, but funny.
The issue with page-based vs. pure random sampling is that to do, for example,
10% of rows purely randomly would actually mean loading 50% of pages. With
20% of rows, you might as well scan the whole table.
Unless, of course, we use indexes for sampling, which seems like a *really
good* idea to me ....
--
--Josh
Josh Berkus
Aglio Database Solutions
San Francisco