Re: Gsoc2012 idea, tablesample - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: Gsoc2012 idea, tablesample
Date
Msg-id 4FACEBA00200002500047BA4@gw.wicourts.gov
Whole thread Raw
In response to Re: Gsoc2012 idea, tablesample  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Gsoc2012 idea, tablesample
Re: Gsoc2012 idea, tablesample
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> wrote:
> The trouble is, AFAICS, that you can't bound M very well without
> scanning the whole table.  I mean, it's bounded by theoretical
> limit, but that's it.
What would the theoretical limit be?  (black size - page header size
- minimum size of one tuple) / item pointer size?  So, on an 8KB
page, somewhere in the neighborhood of 1350?  Hmm.  If that's right,
that would mean a 1% random sample would need 13.5 probes per page,
meaning there wouldn't tend to be a lot of pages missed.  Still, the
technique for getting a random sample seems sound, unless someone
suggests something better.  Maybe we just want to go straight to a
seqscan to get to the pages we want to probe rather than reading
just the ones on the "probe list" in physical order?
-Kevin


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Gsoc2012 idea, tablesample
Next
From: Tom Lane
Date:
Subject: Re: Gsoc2012 idea, tablesample