Re: Gsoc2012 idea, tablesample - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Gsoc2012 idea, tablesample
Date
Msg-id 5527.1336750941@sss.pgh.pa.us
Whole thread Raw
In response to Re: Gsoc2012 idea, tablesample  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-hackers
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Robert Haas <robertmhaas@gmail.com> wrote:
>> The trouble is, AFAICS, that you can't bound M very well without
>> scanning the whole table.  I mean, it's bounded by theoretical
>> limit, but that's it.
> What would the theoretical limit be?  (black size - page header size
> - minimum size of one tuple) / item pointer size?  So, on an 8KB
> page, somewhere in the neighborhood of 1350?

Your math is off --- I get something less than 300, even if the tuples
are assumed to be empty of data.  (Header size 24 bytes, plus 4-byte
line pointer, so at least 28 bytes per tuple, so at most 292 on an 8K
page.)  But you still end up probing just about every page for a 1%
sample.
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: Gsoc2012 idea, tablesample
Next
From: Tom Lane
Date:
Subject: Re: incorrect handling of the timeout in pg_receivexlog