Re: Gsoc2012 idea, tablesample - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Gsoc2012 idea, tablesample
Date
Msg-id CAM-w4HO_+XoqF8cYwNAibDtwWCNL1zBOCWFr_XJ4gnEbDw526A@mail.gmail.com
Whole thread Raw
In response to Re: Gsoc2012 idea, tablesample  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-hackers
On Fri, May 11, 2012 at 6:16 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
>> MaxHeapTuplesPerPage?
>
> What about dead line pointers without corresponding tuples?

Actually we don't allow there to be more than MaxHeapTuplesPerPage
line pointers even if some of them are dead line pointers.

I think the argument then was that there could be bugs in code that
isn't expecting more so perhaps that doesn't justify baking that into
more places when we could instead be removing that assumption.

Also, if I absorbed enough of this conversation skimming backwards it
seems the algorithm would be very inefficient with such a conservative
estimate. On a table with normal width rows it would mean usually
picking tuples that don't exist and having to try again. As Tom
pointed out that would mean even a small sample would have to read
nearly the whole table to find the sample.


-- 
greg


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Re: [COMMITTERS] pgsql: Ensure age() returns a stable value rather than the latest value
Next
From: Andres Freund
Date:
Subject: Re: WalSndWakeup() and synchronous_commit=off