Re: Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows
Date
Msg-id 21502.1438890302@sss.pgh.pa.us
Whole thread Raw
In response to Re: Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
Simon Riggs <simon@2ndQuadrant.com> writes:
> On 6 August 2015 at 20:14, Josh Berkus <josh@agliodbs.com> wrote:
>> Speaking from a user perspective, SYSTEM seems broken to me.  I can't
>> imagine using it for anything with a that degree of variation in the
>> number of results returned, especially if it's possible to return zero
>> rows from a populated table.

> Please bear in mind you have requested a very small random sample of blocks.

Indeed.  My expectation about it is that you'd get the requested number of
rows *on average* over many tries (which is pretty much what Josh's
results show).  Since what SYSTEM actually returns must be a multiple of
the number of rows per page, if you make a request that's less than that
number of rows, you must get zero rows some of the time.  Otherwise the
sampling logic is cheating.

I do *not* think that we should force the sample to contain at least one
page, which is the only way that we could satisfy the complaint as stated.

Perhaps we need to adjust the documentation to make it clearer that
block-level sampling is not the thing to use if you want a sample that
doesn't amount to a reasonable number of blocks.  But I see absolutely
no evidence here that the sampling isn't behaving exactly as expected.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows
Next
From: Josh Berkus
Date:
Subject: Re: Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows