Re: TABLESAMPLE patch is really in pretty sad shape - Mailing list pgsql-hackers

From Petr Jelinek
Subject Re: TABLESAMPLE patch is really in pretty sad shape
Date
Msg-id 55B2E11E.1020007@2ndquadrant.com
Whole thread Raw
In response to Re: TABLESAMPLE patch is really in pretty sad shape  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: TABLESAMPLE patch is really in pretty sad shape
List pgsql-hackers
On 2015-07-25 00:36, Tom Lane wrote:
> I wrote:
>> Petr Jelinek <petr@2ndquadrant.com> writes:
>>> The only major difference that I see so far and I'd like you to
>>> incorporate that into your patch is that I renamed the SampleScanCost to
>>> SampleScanGetRelSize because that reflects much better the use of it, it
>>> isn't really used for costing, but for getting the pages and tuples of
>>> the baserel.
>
>> Good suggestion.  I was feeling vaguely uncomfortable with that name as
>> well, given what the functionality ended up being.
>
> After further thought it seemed like the right name to use is
> SampleScanGetSampleSize, so as to avoid confusion between the size of the
> relation and the size of the sample.  (The planner is internally not
> making such a distinction right now, but that doesn't mean we should
> propagate that fuzzy thinking into the API spec.)
>

Right, sounds good to me.

> Attached is a more-or-less-final version of the proposed patch.
> Major changes since yesterday:
>
> * I worked over the contrib modules and docs.
>
> * I thought of a reasonably easy way to do something with nonrepeatable
> sampling methods inside join queries: we can simply wrap the SampleScan
> plan node in a Materialize node, which will guard it against being
> executed more than once.  There might be a few corner cases where this
> doesn't work fully desirably, but in testing it seemed to do the right
> thing (and I added some regression tests about that).  This seems
> certainly a better answer than either throwing an error or ignoring
> the problem.
>

Seems reasonable.

I was wondering if we should perhaps cache the output of GetTsmRoutine 
as we call it up to 4 times in the planner now but it's relatively cheap 
call (basically just makeNode) so it's probably not worth it. In any 
case it's not something that should stop you from committing the patch. 
Thanks for all your work on this.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Mike Blackwell
Date:
Subject: Re: [PROPOSAL] VACUUM Progress Checker.
Next
From: Peter Eisentraut
Date:
Subject: Re: MultiXact member wraparound protections are now enabled