Re: sample scans and predicate locking - Mailing list pgsql-hackers

From Andres Freund
Subject Re: sample scans and predicate locking
Date
Msg-id 20190519222259.2gdlswllvf6z45rr@alap3.anarazel.de
Whole thread Raw
In response to Re: sample scans and predicate locking  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: sample scans and predicate locking
List pgsql-hackers
Hi,

On 2019-05-19 13:57:42 +1200, Thomas Munro wrote:
> On Sun, May 19, 2019 at 8:31 AM Andres Freund <andres@anarazel.de> wrote:
> > While looking at fixing [1] on master, I noticed the following
> > codeblock:
> >
> > static HeapScanDesc
> > heap_beginscan_internal(Relation relation, Snapshot snapshot,
> >                                                 int nkeys, ScanKey key,
> >                                                 ParallelHeapScanDesc parallel_scan,
> >                                                 bool allow_strat,
> >                                                 bool allow_sync,
> >                                                 bool allow_pagemode,
> >                                                 bool is_bitmapscan,
> >                                                 bool is_samplescan,
> >                                                 bool temp_snap)
> > ...
> >         /*
> >          * For a seqscan in a serializable transaction, acquire a predicate lock
> >          * on the entire relation. This is required not only to lock all the
> >          * matching tuples, but also to conflict with new insertions into the
> >          * table. In an indexscan, we take page locks on the index pages covering
> >          * the range specified in the scan qual, but in a heap scan there is
> >          * nothing more fine-grained to lock. A bitmap scan is a different story,
> >          * there we have already scanned the index and locked the index pages
> >          * covering the predicate. But in that case we still have to lock any
> >          * matching heap tuples.
> >          */
> >         if (!is_bitmapscan)
> >                 PredicateLockRelation(relation, snapshot);
> >
> > As you can see this only tests for is_bitmapscan, *not* for
> > is_samplescan. Which means we afaict currently every sample scan
> > predicate locks the entire relation.
> 
> Right, I just tested that.  That's not wrong though, is it?  It's just
> overly pessimistic.

Yea, I was mostly commenting on the fact that the comment doesn't
mention sample scans, so it looks a bit accidental.

I added a comment to master (as part of a fix, where this codepath was
entered inadvertently)


> > I think there's two possibilities here:
> >
> > 1) It's just the comment that's inaccurate, and it should really talk
> >    about both seqscans and sample scans. It should not be necessary to
> >    lock the whole relation, but I'm not sure the code otherwise takes
> >    enough care.
> >
> > 2) We should really not predicate lock the entire relation. In which
> >    case I think there might be missing PredicateLockTuple/Page calls.
> 
> Yeah, we could probably predicate-lock pages in
> heapam_scan_sample_next_block() and tuples in
> heapam_scan_sample_next_tuple(), instead of doing this.  Seems like a
> reasonable improvement for 13.  But... hmm....  There *might* be a
> theoretical argument about TABLESAMPLE(100) behaving differently if
> done per page rather than if done at relation level, wrt new pages
> added to the end later and therefore missed.  And then by logical
> extension, TABLESAMPLE of any percentage.  I'm not sure.

I don't think that's actually a problem, tablesample doesn't return
invisible rows. And the equivalent issue is true just as well for index
and bitmap heap scans?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Segfault on ANALYZE in SERIALIZABLE isolation
Next
From: David Rowley
Date:
Subject: Re: Statistical aggregate functions are not working with PARTIAL aggregation