Re: Tid scan improvements - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Tid scan improvements
Date
Msg-id 20190719182105.hbrjtw7fykfywblr@alap3.anarazel.de
Whole thread Raw
In response to Re: Tid scan improvements  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: Tid scan improvements  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers
Hi,

On 2019-07-19 13:54:59 +1200, David Rowley wrote:
> On Thu, 18 Jul 2019 at 14:30, Andres Freund <andres@anarazel.de> wrote:
> > I think the AM part of this patch might be the wrong approach - it won't
> > do anything meaningful for an AM that doesn't directly map the ctid to a
> > specific location in a block (e.g. zedstore).  To me it seems the
> > callback ought to be to get a range of tids, and the tidrange scan
> > shouldn't do anything but determine the range of tids the AM should
> > return.
> 
> Sounds like that's going to require adding some new fields to
> HeapScanDescData, then some callback similar to heap_setscanlimits to
> set those fields.
> 
> Then, we'd either need to:
> 
> 1. Make the table_scan_getnextslot() implementations check the tuple
> falls within the range, or
> 2. add another callback that pays attention to the set TID range.

> The problem with #1 is that would add overhead to normal seqscans,
> which seems like a bad idea.
> 
> Did you imagined two additional callbacks, 1 to set the TID range,
> then one to scan it?  Duplicating the logic in heapgettup_pagemode()
> and heapgettup() looks pretty horrible, but I guess we could add a
> wrapper around it that loops until it gets the first tuple and bails
> once it scans beyond the final tuple.
> 
> Is that what you had in mind?

Yea, I was thinking of something like 2.  We already have a few extra
types of scan nodes (bitmap heap and sample scan), it'd not be bad to
add another. And as you say, they can just share most of the guts: For
heap I'd just implement pagemode, and perhaps split heapgettup_pagemode
into two parts (one to do the page processing, the other to determine
the relevant page).

You say that we'd need new fields in HeapScanDescData - not so sure
about that, it seems feasible to just provide the boundaries in the
call? But I think it'd also just be fine to have the additional fields.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Broken defenses against dropping a partitioning column
Next
From: Tom Lane
Date:
Subject: Re: [sqlsmith] Crash in mcv_get_match_bitmap