On 19 September 2018 at 18:04, Edmund Horner <ejrh00@gmail.com> wrote:
> I have been generally following this approach (handling more kinds of
> TID comparisons), and have found myself doing things like pairing up >
> with <, estimating how much of a table is covered by some set of >, <,
> or "> AND <" quals, etc. Things that I'm sure are handled in an
> advanced way by index paths; unfortunately I didn't see any easily
> reusable code in the index path code. So I've ended up writing
> special-case code for TID scans. Hopefully it will be worth it.
I don't think it would need to be as complex as the index matching
code. Just looping over the quals and gathering up all compatible ctid
quals should be fine. I imagine the complex handling of sorting the
quals by ctid and removal of redundant quals that are covered by some
range would be done in the executor.
Probably the costing will get more complex. At the moment it seems we
add a random_page_cost per ctid, but you'd probably need to make that
better and loop over the quals in each implicitly ANDed set and find
the max ctid for the > / >= quals and the the min < / <= ctid, then
get the page number from each and assume max - min seq_page_cost, then
add random_page_cost for any remaining equality quals. The costs from
other OR branches can likely just be added on. This would double
count if someone did WHERE ctid BETWEEN '(0,0') AND '(100,300)' OR
ctid BETWEEN '(0,0') AND '(100,300)'; The current code seems to
double count now for duplicate ctids anyway. It even double counts if
the ctid being compared to is on the same page as another ctid, so I
don't think that would be unacceptable.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services