Re: Index Skip Scan - Mailing list pgsql-hackers

From James Coleman
Subject Re: Index Skip Scan
Date
Msg-id CAAaqYe_+q=iqDzd8cJsYpcG+4JtK1_zdv01EKtLJFZ4bcpmQHg@mail.gmail.com
Whole thread Raw
In response to Re: Index Skip Scan  (Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru>)
List pgsql-hackers
> Is skip scan only possible for index-only scan?

I haven't see much discussion of this question yet. Is there a
particular reason to lock ourselves into thinking about this only in
an index only scan?

>> I think we can improve this,
>> and the skip scan can be strictly faster than index scan regardless of
>> the data. As a first approximation, imagine that we somehow skipped
>> equal tuples inside _bt_next instead of sending them to the parent
>> Unique node. This would already be marginally faster than Unique + Index
>> scan. A more practical implementation would be to remember our position
>> in tree (that is, BTStack returned by _bt_search) and use it to skip
>> pages in bulk. This looks straightforward to implement for a tree that
>> does not change, but I'm not sure how to make it work with concurrent
>> modifications. Still, this looks a worthwhile direction to me, because
>> if we have a strictly faster skip scan, we can just use it always and
>> not worry about our unreliable statistics. What do you think?
>>
>
> This is something to look at -- maybe there is a way to use
> btpo_next/btpo_prev instead/too in order to speed things up. Atm we just
> have the scan key in BTScanOpaqueData. I'll take a look after my
> upcoming vacation; feel free to contribute those changes in the meantime
> of course.

It seems to me also that the logic necessary for this kind of
traversal has other useful applications. For example, it should be
possible to build on that logic to allow and index like t(owner_fk,
created_at) to be used to execute the following query:

select *
from t
where owner_fk in (1,2,3)
order by created_at
limit 25

without needing to fetch all tuples satisfying "owner_fk in (1,2,3)"
and subsequently sorting them.


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] generated columns
Next
From: Alvaro Herrera
Date:
Subject: Re: New Defects reported by Coverity Scan for PostgreSQL