Re: Index Skip Scan - Mailing list pgsql-hackers

From Jesper Pedersen
Subject Re: Index Skip Scan
Date
Msg-id dfbf52b0-17b3-b216-db01-e96cf82132cf@redhat.com
Whole thread Raw
In response to Re: Index Skip Scan  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
List pgsql-hackers
Hi,

On 1/31/19 1:31 AM, Kyotaro HORIGUCHI wrote:
> At Wed, 30 Jan 2019 18:19:05 +0100, Dmitry Dolgov <9erthalion6@gmail.com> wrote in
<CA+q6zcVP18wYiO=aa+fz3GuncuTF52q1sufB7ise37TJPSDK1w@mail.gmail.com>
>> A bit of adjustment after nodes/relation -> nodes/pathnodes.
> 
> I had a look on this.
> 
> The name "index skip scan" is a different feature from the
> feature with the name on other prodcuts, which means "index scan
> with postfix key (of mainly of multi column key) that scans
> ignoring the prefixing part" As Thomas suggested I'd suggest that
> we call it "index hop scan". (I can accept Hopscotch, either:p)
> 
> Also as mentioned upthread by Peter Geoghegan, this could easly
> give worse plan by underestimation. So I also suggest that this
> has dynamic fallback function.  In such perspective it is not
> suitable for AM API level feature.
> 
> If all leaf pages are on the buffer and the average hopping
> distance is less than (expectedly) 4 pages (the average height of
> the tree), the skip scan will lose. If almost all leaf pages are
> staying on disk, we could win only by 2-pages step (skipping over
> one page).
> 
> =====
> As I'm writing the above, I came to think that it's better
> implement this as an pure executor optimization.
> 
> Specifically, let _bt_steppage count the ratio of skipped pages
> so far then if the ratio exceeds some threshold (maybe around
> 3/4) it gets into hopscotching mode, where it uses index scan to
> find the next page (rather than traversing). As mentioned above,
> I think using skip scan to go beyond the next page is a good
> bet. If the success ration of skip scans goes below some
> threshold (I'm not sure for now), we should fall back to
> traversing.
> 
> Any opinions?
> 
> ====
> 
> Some comments on the patch below.
> 
> +     skip scan approach, which is based on the idea of
> +     <ulink url="https://wiki.postgresql.org/wiki/Free_Space_Map_Problems">
> +     Loose index scan</ulink>. Rather than scanning all equal values of a key,
> +     as soon as a new value is found, it will search for a larger value on the
> 
> I'm not sure it is a good thing to put a pointer to rather
> unstable source in the documentation.
> 
> 
> This adds a new AM method but it seems avaiable only for ordered
> indexes, specifically btree. And it seems that the feature can be
> implemented in btgettuple since btskip apparently does the same
> thing. (I agree to Robert in the point in [1]).
> 
> [1] https://www.postgresql.org/message-id/CA%2BTgmobb3uN0xDqTRu7f7WdjGRAXpSFxeAQnvNr%3DOK5_kC_SSg%40mail.gmail.com
> 
> 
> Related to the above, it seems better that the path generation of
> skip scan is a part of index scan. Whether skip scan or not is a
> matter of index scan itself.
> 

Thanks for your valuable feedback ! And, calling it "Loose index scan" 
or something else is better.

Dmitry and I will look at this and take it into account for the next 
version.

For now, I have switched the CF entry to WoA.

Thanks again !

Best regards,
  Jesper


pgsql-hackers by date:

Previous
From: Jesper Pedersen
Date:
Subject: Re: pg_upgrade: Pass -j down to vacuumdb
Next
From: Robert Haas
Date:
Subject: Re: ATTACH/DETACH PARTITION CONCURRENTLY