Re: Index Skip Scan - Mailing list pgsql-hackers
From | Dmitry Dolgov |
---|---|
Subject | Re: Index Skip Scan |
Date | |
Msg-id | CA+q6zcV5XmtTFRVKdkK6nhd4Xf4D-SzzLHjvkkXFdnQJBV-pNw@mail.gmail.com Whole thread Raw |
In response to | Re: Index Skip Scan (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Index Skip Scan
(Sergei Kornilov <sk@zsrv.org>)
|
List | pgsql-hackers |
> On Fri, 12 Oct 2018 at 19:44, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Sep 13, 2018 at 11:40 AM Jesper Pedersen > <jesper.pedersen@redhat.com> wrote: > > > I noticed that current implementation doesn't > > > perform well when we have lots of small groups of equal values. Here is > > > the execution time of index skip scan vs unique over index scan, in ms, > > > depending on the size of group. The benchmark script is attached. > > > > > > group size skip unique > > > 1 2,293.85 132.55 > > > 5 464.40 106.59 > > > 10 239.61 102.02 > > > 50 56.59 98.74 > > > 100 32.56 103.04 > > > 500 6.08 97.09 > > > > > > > Yes, this doesn't look good. Using your test case I'm seeing that unique > > is being chosen when the group size is below 34, and skip above. > > I'm not sure exactly how the current patch is approaching the problem, > but it seems like it might be reasonable to do something like -- look > for a distinct item within the current page; if not found, then search > from the root of the tree for the next item > the current item. I'm not sure that I understand it correctly, can you elaborate please? From what I see it's quite similar to what's already implemented - we look for a distinct item on the page, and then search the index tree for a next distinct item. > On Wed, 10 Oct 2018 at 17:34, Dmitry Dolgov <9erthalion6@gmail.com> wrote: > > On Tue, 9 Oct 2018 at 18:13, Pavel Stehule <pavel.stehule@gmail.com> wrote: > > > >> It looks like good idea, but then the node should be named "index scan" and > >> other info can be displayed in detail parts. It can be similar like "sort". > >> The combination of unique and index skip scan looks strange :) > > > > maybe we don't need special index skip scan node - maybe possibility to > > return unique values from index scan node can be good enough - some like > > "distinct index scan" - and the implementation there can be dynamic -skip > > scan, classic index scan, > > > > "index skip scan" is not good name if the implementaion is dynamic. > > Yeah, that's a valid point. The good part is that index skip scan is not really > a separate node, but just enhanced index only scan node. So indeed maybe it > would be better to call it Index Only Scan, but show in details that we apply > the skip scan strategy. Any other opinions about this? To make it more clean what I mean, see attached version of the patch. > >> I think it was query like > >> select count(*) from (select distinct x from tab) s > > Thanks, I'll take a look. I couldn't reproduce it either yet.
Attachment
pgsql-hackers by date: