Re: Index Skip Scan - Mailing list pgsql-hackers

From Dmitry Dolgov
Subject Re: Index Skip Scan
Date
Msg-id 20200210193056.l4slz4bspqv44gqu@localhost
Whole thread Raw
In response to Re: Index Skip Scan  (James Coleman <jtc331@gmail.com>)
List pgsql-hackers
> On Sat, Feb 08, 2020 at 01:31:02PM -0500, James Coleman wrote:
> On Sat, Feb 8, 2020 at 10:24 AM Dmitry Dolgov <9erthalion6@gmail.com> wrote:
> >
> > > On Sat, Feb 08, 2020 at 03:22:17PM +0100, Tomas Vondra wrote:
> > > So in this case the skip scan is ~15x slower than the usual plan (index
> > > only scan + unique). The reason why this happens is pretty simple - to
> > > estimate the number of groups we multiply the ndistinct estimates for
> > > the two columns (which both have n_distinct = 10000), but then we cap
> > > the estimate to 10% of the table. But when the columns are independent
> > > with high cardinalities that under-estimates the actual value, making
> > > the cost for skip scan much lower than it should be.
> >
> > The current implementation checks if we can find the next value on the
> > same page to do a shortcut instead of tree traversal and improve such
> > kind of situations, but I can easily imagine that it's still not enough
> > in some extreme situations.
>
> This is almost certainly rehashing already covered ground, but since I
> doubt it's been discussed recently, would you be able to summarize
> that choice (not to always get the next tuple by scanning from the top
> of the tree again) and the performance/complexity tradeoffs?

Yeah, this part of discussion happened already some time ago. The idea
[1] is to protect ourselves at least partially from incorrect ndistinct
estimations. Simply doing jumping over an index means that even if the
next key we're searching for is on the same page as previous, we still
end up doing a search from the root of the tree, which is of course less
efficient than just check right on the page before jumping further.

Performance tradeoff in this case is simple, we make regular use case
slightly slower, but can perform better in the worst case scenarios.
Complexity tradeoff was never discussed, but I guess everyone assumed
it's relatively straightforward to check the current page and return if
something was found before jumping.

[1]: https://www.postgresql.org/message-id/CA%2BTgmoY7QTHhzLWZupNSyyqFRBfMgYocg3R-6g%3DDRgT4-KBGqg%40mail.gmail.com



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: POC: GUC option for skipping shared buffers in core dumps
Next
From: Andres Freund
Date:
Subject: Re: POC: GUC option for skipping shared buffers in core dumps