Home > mailing lists

Re: Index Skip Scan - Mailing list pgsql-hackers

From	James Coleman
Subject	Re: Index Skip Scan
Date	February 8, 2020 21:31:02
Msg-id	CAAaqYe8ATjgwJJnUe5O2KBy8HAAm1bPfYaGvaZrBh56-XP21AQ@mail.gmail.com Whole thread Raw
In response to	Re: Index Skip Scan (Dmitry Dolgov <9erthalion6@gmail.com>)
Responses	Re: Index Skip Scan
List	pgsql-hackers

Tree view

On Sat, Feb 8, 2020 at 10:24 AM Dmitry Dolgov <9erthalion6@gmail.com> wrote:
>
> > On Sat, Feb 08, 2020 at 03:22:17PM +0100, Tomas Vondra wrote:
> > So in this case the skip scan is ~15x slower than the usual plan (index
> > only scan + unique). The reason why this happens is pretty simple - to
> > estimate the number of groups we multiply the ndistinct estimates for
> > the two columns (which both have n_distinct = 10000), but then we cap
> > the estimate to 10% of the table. But when the columns are independent
> > with high cardinalities that under-estimates the actual value, making
> > the cost for skip scan much lower than it should be.
>
> The current implementation checks if we can find the next value on the
> same page to do a shortcut instead of tree traversal and improve such
> kind of situations, but I can easily imagine that it's still not enough
> in some extreme situations.

This is almost certainly rehashing already covered ground, but since I
doubt it's been discussed recently, would you be able to summarize
that choice (not to always get the next tuple by scanning from the top
of the tree again) and the performance/complexity tradeoffs?

Thanks,
James

pgsql-hackers by date:

From: Tom Lane
Date: 08 February 2020, 20:34:53
Subject: Re: Marking some contrib modules as trusted extensions

From: Masahiko Sawada
Date: 09 February 2020, 04:11:30
Subject: Re: Internal key management system

Re: Index Skip Scan - Mailing list pgsql-hackers

Previous

Next