RE: Index Skip Scan (new UniqueKeys) - Mailing list pgsql-hackers

From Floris Van Nee
Subject RE: Index Skip Scan (new UniqueKeys)
Date
Msg-id ef9954d83a9e42fabfac235bdd87d05a@opammb0561.comp.optiver.com
Whole thread Raw
In response to Re: Index Skip Scan (new UniqueKeys)  (Dmitry Dolgov <9erthalion6@gmail.com>)
Responses RE: Index Skip Scan (new UniqueKeys)
Re: Index Skip Scan (new UniqueKeys)
List pgsql-hackers
>
> Good point, thanks for looking at this. With the latest planner version there
> are indeed more possibilities to use skipping. It never occured to me that
> some of those paths will still rely on index scan returning full data set. I'll look
> in details and add verification to prevent putting something like this on top of
> skip scan in the next version.

I believe the required changes are something like in attached patch. There were a few things I've changed:
- build_uniquekeys was constructing the list incorrectly. For a DISTINCT a,b, it would create two unique keys, one with
aand one with b. However, it should be one unique key with (a,b). 
- the uniquekeys that is built, still contains some redundant keys, that are normally eliminated from the path keys
lists.
- the distinct_pathkeys may be NULL, even though there's a possibility for skipping. But it wouldn't create the
uniquekeysin this case. This makes the planner not choose skip scans even though it could. For example in queries that
doSELECT DISTINCT ON (a) * FROM t1 WHERE a=1 ORDER BY a,b; Since a is constant, it's eliminated from regular pathkeys. 
- to combat the issues mentioned earlier, there's now a check in build_index_paths that checks if the query_pathkeys
matchesthe useful_pathkeys. Note that we have to use the path keys here rather than any of the unique keys. The unique
keysare only Expr nodes - they do not contain the necessary information about ordering. Due to elimination of some
constantpath keys, we have to search the attributes of the index to find the correct prefix to use in skipping. 
- creating the skip scan path did not actually fill the Path's unique keys. It should just set this to
query_uniquekeys.

I've attached the first two unique-keys patches (v9, 0001, 0002)), your patches, but rebased on v9 of unique keys
(0003-0006)+ a diff patch (0007) that applies my suggested changes on top of it. 

-Floris


Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: A patch for get origin from commit_ts.
Next
From: vignesh C
Date:
Subject: Re: [PATCH] Performance Improvement For Copy From Binary Files