Re: Index Skip Scan - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Index Skip Scan
Date
Msg-id 20190725.202950.257050501.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: Index Skip Scan  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Index Skip Scan  (Dmitry Dolgov <9erthalion6@gmail.com>)
List pgsql-hackers
Sorry, there's a too-hard-to-read part.

At Thu, 25 Jul 2019 20:17:37 +0900 (Tokyo Standard Time), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
<20190725.201737.192223037.horikyota.ntt@gmail.com>
> Hello.
> 
> At Wed, 24 Jul 2019 22:49:32 +0200, Dmitry Dolgov <9erthalion6@gmail.com> wrote in
<CA+q6zcXgwDMiowOGbr7gimTY3NV-LbcwP=rbma_L56pc+9p1Xw@mail.gmail.com>
> > > On Mon, Jul 22, 2019 at 7:10 PM Jesper Pedersen <jesper.pedersen@redhat.com> wrote:
> > >
> > > On 7/22/19 1:44 AM, David Rowley wrote:
> > > > Here are the comments I noted down during the review:
> > > >
> > > > cost_index:
> > > >
> > > > I know you've not finished here, but I think it'll need to adjust
> > > > tuples_fetched somehow to account for estimate_num_groups() on the
> > > > Path's unique keys. Any Eclass with an ec_has_const = true does not
> > > > need to be part of the estimate there as there can only be at most one
> > > > value for these.
> > > >
> > > > For example, in a query such as:
> > > >
> > > > SELECT x,y FROM t WHERE x = 1 GROUP BY x,y;
> > > >
> > > > you only need to perform estimate_num_groups() on "y".
> > > >
> > > > I'm really not quite sure on what exactly will be required from
> > > > amcostestimate() here.  The cost of the skip scan is not the same as
> > > > the normal scan. So other that API needs adjusted to allow the caller
> > > > to mention that we want skip scans estimated, or there needs to be
> > > > another callback.
> > > >
> > >
> > > I think this part will become more clear once the index skip scan patch
> > > is rebased, as we got the uniquekeys field on the Path, and the
> > > indexskipprefixy info on the IndexPath node.
> > 
> > Here is what I came up with to address the problems, mentioned above in this
> > thread. It passes tests, but I haven't tested it yet more thoughtful (e.g. it
> > occurred to me, that `_bt_read_closest` probably wouldn't work, if a next key,
> > that passes an index condition is few pages away - I'll try to tackle it soon).
> > Just another small step forward, but I hope it's enough to rebase on top of it
> > planner changes.
> > 
> > Also I've added few tags, mostly to mention reviewers contribution.
> 
> I have some comments.
> 
> +           * The order of columns in the index should be the same, as for
> +           * unique distincs pathkeys, otherwise we cannot use _bt_search
> +           * in the skip implementation - this can lead to a missing
> +           * records.
> 
> It seems that it is enough that distinct pathkeys is contained in
> index pathkeys. If it's right, that is almost checked in existing
> code:
> 
> > if (pathkeys_contained_in(needed_pathkeys, path->pathkeys))
> 
> It is perfect when needed_pathkeys is distinct_pathkeys.  So
> additional check is required if and only if it is not the case.

So I think the following change will work.

- +        /* Consider index skip scan as well */
- +        if (enable_indexskipscan &&
- +          IsA(path, IndexPath) &&
- +          ((IndexPath *) path)->indexinfo->amcanskip &&
- +          (path->pathtype == T_IndexOnlyScan ||
- +           path->pathtype == T_IndexScan) &&
- +          root->distinct_pathkeys != NIL)

+ +        if (enable_indexskipscan &&
+ +          IsA(path, IndexPath) &&
+ +          ((IndexPath *) path)->indexskipprefix >= 0 &&
+ +          (needed_pathkeys == root->distinct_pathkeys ||
+ +           pathkeys_contained_in(root->distinct_pathkeys,
+ +                       path->pathkeys)))

Additional comments on the condition above are:

> path->pathtype is always one of T_IndexOnlyScan or T_IndexScan so
> the check against them isn't needed. If you have concern on that,
> we can add that as Assert().
> 
> I feel uncomfortable to look into indexinfo there. Couldnd't we
> use indexskipprefix == -1 to signal !amcanskip from
> create_index_path?
> 
> 
> +          /*
> +           * XXX: In case of index scan quals evaluation happens after
> +           * ExecScanFetch, which means skip results could be fitered out
> +           */
> 
> Why can't we use skipscan path if having filter condition?  If
> something bad happens, the reason must be written here instead of
> what we do.
> 
> 
> +          if (path->pathtype == T_IndexScan &&
> +            parse->jointree != NULL &&
> +            parse->jointree->quals != NULL &&
> +            ((List *)parse->jointree->quals)->length != 0)
> 
> It's better to use list_length instead of peeping inside. It
> handles the NULL case as well. (The structure has recently
> changed but .length is not, though.)
> 
> 
> +     * If advancing direction is different from index direction, we must
> +     * skip right away, but _bt_skip requires a starting point.
> 
> It doesn't seem needed to me. Could you elaborate on the reason
> for that?
> 
> 
> +     * If advancing direction is different from index direction, we must
> +     * skip right away, but _bt_skip requires a starting point.
> +     */
> +    if (direction * indexonlyscan->indexorderdir < 0 &&
> +      !node->ioss_FirstTupleEmitted)
> 
> I'm confused by this. "direction" there is the physical scan
> direction (fwd/bwd) of index scan, which is already compensated
> by indexorderdir. Thus the condition means we do that when
> logical ordering (ASC/DESC) is DESC. (Though I'm not sure what
> "index direction" exactly means...)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: POC: Cleaning up orphaned files using undo logs
Next
From: vignesh C
Date:
Subject: Re: Initdb failure