Re: [PATCH] Keeps tracking the uniqueness with UniqueKey - Mailing list pgsql-hackers

From David Rowley
Subject Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
Date
Msg-id CAApHDvqO543rifM8LMYBW9DgSJaiX52V2C2sPRzsc2T4qd2Ytw@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (Dmitry Dolgov <9erthalion6@gmail.com>)
Responses Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (Dmitry Dolgov <9erthalion6@gmail.com>)
Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (Andy Fan <zhihui.fan1213@gmail.com>)
List pgsql-hackers
On Sun, 24 May 2020 at 04:14, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
>
> > On Fri, May 22, 2020 at 08:40:17AM +1200, David Rowley wrote:
> > I imagine we'll set some required UniqueKeys during
> > standard_qp_callback()
>
> In standard_qp_callback, because pathkeys are computed at this point I
> guess?

Yes. In particular, we set the pathkeys for DISTINCT clauses there.

> > and then we'll try to create some Skip Scan
> > paths (which are marked with UniqueKeys) if the base relation does not
> > already have UniqueKeys that satisfy the required UniqueKeys that were
> > set during standard_qp_callback().
>
> For a simple distinct query those UniqueKeys would be set based on
> distinct clause. If I understand correctly, the very same is implemented
> right now in create_distinct_paths, just after building all index paths,
> so wouldn't it be just a duplication?

I think we need to create the skip scan paths when we create the other
paths for base relations. We shouldn't be adjusting existing index
paths during create_distinct_paths().  The last code I saw for the
skip scans patch did something like if (IsA(path, IndexScanPath)) in
create_distinct_paths(), but that's only ever going to work when the
query is to a single relation. You'll never see IndexScanPaths in the
upper planner's paths when there are joins. You'd see join type paths
instead.  It is possible to make use of skip scans for DISTINCT when
the query has joins. We'd just need to ensure the join does not
duplicate the unique rows from the skip scanned relation.

> In general UniqueKeys in the skip scan patch were created from
> distinctClause in build_index_paths (looks similar to what you've
> described) and then based on them created index skip scan paths. So my
> expectations were that the patch from this thread would work similar.

The difference will be that you'd be setting some distinct_uniquekeys
in standard_qp_callback() to explicitly request that some skip scan
paths be created for the uniquekeys, whereas the patch here just does
not bother doing DISTINCT if the upper relation already has unique
keys that state that the DISTINCT is not required. The skip scans
patch should check if the RelOptInfo for the uniquekeys set in
standard_qp_callback() are already mentioned in the RelOptInfo's
uniquekeys.  If they are then there's no point in skip scanning as the
rel is already unique for the distinct_uniquekeys.

David



pgsql-hackers by date:

Previous
From: Victor Yegorov
Date:
Subject: Failure to create GiST on ltree column
Next
From: ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date:
Subject: Missing links between system catalog documentation pages