Re: [PATCH] Keeps tracking the uniqueness with UniqueKey - Mailing list pgsql-hackers

From Dmitry Dolgov
Subject Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
Date
Msg-id 20200606091751.y6oxesi45xlffz5k@localhost
Whole thread Raw
In response to Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
List pgsql-hackers
> On Fri, Jun 05, 2020 at 12:26:15PM +1200, David Rowley wrote:
> On Mon, 25 May 2020 at 19:14, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
> >
> > > On Mon, May 25, 2020 at 06:34:30AM +1200, David Rowley wrote:
> > > The difference will be that you'd be setting some distinct_uniquekeys
> > > in standard_qp_callback() to explicitly request that some skip scan
> > > paths be created for the uniquekeys, whereas the patch here just does
> > > not bother doing DISTINCT if the upper relation already has unique
> > > keys that state that the DISTINCT is not required. The skip scans
> > > patch should check if the RelOptInfo for the uniquekeys set in
> > > standard_qp_callback() are already mentioned in the RelOptInfo's
> > > uniquekeys.  If they are then there's no point in skip scanning as the
> > > rel is already unique for the distinct_uniquekeys.
> >
> > It sounds like it makes semantics of UniqueKey a bit more confusing,
> > isn't it? At the moment it says:
> >
> >      Represents the unique properties held by a RelOptInfo.
> >
> > With the proposed changes it would be "unique properties, that are held"
> > and "unique properties, that are requested", which are partially
> > duplicated, but stored in some different fields. From the skip scan
> > patch perspective it's probably doesn't make any difference, seems like
> > the implementation would be almost the same, just created UniqueKeys
> > would be of different type. But I'm afraid potentiall future users of
> > UniqueKeys could be easily confused.
>
> If there's some comment that says UniqueKeys are for RelOptInfos, then
> perhaps that comment just needs to be expanded to mention the Path
> uniqueness when we add the uniquekeys field to Path.

My concerns are more about having two different sets of distinct
uniquekeys:

* one prepared in standard_qp_callback for skip scan (I guess those
  should be added to PlannerInfo?)

* one in create_distinct_paths as per current implementation

with what seems to be similar content.

> I think the main point of basing skip scans on top of this uniquekeys
> patch is to ensure it's the right thing for the job. I don't think
> it's realistic to be maintaining two different sets of infrastructure
> which serve a very similar purpose. It's important we make UniqueKeys
> general purpose enough to support future useful forms of optimisation.
> Basing skip scans on it seems like a good exercise towards that. I'm
> not expecting that we need to make zero changes here to allow it to
> work well with skip scans.

Sure, no one suggests to have two ways of saying "this thing is unique".
I'm just trying to figure out how to make skip scan and uniquekeys play
together without having rough edges.



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Vacuuming the operating system documentation
Next
From: Justin Pryzby
Date:
Subject: Re: how to create index concurrently on paritioned table