Re: [PATCH] Keeps tracking the uniqueness with UniqueKey - Mailing list pgsql-hackers

From David Rowley
Subject Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
Date
Msg-id CAApHDvr4OW_OUd_Rxp0d1hRgz+a4mm8+8uR7QoM2VqKFX08SqA@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (Andy Fan <zhihui.fan1213@gmail.com>)
Responses Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Fri, 3 Apr 2020 at 15:18, Andy Fan <zhihui.fan1213@gmail.com> wrote:
>
> The updated patch should fixed all the issues.  See the comments below for more
> information.
>
> On Tue, Mar 31, 2020 at 9:44 AM David Rowley <dgrowleyml@gmail.com> wrote:
>>
>> + * can only guarantee the uniqueness without considering the null values. This
>> + * field is necessary for remove_useless_join & reduce_unique_semijions since
>> + * these cases don't care about the null values.
>>
>> Why is the field which stores the nullability of the key required for
>> code that does not care about the nullability of the key?
>>
>> Also please check your spelling of the word "join"
>>
>
> Actually I didn't find the spell error for "join"..

It was in reduce_unique_semijions. That should be
reduce_unique_semijoins. I see you fixed it in the patch though.

> 3.  remove_useless_groupby_columns maintains the parse->constraintDeps
> when it depends on primary key,  but UniqueKey doesn't maintain such data.
> since we have translation layer which should protect us from the concurrency issue
> and isolation issue.  Do we need to do that as well in UniqueKey?

I'm pretty sure that code is pretty bogus in
remove_useless_groupby_columns(). It perhaps was just copied from
check_functional_grouping(), where it is required. Looks like the
(ahem) author of d4c3a156c got that wrong... :-(

The reason check_functional_grouping() needs it is for things like
creating a view with a GROUP BY clause that has a column in the SELECT
list that is functionally dependant on the GROUP BY columns. e.g:

create table z (a int primary key, b int);
create view view_z as select a,b from z group by a;
alter table z drop constraint z_pkey;
ERROR:  cannot drop constraint z_pkey on table z because other objects
depend on it
DETAIL:  view view_z depends on constraint z_pkey on table z
HINT:  Use DROP ... CASCADE to drop the dependent objects too.

Here that view would become invalid if the PK was dropped, so we must
record the dependency in that case.  Doing so is what causes that
error message.

For just planner smarts such as LEFT JOIN removal, Unique Joins, and
all this Unique Key stuff, we really don't need to record the
dependency as if the index or constraint is dropped, then that'll
cause a relcache invalidation and we'll see the invalidation when we
attempt to execute the cached plan. That will cause the statement to
be re-planned and we'll not see the unique index when we do that.

We should probably just get rid of that code in
remove_useless_groupby_columns() to stop people getting confused about
that.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: WAL usage calculation patch
Next
From: Yuri Astrakhan
Date:
Subject: Re: Yet another fast GiST build