Thread: remove_useless_groupby_columns does not need to record constraint dependencies

Hi,

Over in [1], Tom and I had a discussion in response to some confusion
about why remove_useless_groupby_columns() goes to the trouble of
recording a dependency on the PRIMARY KEY constraint when removing
surplus columns from the GROUP BY clause.

The outcome was that we don't need to do this since
remove_useless_groupby_columns() is used only as a plan-time
optimisation, we don't need to record any dependency.  Unlike
check_functional_grouping(), where we must record the dependency as we
may end up with a VIEW with columns, e.g, in the select list which are
functionally dependant on a pkey constraint. In that case, we must
ensure the view is also removed, or that the constraint removal is
blocked. There's no such requirement for planner smarts, such as the
one in remove_useless_groupby_columns() as in that case we'll trigger
a relcache invalidation during ALTER TABLE DROP CONSTRAINT, which
cached plans will notice when they obtain their locks just before
execution begins.

To prevent future confusion, I'd like to remove dependency recording
code from remove_useless_groupby_columns() and update the misleading
comment. Likely this should also be backpatched to 9.6.

Does anyone think differently?

A patch to do this is attached.

[1] https://www.postgresql.org/message-id/CAApHDvr4OW_OUd_Rxp0d1hRgz+a4mm8+8uR7QoM2VqKFX08SqA@mail.gmail.com

Attachment
David Rowley <dgrowleyml@gmail.com> writes:
> Over in [1], Tom and I had a discussion in response to some confusion
> about why remove_useless_groupby_columns() goes to the trouble of
> recording a dependency on the PRIMARY KEY constraint when removing
> surplus columns from the GROUP BY clause.

> The outcome was that we don't need to do this since
> remove_useless_groupby_columns() is used only as a plan-time
> optimisation, we don't need to record any dependency.

Right.  I think it would be good for the comments to emphasize that
a relcache inval will be forced if the *index* underlying the pkey
constraint is dropped; the code doesn't care so much about the constraint
as such.  (This is also why it'd be safe to use a plain unique index
for the same optimization, assuming you can independently verify
non-nullness of the columns.  Maybe we should trash the existing coding
and just have it look for unique indexes + attnotnull flags.)

> To prevent future confusion, I'd like to remove dependency recording
> code from remove_useless_groupby_columns() and update the misleading
> comment. Likely this should also be backpatched to 9.6.

+1 for removing the dependency and improving the comments in HEAD.
Minus quite a lot for back-patching: this is not a bug fix, and
there's a nonzero risk that we've overlooked something.  I'd rather
find that out in beta testing than from bug reports against stable
branches.

            regards, tom lane



On Thu, 16 Apr 2020 at 03:24, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > Over in [1], Tom and I had a discussion in response to some confusion
> > about why remove_useless_groupby_columns() goes to the trouble of
> > recording a dependency on the PRIMARY KEY constraint when removing
> > surplus columns from the GROUP BY clause.
>
> > The outcome was that we don't need to do this since
> > remove_useless_groupby_columns() is used only as a plan-time
> > optimisation, we don't need to record any dependency.
>
> Right.  I think it would be good for the comments to emphasize that
> a relcache inval will be forced if the *index* underlying the pkey
> constraint is dropped; the code doesn't care so much about the constraint
> as such.  (This is also why it'd be safe to use a plain unique index
> for the same optimization, assuming you can independently verify
> non-nullness of the columns.

I've reworded the comment in the attached version.

> Maybe we should trash the existing coding
> and just have it look for unique indexes + attnotnull flags.)

I'd like to, but the timing seems off. Perhaps after we branch for PG14.

> > To prevent future confusion, I'd like to remove dependency recording
> > code from remove_useless_groupby_columns() and update the misleading
> > comment. Likely this should also be backpatched to 9.6.
>
> +1 for removing the dependency and improving the comments in HEAD.
> Minus quite a lot for back-patching: this is not a bug fix, and
> there's a nonzero risk that we've overlooked something.  I'd rather
> find that out in beta testing than from bug reports against stable
> branches.

That seems fair.

David

Attachment
David Rowley <dgrowleyml@gmail.com> writes:
> I've reworded the comment in the attached version.

LGTM.

            regards, tom lane



On Fri, 17 Apr 2020 at 02:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > I've reworded the comment in the attached version.
>
> LGTM.

Thanks for reviewing. Pushed.