Home > mailing lists

Re: Removing Functionally Dependent GROUP BY Columns - Mailing list pgsql-hackers

From	David Rowley
Subject	Re: Removing Functionally Dependent GROUP BY Columns
Date	January 14, 2016 00:30:39
Msg-id	CAKJS1f_WgEuiazFEoXpTB_vTDZ_FHKrPxwZ0BiQdMwQ_4kmYmA@mail.gmail.com Whole thread
In response to	Re: Removing Functionally Dependent GROUP BY Columns (Julien Rouhaud <julien.rouhaud@dalibo.com>)
Responses	Re: Removing Functionally Dependent GROUP BY Columns
List	pgsql-hackers

Tree view

Many thanks for the thorough review!

On 12 January 2016 at 23:37, Julien Rouhaud <julien.rouhaud@dalibo.com> wrote:

In identify_key_vars()

+ /* Only PK constraints are of interest for now, see comment above */
+ if (con->contype != CONSTRAINT_PRIMARY)
+ continue;

I think the comment should better refer to
remove_useless_groupby_columns() (don't optimize further than what
check_functional_grouping() does).

I've improved this by explaining it more clearly.

+ /*
+ * Exit if the constraint is deferrable, there's no point in looking
+ * for another PK constriant, as there can only be one.
+ */

small typo on "constriant"

Fixed.

+ {
+ varlistmatches = bms_add_member(varlistmatches,
varlistidx);
+ found_col = true;
+ /* don't break, there might be dupicates */
+ }

small typo on "dupicates". Also, I thought transformGroupClauseExpr()
called in the parse analysis make sure that there isn't any duplicate in
the GROUP BY clause. Do I miss something?

No I missed something :) You are right. I've put a break; here and a comment to explain why.

in remove_useless_groupby_columns()

+ /*
+ * Skip if there were no Vars found in the GROUP BY clause which
belong
+ * to this relation. We can also skip if there's only 1 Var, as
there
+ * needs to be more than one Var for there to be any columns
which are
+ * functionally dependent on another column.
+ */

This comment doesn't sound very clear. I'm not a native english speaker,
so maybe it's just me.

Yes it was badly worded. I've fixed in the attached.

+ /*
+ * If we found any surplus Vars in the GROUP BY clause, then we'll build
+ * a new GROUP BY clause without these surplus Vars.
+ */
+ if (anysurplus)
+ {
+ List *new_groupby = NIL;
+
+ foreach (lc, root->parse->groupClause)
+ {
+ SortGroupClause *sgc = (SortGroupClause *) lfirst(lc);
+ TargetEntry *tle;
+ Var *var;
+
+ tle = get_sortgroupclause_tle(sgc, root->parse->targetList);
+ var = (Var *) tle->expr;
+
+ if (!IsA(var, Var))
+ continue;
+ [...]

if var isn't a Var, it needs to be added in new_groupby.

Yes you're right, well, at least I've written enough code to ensure that it's not needed.

Technically we could look inside non-Vars and check if the Expr is just made up of Vars from a single relation, and then we could also make that surplus if we find other Vars which make up the table's primary key. I didn't make these changes as I thought it was a less likely scenario. It wouldn't be too much extra code to add however. I've went and added an XXX comment to explain that there might be future optimisation opportunities in the future around this.

I've attached an updated patch.

David Rowley http://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Training & Services

Attachment

prune_group_by_clause_90b5f3c_2016-01-14.patch

pgsql-hackers by date:

From: Tom Lane
Date: 14 January 2016, 00:05:35
Subject: Re: Truncation of identifiers

From: Gavin Flower
Date: 14 January 2016, 01:08:55
Subject: Re: Truncation of identifiers

Re: Removing Functionally Dependent GROUP BY Columns - Mailing list pgsql-hackers

Attachment

Previous

Next