Re: [PATCH] Keeps tracking the uniqueness with UniqueKey - Mailing list pgsql-hackers

From David Rowley
Subject Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
Date
Msg-id CAApHDvqH=xpaCqoUK8ZECT989RaY_ZMygYpS1qvMVNr29n=DwQ@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (Andy Fan <zhihui.fan1213@gmail.com>)
Responses Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (Andy Fan <zhihui.fan1213@gmail.com>)
List pgsql-hackers
On Wed, 15 Apr 2020 at 12:19, Andy Fan <zhihui.fan1213@gmail.com> wrote:
>
>> 2. I think 0002 is overly restrictive in its demands that
>> parse->hasAggs must be false. We should be able to just use a Group
>> Aggregate with unsorted input when the input_rel is unique on the
>> GROUP BY clause.  This will save on hashing and sorting.  Basically
>> similar to what we do for when a query contains aggregates without any
>> GROUP BY.
>>
>
> Yes,  This will be a perfect result,  the difficult is the current aggregation function
> execution is highly coupled with Agg node(ExecInitAgg) which is removed in the
> unique case.

This case here would be slightly different. It would be handled by
still creating a Group Aggregate path, but just not consider Hash
Aggregate and not Sort the input to the Group Aggregate path. Perhaps
that's best done by creating a new flag bit and using it in
create_grouping_paths() in the location where we set the flags
variable.  If you determine that the input_rel is unique for the GROUP
BY clause, and that there are aggregate functions, then set a flag,
e.g GROUPING_INPUT_UNIQUE. Likely there will be a few other flags that
you can skip setting in that function, for example, there's no need to
check if the input can sort, so no need for GROUPING_CAN_USE_SORT,
since you won't need to sort, likewise for GROUPING_CAN_USE_HASH. I'd
say there also is no need for checking if we can set
GROUPING_CAN_PARTIAL_AGG (What would be the point in doing partial
aggregation when there's 1 row per group?)   Then down in
add_paths_to_grouping_rel(), just add a special case before doing any
other code, such as:

if ((extra->flags & GROUPING_INPUT_UNIQUE) != 0 && parse->groupClause != NIL)
{
Path    *path = input_rel->cheapest_total_path;

add_path(grouped_rel, (Path *)
create_agg_path(root,
grouped_rel,
path,
grouped_rel->reltarget,
AGG_SORTED,
AGGSPLIT_SIMPLE,
parse->groupClause,
havingQual,
agg_costs,
dNumGroups));
return;
}

You may also want to consider the cheapest startup path there too so
that the LIMIT processing can do something smarter later in planning
(assuming cheapest_total_path != cheapest_startup_path (which you'd
need to check for)).

Perhaps it would be better to only set the GROUPING_INPUT_UNIQUE if
there is a groupClause, then just Assert(parse->groupClause != NIL)
inside that if.

David



pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: Race condition in SyncRepGetSyncStandbysPriority
Next
From: Noah Misch
Date:
Subject: Re: Race condition in SyncRepGetSyncStandbysPriority