Re: WIP: Aggregation push-down - take2 - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: WIP: Aggregation push-down - take2
Date
Msg-id 72e599d0-7766-b8bb-936a-848b06404780@enterprisedb.com
Whole thread Raw
In response to Re: WIP: Aggregation push-down - take2  (Antonin Houska <ah@cybertec.at>)
Responses Re: WIP: Aggregation push-down - take2
List pgsql-hackers
Hi,

I did a quick initial review of the v20 patch series. I plan to do a
more thorough review over the next couple days, if time permits. In
general I think the patch is in pretty good shape.

I've added a bunch of comments in a number of places - see the "review
comments" parts for each of the original parts. That should make it
easier to deal with all the items. I'll go through the main stuff here:

1) I was somewhat confused why we even need RelInfoList, when it merely
wraps existing fields, but I guess it's because we need multiple such
pairs - one for joins, one for grouped rels. Correct?

2) While reading the README, I was somewhat confused because it seems to
suggest we have to push the aggregate only to baserel level, but then it
also talks about pushing to other places (above joins). There's a couple
other places in the README that confused me a bit, see the XXX comments.

In general, I think the README focuses on explaining the motivation,
i.e. why we want to do this, but it's somewhat light on how it's done.
The other parts talk about the implementation in more detail.

3) I tweaked a couple places in allpaths.c to make it more readable, but
I admit that's a somewhat subjective measure, so feel free to undo that.

4) setup_base_grouped_rels compares bitmaps before looking at
reloptkind, which seems to be cheaper so maybe the checks should happen
in the opposite order (not a huge difference, though)

5) add_grouped_path seems to be a bit confusing, because the name makes
it look like it does about the same stuff as add_path/add_partial_path,
when that's not quite true

6) 0002 failed to add enable_agg_pushdown to the sample file, which
leads to a failure in regression tests

7) when I change enable_agg_pushdown to true and run regression tests, I
get a bunch of failures like

   ERROR:  WindowFunc found where not expected

Seems we don't handle window functions correctly somewhere, or maybe
setup_aggregate_pushdown should check/reject hasWindowFuncs too?

8) create_ordinary_grouping_paths changes when set_cheapest() gets
called, but I can't quite convince myself the change is correct. How
come it's correct to check pathlist instead of partial_pathlist (as before).

9) I see create_agg_sorted_path is quite picky about the subpath
pathkeys, essentially requiring it to be a prefix of group_pathkeys.
Seems unnecessary, no? Even if we sort/group on different pathkeys, that
reduces the cardinality, and we may do sort later (or just finalize
using hashagg).

Furthermore, we generally try creating a sort with the proper ordering
in other places - why not here? I mean, if subpath has pathkeys=A and we
need [A,B], we could try adding suitable IncrementalSort, no? Or even
full Sort, or something. Or is that not beneficial here?

10) I don't understand why create_agg_hashed_path limits the hashtable
size to work_mem - shouldn't it do something like cost_agg to account
for spilling to disk?

11) There's an unnecessary/unrelated change in trigger.c.

12) I improved/reworded a couple comments where I initially was unsure
what exactly that means. Hopefully I got it right.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: proposal: possibility to read dumped table's name from file
Next
From: Thomas Munro
Date:
Subject: Re: Reducing power consumption on idle servers