Re: Aggregate ORDER BY patch - Mailing list pgsql-hackers

From Hitoshi Harada
Subject Re: Aggregate ORDER BY patch
Date
Msg-id e08cc0400911150342v5ca55ddm843a2c8232d8abf5@mail.gmail.com
Whole thread Raw
In response to Aggregate ORDER BY patch  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Responses Re: Aggregate ORDER BY patch
List pgsql-hackers
Here's my first review.

The patch was in context diff format and applied cleanly with a little
3 hunks in parse_expr.c. make succeeded without any warnings and make
check passed all 121 tests.

It implements as advertised, following SQL spec with extension of both
DISTINCT clause and ORDER BY clause are available in any aggregate
functions including user defined ones. It supports VIEWs by adding
code in ruleutils.c.

Questions here:
- agglevelsup?
We have aggregate capability that all arguments from upper level query
in downer level aggregate makes aggregate call itself to upper level
call, as a constant value in downer level. What if ORDER BY clause has
downer level Vars? For example:

regression=# select (select count(t1.four order by unique1) from tenk1
limit 1) from tenk1 t1 where unique1 < 10;?column?
----------   10000   10000   10000   10000   10000   10000   10000   10000   10000   10000
(10 rows)

regression=# select (select count(t1.four order by t1.unique1) from
tenk1 limit 1) from tenk1 t1 where unique1 < 10;?column?
----------      10
(1 row)

Is it sane? The result is consistent but surprised me a little. No
need to raise an error?

- order by 1?
Normal ORDER BY clause accepts constant integer as TargetEntry's
resno. The patch seems not to support it.

regression=# select array_agg(four order by 1) from tenk1 where unique1 < 10;      array_agg
-----------------------{0,2,1,2,1,0,1,3,3,0}
(1 row)

Shouldn't it be the same as normal ORDER BY?

Performance doesn't seem slowing down, though I don't have
quantitative test result.

Coding, almost all sane. Since its syntax and semantics are similar to
existing DISTINCT and ORDER BY features, parsing and transformation
code are derived from those area. The executor has few issues:

- #include in nodeAgg.c
executor/tuptable.h is added in the patch but required really?
I removed that line but still build without any warnings.

- process_ordered_aggregate_(single|multi)
It seems that the patch left process_sorted_aggregate() function as
process_ordered_aggregate_single() and added
process_ordered_aggregate_multi() for more than one input arguments
(actually target entries) case. Why have those two? Could we combine
them? Or I couldn't find convincing reason in comments.

And ParseFuncOrColumn() in parse_func.c now gets more complicated.
Since feature / semantics are similar, I bet we may share some code to
transform DISTINCT and ORDER BY with traditional code in
parse_clause.c, though I'm not sure nor don't have clear idea.
Especially, code around here
        save_next_resno = pstate->p_next_resno;        pstate->p_next_resno = attno + 1;

cheats pstate to transform clauses and I felt a bit fear.
- SortGroupClause.implicit
"implicit" member was added in SortGroupClause. I didn't find clear
reason to add this. Could you show a case to clarify this?

That's it for now.

Regards,



-- 
Hitoshi Harada


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Hot standby, overflowed snapshots, testing
Next
From: Heikki Linnakangas
Date:
Subject: Re: Hot standby, race condition between recovery snapshot and commit