Re: [PATCH] distinct aggregates within a window function WIP - Mailing list pgsql-hackers

From Krasiyan Andreev
Subject Re: [PATCH] distinct aggregates within a window function WIP
Date
Msg-id CAN1PwonhxL191aRBbaLOD1aDS1vZOWFP70OjyQvUOskpcXa94g@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] distinct aggregates within a window function WIP  (Vik Fearing <vik.fearing@2ndquadrant.com>)
Responses Re: [PATCH] distinct aggregates within a window function WIP  (Surafel Temesgen <surafel3000@gmail.com>)
List pgsql-hackers
I have currently suspended development of this patch, based on it's review,
but I will continue development of the other Oliver Ford's work about adding support of respect/ignore nulls
for lag(),lead(),first_value(),last_value() and nth_value() and from first/last for nth_value() patch,
but I am not sure how to proceed with it's implementation and any feedback will be very helpful.

I have dropped support of from first/last for nth_value(), but also I reimplemented it in a different way,
by using negative number for the position argument, to be able to get the same frame in exact reverse order.
After that patch becomes much more simple and major concerns about precedence hack has gone,
but maybe it can be additionally simplified.

I have not renamed special bool type "ignorenulls", because it is probably not acceptable way for calling extra version
of window functions (but it makes things very easy and it can reuse frames), but I removed the other special bool type "fromlast".

Attached file is for PostgreSQL 13 (master git branch) and I will add it now to a March commit fest, to be able to track changes.
Everything works and patch is in very good shape, all tests are passed and also, I use it from some time for SQL analysis purposes
(because ignore nulls is one of the most needed feature in OLAP/BI area and Oracle, Amazon Redshift, even Informix have it).

After patch review and suggestions about what to do with special bool type and unreserved keywords, I will reimplement it, if needed.

На пн, 13.01.2020 г. в 18:19 Vik Fearing <vik.fearing@2ndquadrant.com> написа:
On 13/01/2020 15:19, Tom Lane wrote:
> Krasiyan Andreev <krasiyan@gmail.com> writes:
>> I want to propose to you an old patch for Postgres 11, off-site developed
>> by Oliver Ford,
>> but I have permission from him to publish it and to continue it's
>> development,
>> that allow distinct aggregates, like select sum(distinct nums) within a
>> window function.
> I started to respond by asking whether that's well-defined, but
> reading down further I see that that's not actually what the feature
> is: what it is is attaching DISTINCT to a window function itself.
> I'd still ask whether it's well-defined though, or even minimally
> sensible.  Window functions are generally supposed to produce one
> row per input row --- how does that square with the implicit row
> merging of DISTINCT?  They're also typically row-order-sensitive
> --- how does that work with DISTINCT? 


It's a little strange because the spec says:


<q>
If the window ordering clause or the window framing clause of the window
structure descriptor that describes the <window name or specification>
is present, then no <aggregate function> simply contained in <window
function> shall specify DISTINCT or <ordered set function>.
</q>


So it seems to be well defined if all you have is a partition.


But then it also says:


<q>
DENSE_RANK() OVER WNS is equivalent to the <window function>:
    COUNT (DISTINCT ROW ( VE 1 , ..., VE N ) )
    OVER (WNS1 RANGE UNBOUNDED PRECEDING)
</q>


And that kind of looks like a framing clause there.


> Also, to the extent that
> this is sensible, can't you get the same results already today
> with appropriate use of window framing options?


I don't see how.


I have sometimes wanted this feature so I am +1 on us getting at least a
minimal form of it.

--

Vik

Attachment

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: bool_plperl transform
Next
From: Wao
Date:
Subject: Re[2]: bool_plperl transform