Re: Windowing Function Patch Review -> Standard Conformance - Mailing list pgsql-hackers

From David Rowley
Subject Re: Windowing Function Patch Review -> Standard Conformance
Date
Msg-id 9BB7F4690A644063B48FBD57963652EA@amd64
Whole thread Raw
In response to Re: Windowing Function Patch Review -> Standard Conformance  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
> -----Original Message-----
> From: Heikki Linnakangas [mailto:heikki.linnakangas@enterprisedb.com]
> Sent: 26 November 2008 09:09
> To: Hitoshi Harada
> Cc: David Rowley; pgsql-hackers@postgresql.org
> Subject: Re: Windowing Function Patch Review -> Standard Conformance
>
> Hitoshi Harada wrote:
> > 2008/11/26 David Rowley <dgrowley@gmail.com>:
> >> I'm at a bit of a loss to what to do now. Should I wait for your work
> >> Heikki? Or continue validating this patch?
> >>
> >> The best thing I can think to do right now is continue and any problems
> I
> >> find you can add regression tests for, then if we keep your regression
> tests
> >> for Heikki's changes then we can validate those changes more quickly.
> >>
> >> Any thoughts? Better ideas?
> >
> > Thanks to your great tests, we now know much more about specification
> > and where to fail easily, so continuing makes sense but it may be good
> > time to take a rest and wait for Heikki's patch completing.
>
> Here's another updated patch, including all your bug fixes.
>
> There's two known issues:
> - ranking functions still don't treat peer rows correctly.
>
> - I commented out the "this function requires ORDER BY clause in the
> window" test in rank_up, because a window function shouldn't be poking
> into the WindowState struct like that. I wonder if it's really needed?
> In section 7.11, the SQL2008 spec says "if WD has no window ordering
> clause, then the window ordering is implementation-dependent, and *all
> rows are peers*". The regression test now fails because of this, but the
> current behavior actually seems correct to me.
>

Thanks for the updated patch. I've been doing a little testing over the
weekend.

I'm running into a small problem with passing results from normal aggregate
functions into the window function. Hitoshi and I saw this before:

SELECT depname,SUM(SUM(salary)) OVER (ORDER BY depname) FROM empsalary GROUP
BY depname;
ERROR:  variable not found in subplan target list

Hitoshi fixed this in his 2008-11-12 patch, but it seems to have found its
way back in.


I was also reading over the standard tonight. I've discovered that the
OFFSET in LEAD() and LAG() is optional. It should default to 1 if it is not
present. Oracle seems to support this.

SQL2008 says:
> If <lead or lag function> is specified, then:
> i) Let VE1 be <lead or lag extent> and let DT be the declared type of VE1.
> ii) Case:
> Scalar expressions 209
> WD 9075-2:200w(E)
> 6.10 <window function>
> If <offset> is specified, then let OFF be <offset>. The declared type of
>OFF shall be an
> exact numeric type with scale 0 (zero).
> 1)
> 2) Otherwise, let OFF be 1 (one).

Yet another variant of LEAD() and LAG() but I think well worth it for both
compliance to the standard and compatibility to Oracle.


I've also been thinking about the on-demand buffering for the window
functions that you talked about. In some of my previous performance tests I
noticed that LEAD with a LIMIT clause is not optimal as it will store all
the rows in the tuplestore before applying the LIMIT:

select timestamp,lead(timestamp,1) over (order by id) from bigtable order by
id limit 1;

Is there any way for the on-demand buffering to make the above query faster?
Really only the first 2 rows ordered by id need to be read.

I had previously thought this would be too difficult to implement as the
OFFSET can be variable, but maybe it's possible on-demand. Yet I'd imagine
it's difficult to know when to allow rows to exit the tuplestore as a
LAG(col, someVariableValue) might need those rows later.

My next task is to read more of the standard just in case there is anything
else we should know.

David.



> --
>    Heikki Linnakangas
>    EnterpriseDB   http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: "David E. Wheeler"
Date:
Subject: Re: WIP: default values for function parameters
Next
From: "Pavel Stehule"
Date:
Subject: Re: WIP: default values for function parameters