Home > mailing lists

Thread: TODO items for window functions

TODO items for window functions

From

Tom Lane

Date:

28 December 2008, 15:26:08

The core window-functions patch is now committed and ready for wider
testing. However, there are a number of unfinished items, at least
some of which I'd like to see addressed before 8.4 release. In rough
order of importance:

* Support creation of user-defined window functions. I think this is
a "must have" for 8.4 --- we are not in the habit of building
nonextensible basic features. It doesn't seem that hard either.
I think all we need do is to allow "WINDOW" as an attribute keyword
in CREATE FUNCTION. Does anyone have an objection or a better idea?

* Implement support for non-default window framing clauses. Most likely
it's too late to consider getting the whole feature done for 8.4, but
I wonder whether we could support the restricted case of allowing just
these two combinationsBETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (which is default)BETWEEN UNBOUNDED PRECEDING AND
UNBOUNDEDFOLLOWING

I said earlier that we didn't really need to address this for 8.4,
but my thinking was flawed. The case I think is really important
is to allow last_value() to do something useful, and you can hardly
argue that it's useful without an ORDER BY to define which row in the
partition is "last".

* Investigate whether we should prohibit window functions in recursive
terms; check whether any of the committed prohibitions are unnecessary.

* Look at tuplestore performance issues. The tuplestore_in_memory()
thing is just a band-aid, we ought to try to solve it properly.
tuplestore_advance seems like a weak spot as well.

* Do we really need so much duplicated code between Agg and WindowAgg?
regards, tom lane

Re: TODO items for window functions

From

"David Rowley"

Date:

28 December 2008, 17:29:30

Tom Lane Wrote:
> The core window-functions patch is now committed and ready for wider
> testing.  However, there are a number of unfinished items, at least
> some of which I'd like to see addressed before 8.4 release.  In rough
> order of importance:
>
> * Support creation of user-defined window functions.  I think this is
> a "must have" for 8.4 --- we are not in the habit of building
> nonextensible basic features.  It doesn't seem that hard either.
> I think all we need do is to allow "WINDOW" as an attribute keyword
> in CREATE FUNCTION.  Does anyone have an objection or a better idea?
>
> * Implement support for non-default window framing clauses.  Most likely
> it's too late to consider getting the whole feature done for 8.4, but
> I wonder whether we could support the restricted case of allowing just
> these two combinations
>     BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (which is default)
>     BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
> I said earlier that we didn't really need to address this for 8.4,
> but my thinking was flawed.  The case I think is really important
> is to allow last_value() to do something useful, and you can hardly
> argue that it's useful without an ORDER BY to define which row in the
> partition is "last".
>
> * Investigate whether we should prohibit window functions in recursive
> terms; check whether any of the committed prohibitions are unnecessary.
>
> * Look at tuplestore performance issues. The tuplestore_in_memory()
> thing is just a band-aid, we ought to try to solve it properly.
> tuplestore_advance seems like a weak spot as well.
>
> * Do we really need so much duplicated code between Agg and WindowAgg?


Hitoshi and I did briefly talk about these two a few months back, but there
were other priorities at the time.

Unsure how difficult it is, maybe another one for a TODO, 8.4 or 8.5 I'm not
sure:

* Minimise sorts in a query such as:

david=# explain SELECT depname,
david-#        SUM(salary) OVER (ORDER BY salary),
david-#        SUM(salary) OVER (ORDER BY empno)
david-# FROM empsalary
david-# ORDER BY salary;                                        QUERY PLAN
----------------------------------------------------------------------------
----------------Sort  (cost=213.15..215.75 rows=1040 width=44)  Sort Key: salary  ->  WindowAgg  (cost=142.83..161.03
rows=1040width=44)        ->  Sort  (cost=142.83..145.43 rows=1040 width=44)              Sort Key: empno
-> WindowAgg  (cost=72.52..90.72 rows=1040 width=44)                    ->  Sort  (cost=72.52..75.12 rows=1040
width=44)                         Sort Key: salary                          ->  Seq Scan on empsalary
(cost=0.00..20.40
rows=1040 width=44)

In the above case it would be more efficient to evaluate windows with the
same order by clause as the final results last to eliminate the final sort.

In the above query Oracle 10g performs 2 sorts, DB2 and Sybase perform 3
sorts. We also perform 3.


Also perhaps more difficult? Maybe 8.5...

* Teach planner to decide which window to evaluate first based on costs.
Currently the first window in the query is evaluated first, there may be no
index to help sort the first window, but perhaps there are for other windows
in the query. This may allow an index scan instead of a seqscan -> sort.

I Oracle 10g seems to have the above capability but Sybase seems evaluate
the first window first. I've yet to look at DB2.

This would stop performance critical queries from looking like:

SELECT id,sum_value2,sum_value
FROM (
SELECT id,      SUM(value) OVER (ORDER BY idxcol) AS sum_value,      SUM(value2) OVER (ORDER BY no_idxcol) AS
sum_value2
FROM some_table
) t; -- Allow index scan by putting the indexed column as the 1st window

When they could look like:

SELECT id,      SUM(value2) OVER (ORDER BY no_idxcol) AS sum_value2,      SUM(value) OVER (ORDER BY idxcol) AS
sum_value
FROM some_table; -- Planner has the option to eval 2nd window first due to
index.


David.

Re: TODO items for window functions

From

Tom Lane

Date:

28 December 2008, 18:01:06

"David Rowley" <dgrowley@gmail.com> writes:
> Unsure how difficult it is, maybe another one for a TODO, 8.4 or 8.5 I'm not
> sure:
> * Minimise sorts in a query such as:

I'm not tremendously excited about improving that situation.  As the
code stands, the user can control what happens by ordering the WINDOW
clause appropriately, so it's not really a show-stopper.  Ideally we'd
do better automatically, but it's not easy in the current planner
structure --- we can only ask query_planner for one target sort order
and there's no good way to determine beforehand which of the possible
targets might be the best choice.  So at best this is a "maybe TODO" for
8.5 or later.

I do think the patch has probably left some low-hanging fruit on the
simpler end of the difficulty spectrum, namely when the window stuff
requires only one ordering that could be done either explicitly or
by an indexscan.  That choice should ideally be done with a proper
cost comparison taking any LIMIT into account.  I think right now
the LIMIT might not be accounted for, or might be considered even
when it shouldn't be because another sort is needed anyway.

But in any case, the tuplestore internal issues are probably the
more significant performance problems for the short term.
        regards, tom lane

Re: TODO items for window functions

From

"Hitoshi Harada"

Date:

29 December 2008, 03:08:33

2008/12/29 Tom Lane <tgl@sss.pgh.pa.us>:
> The core window-functions patch is now committed and ready for wider
> testing.  However, there are a number of unfinished items, at least
> some of which I'd like to see addressed before 8.4 release.  In rough
> order of importance:
>
> * Support creation of user-defined window functions.  I think this is
> a "must have" for 8.4 --- we are not in the habit of building
> nonextensible basic features.  It doesn't seem that hard either.
> I think all we need do is to allow "WINDOW" as an attribute keyword
> in CREATE FUNCTION.  Does anyone have an objection or a better idea?

The reason I and people decided window functions are not able to be
defined by user is whether Window functions API is ready for exposure.
If we agree with the current design is not changing for the long
future, I don't have any objection.
My only concern is for plpgsql. For c functions we can define user
functions but what about other pls?

> * Implement support for non-default window framing clauses.  Most likely
> it's too late to consider getting the whole feature done for 8.4, but
> I wonder whether we could support the restricted case of allowing just
> these two combinations
>        BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (which is default)
>        BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
> I said earlier that we didn't really need to address this for 8.4,
> but my thinking was flawed.  The case I think is really important
> is to allow last_value() to do something useful, and you can hardly
> argue that it's useful without an ORDER BY to define which row in the
> partition is "last".

Frame clause concern is shrinking situation. For shrinking frame, we
must do something to optimize aggregate not to recalculate from the
first of the frame. So as far as we stay in UNBOUNDED PRECEDING, it is
not so hard work, inclulding BETWEEN UNBOUNDED PRECEDING AND n
FOLLOWING.

And surveying sgml docs, I found this is not correct.

http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/ref/select.sgml?r1=1.112&r2=1.113

+     default framing behavior, which is equivalent to the framing clause
+     <literal>ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW</>.

the default frame with ORDER BY clause is RANGE BETWEEN UNBOUNDED
PRECEDING AND CURRENT ROW, as aggregates perform rows peer to the
current row.

Regards,


-- 
Hitoshi Harada

Re: TODO items for window functions

From

Tom Lane

Date:

29 December 2008, 12:44:38

"Hitoshi Harada" <umi.tanuki@gmail.com> writes:
> 2008/12/29 Tom Lane <tgl@sss.pgh.pa.us>:
>> * Support creation of user-defined window functions.  I think this is
>> a "must have" for 8.4 --- we are not in the habit of building
>> nonextensible basic features.  It doesn't seem that hard either.

> The reason I and people decided window functions are not able to be
> defined by user is whether Window functions API is ready for exposure.

Well, it seems about as stable as any other bit of new backend code ;-).
We never promise that backend-internal APIs will not change across
versions.

> My only concern is for plpgsql. For c functions we can define user
> functions but what about other pls?

I'm not concerned about making this stuff available at the PL level
(at least not yet).  C-level capability will be enough to satisfy
my concern for 8.4.

> And surveying sgml docs, I found this is not correct.

> http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/ref/select.sgml?r1=1.112&r2=1.113

> +     default framing behavior, which is equivalent to the framing clause
> +     <literal>ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW</>.

> the default frame with ORDER BY clause is RANGE BETWEEN UNBOUNDED
> PRECEDING AND CURRENT ROW, as aggregates perform rows peer to the
> current row.

What is the difference?  AFAICS the RANGE and ROWS keywords ought to be
equivalent if you are not specifying "expression PRECEDING" or
"expression FOLLOWING".
        regards, tom lane

Re: TODO items for window functions

From

Tom Lane

Date:

29 December 2008, 12:59:58

I wrote:
> * Support creation of user-defined window functions.  I think this is
> a "must have" for 8.4 --- we are not in the habit of building
> nonextensible basic features.  It doesn't seem that hard either.
> I think all we need do is to allow "WINDOW" as an attribute keyword
> in CREATE FUNCTION.  Does anyone have an objection or a better idea?

What I had in mind when I wrote that was something like
create [or replace] function mywindow(...) returns ...    as 'mymodule, 'mywindow'    language c    window;

but on reflection there seems to be a case also for
create [or replace] window function mywindow(...) returns ...    as 'mymodule, 'mywindow'    language c;

The main argument in favor of the latter is that window-ness will need
to be a fixed property of a function that you can't change except by
dropping and recreating it, because any existing views calling the
function will have its window-ness embedded in them in the form of
whether they use a FuncExpr or WindowFunc node to call it.  So it
doesn't feel quite like an optional attribute.

However, if we do that then for consistency we'd have to invent
DROP WINDOW FUNCTION, ALTER WINDOW FUNCTION, RENAME WINDOW FUNCTION,
COMMENT ON WINDOW FUNCTION, yadda yadda, and insist that you refer
to a function properly (with or without WINDOW) in each one of these
commands.  Which would be a real PITA to implement and document,
and I can't see that it's doing anything much for users either.

So I'm still leaning to the first way.  Comments?
        regards, tom lane

Re: TODO items for window functions

From

"Pavel Stehule"

Date:

29 December 2008, 13:20:03

2008/12/29 Tom Lane <tgl@sss.pgh.pa.us>:
> I wrote:
>> * Support creation of user-defined window functions.  I think this is
>> a "must have" for 8.4 --- we are not in the habit of building
>> nonextensible basic features.  It doesn't seem that hard either.
>> I think all we need do is to allow "WINDOW" as an attribute keyword
>> in CREATE FUNCTION.  Does anyone have an objection or a better idea?
>
> What I had in mind when I wrote that was something like
>
>        create [or replace] function mywindow(...) returns ...
>                as 'mymodule, 'mywindow'
>                language c
>                window;
>

+1

regards
Pavel Stehule

> but on reflection there seems to be a case also for
>
>        create [or replace] window function mywindow(...) returns ...
>                as 'mymodule, 'mywindow'
>                language c;
>
> The main argument in favor of the latter is that window-ness will need
> to be a fixed property of a function that you can't change except by
> dropping and recreating it, because any existing views calling the
> function will have its window-ness embedded in them in the form of
> whether they use a FuncExpr or WindowFunc node to call it.  So it
> doesn't feel quite like an optional attribute.
>
> However, if we do that then for consistency we'd have to invent
> DROP WINDOW FUNCTION, ALTER WINDOW FUNCTION, RENAME WINDOW FUNCTION,
> COMMENT ON WINDOW FUNCTION, yadda yadda, and insist that you refer
> to a function properly (with or without WINDOW) in each one of these
> commands.  Which would be a real PITA to implement and document,
> and I can't see that it's doing anything much for users either.
>
> So I'm still leaning to the first way.  Comments?
>
>                        regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: TODO items for window functions

From

Andrew Dunstan

Date:

29 December 2008, 13:20:27


Tom Lane wrote:
> However, if we do that then for consistency we'd have to invent
> DROP WINDOW FUNCTION, ALTER WINDOW FUNCTION, RENAME WINDOW FUNCTION,
> COMMENT ON WINDOW FUNCTION, yadda yadda, and insist that you refer
> to a function properly (with or without WINDOW) in each one of these
> commands.  Which would be a real PITA to implement and document,
> and I can't see that it's doing anything much for users either.
>
> So I'm still leaning to the first way.  Comments?
>
>             
>   

I don't know that this matters so much unless you're going to have a 
seperate namespace for window functions. Otherwise, isn't WINDOW 
basically a noise word for these operations?

cheers

andrew

Re: TODO items for window functions

From

"Jaime Casanova"

Date:

29 December 2008, 13:31:04

On Mon, Dec 29, 2008 at 11:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> * Support creation of user-defined window functions.  I think this is
>> a "must have" for 8.4 --- we are not in the habit of building
>> nonextensible basic features.  It doesn't seem that hard either.
>> I think all we need do is to allow "WINDOW" as an attribute keyword
>> in CREATE FUNCTION.  Does anyone have an objection or a better idea?
>
> What I had in mind when I wrote that was something like
>
>        create [or replace] function mywindow(...) returns ...
>                as 'mymodule, 'mywindow'
>                language c
>                window;
>

i don't understand this window function stuff well yet, but AFAIU it
is like an aggregate function that shows grouped values without
grouping rows (ok, maybe a very laizy or novice definition) but if
that is correct or near correct maybe we need to follow the same
pattern:

create function -- without any decoration
create aggregate maybe with a decoration of being window o create
window aggregate or something similar...

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: TODO items for window functions

From

Tom Lane

Date:

29 December 2008, 13:35:54

I wrote:
> * Investigate whether we should prohibit window functions in recursive
> terms; check whether any of the committed prohibitions are unnecessary.

I looked into these questions a bit.  As for the first, there doesn't
appear to be a compelling implementation reason to forbid it, and I
can't find anything in the spec that says to disallow it.  SQL:2008's
prohibition on aggregates in recursive terms is in 7.13 <query
expression> syntax rule 2) g) iii) 4), and there's nothing about window
functions in the vicinity.

The primary reason to forbid aggregates, so far as I can divine the
intent of the SQL committee, is that incremental evaluation of an
aggregate would give implementation-dependent results, ie you'd get a
different aggregate result depending on how many and which rows the
implementation chose to push through the recursion at a time.  It seems
like the same charge could be leveled against window functions.  On the
other hand it's at least possible to construct recursive queries in
which all the rows of a given window partition should get pushed through
together, so that you should get consistent answers despite the overall
incremental evaluation.  So I can't tell for sure if the committee
thought about that and intentionally decided to allow window functions
in recursive terms, or if their failure to forbid it was an oversight.
(My confidence in the unerring accuracy of the spec is not high at
the moment ;-).)  Nonetheless, the spec does not forbid it, so I feel
we shouldn't either.

As for the second point, I looked at every place that the committed
patch throws an error for queries or expressions containing window
functions.  Most are demonstrably per spec, or are necessary
implementation restrictions arising from the fact that we know an
expression isn't going to get fed through the full planner/executor
machinery (an example of the latter is ALTER COLUMN TYPE USING).
The only case that I think is debatable is that we are throwing
error for window functions used in a SELECT FOR UPDATE/FOR SHARE
query.  The corresponding error for aggregate functions is necessary
because the executor top level doesn't "see" the individual rows that
went into the aggregate, so there's no way to lock them.  In the
case of window functions no aggregation occurs, and so in principle
we could lock the rows.  However, consider something like this:
select x, lead(x) over() from table for update limit 1;

Because of the LIMIT, we'd only lock the first-returned row ...
but the values returned would also depend on the second row of the
table, which wouldn't get locked.  In general the results could
depend on any or all rows of the table but we might lock only some.
This seems to me to be at variance with how you'd expect SELECT FOR
UPDATE to behave, so I'm inclined to leave the prohibition in there
--- at least until someone comes up with a convincing use-case for
SELECT FOR UPDATE together with a window function, and explains why
he doesn't care about relevant rows possibly not getting locked.

Comments?
        regards, tom lane

Re: TODO items for window functions

From

"Hitoshi Harada"

Date:

29 December 2008, 13:54:24

2008/12/30 Tom Lane <tgl@sss.pgh.pa.us>:
> "Hitoshi Harada" <umi.tanuki@gmail.com> writes:
>> And surveying sgml docs, I found this is not correct.
>
>> http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/ref/select.sgml?r1=1.112&r2=1.113
>
>> +     default framing behavior, which is equivalent to the framing clause
>> +     <literal>ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW</>.
>
>> the default frame with ORDER BY clause is RANGE BETWEEN UNBOUNDED
>> PRECEDING AND CURRENT ROW, as aggregates perform rows peer to the
>> current row.
>
> What is the difference?  AFAICS the RANGE and ROWS keywords ought to be
> equivalent if you are not specifying "expression PRECEDING" or
> "expression FOLLOWING".

The difference is that RANGE ... CURRENT ROW contains all peers of the
current row, while ROWS ... CURRENT ROW doesn't contain them but the
current row itself only. See 7.11 rule 5-b.


Regards,


-- 
Hitoshi Harada

Re: TODO items for window functions

From

Tom Lane

Date:

29 December 2008, 14:00:23

Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> However, if we do that then for consistency we'd have to invent
>> DROP WINDOW FUNCTION, ALTER WINDOW FUNCTION, RENAME WINDOW FUNCTION,
>> COMMENT ON WINDOW FUNCTION, yadda yadda, and insist that you refer
>> to a function properly (with or without WINDOW) in each one of these
>> commands.

> I don't know that this matters so much unless you're going to have a 
> seperate namespace for window functions. Otherwise, isn't WINDOW 
> basically a noise word for these operations?

Well, the question is whether window functions are so different from
plain functions that we should treat them as a different kind of animal
for SQL-command purposes.  We do do that for aggregate functions, but
aggregates have some really fundamental effects on query semantics.
Consider

select sin(x) from table;        -- returns 1 row per table row
select sum(x) from table;        -- returns 1 row
select lead(x) over () from table;    -- returns 1 row per table row

In this sense window functions aren't that different from regular ones.

Window functions are also much more like regular functions than
aggregates in terms of what you have to specify to define one.

You could certainly argue the classification either way, but I think
that we should make a hard decision now: either window functions are
treated as a distinct object type (implying their own set of command
names and nuisance errors if you use the wrong one), or they are not a
distinct object type (implying that WINDOW is an attribute for CREATE
FUNCTION and not part of the command name).  If we are wishy-washy about
it and treat WINDOW as just a noise word in some contexts then we will
have user confusion.  The precedent that is bothering me here is all the
user confusion that has ensued over whether you can use ALTER TABLE to
operate on sequences and views.
        regards, tom lane

Re: TODO items for window functions

From

"Hitoshi Harada"

Date:

29 December 2008, 14:02:43

2008/12/30 Jaime Casanova <jcasanov@systemguards.com.ec>:
> On Mon, Dec 29, 2008 at 11:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I wrote:
>>> * Support creation of user-defined window functions.  I think this is
>>> a "must have" for 8.4 --- we are not in the habit of building
>>> nonextensible basic features.  It doesn't seem that hard either.
>>> I think all we need do is to allow "WINDOW" as an attribute keyword
>>> in CREATE FUNCTION.  Does anyone have an objection or a better idea?
>>
>> What I had in mind when I wrote that was something like
>>
>>        create [or replace] function mywindow(...) returns ...
>>                as 'mymodule, 'mywindow'
>>                language c
>>                window;
>>
>
> i don't understand this window function stuff well yet, but AFAIU it
> is like an aggregate function that shows grouped values without
> grouping rows (ok, maybe a very laizy or novice definition) but if
> that is correct or near correct maybe we need to follow the same
> pattern:
>
> create function -- without any decoration
> create aggregate maybe with a decoration of being window o create
> window aggregate or something similar...
>

I prefer "create window function" because it is semantically readable
and window keyword is more similar to aggregate than immutable, or
strict.

And for drop/comment or so, I guess we don't need prepare window
keyword. It's because window function is represented in pg_proc
catalog only, whereas aggregate uses pg_proc and pg_aggregate. If
there weren't window keyword in DROP FUNCTION, we won't be worried
which function to be dropped?


Regards,




-- 
Hitoshi Harada

Re: TODO items for window functions

From

Tom Lane

Date:

29 December 2008, 14:03:20

"Jaime Casanova" <jcasanov@systemguards.com.ec> writes:
> i don't understand this window function stuff well yet, but AFAIU it
> is like an aggregate function that shows grouped values without
> grouping rows (ok, maybe a very laizy or novice definition) but if
> that is correct or near correct maybe we need to follow the same
> pattern:

> create function -- without any decoration
> create aggregate maybe with a decoration of being window o create
> window aggregate or something similar...

No, we can't really manage this as a variant of CREATE AGGREGATE
--- the information you need to specify to create a window function
is not at all like what you need to specify to create an aggregate.
        regards, tom lane

Re: TODO items for window functions

From

"Hitoshi Harada"

Date:

29 December 2008, 14:12:55

2008/12/30 Tom Lane <tgl@sss.pgh.pa.us>:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Tom Lane wrote:
>>> However, if we do that then for consistency we'd have to invent
>>> DROP WINDOW FUNCTION, ALTER WINDOW FUNCTION, RENAME WINDOW FUNCTION,
>>> COMMENT ON WINDOW FUNCTION, yadda yadda, and insist that you refer
>>> to a function properly (with or without WINDOW) in each one of these
>>> commands.
>
>> I don't know that this matters so much unless you're going to have a
>> seperate namespace for window functions. Otherwise, isn't WINDOW
>> basically a noise word for these operations?
>
> Well, the question is whether window functions are so different from
> plain functions that we should treat them as a different kind of animal
> for SQL-command purposes.  We do do that for aggregate functions, but
> aggregates have some really fundamental effects on query semantics.
> Consider
>
> select sin(x) from table;               -- returns 1 row per table row
> select sum(x) from table;               -- returns 1 row
> select lead(x) over () from table;      -- returns 1 row per table row
>
> In this sense window functions aren't that different from regular ones.
>
> Window functions are also much more like regular functions than
> aggregates in terms of what you have to specify to define one.
>
> You could certainly argue the classification either way, but I think
> that we should make a hard decision now: either window functions are
> treated as a distinct object type (implying their own set of command
> names and nuisance errors if you use the wrong one), or they are not a
> distinct object type (implying that WINDOW is an attribute for CREATE
> FUNCTION and not part of the command name).  If we are wishy-washy about
> it and treat WINDOW as just a noise word in some contexts then we will
> have user confusion.  The precedent that is bothering me here is all the
> user confusion that has ensued over whether you can use ALTER TABLE to
> operate on sequences and views.

Hmm, sequences and views are created by
CREATE SEQUENCE
CREATE VIEW
but
CREATE WINDOW FUNCTION
is not so. I mean, by this syntax user recognizes he creates one of
the plain function with window attribute. And as I said in the
previous mail, actually it is. The proiswindow attribute is used to
declare that it can handle window function API and nothing more,
everything is the same with plain ones.


Regards,


-- 
Hitoshi Harada

Re: TODO items for window functions

From

Tom Lane

Date:

29 December 2008, 14:19:39

"Hitoshi Harada" <umi.tanuki@gmail.com> writes:
> 2008/12/30 Tom Lane <tgl@sss.pgh.pa.us>:
>> What is the difference?  AFAICS the RANGE and ROWS keywords ought to be
>> equivalent if you are not specifying "expression PRECEDING" or
>> "expression FOLLOWING".

> The difference is that RANGE ... CURRENT ROW contains all peers of the
> current row, while ROWS ... CURRENT ROW doesn't contain them but the
> current row itself only. See 7.11 rule 5-b.

Hah, I had missed that fine point.  Okay, doc is wrong and I will fix.

Given that, I think that a suitable minimum implementation should cover
both the RANGE/ROWS distinction and the CURRENT ROW/UNBOUNDED FOLLOWING
distinction, ie I would like 8.4 to support
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROWRANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWINGROWS BETWEEN
UNBOUNDEDPRECEDING AND CURRENT ROWROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
 

(1 is the default, 2 and 4 behave the same unless I'm still missing
something.)  This doesn't seem too difficult to consider adding now,
and it will greatly increase the usefulness of frame-dependent
window functions.

Is this something you're interested in working on?  I can tackle it
if you don't have time now.
        regards, tom lane

Re: TODO items for window functions

From

"Hitoshi Harada"

Date:

29 December 2008, 14:26:54

2008/12/30 Tom Lane <tgl@sss.pgh.pa.us>:
> Hah, I had missed that fine point.  Okay, doc is wrong and I will fix.
>
> Given that, I think that a suitable minimum implementation should cover
> both the RANGE/ROWS distinction and the CURRENT ROW/UNBOUNDED FOLLOWING
> distinction, ie I would like 8.4 to support
>
>        RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
>        RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
>        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
>        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
>
> (1 is the default, 2 and 4 behave the same unless I'm still missing
> something.)

My understanding is as well.

> Is this something you're interested in working on?  I can tackle it
> if you don't have time now.
>
>                        regards, tom lane
>

Sorry, over the new year days, I don't have time and will be remote.
Maybe from 3th or 4th I can work on this, so if you have time during
time I would like you to do it. Otherwise, I will.


Regards,

-- 
Hitoshi Harada

Re: TODO items for window functions

From

Tom Lane

Date:

29 December 2008, 14:30:05

"Hitoshi Harada" <umi.tanuki@gmail.com> writes:
> 2008/12/30 Tom Lane <tgl@sss.pgh.pa.us>:
>> Is this something you're interested in working on?  I can tackle it
>> if you don't have time now.

> Sorry, over the new year days, I don't have time and will be remote.
> Maybe from 3th or 4th I can work on this, so if you have time during
> time I would like you to do it. Otherwise, I will.

I have nothing pressing during this week; I'll see what I can get done.
        regards, tom lane

Re: TODO items for window functions

From

Alvaro Herrera

Date:

29 December 2008, 14:36:52

Tom Lane escribió:
> The core window-functions patch is now committed and ready for wider
> testing.  However, there are a number of unfinished items, at least
> some of which I'd like to see addressed before 8.4 release.  In rough
> order of importance:

[lots of discussion]

Perhaps I was a bit hasty -- I added the initial items mentioned in this
thread to the Todo page:

http://wiki.postgresql.org/wiki/Todo#Window_Functions

Perhaps people who has a clue should have a look for reworking the list.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: TODO items for window functions

From

Jeff Davis

Date:

29 December 2008, 15:59:28

On Mon, 2008-12-29 at 12:35 -0500, Tom Lane wrote:
> we could lock the rows.  However, consider something like this:
> 
>     select x, lead(x) over() from table for update limit 1;
> 
> Because of the LIMIT, we'd only lock the first-returned row ...
> but the values returned would also depend on the second row of the
> table, which wouldn't get locked.  In general the results could
> depend on any or all rows of the table but we might lock only some.
> This seems to me to be at variance with how you'd expect SELECT FOR
> UPDATE to behave, so I'm inclined to leave the prohibition in there
> --- at least until someone comes up with a convincing use-case for
> SELECT FOR UPDATE together with a window function, and explains why
> he doesn't care about relevant rows possibly not getting locked.
> 

How is that different from a subselect?

create table foo(a int, b int);
create table bar(c int, d int);

insert into foo values(1, 10);
insert into foo values(2, 20);
insert into bar values(100, 1000);

-- connection1
BEGIN;
select a, b, (select d from bar where c = 100) as d from foo where a = 1 for update;

-- connection2
BEGIN;
select a, b, (select d from bar where c = 100) as d from foo where a = 2 for update;

The single tuple in bar affects both results, but the second connection
is not blocked.

Regards,Jeff Davis

Re: TODO items for window functions

From

Tom Lane

Date:

30 December 2008, 12:59:33

I wrote:
> You could certainly argue the classification either way, but I think
> that we should make a hard decision now: either window functions are
> treated as a distinct object type (implying their own set of command
> names and nuisance errors if you use the wrong one), or they are not a
> distinct object type (implying that WINDOW is an attribute for CREATE
> FUNCTION and not part of the command name).  If we are wishy-washy about
> it and treat WINDOW as just a noise word in some contexts then we will
> have user confusion.  The precedent that is bothering me here is all the
> user confusion that has ensued over whether you can use ALTER TABLE to
> operate on sequences and views.

Apparently that analogy didn't impress anyone but me.  AFAICT the
majority opinion is that we should use the syntax
create [or replace] [window] function ...

but just ignore the distinction between regular functions and window
functions for all other function-related SQL commands.  Barring further
discussion, I'll make that happen in the next day or two.
        regards, tom lane

Re: TODO items for window functions

From

David Fetter

Date:

30 December 2008, 14:52:36

On Tue, Dec 30, 2008 at 11:59:22AM -0500, Tom Lane wrote:
> I wrote:
> > You could certainly argue the classification either way, but I
> > think that we should make a hard decision now: either window
> > functions are treated as a distinct object type (implying their
> > own set of command names and nuisance errors if you use the wrong
> > one), or they are not a distinct object type (implying that WINDOW
> > is an attribute for CREATE FUNCTION and not part of the command
> > name).  If we are wishy-washy about it and treat WINDOW as just a
> > noise word in some contexts then we will have user confusion.  The
> > precedent that is bothering me here is all the user confusion that
> > has ensued over whether you can use ALTER TABLE to operate on
> > sequences and views.
> 
> Apparently that analogy didn't impress anyone but me.  AFAICT the
> majority opinion is that we should use the syntax
> 
>     create [or replace] [window] function ...
> 
> but just ignore the distinction between regular functions and window
> functions for all other function-related SQL commands.  Barring further
> discussion, I'll make that happen in the next day or two.

Presumably psql should know about this change.  Should \df now include
windowing functions along with a boolean column that indicates whether
a function is a windowing function?  Should there be \dw[+] instead?

In either case, should the S option indicating "include system
functions only when S is present" (e.g. \dwS) apply?

I'm thinking yes on that last one.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: TODO items for window functions

From

"Robert Haas"

Date:

30 December 2008, 14:55:46

> Apparently that analogy didn't impress anyone but me.  AFAICT the
> majority opinion is that we should use the syntax
>
>        create [or replace] [window] function ...
>
> but just ignore the distinction between regular functions and window
> functions for all other function-related SQL commands.  Barring further
> discussion, I'll make that happen in the next day or two.

It impressed me.  I liked making WINDOW a flag that occurs later in
the statement a lot better.

...Robert

Re: TODO items for window functions

From

Tom Lane

Date:

30 December 2008, 15:58:56

David Fetter <david@fetter.org> writes:
> Presumably psql should know about this change.  Should \df now include
> windowing functions along with a boolean column that indicates whether
> a function is a windowing function?  Should there be \dw[+] instead?

> In either case, should the S option indicating "include system
> functions only when S is present" (e.g. \dwS) apply?

If people are going to start proposing that, I'm going to switch back to
the position that WINDOW should be treated as an attribute.  The whole
point of the discussion is that no one wanted to get into the game of
treating window functions as a separate classification in general.
        regards, tom lane

Re: TODO items for window functions

From

Tom Lane

Date:

30 December 2008, 22:38:33

"Robert Haas" <robertmhaas@gmail.com> writes:
>> Apparently that analogy didn't impress anyone but me.

> It impressed me. I liked making WINDOW a flag that occurs later in
> the statement a lot better.

I ended up going with the flag/attribute approach. The other would be
only marginally more work now, but I remain convinced that we'd have to
do more work later to deal with the issue that CREATE WINDOW FUNCTION
looks like "window function" is a distinct kind of SQL object. And
nobody seemed to want to propagate that distinction into all the places
it would logically have to go.

However ... having said that, there is more to David Fetter's gripe
about \df than I realized at first. Consider

regression=# \df nth_value List of functions Schema | Name | Result data type | Argument
datatypes

------------+-----------+------------------+---------------------pg_catalog | nth_value | anyelement |
anyelement,integer

(1 row)

Even without any consideration of user-defined window functions,
this seems a bit lacking: the user of nth_value() needs to know that
he has to write an OVER clause, and as things stand \df is not going
to give him the slightest hint about that. So I can see the argument
for reflecting window-ness into \df somehow.

I am not thrilled about inventing a new column for this, but how about
a display like so:

regression=# \df nth_value List of functions Schema | Name | Result data type |
Argumentdata types

------------+-----------+------------------+---------------------------------pg_catalog | nth_value | anyelement
|anyelement, integer OVER window

or some other addition that only shows up when needed.
regards, tom lane

Re: TODO items for window functions

From

"Pavel Stehule"

Date:

31 December 2008, 04:20:47

2008/12/31 Tom Lane <tgl@sss.pgh.pa.us>:
> "Robert Haas" <robertmhaas@gmail.com> writes:
>>> Apparently that analogy didn't impress anyone but me.
>
>> It impressed me.  I liked making WINDOW a flag that occurs later in
>> the statement a lot better.
>
> I ended up going with the flag/attribute approach.  The other would be
> only marginally more work now, but I remain convinced that we'd have to
> do more work later to deal with the issue that CREATE WINDOW FUNCTION
> looks like "window function" is a distinct kind of SQL object.  And
> nobody seemed to want to propagate that distinction into all the places
> it would logically have to go.
>
> However ... having said that, there is more to David Fetter's gripe
> about \df than I realized at first.  Consider
>
> regression=# \df nth_value
>                        List of functions
>   Schema   |   Name    | Result data type | Argument data types
> ------------+-----------+------------------+---------------------
>  pg_catalog | nth_value | anyelement       | anyelement, integer
> (1 row)
>
> Even without any consideration of user-defined window functions,
> this seems a bit lacking: the user of nth_value() needs to know that
> he has to write an OVER clause, and as things stand \df is not going
> to give him the slightest hint about that.  So I can see the argument
> for reflecting window-ness into \df somehow.
>
> I am not thrilled about inventing a new column for this, but how about
> a display like so:
>
> regression=# \df nth_value
>                        List of functions
>   Schema   |   Name    | Result data type |       Argument data types
> ------------+-----------+------------------+---------------------------------
>  pg_catalog | nth_value | anyelement       | anyelement, integer OVER window
>

+1

regards
Pavel Stehule

> or some other addition that only shows up when needed.
>
>                        regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: TODO items for window functions

From

Heikki Linnakangas

Date:

31 December 2008, 07:21:29

Tom Lane wrote:
> I am not thrilled about inventing a new column for this, but how about
> a display like so:
> 
> regression=# \df nth_value
>                         List of functions
>    Schema   |   Name    | Result data type |       Argument data types 
> ------------+-----------+------------------+---------------------------------
>  pg_catalog | nth_value | anyelement       | anyelement, integer OVER window
> 
> or some other addition that only shows up when needed.

That looks like "OVER window" is associated with the "integer", like 
DEFAULT. I don't have any better suggestions, though.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: TODO items for window functions

From

Alvaro Herrera

Date:

31 December 2008, 08:02:58

Heikki Linnakangas escribió:
> Tom Lane wrote:
>> I am not thrilled about inventing a new column for this, but how about
>> a display like so:
>>
>> regression=# \df nth_value
>>                         List of functions
>>    Schema   |   Name    | Result data type |       Argument data types  
>> ------------+-----------+------------------+---------------------------------
>>  pg_catalog | nth_value | anyelement       | anyelement, integer OVER window
>>
>> or some other addition that only shows up when needed.
>
> That looks like "OVER window" is associated with the "integer", like  
> DEFAULT. I don't have any better suggestions, though.
                        List of functions   Schema   |   Name    | Result data type |        Argument data
types------------+-----------+------------------+-----------------------------------pg_catalog | nth_value | anyelement
     | (anyelement, integer) OVER window
 


-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: TODO items for window functions

From

Tom Lane

Date:

31 December 2008, 12:05:27

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Heikki Linnakangas escribi�:
>> Tom Lane wrote:
>>> pg_catalog | nth_value | anyelement       | anyelement, integer OVER window
>> 
>> That looks like "OVER window" is associated with the "integer", like  
>> DEFAULT. I don't have any better suggestions, though.

>   pg_catalog | nth_value | anyelement       | (anyelement, integer) OVER window

Yeah, I had considered that too, and it has a distinct advantage for
parameterless functions like rank():
  Schema   | Name | Result data type | Argument data types 
------------+------+------------------+---------------------pg_catalog | rank | bigint           | OVER
windowpg_catalog| rank | bigint           | () OVER window
 

The latter is definitely clearer about what you're supposed to do.

However, it seems kind of inconsistent to do this for window functions
unless we also make \df start putting parens around the argument lists
for regular functions.  Comments?
        regards, tom lane

Re: TODO items for window functions

From

David Fetter

Date:

31 December 2008, 15:11:26

On Wed, Dec 31, 2008 at 11:04:41AM -0500, Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Heikki Linnakangas escribi�:
> >> Tom Lane wrote:
> >>> pg_catalog | nth_value | anyelement       | anyelement, integer OVER window
> >> 
> >> That looks like "OVER window" is associated with the "integer", like  
> >> DEFAULT. I don't have any better suggestions, though.
> 
> >   pg_catalog | nth_value | anyelement       | (anyelement, integer) OVER window
> 
> Yeah, I had considered that too, and it has a distinct advantage for
> parameterless functions like rank():
> 
>    Schema   | Name | Result data type | Argument data types 
> ------------+------+------------------+---------------------
>  pg_catalog | rank | bigint           | OVER window
>  pg_catalog | rank | bigint           | () OVER window
> 
> The latter is definitely clearer about what you're supposed to do.

+1 on the latter.

> However, it seems kind of inconsistent to do this for window functions
> unless we also make \df start putting parens around the argument lists
> for regular functions.  Comments?

Would parens around all the argument lists really be so bad?  I'm
thinking not.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: TODO items for window functions

From

Dimitri Fontaine

Date:

01 January 2009, 07:45:44

Hi,

Happy new year!

Le 31 déc. 08 à 17:04, Tom Lane <tgl@sss.pgh.pa.us> a écrit :
> However, it seems kind of inconsistent to do this for window functions
> unless we also make \df start putting parens around the argument lists
> for regular functions.  Comments?

A way to distinguish between window functions "seeing" frames vs.
partitions, if possible, would sound quite useful after a docs reading
session.
AKA OVER(... order by ...) effect in some other thread.

--
dim

Re: TODO items for window functions

From

"Robert Haas"

Date:

01 January 2009, 09:42:47

> I am not thrilled about inventing a new column for this, but how about
> a display like so:
>
> regression=# \df nth_value
>                        List of functions
>   Schema   |   Name    | Result data type |       Argument data types
> ------------+-----------+------------------+---------------------------------
>  pg_catalog | nth_value | anyelement       | anyelement, integer OVER window
>
> or some other addition that only shows up when needed.

I think this whole idea is a bad one.  In the current release, you can do

DROP FUNCTION Name ( Argument data types )

...and it will work.  Maybe you will say that no one is doing this via
a script (which I wouldn't bet on, but it's possible) but I'm sure
people are doing it via cut and paste, because I have done exactly
this thing.  Any of the various proposals for hacking up Argument data
types will make this no longer true, and somebody will get confused.
I think you should bite the bullet and add a "type" column (f for
regular function and w for window?  could there be others in the
future?).

...Robert

Re: TODO items for window functions

From

Bruce Momjian

Date:

21 January 2009, 20:02:27

Robert Haas wrote:
> > I am not thrilled about inventing a new column for this, but how about
> > a display like so:
> >
> > regression=# \df nth_value
> >                        List of functions
> >   Schema   |   Name    | Result data type |       Argument data types
> > ------------+-----------+------------------+---------------------------------
> >  pg_catalog | nth_value | anyelement       | anyelement, integer OVER window
> >
> > or some other addition that only shows up when needed.
> 
> I think this whole idea is a bad one.  In the current release, you can do
> 
> DROP FUNCTION Name ( Argument data types )
> 
> ...and it will work.  Maybe you will say that no one is doing this via
> a script (which I wouldn't bet on, but it's possible) but I'm sure
> people are doing it via cut and paste, because I have done exactly
> this thing.  Any of the various proposals for hacking up Argument data
> types will make this no longer true, and somebody will get confused.
> I think you should bite the bullet and add a "type" column (f for
> regular function and w for window?  could there be others in the
> future?).

I assume this is still an open issue.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: TODO items for window functions

From

Bruce Momjian

Date:

04 February 2009, 15:52:57

Robert Haas wrote:
> > I am not thrilled about inventing a new column for this, but how about
> > a display like so:
> >
> > regression=# \df nth_value
> >                        List of functions
> >   Schema   |   Name    | Result data type |       Argument data types
> > ------------+-----------+------------------+---------------------------------
> >  pg_catalog | nth_value | anyelement       | anyelement, integer OVER window
> >
> > or some other addition that only shows up when needed.
> 
> I think this whole idea is a bad one.  In the current release, you can do
> 
> DROP FUNCTION Name ( Argument data types )
> 
> ...and it will work.  Maybe you will say that no one is doing this via
> a script (which I wouldn't bet on, but it's possible) but I'm sure
> people are doing it via cut and paste, because I have done exactly
> this thing.  Any of the various proposals for hacking up Argument data
> types will make this no longer true, and somebody will get confused.
> I think you should bite the bullet and add a "type" column (f for
> regular function and w for window?  could there be others in the
> future?).

Are we doing anything for this for 8.4?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: TODO items for window functions

From

Hitoshi Harada

Date:

05 February 2009, 00:35:36

2009/2/5 Bruce Momjian <bruce@momjian.us>:
> Robert Haas wrote:
>> > I am not thrilled about inventing a new column for this, but how about
>> > a display like so:
>> >
>> > regression=# \df nth_value
>> >                        List of functions
>> >   Schema   |   Name    | Result data type |       Argument data types
>> > ------------+-----------+------------------+---------------------------------
>> >  pg_catalog | nth_value | anyelement       | anyelement, integer OVER window
>> >
>> > or some other addition that only shows up when needed.
>>
>> I think this whole idea is a bad one.  In the current release, you can do
>>
>> DROP FUNCTION Name ( Argument data types )
>>
>> ...and it will work.  Maybe you will say that no one is doing this via
>> a script (which I wouldn't bet on, but it's possible) but I'm sure
>> people are doing it via cut and paste, because I have done exactly
>> this thing.  Any of the various proposals for hacking up Argument data
>> types will make this no longer true, and somebody will get confused.
>> I think you should bite the bullet and add a "type" column (f for
>> regular function and w for window?  could there be others in the
>> future?).
>
> Are we doing anything for this for 8.4?

I prefer adding column of type 'w'|'f' to attaching OVER keyword in
argument column, because type column approach is more general if it
may refer to not only window functions but also setof, trigger, and
those coming in the future (hypothetical set function maybe?). It
seems to me that the OVER keyword is not necessary to let him know he
needs it.

Regards,



-- 
Hitoshi Harada