Re: How do query optimizers affect window functions - Mailing list pgsql-general

From Igor Romanchenko
Subject Re: How do query optimizers affect window functions
Date
Msg-id CAP95Gq=VpHbtX8gvML0q13ugqx97PA_h58HZORUUAkvrOOjCbw@mail.gmail.com
Whole thread Raw
In response to How do query optimizers affect window functions  (Tianyin Xu <tixu@cs.ucsd.edu>)
List pgsql-general
On Wed, Nov 14, 2012 at 10:12 AM, Tianyin Xu <tixu@cs.ucsd.edu> wrote:
Hi, Postgresql,

I want to understand how the query optimizers affect the output of the window functions.

For example, set "cpu_tuple_cost = 50" in postgresql.conf and start the server, I apply the regress test (make installcheck). The test of window function fails.

Checking the diff and I found the output of the window functions are different. For example,

For the following query:

SELECT sum(unique1) over (rows between current row and unbounded following), unique1, four FROM tenk1 WHERE unique1 < 10;

The expected results are:

 sum | unique1 | four
-----+---------+------
  45 |       4 |    0
  41 |       2 |    2
  39 |       1 |    1
  38 |       6 |    2
  32 |       9 |    1
  23 |       8 |    0
  15 |       5 |    1
  10 |       3 |    3
   7 |       7 |    3
   0 |       0 |    0

But the real results are:

 sum | unique1 | four
-----+---------+------
  45 |       0 |    0   
  45 |       1 |    1   
  44 |       2 |    2   
  42 |       3 |    3   
  39 |       4 |    0   
  35 |       5 |    1   
  30 |       6 |    2   
  24 |       7 |    3   
  17 |       8 |    0   
   9 |       9 |    1   

There're altogether 6 queries in window test that outputs different query results.

I don't understand why the results are different. Intuitively, the queries show return the same results no matter what plan the optimizer choose.

I suspected the previous queries had some side effect on the latter one (e.g., change the current row), so I removed all the previous queries before this query in window.sql. But the result did not change.

Could anyone explain this behavior? Or point out how to investigate?

Thanks a lot!
Tianyin
 

Hi.
In short: if no explicit ordering specivied for a query the resulting set can be in any order. It is up to query optimizer to chose in what order the resulting tuples will be. 
The window function used in this test case rely on the order of the resulting set (it sums from current to the last) so it will generate different results for different query plans.

I think for this test cases (window functions) explicit ordering should be specified. In "normal" cases order dependent window functions are newer used without explicit ordering.

pgsql-general by date:

Previous
From: "Wang, Hao"
Date:
Subject: File system level copy
Next
From: Igor Romanchenko
Date:
Subject: Re: Using window functions to get the unpaginated count for paginated queries