Re: [PATCH] Negative Transition Aggregate Functions (WIP) - Mailing list pgsql-hackers

From David Rowley
Subject Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Date
Msg-id CAApHDvr_oSpvM-XXz43eCMX8n0EfshJ=9j+rxvGqCy91YR-YQw@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Negative Transition Aggregate Functions (WIP)  (Florian Pflug <fgp@phlo.org>)
Responses Re: [PATCH] Negative Transition Aggregate Functions (WIP)
List pgsql-hackers
On Sun, Jan 19, 2014 at 3:22 AM, Florian Pflug <fgp@phlo.org> wrote:
On Jan18, 2014, at 06:15 , David Rowley <dgrowleyml@gmail.com> wrote:> Note that after this fix the results for my quick benchmark now look like:
>
> create table num (num numeric(10,2) not null);
> insert into num (num) select * from generate_series(1,20000);
> select sum(num) over(order by num rows between current row and unbounded following) from num; -- 113 ms
> select sum(num / 10) over(order by num rows between current row and unbounded following) from num; -- 156ms
> select sum(num / 1) over(order by num rows between current row and unbounded following) from num; -- 144 ms
>
> So it seems to be much less prone to falling back to brute force forward
> transitions.
> It also seems the / 10 version must have had to previously do 1 brute
> force rescan but now it looks like it can do it in 1 scan.
>

I've performed some more benchmarks on this patch tonight. The results and full recreation scripts are attached along with the patch it was tested against.

The benchmarks contain a bit of a mix of different possible workloads, I've tried to show the worst and best case for everything I can think of that has a best and worst case. The best case with the patched version is naturally very good when the frame is unbounded following and the frame head moves down on each row, that is providing the inverse transition can actually take place.

I've noticed a slight performance regression on the worse case for max() and min() aggregates. Note that these inverse transition functions are implemented to return failure when the value being removed is equal to the current state. e.g, if max() is 10 and we remove 9, then we can report success, but if max is 10 and we remove 10 we report failure. The failed attempt to remove in this case will be costing us a little bit more than it was previously.

I'm not quite sure how likely it is to be a realistic workload, but a query such as:

select max(id) over (order by id DESC rows between current row and unbounded following) from test_numeric;

will fail to perform inverse transitions on *EVERY* row. 
For int types this regression is only about 2-3%, but it will likely increase as the comparison function gets more expensive. I wondered if it would be worth attempting to check the window's order by and compare that to the aggregate's sort operator and not perform inverse transitions in this case. Although a query such as this:

select first_value(id) over (order by id DESC rows between current row and unbounded following) from test_numeric;

would give the same results to the user and will likely be much much faster anyway, so perhaps detecting that is not worth it.

The performance regression seems a bit unavoidable in the case where the order by col is not the same as the max's column, but the 2 columns happen to sort the results in the same way. I can't really think of a way to detect this. But then I think you'll get back what you lost even if you could perform just 1 inverse transition in the entire processing of the window, so that case is probably quite unlikely to hit and if you compare it to the possible cases that it would improve.

I focused quite a bit on testing the performance of SUM(numeric) due to a few extra lines of code that I had to add into do_numeric_accum() to allow tracking of the maximum dscale'd numeric. My plain aggregate test on this showed no measurable regression, though there will be a couple of extra CPU cycles in there on each call to the function.

I didn't really bother doing much performance testing on aggregates like count(*) and sum(int) as these can always perform inverse transitions, the best cases for each of the other queries should be used as a guide for how much performance will increase.

I'm pretty happy to see that for processing as little as 20 thousand rows that the patch can increase performance by up to 1000 times! And this increase will just get bigger with more rows.

Here's some sample results from the attached txt file:

select sum(num_fixed_scale) over(order by id rows between current row and unbounded following) from test_numeric;
Results in miliseconds 

Patched: 
50.958
63.820
64.719
52.860
64.921

Unpatched:
61358.856
61180.370
62642.495
61416.385
62383.012

Comp. Patched Unpatched
average 59.4556 61796.2236 Avg Increase (times) 1039.367589 103936.76%
median 63.82 61416.385 Median Increase (times) 962.3375901 96233.76%

If anyone can think of any other workloads that they would like tested please let me know.

Regards

David Rowley.

Attachment

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: GIN improvements part 1: additional information
Next
From: Marti Raudsepp
Date:
Subject: Re: Proposal: variant of regclass