Home > mailing lists

Improving avg performance for numeric - Mailing list pgsql-hackers

From	Hadi Moshayedi
Subject	Improving avg performance for numeric
Date	March 16, 2013 07:15:59
Msg-id	CAK=1=WrmCkWc_xQXs_bpUyswCPr7O9zkLmm8Oa7_nT2vybvBEQ@mail.gmail.com Whole thread Raw
Responses	Re: Improving avg performance for numeric (Pavel Stehule <pavel.stehule@gmail.com>) Re: Improving avg performance for numeric (Pavel Stehule <pavel.stehule@gmail.com>)
List	pgsql-hackers

Tree view

Revisiting: http://www.postgresql.org/message-id/45661BE7.4050205@paradise.net.nz

I think the reasons which the numeric average was slow were:

(1) Using Numeric for count, which is slower than int8 to increment,

(2) Constructing/deconstructing arrays at each transition step.

This is also discussed at: http://www.citusdata.com/blog/53-postgres-performance-to-avg-or-to-sum-divided-by-count

So, I think we can improve the speed of numeric average by keeping the transition state as an struct in the aggregate context, and just passing the pointer to that struct from/to the aggregate transition function.

The attached patch uses this method.

I tested it using the data generated using:

CREATE TABLE avg_test AS SELECT (random() * 999)::decimal(5,2) as d FROM generate_series(1, 10000000) s;

After applying this patch, run time of "SELECT avg(d) FROM avg_test;" improves from 10.701 seconds to 5.204 seconds, which seems to be a huge improvement.

I think we may also be able to use a similar method to improve performance of some other numeric aggregates (like stddev). But I want to know your feedback first.

Is this worth working on?

Thanks,

-- Hadi

Attachment

numeric-avg-optimize.patch

pgsql-hackers by date:

From: Alvaro Herrera
Date: 16 March 2013, 04:17:03
Subject: Re: Materialized view assertion failure in HEAD

From: Pavel Stehule
Date: 16 March 2013, 07:56:57
Subject: Re: Improving avg performance for numeric

Improving avg performance for numeric - Mailing list pgsql-hackers

Attachment

Previous

Next