Re: Improving avg performance for numeric - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: Improving avg performance for numeric
Date
Msg-id CAFj8pRDUFoa1wMc7vGFeQogvm3Og4Kq+kku-svtX9ZO8L5Yj_A@mail.gmail.com
Whole thread Raw
In response to Improving avg performance for numeric  (Hadi Moshayedi <hadi@moshayedi.net>)
Responses Re: Improving avg performance for numeric  (Hadi Moshayedi <hadi@moshayedi.net>)
List pgsql-hackers
Hello

here is a rebased patch. Hadi, please, can verify this version?

Regards

Pavel

p.s. Performance tests

postgres=# create table foo(a int, b float, c double precision, d numeric, gr int);
CREATE TABLE
postgres=#
postgres=# insert into foo select 1, 2.0, 3.0, 3.14, random()*10000 from generate_series(1,10000000);

postgres=# \d foo
          Table "public.foo"
 Column |       Type       | Modifiers
--------+------------------+-----------
 a      | integer          |
 b      | double precision |
 c      | double precision |
 d      | numeric          |
 gr     | integer          |


set work_mem to '2MB';

postgres=# show debug_assertions;
 debug_assertions
------------------
 off
(1 row)



postgres=# explain (analyze, timing off) select sum(a) from foo;
                                            QUERY PLAN                                            
---------------------------------------------------------------------------------------------------
 Aggregate  (cost=208332.23..208332.24 rows=1 width=4) (actual rows=1 loops=1)
   ->  Seq Scan on foo  (cost=0.00..183332.58 rows=9999858 width=4) (actual rows=10000000 loops=1)
 Total runtime: 1210.321 ms (1195.117 ms) -- patched (original)
(3 rows)

Time: 1210.709 ms
postgres=# explain (analyze, timing off) select sum(a) from foo group by gr;
                                            QUERY PLAN                                            
---------------------------------------------------------------------------------------------------
 HashAggregate  (cost=233331.87..233431.71 rows=9984 width=8) (actual rows=10001 loops=1)
   ->  Seq Scan on foo  (cost=0.00..183332.58 rows=9999858 width=8) (actual rows=10000000 loops=1)
 Total runtime: 2923.987 ms (2952.292 ms)
(3 rows)

Time: 2924.384 ms

postgres=# explain (analyze, timing off) select avg(a) from foo;
                                            QUERY PLAN                                            
---------------------------------------------------------------------------------------------------
 Aggregate  (cost=208332.23..208332.24 rows=1 width=4) (actual rows=1 loops=1)
   ->  Seq Scan on foo  (cost=0.00..183332.58 rows=9999858 width=4) (actual rows=10000000 loops=1)
 Total runtime: 1331.627 ms (1312.140 ms)
(3 rows)

postgres=# explain (analyze, timing off) select avg(a) from foo group by gr;
                                            QUERY PLAN                                            
---------------------------------------------------------------------------------------------------
 HashAggregate  (cost=233331.87..233456.67 rows=9984 width=8) (actual rows=10001 loops=1)
   ->  Seq Scan on foo  (cost=0.00..183332.58 rows=9999858 width=8) (actual rows=10000000 loops=1)
 Total runtime: 3139.296 ms (3079.479 ms)
(3 rows)

postgres=# explain (analyze, timing off) select sum(b) from foo;
                                            QUERY PLAN                                            
---------------------------------------------------------------------------------------------------
 Aggregate  (cost=208332.23..208332.24 rows=1 width=8) (actual rows=1 loops=1)
   ->  Seq Scan on foo  (cost=0.00..183332.58 rows=9999858 width=8) (actual rows=10000000 loops=1)
 Total runtime: 1327.841 ms (1339.214 ms)
(3 rows)

postgres=# explain (analyze, timing off) select sum(b) from foo group by gr;
                                             QUERY PLAN                                            
----------------------------------------------------------------------------------------------------
 HashAggregate  (cost=233331.87..233431.71 rows=9984 width=12) (actual rows=10001 loops=1)
   ->  Seq Scan on foo  (cost=0.00..183332.58 rows=9999858 width=12) (actual rows=10000000 loops=1)
 Total runtime: 3047.893 ms (3095.591 ms)
(3 rows)

postgres=# explain (analyze, timing off) select avg(b) from foo;
                                            QUERY PLAN                                            
---------------------------------------------------------------------------------------------------
 Aggregate  (cost=208332.23..208332.24 rows=1 width=8) (actual rows=1 loops=1)
   ->  Seq Scan on foo  (cost=0.00..183332.58 rows=9999858 width=8) (actual rows=10000000 loops=1)
 Total runtime: 1454.665 ms (1471.413 ms)
(3 rows)

postgres=# explain (analyze, timing off) select avg(b) from foo group by gr;
                                             QUERY PLAN                                            
----------------------------------------------------------------------------------------------------
 HashAggregate  (cost=233331.87..233456.67 rows=9984 width=12) (actual rows=10001 loops=1)
   ->  Seq Scan on foo  (cost=0.00..183332.58 rows=9999858 width=12) (actual rows=10000000 loops=1)
 Total runtime: 3282.838 ms (3187.157 ms)
(3 rows)

postgres=# explain (analyze, timing off) select sum(c) from foo;
                                            QUERY PLAN                                            
---------------------------------------------------------------------------------------------------
 Aggregate  (cost=208332.23..208332.24 rows=1 width=8) (actual rows=1 loops=1)
   ->  Seq Scan on foo  (cost=0.00..183332.58 rows=9999858 width=8) (actual rows=10000000 loops=1)
 Total runtime: 1348.555 ms (1364.585 ms)
(3 rows)

postgres=# explain (analyze, timing off) select sum(c) from foo group by gr;
                                             QUERY PLAN                                            
----------------------------------------------------------------------------------------------------
 HashAggregate  (cost=233331.87..233431.71 rows=9984 width=12) (actual rows=10001 loops=1)
   ->  Seq Scan on foo  (cost=0.00..183332.58 rows=9999858 width=12) (actual rows=10000000 loops=1)
 Total runtime: 3028.663 ms (3069.710 ms)
(3 rows)

postgres=# explain (analyze, timing off) select avg(c) from foo;
                                            QUERY PLAN                                            
---------------------------------------------------------------------------------------------------
 Aggregate  (cost=208332.23..208332.24 rows=1 width=8) (actual rows=1 loops=1)
   ->  Seq Scan on foo  (cost=0.00..183332.58 rows=9999858 width=8) (actual rows=10000000 loops=1)
 Total runtime: 1488.980 ms (1463.813 ms)
(3 rows)

postgres=# explain (analyze, timing off) select avg(c) from foo group by gr;
                                             QUERY PLAN                                            
----------------------------------------------------------------------------------------------------
 HashAggregate  (cost=233331.87..233456.67 rows=9984 width=12) (actual rows=10001 loops=1)
   ->  Seq Scan on foo  (cost=0.00..183332.58 rows=9999858 width=12) (actual rows=10000000 loops=1)
 Total runtime: 3252.972 ms (3149.986 ms)
(3 rows)

postgres=# explain (analyze, timing off) select sum(d) from foo;
                                            QUERY PLAN                                            
---------------------------------------------------------------------------------------------------
 Aggregate  (cost=208332.23..208332.24 rows=1 width=7) (actual rows=1 loops=1)
   ->  Seq Scan on foo  (cost=0.00..183332.58 rows=9999858 width=7) (actual rows=10000000 loops=1)
 Total runtime: 2301.769 ms (2784.430 ms)
(3 rows)

postgres=# explain (analyze, timing off) select sum(d) from foo group by gr;
                                             QUERY PLAN                                            
----------------------------------------------------------------------------------------------------
 HashAggregate  (cost=233331.87..233456.67 rows=9984 width=11) (actual rows=10001 loops=1)
   ->  Seq Scan on foo  (cost=0.00..183332.58 rows=9999858 width=11) (actual rows=10000000 loops=1)
 Total runtime: 4189.272 ms (4440.335 ms)
(3 rows)

postgres=# explain (analyze, timing off) select avg(d) from foo;
                                            QUERY PLAN                                            
---------------------------------------------------------------------------------------------------
 Aggregate  (cost=208332.23..208332.24 rows=1 width=7) (actual rows=1 loops=1)
   ->  Seq Scan on foo  (cost=0.00..183332.58 rows=9999858 width=7) (actual rows=10000000 loops=1)
 Total runtime: 2308.493 ms (5195.970 ms)
(3 rows)

postgres=# explain (analyze, timing off) select avg(d) from foo group by gr;
                                             QUERY PLAN                                            
----------------------------------------------------------------------------------------------------
 HashAggregate  (cost=233331.87..233456.67 rows=9984 width=11) (actual rows=10001 loops=1)
   ->  Seq Scan on foo  (cost=0.00..183332.58 rows=9999858 width=11) (actual rows=10000000 loops=1)
 Total runtime: 4179.978 ms (6828.398 ms)
(3 rows)


int, float, double 26829 ms (26675 ms) -- 0.5% slower .. statistic error .. cleaner code
numeric sum 6490 ms (7224 ms) --  10% faster
numeric avg 6487 ms (12023 ms) -- 46% faster





2013/8/22 Hadi Moshayedi <hadi@moshayedi.net>
Hello Pavel,

> > Do you think you could give this a review after CF1 ends, but before
> > September?  I hate to make Hadi wait just because I didn't see his patch.
>
> yes, I can.

When do you think you will have time to review this patch?

Thanks,
  -- Hadi

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: pg_system_identifier()
Next
From: David Fetter
Date:
Subject: Re: pg_restore multiple --function options