parallel aggregation for PostgreSQL 9.5 - Mailing list pgsql-announce
From | PostgreSQL - Hans-Jürgen Schönig |
---|---|
Subject | parallel aggregation for PostgreSQL 9.5 |
Date | |
Msg-id | 0B9D5042-8862-413F-AE1E-C1B2CBF0528A@cybertec.at Whole thread Raw |
List | pgsql-announce |
agg-1.0: Bringing multi-core to PostgreSQL aggregations
===========================================
Cybertec Schönig & Schönig GmbH (http://www.cybertec.at/) is proud to announce
the first version of "agg", which brings multi-core analytics to PostgreSQL 9.5.
"app" can be loaded as an extension and is able to scale out aggregations to
more than just one CPU core speeding up queries significantly.
Tests have shown that queries on a 40 core box are up to 30 times faster than on
a single server. "agg" is pushing the limits of PostgreSQL even futher and
provides a signigicant milestone for analytical workloads.
An example:
==========
To show the true potential of agg we have compiled some benchmarking data:
Test case:
- 40 CPU cores (Intel)
- 200 million rows
Running a single-core test:
agg=# SET agg.hash_workers = 1;
SET
agg=# SELECT val1, count(*) FROM t_agg GROUP BY 1 LIMIT 2;
val1 | count
------+---------
34 | 5000000
25 | 5000000
(2 rows)
Time: 55701.966 ms
With 10 cores agg can achieve a perfectly linear improvement
agg=# SET agg.hash_workers = 10;
SET
agg=# SELECT val1, count(*) FROM t_agg GROUP BY 1 LIMIT 2;
val1 | count
------+---------
34 | 5000000
25 | 5000000
(2 rows)
Time: 5574.891 ms
With all CPU cores at work the stunning number of 200 million rows can be
aggregated in roughly 2.6 seconds. On other words: PostgreSQL crunched 77
million rows per second:
agg=# SET agg.hash_workers = 40;
SET
agg=# SELECT val1, count(*) FROM t_agg GROUP BY 1 LIMIT 2;
val1 | count
------+---------
34 | 5000000
25 | 5000000
(2 rows)
Time: 2596.967 ms
Installation:
=========
"agg" can be downloaded freely from our website:
http://www.cybertec.at/en/products/agg-parallel-aggregations-postgresql/
===========================================
Cybertec Schönig & Schönig GmbH (http://www.cybertec.at/) is proud to announce
the first version of "agg", which brings multi-core analytics to PostgreSQL 9.5.
"app" can be loaded as an extension and is able to scale out aggregations to
more than just one CPU core speeding up queries significantly.
Tests have shown that queries on a 40 core box are up to 30 times faster than on
a single server. "agg" is pushing the limits of PostgreSQL even futher and
provides a signigicant milestone for analytical workloads.
An example:
==========
To show the true potential of agg we have compiled some benchmarking data:
Test case:
- 40 CPU cores (Intel)
- 200 million rows
Running a single-core test:
agg=# SET agg.hash_workers = 1;
SET
agg=# SELECT val1, count(*) FROM t_agg GROUP BY 1 LIMIT 2;
val1 | count
------+---------
34 | 5000000
25 | 5000000
(2 rows)
Time: 55701.966 ms
With 10 cores agg can achieve a perfectly linear improvement
agg=# SET agg.hash_workers = 10;
SET
agg=# SELECT val1, count(*) FROM t_agg GROUP BY 1 LIMIT 2;
val1 | count
------+---------
34 | 5000000
25 | 5000000
(2 rows)
Time: 5574.891 ms
With all CPU cores at work the stunning number of 200 million rows can be
aggregated in roughly 2.6 seconds. On other words: PostgreSQL crunched 77
million rows per second:
agg=# SET agg.hash_workers = 40;
SET
agg=# SELECT val1, count(*) FROM t_agg GROUP BY 1 LIMIT 2;
val1 | count
------+---------
34 | 5000000
25 | 5000000
(2 rows)
Time: 2596.967 ms
Installation:
=========
"agg" can be downloaded freely from our website:
http://www.cybertec.at/en/products/agg-parallel-aggregations-postgresql/
It can be loaded into PostgreSQL 9.5 as a simple extension.
The module us completely transparent and does not require changes on the SQL
level.
Features and limitations:
===================
"agg" has been optimized to scale aggregations and sequential scans. It sits
between the optimizer and the executor post-processing a standard PostgreSQL
execution plan.
In case agg discovers a suitable plan, it replaces standard routines with our
multi-core implementations. In case a query is not suitable for parallel
execution, agg just leaves the PostgreSQL plan as is.
Supported features:
- Parallel aggregations
- Support for FILTER
- Procedures containing suitable queries
- Parallel scanning of single tables
- Parallel scanning of partitioned tables
Not supported:
- Parallel joins
- SMP-aware CREATE INDEX, VACUUM, etc.
- Grouping sets
24x7 support:
=============
Cybertec Schönig & Schönig GmbH offers professional 24x7 support, consulting,
and training to professional users deploying PostgreSQL and "agg" in their
environments.
Contact office@cybertec.at for futher information.
The module us completely transparent and does not require changes on the SQL
level.
Features and limitations:
===================
"agg" has been optimized to scale aggregations and sequential scans. It sits
between the optimizer and the executor post-processing a standard PostgreSQL
execution plan.
In case agg discovers a suitable plan, it replaces standard routines with our
multi-core implementations. In case a query is not suitable for parallel
execution, agg just leaves the PostgreSQL plan as is.
Supported features:
- Parallel aggregations
- Support for FILTER
- Procedures containing suitable queries
- Parallel scanning of single tables
- Parallel scanning of partitioned tables
Not supported:
- Parallel joins
- SMP-aware CREATE INDEX, VACUUM, etc.
- Grouping sets
24x7 support:
=============
Cybertec Schönig & Schönig GmbH offers professional 24x7 support, consulting,
and training to professional users deploying PostgreSQL and "agg" in their
environments.
Contact office@cybertec.at for futher information.
many thanks,
hans
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
pgsql-announce by date: