Home > mailing lists

Performance options for CPU bound multi-SUM query - Mailing list pgsql-general

From	Matt
Subject	Performance options for CPU bound multi-SUM query
Date	January 25, 2016 02:45:20
Msg-id	CA42A9B7-7DF0-4ED8-945C-08328BF9DDB4@gmail.com Whole thread Raw
Responses	Re: Performance options for CPU bound multi-SUM query Re: Performance options for CPU bound multi-SUM query
List	pgsql-general

Tree view

I have a warehousing case where data is bucketed by a key of an hourly timestamp and 3 other columns. In addition there are 32 numeric columns. The tables are partitioned on regular date ranges, and aggregated to the lowest resolution usable.

The principal use case is to select over a range of dates or hours, filter by the other key columns, and SUM() all 32 of the other columns. The execution plan shows the primary key index limits row selection efficiently, but the query appears CPU bound in performing all of those 32 SUM() aggregates.

I am looking at a couple of distributed PostgreSQL forks, but until those reach feature parity with 9.5 I am hoping to stay with single node PostgreSQL.

Are there any other options I can use to improve query times?

Server is 64GB RAM, with work_mem set to 1GB. All SSD storage, with separate RAID-10 volumes for OS, data, and indexes. Additional setting beyond defaults as follows.

default_statistics_target = 500
maintenance_work_mem = 1GB
checkpoint_completion_target = 0.9
effective_cache_size = 48GB
work_mem = 1GB
wal_buffers = 16MB
checkpoint_segments = 128
shared_buffers = 16GB
max_connections = 20

pgsql-general by date:

From: Christophe Pettus
Date: 25 January 2016, 02:19:21
Subject: Re: CoC [Final v2]

From: Neil
Date: 25 January 2016, 04:00:15
Subject: Re: A motion

Performance options for CPU bound multi-SUM query - Mailing list pgsql-general

Previous

Next