Thread: New server optimization advice

New server optimization advice

From
Steve Crawford
Date:
I will soon be migrating to some recently acquired hardware and seek
input from those who have gone before.

A quick overview: the dataset size is ~100GB, (~250-million tuples) with
a workload that consists of about 2/3 writes, mostly single record
inserts into various indexed tables, and 1/3 reads. Queries per second
peak around 2,000 and our application typically demands fast response -
for many of these queries the timeout is set to 2-seconds and the
application moves forward and recovers later if that is exceeded.

Although by count they are minimal, every hour there are dozens both of
import and of analysis queries involving multiple tables and tens of
thousands of records. These queries may take up to a few minutes on our
current hardware.

Old hardware is 4-core, 24GB RAM, battery-backed RAID-10 with four 15k
drives.

New hardware is quite different. 2x10-core E5-2660v3 @2.6GHz, 128GB
DDR4-2133 RAM and 800GB Intel DC P3700 NVMe PCIe SSD. In essence, the
dataset will fit in RAM and will be backed by exceedingly fast storage.

This new machine is very different than any we've had before so any
current thinking on optimization would be appreciated. Do I leave
indexes as is and evaluate which ones to drop later? Any recommendations
on distribution and/or kernels (and kernel tuning)? PostgreSQL tuning
starting points? Whatever comes to mind.

Thanks,
Steve



Re: New server optimization advice

From
Claudio Freire
Date:
On Fri, Jan 9, 2015 at 4:26 PM, Steve Crawford
<scrawford@pinpointresearch.com> wrote:
> New hardware is quite different. 2x10-core E5-2660v3 @2.6GHz, 128GB
> DDR4-2133 RAM and 800GB Intel DC P3700 NVMe PCIe SSD. In essence, the
> dataset will fit in RAM and will be backed by exceedingly fast storage.
>
> This new machine is very different than any we've had before so any current
> thinking on optimization would be appreciated. Do I leave indexes as is and
> evaluate which ones to drop later? Any recommendations on distribution
> and/or kernels (and kernel tuning)? PostgreSQL tuning starting points?
> Whatever comes to mind.


That's always a good idea (don't optimize prematurely).

Still, you may want to tweak random_page_cost to bring it closer to
seq's cost to get plans that are more suited to your exceedingly fast
storage (not to mention effective_cache_size, which should be a
given).

You'll most likely be CPU-bound, so optimization will involve tweaking
data types.

Since you mention lots of writes, I'd imagine you will also want to
tweak shared_buffers and checkpoint_segments to adapt it to your NVM
card's buffering, and as with everything new, test or reasearch into
the card's crash behavior (ie: what happens when you pull the plug).
I've heard of SSD storage solutions that got hopelessly corrupted with
pull the plug tests, so be careful with that, but you do certainly
want to know how this card would behave in a power outage.


Re: New server optimization advice

From
Merlin Moncure
Date:
On Fri, Jan 9, 2015 at 1:48 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
> On Fri, Jan 9, 2015 at 4:26 PM, Steve Crawford
> <scrawford@pinpointresearch.com> wrote:
>> New hardware is quite different. 2x10-core E5-2660v3 @2.6GHz, 128GB
>> DDR4-2133 RAM and 800GB Intel DC P3700 NVMe PCIe SSD. In essence, the
>> dataset will fit in RAM and will be backed by exceedingly fast storage.
>>
>> This new machine is very different than any we've had before so any current
>> thinking on optimization would be appreciated. Do I leave indexes as is and
>> evaluate which ones to drop later? Any recommendations on distribution
>> and/or kernels (and kernel tuning)? PostgreSQL tuning starting points?
>> Whatever comes to mind.
>
>
> That's always a good idea (don't optimize prematurely).
>
> Still, you may want to tweak random_page_cost to bring it closer to
> seq's cost to get plans that are more suited to your exceedingly fast
> storage (not to mention effective_cache_size, which should be a
> given).
>
> You'll most likely be CPU-bound, so optimization will involve tweaking
> data types.
>
> Since you mention lots of writes, I'd imagine you will also want to
> tweak shared_buffers and checkpoint_segments to adapt it to your NVM
> card's buffering, and as with everything new, test or reasearch into
> the card's crash behavior (ie: what happens when you pull the plug).
> I've heard of SSD storage solutions that got hopelessly corrupted with
> pull the plug tests, so be careful with that, but you do certainly
> want to know how this card would behave in a power outage.

The intel DC branded SSD drives so far have an excellent safety record
(for example see here: http://lkcl.net/reports/ssd_analysis.html).
Should still test it carefully though, hopefully that will validate
previous results.

For fast SSD, I'd also set effective_io_concurrency to 256.  This only
affects bitmap heap scans but can double or triple performance of
them.  See: http://www.postgresql.org/message-id/CAHyXU0yiVvfQAnR9cyH=HWh1WbLRsioe=mzRJTHwtr=2azsTdQ@mail.gmail.com

It'd be nice if you could bench and report some numbers for this
device, particularly:
large scale (at least 2x>ram) pgbench select only test (-S), one for
single client, one for many clients
large scale pgbench standard test, single client, many clients

merlin