Thread: Large PostgreSQL servers

Large PostgreSQL servers

From
Kjetil Nygård
Date:
Hi,

We are considering to migrate some of our databases to PostgreSQL.

We wonder if someone could give some hardware / configuration specs for
large PostgreSQL installations.
We're interested in:
    - Number of CPUs
    - Memory on the server
    - shared_buffers
    - Size of the database on disk



PS: I have read in "PosgreSQL 9.0 High Performance" that one should not
use more than 8GB for shared_buffers. But Robert Haas and comments say
that one can use a lot more.
http://rhaas.blogspot.com/2012/03/tuning-sharedbuffers-and-walbuffers.html



Regards,

Kjetil Nygård



Re: Large PostgreSQL servers

From
Frank Lanitz
Date:
On Wed, 21 Mar 2012 20:31:08 +0100
Kjetil Nygård <polpot78@gmail.com> wrote:

> We are considering to migrate some of our databases to PostgreSQL.
>
> We wonder if someone could give some hardware / configuration specs
> for large PostgreSQL installations.
> We're interested in:
>     - Number of CPUs
>     - Memory on the server
>     - shared_buffers
>     - Size of the database on disk

I guess this is extremely depending on how big you database is ...

Cheers,
Frank
--
Frank Lanitz <frank@frank.uvena.de>

Attachment

Re: Large PostgreSQL servers

From
Kjetil Nygård
Date:
On Wed, 2012-03-21 at 20:45 +0100, Frank Lanitz wrote:
> On Wed, 21 Mar 2012 20:31:08 +0100
> Kjetil Nygård <polpot78@gmail.com> wrote:
>
> > We are considering to migrate some of our databases to PostgreSQL.
> >
> > We wonder if someone could give some hardware / configuration specs
> > for large PostgreSQL installations.
> > We're interested in:
> >     - Number of CPUs
> >     - Memory on the server
> >     - shared_buffers
> >     - Size of the database on disk
>
> I guess this is extremely depending on how big you database is ...

I know. But I just hope that someone will share with me some
quantitative numbers about their databases :-)

Our needs are not the greatest, but I'm curious about how much resources
PostgreSQL can consume in practice.


-kny



Re: Large PostgreSQL servers

From
Steve Crawford
Date:
On 03/21/2012 12:31 PM, Kjetil Nygård wrote:
> Hi,
>
> We are considering to migrate some of our databases to PostgreSQL.
>
> We wonder if someone could give some hardware / configuration specs for
> large PostgreSQL installations...
You need to tell us a lot more than "large" (a speaker-dependent
description I've heard applied to 100,000 record databases up to
multi-terabyte).

For rough starting approximations how about:

-Estimated data size in GB and number of tuples

-Peak transactions/second required

-Max acceptable transaction latency

-Number of simultaneous connections required

-Nature of workload (OLAP, OLTP, primarily reads or writes)

-Nature of data - is it mostly "vanilla" int/date/text/etc. or is it
large blobs, full-text-search, GIS or the like?

-Maintenance window allowances (running business-hours only or 24x7)

Cheers,
Steve


Re: Large PostgreSQL servers

From
John R Pierce
Date:
On 03/21/12 12:45 PM, Frank Lanitz wrote:
>>     - Number of CPUs
>> >      - Memory on the server
>> >      - shared_buffers
>> >      - Size of the database on disk
> I guess this is extremely depending on how big you database is ...

and how much concurrent access.    48 CPU cores won't help if you're
only ever making 8 queries at once, regardless of how big the database is.



--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast


Re: Large PostgreSQL servers

From
John R Pierce
Date:
On 03/21/12 12:31 PM, Kjetil Nygård wrote:
> We wonder if someone could give some hardware / configuration specs for
> large PostgreSQL installations.
> We're interested in:
>     - Number of CPUs
>     - Memory on the server
>     - shared_buffers
>     - Size of the database on disk

oh, and you left out a couple /very/ important specifications...

     - number and speed of storage IO channels
     - number and speed (RPM) of disk drives



--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast



Re: Large PostgreSQL servers

From
Kjetil Nygård
Date:
On Wed, 2012-03-21 at 13:06 -0700, John R Pierce wrote:
> On 03/21/12 12:31 PM, Kjetil Nygård wrote:
> > We wonder if someone could give some hardware / configuration specs for
> > large PostgreSQL installations.
> > We're interested in:
> >     - Number of CPUs
> >     - Memory on the server
> >     - shared_buffers
> >     - Size of the database on disk
>
> oh, and you left out a couple /very/ important specifications...
>
>      - number and speed of storage IO channels
>      - number and speed (RPM) of disk drives


I understand that IO performance, transactions/s, 24/7 vs office hours,
data-complexity etc is also needed to really say how much beating a
database can handle.

I just hoped for some simple numbers, but other relevant performance
numbers etc would be nice as well :-)



Regards,
Kny



Re: Large PostgreSQL servers

From
Steve Crawford
Date:
On 03/21/2012 01:13 PM, Kjetil Nygård wrote:
>
> I just hoped for some simple numbers...
That's exactly what we want in order to help. Since you said you are
considering a migration, you must have a pretty good idea of your
current data and workload.

There is no "one-size-fits-all". Without some specifics we are like an
architect confronted by a client who wants a "structure" but has no idea
if that means an office-tower, outhouse, concert-hall or parking garage.

Cheers,
Steve


Re: Large PostgreSQL servers

From
Scott Marlowe
Date:
On Wed, Mar 21, 2012 at 2:13 PM, Kjetil Nygård <polpot78@gmail.com> wrote:
> I understand that IO performance, transactions/s, 24/7 vs office hours,
> data-complexity etc is also needed to really say how much beating a
> database can handle.
>
> I just hoped for some simple numbers, but other relevant performance
> numbers etc would be nice as well :-)

At my last job we ran a trio of mainline db servers with 48 opterons
(4x12 2.1GHz) with 128G RAM and 34 15k SAS drives on Areca, and LSI
RAID controllers as well as plain host based SAS adapters.

With the RAID controllers we were able to hit somewhere in the 4k to
5k tps range with pgbench on a 40G test db with somewhere around 50 to
64 connections.  As we went past 64 connections, the numbers would
fall down to the 2.5k to 3k range as we headed towards 500 or so.

The actual application was on a ~300G database, with memcache in front
of it.  When the memcache was working, load averaged about 4 to 12.
When memcache would die for whatever reason, the load would shoot up
to 300 to 500.  Response times would go from sub second to
multi-second.  But the db server would actually stay up under such
extreme load.

Re: Large PostgreSQL servers

From
John R Pierce
Date:
On 03/21/12 1:13 PM, Kjetil Nygård wrote:
> I just hoped for some simple numbers, but other relevant performance
> numbers etc would be nice as well :-)

42!



--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast



Re: Large PostgreSQL servers

From
Merlin Moncure
Date:
On Wed, Mar 21, 2012 at 2:31 PM, Kjetil Nygård <polpot78@gmail.com> wrote:
> Hi,
>
> We are considering to migrate some of our databases to PostgreSQL.
>
> We wonder if someone could give some hardware / configuration specs for
> large PostgreSQL installations.
> We're interested in:
>        - Number of CPUs
>        - Memory on the server
>        - shared_buffers
>        - Size of the database on disk
>
>
>
> PS: I have read in "PosgreSQL 9.0 High Performance" that one should not
> use more than 8GB for shared_buffers. But Robert Haas and comments say
> that one can use a lot more.
> http://rhaas.blogspot.com/2012/03/tuning-sharedbuffers-and-walbuffers.html

If your database (or at least, the portion of it that sees regular
activity) fits completely in shared_buffers, it's a win because they
are faster than the o/s filesystem cache and they don't have to get
paged in and out.  OTOH, if your database does not fit, you can get
performance issues relating to them getting pushed in and out.
Another disadvantage of large shared buffers settings is it reduces
the amount of memory for other things, like temporary demands (sorts,
large result sets) or cached structures like plpgsql plans.  Once you
go over 50% memory into shared, it's pretty easy to overcommit your
server and burn yourself.  Of course, 50% of 256GB server is a very
different animal than 50% of a 4GB server.

Here's the takeaway for shared_buffers.
*) it's a nuanced setting.  for single user workloads its affects are
usually undetectable
*) it's more important for high write activity workloads.  for low
user high read olap type workloads, I usually set it lower, perhaps
even to 256mb -- it doesn't  help all that much and i'd rather have
that memory be on demand for the o/s
*) don't be afraid to buck the conventional wisdom if you're not
seeing the performance you think you should be getting (especially on
writes).  higher or lower shared_buffers can work
*) lots of other variables are at play -- o/s page flush policy for example.
*) it's unclear right now what the upcoming revolution in faster
storage means for database configuration and tuning.  my gut feeling
is that it's going to be generally less important as databases become
primarily cpu,lock, and algorithm (query plan) bound.
*) beware memory over commit.

merlin

Re: Large PostgreSQL servers

From
Scott Marlowe
Date:
On Thu, Mar 22, 2012 at 8:46 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> large result sets) or cached structures like plpgsql plans.  Once you
> go over 50% memory into shared, it's pretty easy to overcommit your
> server and burn yourself.  Of course, 50% of 256GB server is a very
> different animal than 50% of a 4GB server.

There's other issues you run into with large shared_buffers as well.
If you've got a large shared_buffers setting, but only regularly hit a
small subset of your db (say 32GB shared_buffers but only hit 4G or so
regularly in your app) then it's quite possible that older
shared_buffer segments will get swapped out because they're not being
used.  Then, when the db goes to hit a page in shared_buffers, the OS
will have to swap it back in.  What was supposed to make your db much
faster has now made it much slower.

With Linux, the OS tends to swap out unused memory to make room for
file buffers.  While you can change the swappiness settings to 0 to
slow it down, the OS will eventually swap out the least used segments
anyway.  The only solution on large memory servers is often to just
turn off swap.

Re: Large PostgreSQL servers

From
Merlin Moncure
Date:
On Thu, Mar 22, 2012 at 10:02 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Thu, Mar 22, 2012 at 8:46 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> large result sets) or cached structures like plpgsql plans.  Once you
>> go over 50% memory into shared, it's pretty easy to overcommit your
>> server and burn yourself.  Of course, 50% of 256GB server is a very
>> different animal than 50% of a 4GB server.
>
> There's other issues you run into with large shared_buffers as well.
> If you've got a large shared_buffers setting, but only regularly hit a
> small subset of your db (say 32GB shared_buffers but only hit 4G or so
> regularly in your app) then it's quite possible that older
> shared_buffer segments will get swapped out because they're not being
> used.  Then, when the db goes to hit a page in shared_buffers, the OS
> will have to swap it back in.  What was supposed to make your db much
> faster has now made it much slower.
>
> With Linux, the OS tends to swap out unused memory to make room for
> file buffers.  While you can change the swappiness settings to 0 to
> slow it down, the OS will eventually swap out the least used segments
> anyway.  The only solution on large memory servers is often to just
> turn off swap.

Right -- but my take on that is that hacking the o/s to disable swap
is dealing with symptoms of problem related to server
misconfiguration.

In particular it probably means shared_buffers is set too high...the
o/s thinks it needs that memory more than you do and it may very well
be right.  The o/s doesn't swap for fun -- it does so when there are
memory pressures and things are under stress.  Generally, unused
memory *should* get swapped out...of course there exceptions for
example if you want zero latency access to an important table that is
only touched once a day.  But those cases are pretty rare.  On systems
with very fast storage (ssd), removing swap is even more unreasonable
-- the penalty for going to storage is less and the server could use
that memory for other things.

merlin

Re: Large PostgreSQL servers

From
Scott Marlowe
Date:
On Thu, Mar 22, 2012 at 9:29 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Thu, Mar 22, 2012 at 10:02 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>> There's other issues you run into with large shared_buffers as well.
>> If you've got a large shared_buffers setting, but only regularly hit a
>> small subset of your db (say 32GB shared_buffers but only hit 4G or so
>> regularly in your app) then it's quite possible that older
>> shared_buffer segments will get swapped out because they're not being
>> used.  Then, when the db goes to hit a page in shared_buffers, the OS
>> will have to swap it back in.  What was supposed to make your db much
>> faster has now made it much slower.
>>
>> With Linux, the OS tends to swap out unused memory to make room for
>> file buffers.  While you can change the swappiness settings to 0 to
>> slow it down, the OS will eventually swap out the least used segments
>> anyway.  The only solution on large memory servers is often to just
>> turn off swap.
>
> Right -- but my take on that is that hacking the o/s to disable swap
> is dealing with symptoms of problem related to server
> misconfiguration.

You can configure a big memory linux server anyway you want.  After a
while, they seem to go crazy anyway and start swapping even when
you've told them not to.

> In particular it probably means shared_buffers is set too high...the
> o/s thinks it needs that memory more than you do and it may very well
> be right.

I've had machines with 128GB RAM and a 4G shared_buffers start
swapping for no apparent reason and just fall over.  There's no memory
pressure etc, just kswapd decides to go nuts and start swapping.

This was on Ubuntu 10.04 and 12.04 and RHEL 5.2 through 5.latest with
all updates.  These machines typically had ~90GB+ of kernel cache and
zero memory pressure.

Re: Large PostgreSQL servers

From
Merlin Moncure
Date:
On Thu, Mar 22, 2012 at 10:57 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Thu, Mar 22, 2012 at 9:29 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> On Thu, Mar 22, 2012 at 10:02 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>>> There's other issues you run into with large shared_buffers as well.
>>> If you've got a large shared_buffers setting, but only regularly hit a
>>> small subset of your db (say 32GB shared_buffers but only hit 4G or so
>>> regularly in your app) then it's quite possible that older
>>> shared_buffer segments will get swapped out because they're not being
>>> used.  Then, when the db goes to hit a page in shared_buffers, the OS
>>> will have to swap it back in.  What was supposed to make your db much
>>> faster has now made it much slower.
>>>
>>> With Linux, the OS tends to swap out unused memory to make room for
>>> file buffers.  While you can change the swappiness settings to 0 to
>>> slow it down, the OS will eventually swap out the least used segments
>>> anyway.  The only solution on large memory servers is often to just
>>> turn off swap.
>>
>> Right -- but my take on that is that hacking the o/s to disable swap
>> is dealing with symptoms of problem related to server
>> misconfiguration.
>
> You can configure a big memory linux server anyway you want.  After a
> while, they seem to go crazy anyway and start swapping even when
> you've told them not to.
>
>> In particular it probably means shared_buffers is set too high...the
>> o/s thinks it needs that memory more than you do and it may very well
>> be right.
>
> I've had machines with 128GB RAM and a 4G shared_buffers start
> swapping for no apparent reason and just fall over.  There's no memory
> pressure etc, just kswapd decides to go nuts and start swapping.
>
> This was on Ubuntu 10.04 and 12.04 and RHEL 5.2 through 5.latest with
> all updates.  These machines typically had ~90GB+ of kernel cache and
> zero memory pressure.

hm, that's interesting -- noted.  I'll keep an eye out for that.

merlin

Re: Large PostgreSQL servers

From
Vick Khera
Date:
On Wed, Mar 21, 2012 at 3:31 PM, Kjetil Nygård <polpot78@gmail.com> wrote:
> We wonder if someone could give some hardware / configuration specs for
> large PostgreSQL installations.
> We're interested in:
>        - Number of CPUs
>        - Memory on the server
>        - shared_buffers
>        - Size of the database on disk

Just yesterday I purchased this system:

   Details:
   CPU:  2 x Opteron 6212 (2.6GHz, 8-Core, G34, 16MB L3 Cache) 115W TDP, 32nm
   RAM:  32GB (8 x 4GB) Operating at 1333MHz Max (DDR3-1333 ECC
Registered DIMMs)
   NIC:  Intel 82576 Dual-Port Gigabit Ethernet Controller - Integrated
   Management:  Integrated IPMI 2.0 & KVM with Dedicated LAN
   PCIe 2.0 x8 - 1:  LSI 9265-8i 6Gb/s SAS/SATA RAID (8-Port Int) with
1GB DDR3 Cache (BBU Consumes 2nd PCI Slot)
   PCIe 2.0 x8 - 2:  No Item Selected
   NOTE:  SAS Drives or More Than 6 SATA Drives Require Controller
(See PCIe 2.0 Slot)
   Hot-Swap Drive - 1:  40GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD
   Hot-Swap Drive - 2:  40GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD
   Hot-Swap Drive - 3:  160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD
   Hot-Swap Drive - 4:  160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD
   Hot-Swap Drive - 5:  160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD
   Hot-Swap Drive - 6:  160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD
   Hot-Swap Drive - 7:  160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD
   Hot-Swap Drive - 8:  160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD
   Optical Drive:  Low-Profile DVD-ROM Drive
   Power Supply:  Redundant 700W Power Supply with PMBus - 80 PLUS
Gold Certified
   **** Additional Components ****
   Fastpath:  LSI FastPath Software License (Requires 926x 928x controller)

 drives 1+2 RAID 1 for boot + OS
 drives 3+4 RAID 1 for db logs
 drives 5-8 RAID 10 for data

The expected maximum size of the stored data is going to be about
100GB, so the goal was to have the data area about 2x that limit since
they are SSDs and work optimally with lots of room to spare.

My currently largest server has 22GB RAM and I specify 5GB as shared
buffers.  Most important in tuning is to get your random_page_cost
right, and have enough checkpoint segments for your write load.

Re: Large PostgreSQL servers

From
Merlin Moncure
Date:
On Fri, Mar 23, 2012 at 10:01 AM, Vick Khera <vivek@khera.org> wrote:
> On Wed, Mar 21, 2012 at 3:31 PM, Kjetil Nygård <polpot78@gmail.com> wrote:
>> We wonder if someone could give some hardware / configuration specs for
>> large PostgreSQL installations.
>> We're interested in:
>>        - Number of CPUs
>>        - Memory on the server
>>        - shared_buffers
>>        - Size of the database on disk
>
> Just yesterday I purchased this system:
>
>   Details:
>   CPU:  2 x Opteron 6212 (2.6GHz, 8-Core, G34, 16MB L3 Cache) 115W TDP, 32nm
>   RAM:  32GB (8 x 4GB) Operating at 1333MHz Max (DDR3-1333 ECC
> Registered DIMMs)
>   NIC:  Intel 82576 Dual-Port Gigabit Ethernet Controller - Integrated
>   Management:  Integrated IPMI 2.0 & KVM with Dedicated LAN
>   PCIe 2.0 x8 - 1:  LSI 9265-8i 6Gb/s SAS/SATA RAID (8-Port Int) with
> 1GB DDR3 Cache (BBU Consumes 2nd PCI Slot)
>   PCIe 2.0 x8 - 2:  No Item Selected
>   NOTE:  SAS Drives or More Than 6 SATA Drives Require Controller
> (See PCIe 2.0 Slot)
>   Hot-Swap Drive - 1:  40GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD
>   Hot-Swap Drive - 2:  40GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD
>   Hot-Swap Drive - 3:  160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD
>   Hot-Swap Drive - 4:  160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD
>   Hot-Swap Drive - 5:  160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD
>   Hot-Swap Drive - 6:  160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD
>   Hot-Swap Drive - 7:  160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD
>   Hot-Swap Drive - 8:  160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD
>   Optical Drive:  Low-Profile DVD-ROM Drive
>   Power Supply:  Redundant 700W Power Supply with PMBus - 80 PLUS
> Gold Certified
>   **** Additional Components ****
>   Fastpath:  LSI FastPath Software License (Requires 926x 928x controller)
>
>  drives 1+2 RAID 1 for boot + OS
>  drives 3+4 RAID 1 for db logs
>  drives 5-8 RAID 10 for data
>
> The expected maximum size of the stored data is going to be about
> 100GB, so the goal was to have the data area about 2x that limit since
> they are SSDs and work optimally with lots of room to spare.
>
> My currently largest server has 22GB RAM and I specify 5GB as shared
> buffers.  Most important in tuning is to get your random_page_cost
> right, and have enough checkpoint segments for your write load.

would love to see some performance #s from your server when it's all set up...

merlin

Re: Large PostgreSQL servers

From
Vick Khera
Date:
On Fri, Mar 23, 2012 at 11:28 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> would love to see some performance #s from your server when it's all set up...

I plan on doing some.  This will also be my very first non-FreeBSD
server (it is also running a custom app we just bought, and that
requires CentOS) so this is very very new to me.

Any recommended tests to run?  I was going to try to get Greg Smith
some time on this box before it goes production to see how his tests
come out, but I haven't heard from him in a while.