Thread: Large PostgreSQL servers
Hi, We are considering to migrate some of our databases to PostgreSQL. We wonder if someone could give some hardware / configuration specs for large PostgreSQL installations. We're interested in: - Number of CPUs - Memory on the server - shared_buffers - Size of the database on disk PS: I have read in "PosgreSQL 9.0 High Performance" that one should not use more than 8GB for shared_buffers. But Robert Haas and comments say that one can use a lot more. http://rhaas.blogspot.com/2012/03/tuning-sharedbuffers-and-walbuffers.html Regards, Kjetil Nygård
On Wed, 21 Mar 2012 20:31:08 +0100 Kjetil Nygård <polpot78@gmail.com> wrote: > We are considering to migrate some of our databases to PostgreSQL. > > We wonder if someone could give some hardware / configuration specs > for large PostgreSQL installations. > We're interested in: > - Number of CPUs > - Memory on the server > - shared_buffers > - Size of the database on disk I guess this is extremely depending on how big you database is ... Cheers, Frank -- Frank Lanitz <frank@frank.uvena.de>
Attachment
On Wed, 2012-03-21 at 20:45 +0100, Frank Lanitz wrote: > On Wed, 21 Mar 2012 20:31:08 +0100 > Kjetil Nygård <polpot78@gmail.com> wrote: > > > We are considering to migrate some of our databases to PostgreSQL. > > > > We wonder if someone could give some hardware / configuration specs > > for large PostgreSQL installations. > > We're interested in: > > - Number of CPUs > > - Memory on the server > > - shared_buffers > > - Size of the database on disk > > I guess this is extremely depending on how big you database is ... I know. But I just hope that someone will share with me some quantitative numbers about their databases :-) Our needs are not the greatest, but I'm curious about how much resources PostgreSQL can consume in practice. -kny
On 03/21/2012 12:31 PM, Kjetil Nygård wrote: > Hi, > > We are considering to migrate some of our databases to PostgreSQL. > > We wonder if someone could give some hardware / configuration specs for > large PostgreSQL installations... You need to tell us a lot more than "large" (a speaker-dependent description I've heard applied to 100,000 record databases up to multi-terabyte). For rough starting approximations how about: -Estimated data size in GB and number of tuples -Peak transactions/second required -Max acceptable transaction latency -Number of simultaneous connections required -Nature of workload (OLAP, OLTP, primarily reads or writes) -Nature of data - is it mostly "vanilla" int/date/text/etc. or is it large blobs, full-text-search, GIS or the like? -Maintenance window allowances (running business-hours only or 24x7) Cheers, Steve
On 03/21/12 12:45 PM, Frank Lanitz wrote: >> - Number of CPUs >> > - Memory on the server >> > - shared_buffers >> > - Size of the database on disk > I guess this is extremely depending on how big you database is ... and how much concurrent access. 48 CPU cores won't help if you're only ever making 8 queries at once, regardless of how big the database is. -- john r pierce N 37, W 122 santa cruz ca mid-left coast
On 03/21/12 12:31 PM, Kjetil Nygård wrote: > We wonder if someone could give some hardware / configuration specs for > large PostgreSQL installations. > We're interested in: > - Number of CPUs > - Memory on the server > - shared_buffers > - Size of the database on disk oh, and you left out a couple /very/ important specifications... - number and speed of storage IO channels - number and speed (RPM) of disk drives -- john r pierce N 37, W 122 santa cruz ca mid-left coast
On Wed, 2012-03-21 at 13:06 -0700, John R Pierce wrote: > On 03/21/12 12:31 PM, Kjetil Nygård wrote: > > We wonder if someone could give some hardware / configuration specs for > > large PostgreSQL installations. > > We're interested in: > > - Number of CPUs > > - Memory on the server > > - shared_buffers > > - Size of the database on disk > > oh, and you left out a couple /very/ important specifications... > > - number and speed of storage IO channels > - number and speed (RPM) of disk drives I understand that IO performance, transactions/s, 24/7 vs office hours, data-complexity etc is also needed to really say how much beating a database can handle. I just hoped for some simple numbers, but other relevant performance numbers etc would be nice as well :-) Regards, Kny
On 03/21/2012 01:13 PM, Kjetil Nygård wrote: > > I just hoped for some simple numbers... That's exactly what we want in order to help. Since you said you are considering a migration, you must have a pretty good idea of your current data and workload. There is no "one-size-fits-all". Without some specifics we are like an architect confronted by a client who wants a "structure" but has no idea if that means an office-tower, outhouse, concert-hall or parking garage. Cheers, Steve
On Wed, Mar 21, 2012 at 2:13 PM, Kjetil Nygård <polpot78@gmail.com> wrote: > I understand that IO performance, transactions/s, 24/7 vs office hours, > data-complexity etc is also needed to really say how much beating a > database can handle. > > I just hoped for some simple numbers, but other relevant performance > numbers etc would be nice as well :-) At my last job we ran a trio of mainline db servers with 48 opterons (4x12 2.1GHz) with 128G RAM and 34 15k SAS drives on Areca, and LSI RAID controllers as well as plain host based SAS adapters. With the RAID controllers we were able to hit somewhere in the 4k to 5k tps range with pgbench on a 40G test db with somewhere around 50 to 64 connections. As we went past 64 connections, the numbers would fall down to the 2.5k to 3k range as we headed towards 500 or so. The actual application was on a ~300G database, with memcache in front of it. When the memcache was working, load averaged about 4 to 12. When memcache would die for whatever reason, the load would shoot up to 300 to 500. Response times would go from sub second to multi-second. But the db server would actually stay up under such extreme load.
On 03/21/12 1:13 PM, Kjetil Nygård wrote: > I just hoped for some simple numbers, but other relevant performance > numbers etc would be nice as well :-) 42! -- john r pierce N 37, W 122 santa cruz ca mid-left coast
On Wed, Mar 21, 2012 at 2:31 PM, Kjetil Nygård <polpot78@gmail.com> wrote: > Hi, > > We are considering to migrate some of our databases to PostgreSQL. > > We wonder if someone could give some hardware / configuration specs for > large PostgreSQL installations. > We're interested in: > - Number of CPUs > - Memory on the server > - shared_buffers > - Size of the database on disk > > > > PS: I have read in "PosgreSQL 9.0 High Performance" that one should not > use more than 8GB for shared_buffers. But Robert Haas and comments say > that one can use a lot more. > http://rhaas.blogspot.com/2012/03/tuning-sharedbuffers-and-walbuffers.html If your database (or at least, the portion of it that sees regular activity) fits completely in shared_buffers, it's a win because they are faster than the o/s filesystem cache and they don't have to get paged in and out. OTOH, if your database does not fit, you can get performance issues relating to them getting pushed in and out. Another disadvantage of large shared buffers settings is it reduces the amount of memory for other things, like temporary demands (sorts, large result sets) or cached structures like plpgsql plans. Once you go over 50% memory into shared, it's pretty easy to overcommit your server and burn yourself. Of course, 50% of 256GB server is a very different animal than 50% of a 4GB server. Here's the takeaway for shared_buffers. *) it's a nuanced setting. for single user workloads its affects are usually undetectable *) it's more important for high write activity workloads. for low user high read olap type workloads, I usually set it lower, perhaps even to 256mb -- it doesn't help all that much and i'd rather have that memory be on demand for the o/s *) don't be afraid to buck the conventional wisdom if you're not seeing the performance you think you should be getting (especially on writes). higher or lower shared_buffers can work *) lots of other variables are at play -- o/s page flush policy for example. *) it's unclear right now what the upcoming revolution in faster storage means for database configuration and tuning. my gut feeling is that it's going to be generally less important as databases become primarily cpu,lock, and algorithm (query plan) bound. *) beware memory over commit. merlin
On Thu, Mar 22, 2012 at 8:46 AM, Merlin Moncure <mmoncure@gmail.com> wrote: > large result sets) or cached structures like plpgsql plans. Once you > go over 50% memory into shared, it's pretty easy to overcommit your > server and burn yourself. Of course, 50% of 256GB server is a very > different animal than 50% of a 4GB server. There's other issues you run into with large shared_buffers as well. If you've got a large shared_buffers setting, but only regularly hit a small subset of your db (say 32GB shared_buffers but only hit 4G or so regularly in your app) then it's quite possible that older shared_buffer segments will get swapped out because they're not being used. Then, when the db goes to hit a page in shared_buffers, the OS will have to swap it back in. What was supposed to make your db much faster has now made it much slower. With Linux, the OS tends to swap out unused memory to make room for file buffers. While you can change the swappiness settings to 0 to slow it down, the OS will eventually swap out the least used segments anyway. The only solution on large memory servers is often to just turn off swap.
On Thu, Mar 22, 2012 at 10:02 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Thu, Mar 22, 2012 at 8:46 AM, Merlin Moncure <mmoncure@gmail.com> wrote: >> large result sets) or cached structures like plpgsql plans. Once you >> go over 50% memory into shared, it's pretty easy to overcommit your >> server and burn yourself. Of course, 50% of 256GB server is a very >> different animal than 50% of a 4GB server. > > There's other issues you run into with large shared_buffers as well. > If you've got a large shared_buffers setting, but only regularly hit a > small subset of your db (say 32GB shared_buffers but only hit 4G or so > regularly in your app) then it's quite possible that older > shared_buffer segments will get swapped out because they're not being > used. Then, when the db goes to hit a page in shared_buffers, the OS > will have to swap it back in. What was supposed to make your db much > faster has now made it much slower. > > With Linux, the OS tends to swap out unused memory to make room for > file buffers. While you can change the swappiness settings to 0 to > slow it down, the OS will eventually swap out the least used segments > anyway. The only solution on large memory servers is often to just > turn off swap. Right -- but my take on that is that hacking the o/s to disable swap is dealing with symptoms of problem related to server misconfiguration. In particular it probably means shared_buffers is set too high...the o/s thinks it needs that memory more than you do and it may very well be right. The o/s doesn't swap for fun -- it does so when there are memory pressures and things are under stress. Generally, unused memory *should* get swapped out...of course there exceptions for example if you want zero latency access to an important table that is only touched once a day. But those cases are pretty rare. On systems with very fast storage (ssd), removing swap is even more unreasonable -- the penalty for going to storage is less and the server could use that memory for other things. merlin
On Thu, Mar 22, 2012 at 9:29 AM, Merlin Moncure <mmoncure@gmail.com> wrote: > On Thu, Mar 22, 2012 at 10:02 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote: >> There's other issues you run into with large shared_buffers as well. >> If you've got a large shared_buffers setting, but only regularly hit a >> small subset of your db (say 32GB shared_buffers but only hit 4G or so >> regularly in your app) then it's quite possible that older >> shared_buffer segments will get swapped out because they're not being >> used. Then, when the db goes to hit a page in shared_buffers, the OS >> will have to swap it back in. What was supposed to make your db much >> faster has now made it much slower. >> >> With Linux, the OS tends to swap out unused memory to make room for >> file buffers. While you can change the swappiness settings to 0 to >> slow it down, the OS will eventually swap out the least used segments >> anyway. The only solution on large memory servers is often to just >> turn off swap. > > Right -- but my take on that is that hacking the o/s to disable swap > is dealing with symptoms of problem related to server > misconfiguration. You can configure a big memory linux server anyway you want. After a while, they seem to go crazy anyway and start swapping even when you've told them not to. > In particular it probably means shared_buffers is set too high...the > o/s thinks it needs that memory more than you do and it may very well > be right. I've had machines with 128GB RAM and a 4G shared_buffers start swapping for no apparent reason and just fall over. There's no memory pressure etc, just kswapd decides to go nuts and start swapping. This was on Ubuntu 10.04 and 12.04 and RHEL 5.2 through 5.latest with all updates. These machines typically had ~90GB+ of kernel cache and zero memory pressure.
On Thu, Mar 22, 2012 at 10:57 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Thu, Mar 22, 2012 at 9:29 AM, Merlin Moncure <mmoncure@gmail.com> wrote: >> On Thu, Mar 22, 2012 at 10:02 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote: >>> There's other issues you run into with large shared_buffers as well. >>> If you've got a large shared_buffers setting, but only regularly hit a >>> small subset of your db (say 32GB shared_buffers but only hit 4G or so >>> regularly in your app) then it's quite possible that older >>> shared_buffer segments will get swapped out because they're not being >>> used. Then, when the db goes to hit a page in shared_buffers, the OS >>> will have to swap it back in. What was supposed to make your db much >>> faster has now made it much slower. >>> >>> With Linux, the OS tends to swap out unused memory to make room for >>> file buffers. While you can change the swappiness settings to 0 to >>> slow it down, the OS will eventually swap out the least used segments >>> anyway. The only solution on large memory servers is often to just >>> turn off swap. >> >> Right -- but my take on that is that hacking the o/s to disable swap >> is dealing with symptoms of problem related to server >> misconfiguration. > > You can configure a big memory linux server anyway you want. After a > while, they seem to go crazy anyway and start swapping even when > you've told them not to. > >> In particular it probably means shared_buffers is set too high...the >> o/s thinks it needs that memory more than you do and it may very well >> be right. > > I've had machines with 128GB RAM and a 4G shared_buffers start > swapping for no apparent reason and just fall over. There's no memory > pressure etc, just kswapd decides to go nuts and start swapping. > > This was on Ubuntu 10.04 and 12.04 and RHEL 5.2 through 5.latest with > all updates. These machines typically had ~90GB+ of kernel cache and > zero memory pressure. hm, that's interesting -- noted. I'll keep an eye out for that. merlin
On Wed, Mar 21, 2012 at 3:31 PM, Kjetil Nygård <polpot78@gmail.com> wrote: > We wonder if someone could give some hardware / configuration specs for > large PostgreSQL installations. > We're interested in: > - Number of CPUs > - Memory on the server > - shared_buffers > - Size of the database on disk Just yesterday I purchased this system: Details: CPU: 2 x Opteron 6212 (2.6GHz, 8-Core, G34, 16MB L3 Cache) 115W TDP, 32nm RAM: 32GB (8 x 4GB) Operating at 1333MHz Max (DDR3-1333 ECC Registered DIMMs) NIC: Intel 82576 Dual-Port Gigabit Ethernet Controller - Integrated Management: Integrated IPMI 2.0 & KVM with Dedicated LAN PCIe 2.0 x8 - 1: LSI 9265-8i 6Gb/s SAS/SATA RAID (8-Port Int) with 1GB DDR3 Cache (BBU Consumes 2nd PCI Slot) PCIe 2.0 x8 - 2: No Item Selected NOTE: SAS Drives or More Than 6 SATA Drives Require Controller (See PCIe 2.0 Slot) Hot-Swap Drive - 1: 40GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD Hot-Swap Drive - 2: 40GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD Hot-Swap Drive - 3: 160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD Hot-Swap Drive - 4: 160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD Hot-Swap Drive - 5: 160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD Hot-Swap Drive - 6: 160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD Hot-Swap Drive - 7: 160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD Hot-Swap Drive - 8: 160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD Optical Drive: Low-Profile DVD-ROM Drive Power Supply: Redundant 700W Power Supply with PMBus - 80 PLUS Gold Certified **** Additional Components **** Fastpath: LSI FastPath Software License (Requires 926x 928x controller) drives 1+2 RAID 1 for boot + OS drives 3+4 RAID 1 for db logs drives 5-8 RAID 10 for data The expected maximum size of the stored data is going to be about 100GB, so the goal was to have the data area about 2x that limit since they are SSDs and work optimally with lots of room to spare. My currently largest server has 22GB RAM and I specify 5GB as shared buffers. Most important in tuning is to get your random_page_cost right, and have enough checkpoint segments for your write load.
On Fri, Mar 23, 2012 at 10:01 AM, Vick Khera <vivek@khera.org> wrote: > On Wed, Mar 21, 2012 at 3:31 PM, Kjetil Nygård <polpot78@gmail.com> wrote: >> We wonder if someone could give some hardware / configuration specs for >> large PostgreSQL installations. >> We're interested in: >> - Number of CPUs >> - Memory on the server >> - shared_buffers >> - Size of the database on disk > > Just yesterday I purchased this system: > > Details: > CPU: 2 x Opteron 6212 (2.6GHz, 8-Core, G34, 16MB L3 Cache) 115W TDP, 32nm > RAM: 32GB (8 x 4GB) Operating at 1333MHz Max (DDR3-1333 ECC > Registered DIMMs) > NIC: Intel 82576 Dual-Port Gigabit Ethernet Controller - Integrated > Management: Integrated IPMI 2.0 & KVM with Dedicated LAN > PCIe 2.0 x8 - 1: LSI 9265-8i 6Gb/s SAS/SATA RAID (8-Port Int) with > 1GB DDR3 Cache (BBU Consumes 2nd PCI Slot) > PCIe 2.0 x8 - 2: No Item Selected > NOTE: SAS Drives or More Than 6 SATA Drives Require Controller > (See PCIe 2.0 Slot) > Hot-Swap Drive - 1: 40GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD > Hot-Swap Drive - 2: 40GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD > Hot-Swap Drive - 3: 160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD > Hot-Swap Drive - 4: 160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD > Hot-Swap Drive - 5: 160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD > Hot-Swap Drive - 6: 160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD > Hot-Swap Drive - 7: 160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD > Hot-Swap Drive - 8: 160GB Intel 320 Series MLC (3Gb/s) 2.5" SATA SSD > Optical Drive: Low-Profile DVD-ROM Drive > Power Supply: Redundant 700W Power Supply with PMBus - 80 PLUS > Gold Certified > **** Additional Components **** > Fastpath: LSI FastPath Software License (Requires 926x 928x controller) > > drives 1+2 RAID 1 for boot + OS > drives 3+4 RAID 1 for db logs > drives 5-8 RAID 10 for data > > The expected maximum size of the stored data is going to be about > 100GB, so the goal was to have the data area about 2x that limit since > they are SSDs and work optimally with lots of room to spare. > > My currently largest server has 22GB RAM and I specify 5GB as shared > buffers. Most important in tuning is to get your random_page_cost > right, and have enough checkpoint segments for your write load. would love to see some performance #s from your server when it's all set up... merlin
On Fri, Mar 23, 2012 at 11:28 AM, Merlin Moncure <mmoncure@gmail.com> wrote: > would love to see some performance #s from your server when it's all set up... I plan on doing some. This will also be my very first non-FreeBSD server (it is also running a custom app we just bought, and that requires CentOS) so this is very very new to me. Any recommended tests to run? I was going to try to get Greg Smith some time on this box before it goes production to see how his tests come out, but I haven't heard from him in a while.