Thread: performance tuning: shared_buffers, sort_mem; swap
so a while back, i think i recall a post from Bruce Momjian describing an optimum setup for shared_buffers at 25% of available physical memory and sort_mem at 4% of available physical memory. the caveat was to watch for swap activity when these values were being set. so my question is, how much swapping is too much, and is there a way to determine whether postgres is the process causing the swapping? hmm. maybe that's two questions?... :) -tfo
Thomas O'Connell wrote: > so a while back, i think i recall a post from Bruce Momjian describing > an optimum setup for shared_buffers at 25% of available physical memory > and sort_mem at 4% of available physical memory. Yes, that was the numebers to start with. > the caveat was to watch for swap activity when these values were being > set. so my question is, how much swapping is too much, and is there a > way to determine whether postgres is the process causing the swapping? Well, it doesn't really matter who is causing the swapping. If you have more of a load on your machine that RAM can hold, you are better off reducing your PostgreSQL shared buffers. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
In article <200208131556.g7DFuH008873@candle.pha.pa.us>, pgman@candle.pha.pa.us (Bruce Momjian) wrote: > Well, it doesn't really matter who is causing the swapping. If you have > more of a load on your machine that RAM can hold, you are better off > reducing your PostgreSQL shared buffers. So the idea would be: 1. start with the numbers above. 2. benchmark postgres on the machine with those numbers set (creating enough load to require plenty of resource use in shared_buffers/sort_mem) 3. monitor swap. 4. if heavy swapping occurs, reduce the amount of shared memory allocated to shared_buffers/sort_mem. right? on sort of a side note, here's the situation i've got: i'm currently running postgres on a couple of boxes with decent RAM and processors. each postgres box, though, is also running several Apache servers. the Apache servers are running web applications that hit postgres, so when load on the box is high, it's caused by both Apache and postgres. we've had the issue before where postgres will die under heavy load (meaning Apache is logging several requests per minute and stressing postgres, too) with the error about how probably we don't have shared memory configured appropriately. is it possible to set the kernel resources and shared_buffers such that postgres won't be the point of failure when trying to access more shared memory than is currently available? i guess the issue is: when kernel resources are maxed out, does postgres' architecture mean that when an IPC call fails, it will be the piece of the system to go down? e.g., if SHMALL/SHMMAX are configured to allow 128MB shared memory on a box with 512MB RAM, plus a little extra to provide for Apache, then if postgres is set to have 128MB shared memory, is it a problem with our settings if postgres crashes when load is high? meaning, could it be that Apache is using up the extra SHMALL/SHMMAX and postgres doesn't really have 128MB of shared memory to work with? the trick, then, would seem to be to monitor swapping, but also to monitor overall shared memory usage at the upper limits of available resources. sorry to ramble on. i'm just trying to get a high performance database running in a stable environment... :) -tfo
Thomas O'Connell wrote: > In article <200208131556.g7DFuH008873@candle.pha.pa.us>, > pgman@candle.pha.pa.us (Bruce Momjian) wrote: > > > Well, it doesn't really matter who is causing the swapping. If you have > > more of a load on your machine that RAM can hold, you are better off > > reducing your PostgreSQL shared buffers. > > So the idea would be: > > 1. start with the numbers above. > 2. benchmark postgres on the machine with those numbers set (creating > enough load to require plenty of resource use in shared_buffers/sort_mem) > 3. monitor swap. > 4. if heavy swapping occurs, reduce the amount of shared memory > allocated to shared_buffers/sort_mem. > > right? Yes. > on sort of a side note, here's the situation i've got: > > i'm currently running postgres on a couple of boxes with decent RAM and > processors. each postgres box, though, is also running several Apache > servers. the Apache servers are running web applications that hit > postgres, so when load on the box is high, it's caused by both Apache > and postgres. > > we've had the issue before where postgres will die under heavy load > (meaning Apache is logging several requests per minute and stressing > postgres, too) with the error about how probably we don't have shared > memory configured appropriately. > > is it possible to set the kernel resources and shared_buffers such that > postgres won't be the point of failure when trying to access more shared > memory than is currently available? > > i guess the issue is: when kernel resources are maxed out, does > postgres' architecture mean that when an IPC call fails, it will be the > piece of the system to go down? e.g., if SHMALL/SHMMAX are configured to > allow 128MB shared memory on a box with 512MB RAM, plus a little extra > to provide for Apache, then if postgres is set to have 128MB shared > memory, is it a problem with our settings if postgres crashes when load > is high? meaning, could it be that Apache is using up the extra > SHMALL/SHMMAX and postgres doesn't really have 128MB of shared memory to > work with? > > the trick, then, would seem to be to monitor swapping, but also to > monitor overall shared memory usage at the upper limits of available > resources. Assuming you are running 7.2.X, PostgreSQL grabs all its resources on start and doesn't grab anything additional. There is some Linux kernel code that starts to kill processes when resources get low, and that perhaps is what you are seeing. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Thomas O'Connell wrote: >> the caveat was to watch for swap activity when these values were being >> set. so my question is, how much swapping is too much, and is there a >> way to determine whether postgres is the process causing the swapping? > Well, it doesn't really matter who is causing the swapping. If you have > more of a load on your machine that RAM can hold, you are better off > reducing your PostgreSQL shared buffers. Right. Any swapping is too much, except maybe under absolute peak load conditions where you're prepared to accept performance degradation. If the machine is swapping, that probably means it's already been forced to skinny kernel disk cache down to the minimum (since kernel disk buffers normally are allowed to occupy whatever RAM is left over beyond the needs of active processes). You want a fat disk cache not a skinny one ... regards, tom lane
"Thomas O'Connell" <tfo@monsterlabs.com> writes: > we've had the issue before where postgres will die under heavy load > (meaning Apache is logging several requests per minute and stressing > postgres, too) with the error about how probably we don't have shared > memory configured appropriately. Uh, which error exactly? AFAIK all the shared-memory-related messages would only come out at postmaster startup, not during peak load. The system is designed to ensure that those resources are all reserved during postmaster start. regards, tom lane
On Tue, Aug 13, 2002 at 10:42:12AM -0500, Thomas O'Connell wrote: > the caveat was to watch for swap activity when these values were being > set. so my question is, how much swapping is too much, and is there a Any swapping is too much on any system that is supposed to be fast. Swap overhead is just terrible. If you're swapping on your Postgres machine, either find something that you can kill off, or reduce your shared buffers (or both). A -- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M6K 3E3 +1 416 646 3304 x110
In article <27214.1029257785@sss.pgh.pa.us>, tgl@sss.pgh.pa.us (Tom Lane) wrote: > Uh, which error exactly? AFAIK all the shared-memory-related messages > would only come out at postmaster startup, not during peak load. The > system is designed to ensure that those resources are all reserved > during postmaster start. Sorry, I should've included the whole error in the first place: "IpcMemoryCreate: shmget(key=5432001, size=72368128, 03600) failed: Invalid argument This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMMAX parameter. You can either reduce the request size or reconfigure the kernel with larger SHMMAX. To reduce the request size (currently 72368128 bytes), reduce PostgreSQL's shared_buffers parameter (currently 8192) and/or its max_connections parameter (currently 256). If the request size is already small, it's possible that it is less than your kernel's SHMMIN parameter, in which case raising the request size or reconfiguring SHMMIN is called for. "The PostgreSQL Administrator's Guide contains more information about shared memory configuration." We've gotten this when the machine, a Red Hat 6.2 machine with kernel 2.4.17; Dual Pentium III 550 MHz processors; and 512 MB RAM is under great duress.
In article <200208131556.g7DFuH008873@candle.pha.pa.us>, pgman@candle.pha.pa.us (Bruce Momjian) wrote: > Yes, that was the numebers to start with. a quick question about the numbers and postgresql.conf, too. will resetting the numbers in postgresql.conf require a restart, or is a pg_ctl reload sufficient? will pg_ctl reload free up any previously claimed shared memory segments? thanks! -tfo
No, you have to restart the postmaster for the changes to take effect. I think SHOW will show you the current values. --------------------------------------------------------------------------- Thomas O'Connell wrote: > In article <200208131556.g7DFuH008873@candle.pha.pa.us>, > pgman@candle.pha.pa.us (Bruce Momjian) wrote: > > > Yes, that was the numebers to start with. > > a quick question about the numbers and postgresql.conf, too. will > resetting the numbers in postgresql.conf require a restart, or is a > pg_ctl reload sufficient? will pg_ctl reload free up any previously > claimed shared memory segments? > > thanks! > > -tfo > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
In article <tfo-549650.13201613082002@news.hub.org>, Thomas O'Connell <tfo@monsterlabs.com> wrote: > "IpcMemoryCreate: shmget(key=5432001, size=72368128, 03600) failed: > Invalid argument > This error usually means that PostgreSQL's request for a shared memory > segment exceeded your kernel's SHMMAX parameter. You can either reduce > the request size or reconfigure the kernel with larger SHMMAX. To reduce > the request size (currently 72368128 bytes), reduce PostgreSQL's > shared_buffers parameter (currently 8192) and/or its max_connections > parameter (currently 256). If the request size is already small, it's > possible that it is less than your kernel's SHMMIN parameter, in which > case raising the request size or reconfiguring SHMMIN is called for. also, is there a way to figure out the request size the postmaster will use when starting up? it doesn't seem to come directly from the shared_buffers value. for instance, if i set shared_buffers to 4096, this should mean postmaster will require 4096 * 8192 = 33554432 bytes, right? but when i start it, i get the error above with a request size of 38371328. what's the extra padding from? sort_mem, etc? -tfo
"Thomas O'Connell" <tfo@monsterlabs.com> writes: > In article <27214.1029257785@sss.pgh.pa.us>, > tgl@sss.pgh.pa.us (Tom Lane) wrote: >> Uh, which error exactly? AFAIK all the shared-memory-related messages >> would only come out at postmaster startup, not during peak load. The >> system is designed to ensure that those resources are all reserved >> during postmaster start. > Sorry, I should've included the whole error in the first place: > "IpcMemoryCreate: shmget(key=5432001, size=72368128, 03600) failed: > Invalid argument Hmm. That's definitely a startup-time error. The only way that code could be executed later than postmaster startup is if you suffer a database crash and the postmaster is trying to reinitialize the system with a fresh shared-memory arena. That would say that this isn't your primary problem, but a consequence of a crash that'd already occurred. I am curious why you'd get "Invalid argument" (EINVAL), as presumably these are the same arguments that the kernel accepted on the previous cycle of life. But that's probably not the issue to focus on. My bet for the cause of the database crash is the same as Bruce's: the Linux kernel is known to "kill -9" processes at random when it gets under sufficiently heavy memory pressure. If it happens to select a database backend to kill, the postmaster will interpret the backend's unexpected exit as a crash, and will force a database restart. Perhaps when the postmaster tries to reallocate the shmem segment a few milliseconds later, the kernel still thinks it's under load and rejects a shmem request that it'd normally have accepted. (That last bit is just speculation though.) Possible solutions: (a) buy more RAM and/or increase available swap space (I'm not sure whether more swap, without more physical RAM, actually helps; anyone know?); (b) reduce peak load by reducing max_connections and/or scaling back your other servers; (c) switch to another OS --- I don't think the *BSD kernels have this brain-damaged idea about how to cope with low memory... regards, tom lane
In article <1101.1029272567@sss.pgh.pa.us>, tgl@sss.pgh.pa.us (Tom Lane) wrote: > Hmm. That's definitely a startup-time error. The only way that code > could be executed later than postmaster startup is if you suffer a > database crash and the postmaster is trying to reinitialize the system > with a fresh shared-memory arena. That would say that this isn't your > primary problem, but a consequence of a crash that'd already occurred. Interesting. Particularly interesting because postgres actually intelligently restarts itself after a crash under duress. We've gotten this error every time, and postgres is always running properly after a minute or two of downtime. I've always thought this message was why it died in the first place, but I guess it's related to a startup failure after the first crash, instead. > I am curious why you'd get "Invalid argument" (EINVAL), as presumably > these are the same arguments that the kernel accepted on the previous > cycle of life. But that's probably not the issue to focus on. Right. I think this is related to your speculation, below. > If it happens to select > a database backend to kill, the postmaster will interpret the backend's > unexpected exit as a crash, and will force a database restart. I guess this is what we're seeing, then. Right before the IPC error, there are usually several of these: "NOTICE: Message from PostgreSQL backend: The Postmaster has informed me that some other backend died abnormally and possibly corrupted shared memory. I have rolled back the current transaction and am going to terminate your database system connection and exit. Please reconnect to the database system and repeat your query." I had always thought this just meant that postmaster children were dying. Does it instead mean that the main backend server is dying repeatedly? i.e., is this the forced database restart you mention above? > Perhaps > when the postmaster tries to reallocate the shmem segment a few > milliseconds later, the kernel still thinks it's under load and rejects > a shmem request that it'd normally have accepted. (That last bit is > just speculation though.) I think this is pretty good speculation, considering that after things settle down a bit, it perks right up. Wow, this is all great stuff to know. > Possible solutions: (a) buy more RAM and/or increase available swap > space (I'm not sure whether more swap, without more physical RAM, > actually helps; anyone know?); (b) reduce peak load by reducing > max_connections and/or scaling back your other servers; (c) switch to > another OS --- I don't think the *BSD kernels have this brain-damaged > idea about how to cope with low memory... Well, our solution for the time being has been to have saner rate-limiting so that the web server is not even able to pound the database as much. In essence, we were experiencing DoS attacks, meaning requests were coming several times a minute from the same IP. We still accept a reasonable number of requests for a public web application server, but we've managed to stop the crashing, for now. Still, all of this is great added knowledge to the quest for better tuning. I was under the mistaken impression that my bad memory math was somehow responsible for postgres being the point of failure during the stress. Lucky me, as a DBA, to learn otherwise! Thanks! -tfo
"Thomas O'Connell" <tfo@monsterlabs.com> writes: > for instance, if i set shared_buffers to 4096, this should mean > postmaster will require 4096 * 8192 = 33554432 bytes, right? but when i > start it, i get the error above with a request size of 38371328. what's > the extra padding from? sort_mem, etc? No, sort_mem is backend-local. postgresql.conf lists these parameters as affecting shared memory size: # # Shared Memory Size # #shared_buffers = 64 # 2*max_connections, min 16 #max_fsm_relations = 100 # min 10, fsm is free space map #max_fsm_pages = 10000 # min 1000, fsm is free space map #max_locks_per_transaction = 64 # min 10 #wal_buffers = 8 # min 4 This list misses max_connections which also has nontrivial influence. The multiplier for shared_buffers is somewhat higher than 8192, because there's a buffer management block (bufhdr) to allocate as well as the physical data block. The lock table is sized on the basis of max_locks_per_transaction * max_connections entries, and there are other shared data structures that are per-backend so are sized as multiples of max_connections. I think the multiplier for max_fsm_pages is 6 bytes; not sure about max_fsm_relations but it's probably a couple dozen bytes per relation. There are also a number of fixed-size data structures in shared memory, and I believe we throw on a 10% fudge factor for good luck after we're done counting up what we think we need. You could work out the effective multipliers for these parameters with a little bit of experimental effort... regards, tom lane
> postgresql.conf lists these parameters as affecting shared memory size: > > # Shared Memory Size ah, yes. i probably should've guessed. > You could work out the effective multipliers for these parameters with a > little bit of experimental effort... working on it. thanks for all the information. even seeing the above wouldn't have helped quite as much as everything else you mentioned. today alone, i've boosted the stability and performance of our development server tremendously. thanks for all your help! -tfo
"Thomas O'Connell" <tfo@monsterlabs.com> writes: >> If it happens to select >> a database backend to kill, the postmaster will interpret the backend's >> unexpected exit as a crash, and will force a database restart. > I guess this is what we're seeing, then. Right before the IPC error, > there are usually several of these: > "NOTICE: Message from PostgreSQL backend: > The Postmaster has informed me that some other backend > died abnormally and possibly corrupted shared memory. > is this the forced database restart you mention above? Yup. The actual sequence of events is: 1. Some backend dies (which in Unix terms is "exits with nonzero status"). We're hypothesizing that the kernel kills it with the equivalent of a "kill -9", but coredumps and other untimely ends would also produce nonzero exit status. Zero status means a normal exit(0) call. 2. The postmaster gets a report that one of its child processes quit. Seeing the nonzero status, it assumes the worst and begins the recovery fire drill. The first step is for it to send SIGQUIT to all its other children and wait for them to exit. 3. The other backends receive SIGQUIT, spit out the "The Postmaster has informed me ..." message to their connected clients, and immediately exit(). 4. When the postmaster has received exit reports for all of its children, it releases its existing shared memory segment and then begins the same procedure it would normally use at startup --- of which one of the first steps is to try to acquire a shmem segment with shmget(). What you are seeing is that that fails. Therefore, the "postmaster has informed me" messages are also post-crash noise. What you want to look for is what happened to trigger this whole sequence. What I'd expect to see is something like the attached log, which I made by issuing a manual kill -9 against a perfectly innocent backend: <<normal operation up to here>> LOG: server process (pid 1700) was terminated by signal 9 LOG: terminating any other active server processes WARNING: Message from PostgreSQL backend: The Postmaster has informed me that some other backend died abnormally and possibly corrupted shared memory. I have rolled back the current transaction and am going to terminate your database system connection and exit. Please reconnect to the database system and repeat your query. LOG: all server processes terminated; reinitializing shared memory and semaphores LOG: database system was interrupted at 2002-08-13 16:43:33 EDT LOG: checkpoint record is at 0/1C129B8 LOG: redo record is at 0/1C129B8; undo record is at 0/0; shutdown FALSE LOG: next transaction id: 5641; next oid: 156131 LOG: database system was not properly shut down; automatic recovery in progress LOG: ReadRecord: record with zero length at 0/1C129F8 LOG: redo is not required LOG: database system is ready <<normal operation resumes>> (This is with development sources, which label the messages a little differently than prior releases, but you should see pretty much the same text in your postmaster log.) The "terminated by signal 9" part, which the postmaster prints out when it gets a child-death report with a nonzero exit status, is the actually useful information in this series. I have only one "postmaster informed me" message because there was only one other live backend, but in general you might see a bunch of 'em. So, what are you seeing to start the avalanche? regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: [on shared memory size, deleted] Tom> There are also a number of fixed-size data structures in shared Tom> memory, and I believe we throw on a 10% fudge factor for good Tom> luck after we're done counting up what we think we need. Tom> You could work out the effective multipliers for these parameters Tom> with a little bit of experimental effort... I don't often ask a question, but it's been bugging me for some time: is there any reason why PostgreSQL must use SysV shared memory? Coming from the BSD camp, I've often pondered why it doesn't use BSD-style shared memory (which is often easier to allocate in the BSD world). Dave. -- ============================================================================ |David Gilbert, Velocet Communications. | Two things can only be | |Mail: dgilbert@velocet.net | equal if and only if they | |http://daveg.ca | are precisely opposite. | =========================================================GLO================
David Gilbert <dgilbert@velocet.ca> writes: > I don't often ask a question, but it's been bugging me for some time: > is there any reason why PostgreSQL must use SysV shared memory? > Coming from the BSD camp, I've often pondered why it doesn't use > BSD-style shared memory (which is often easier to allocate in the BSD > world). Well, I must say this is the first time I've heard of "BSD-style shared memory". What are the syscalls? How portable is it? Does it have the semantics we need (specifically, the ability to associate an ID with a shmem segment, and the ability to discover whether any other processes are attached to an existing shmem segment)? regards, tom lane
On Tue, 13 Aug 2002, Tom Lane wrote: > Well, I must say this is the first time I've heard of "BSD-style shared > memory". Probably he's talking about mmap. And I think that everything that anyone has to say on that topic is already in the archives. :-) cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> David Gilbert <dgilbert@velocet.ca> writes: >> I don't often ask a question, but it's been bugging me for some >> time: is there any reason why PostgreSQL must use SysV shared >> memory? Coming from the BSD camp, I've often pondered why it >> doesn't use BSD-style shared memory (which is often easier to >> allocate in the BSD world). Tom> Well, I must say this is the first time I've heard of "BSD-style Tom> shared memory". What are the syscalls? How portable is it? Tom> Does it have the semantics we need (specifically, the ability to Tom> associate an ID with a shmem segment, and the ability to discover Tom> whether any other processes are attached to an existing shmem Tom> segment)? The canonical example is to mmap() a file multiply. BSD will guarentee this is shared memory. Of course, to address the concern that you really don't need a file, the MAP_ANON flag can be given (meaning that there is no file). With anonymous memory, you pass the file descriptor through your fork()'s. With non-anonymous memory, you can mmap() it multiply from different processes and get the same effect. With a glance at the man page, you'd have MAP_SHARED | MAP_ANON | MAP_NOSYNC | MAP_HASSEMAPHORE. One bonus of bsd shared memory (when anonymous) is that it dies with the process ... even in the worst case. Dave. -- ============================================================================ |David Gilbert, Velocet Communications. | Two things can only be | |Mail: dgilbert@velocet.net | equal if and only if they | |http://daveg.ca | are precisely opposite. | =========================================================GLO================
David Gilbert <dgilbert@velocet.ca> writes: > "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > Tom> Well, I must say this is the first time I've heard of "BSD-style > Tom> shared memory". > The canonical example is to mmap() a file multiply. Oh. Yes, as Curt said, this issue has been beaten to death already; see the archives. You could in fact use a MAP_ANON region as a replacement for SysV shared memory, but AFAICT you would lose a rather important interlock against starting a new postmaster when old backends remain alive. Ugly as the SysV API is, it does some things we need rather nicely. > One bonus of bsd shared memory (when anonymous) is that it dies with > the process ... even in the worst case. But in fact we don't *want* it to be anonymous. regards, tom lane
In article <1806.1029277058@sss.pgh.pa.us>, tgl@sss.pgh.pa.us (Tom Lane) wrote: > So, what are you seeing to start the avalanche? Indeed, here it is: DEBUG: incomplete startup packet DEBUG: pq_recvbuf: unexpected EOF on client connection DEBUG: server process (pid 6867) was terminated by signal 9 That's very good to know. This has been a partial mystery to me for some time. I've been trying to tweak our numbers, stress the system, and I've been coming at it from the wrong end. Thanks again, Tom! -tfo
On Wed, 14 Aug 2002, Tom Lane wrote: > > One bonus of bsd shared memory (when anonymous) is that it dies with > > the process ... even in the worst case. > > But in fact we don't *want* it to be anonymous. Just as a little further explanation here, postgres on startup will check for an exiting SHM segment and use the information from that to help recover (since its existence alone is an indication that an unclean shutdown ocurred). But yeah, dig through the archives and you'll see more than you ever wanted to know about this topic. cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC