Thread: Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17)
Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17)
From
"Kevin Grittner"
Date:
[Forwarding to the -hackers list. Please respond there.] Craig Ringer <ringerc@ringerc.id.au> wrote: > On 08/09/2012 04:24 AM, Kevin Grittner wrote: >> http://wiki.postgresql.org/wiki/Number_Of_Database_Connections >> > Can we please please PLEASE link to that as a comment above > max_connections? > > Last time this came up nobody was happy with wording of a comment > so nothing got done. It's a real usability wart - causing > real-world performance and reliability problems - that people > unwittingly raise max_connections to absurd levels because they > get no warnings, hints or guidance of any sort. I see that we currently have five links to wiki.postgresql.org in release notes and four more in the rest of the docs. Are people OK with adding this link to the docs on max_connections? (Feel free to improve it before answering if you have qualms about the specifics on that page.) We do seem to get an awful lot of posts (between here and StackOverflow) from people who assume they need one database connection per active user, and then are surprised that performance is horrible. If we get consensus on linking to this I'll put together a patch to make a brief comment in the docs with a link to the Wiki. -Kevin
Re: Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17)
From
Magnus Hagander
Date:
On Thu, Aug 9, 2012 at 4:22 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > [Forwarding to the -hackers list. Please respond there.] > > Craig Ringer <ringerc@ringerc.id.au> wrote: >> On 08/09/2012 04:24 AM, Kevin Grittner wrote: >>> http://wiki.postgresql.org/wiki/Number_Of_Database_Connections >>> >> Can we please please PLEASE link to that as a comment above >> max_connections? >> >> Last time this came up nobody was happy with wording of a comment >> so nothing got done. It's a real usability wart - causing >> real-world performance and reliability problems - that people >> unwittingly raise max_connections to absurd levels because they >> get no warnings, hints or guidance of any sort. > > I see that we currently have five links to wiki.postgresql.org in > release notes and four more in the rest of the docs. Are people OK > with adding this link to the docs on max_connections? (Feel free to > improve it before answering if you have qualms about the specifics > on that page.) I think we should do that in the cases where the text is "volatile", such as when it refers to external products or things that are likely to change. I think this one qualifies reasonably well. However, I wonder if we should somehow try to keep track of those that *are* linked from the documentation. I assume you just grep:ed this time, or something like that? Perhaps we should assign all such pages a specific category or so on the wiki, so we can police them a bit extra hard? -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
Re: Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17)
From
Robert Haas
Date:
On Thu, Aug 9, 2012 at 10:22 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > I see that we currently have five links to wiki.postgresql.org in > release notes and four more in the rest of the docs. Are people OK > with adding this link to the docs on max_connections? (Feel free to > improve it before answering if you have qualms about the specifics > on that page.) I'm generally not in favor of linking to the wiki unless there's some reason that it wouldn't be appropriate to include the material in the documentation in some form. I don't see that that's the case here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17)
From
"Kevin Grittner"
Date:
Robert Haas <robertmhaas@gmail.com> wrote: > Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: >> I see that we currently have five links to wiki.postgresql.org in >> release notes and four more in the rest of the docs. Are people >> OK with adding this link to the docs on max_connections? (Feel >> free to improve it before answering if you have qualms about the >> specifics on that page.) > > I'm generally not in favor of linking to the wiki unless there's > some reason that it wouldn't be appropriate to include the > material in the documentation in some form. I don't see that > that's the case here. All right, but we *really* need to actually get *something* into the docs on this, preferably back-patched. In the -admin thread which prompted Craig to plead for a link to the Wiki page, the OP thought it was reasonable to worry about how to configure the oom killer to deal with the situation that his 600 connections used all 32GB of RAM *plus* the 32GB of swap space he has configured. Imagine what performance must be like by the time it gets to *that* point! The OP clearly has read the docs, because he was attempting to take advice from the section on Linux Memory Overcommit. But as far as I have seen, there is nothing in the docs to suggest connection pooling. It is far better to avoid the oom killer by connection pooling than to tweak the oom killer configuration, yet you wouldn't have a clue about that from the docs. I actually think that the issue merits space in the docs roughly matchnig the Wiki page. Perhaps we could find a place in the Server Setup and Operation chapter to more-or-less include the current Wiki page contents, and reference *that* from the max_connections docs? -Kevin
Re: Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17)
From
Robert Haas
Date:
On Thu, Aug 9, 2012 at 11:27 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Robert Haas <robertmhaas@gmail.com> wrote: >> Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: >>> I see that we currently have five links to wiki.postgresql.org in >>> release notes and four more in the rest of the docs. Are people >>> OK with adding this link to the docs on max_connections? (Feel >>> free to improve it before answering if you have qualms about the >>> specifics on that page.) >> >> I'm generally not in favor of linking to the wiki unless there's >> some reason that it wouldn't be appropriate to include the >> material in the documentation in some form. I don't see that >> that's the case here. > > All right, but we *really* need to actually get *something* into the > docs on this, preferably back-patched. In the -admin thread which > prompted Craig to plead for a link to the Wiki page, the OP thought > it was reasonable to worry about how to configure the oom killer to > deal with the situation that his 600 connections used all 32GB of > RAM *plus* the 32GB of swap space he has configured. Imagine what > performance must be like by the time it gets to *that* point! Sure, I'm not opposed to documenting it. Putting it in the actual documentation is not a demotion relative to putting a link to it in the documentation! As for back-patching, we do that regularly with documentation patches, as the situation seems to warrant. > The OP clearly has read the docs, because he was attempting to take > advice from the section on Linux Memory Overcommit. But as far as I > have seen, there is nothing in the docs to suggest connection > pooling. It is far better to avoid the oom killer by connection > pooling than to tweak the oom killer configuration, yet you wouldn't > have a clue about that from the docs. Sounds easily fixable. Let's add a <note> to that section right after the first paragraph. > I actually think that the issue merits space in the docs roughly > matchnig the Wiki page. Perhaps we could find a place in the Server > Setup and Operation chapter to more-or-less include the current Wiki > page contents, and reference *that* from the max_connections docs? I think the page as it exists today needs some rewriting for tone, but I think the contents are mostly OK. Also, I am a bit doubtful about the advice on sizing the connection pool as applied to small servers: surely it's not sane to recommend that a single-processor system with one disk should have max_connections = 3. At least, *I* don't think that's sane. At least in HEAD/9.2, I like the idea of adding a section to Server Setup and Operation to address this. Perhaps something like "Managing Server Load" or "Preventing Server Overload". I think there's quite a bit we could say in such a chapter apart from the stuff on maximum connections, and I think it would be valuable to have that stuff in our documentation rather than scattered around the Internet in wiki articles and random Greg Smith web pages. I'm not sure we want to back-patch an entire new chapter but let's put something together first and then we'll figure out what makes sense. A systemic problem with our documentation is that it tells you how to do X, rather than what you ought to do. This is a good example of that, but not the only one. We need to be careful of putting advice in the docs that is one person's opinion rather than an actual best practice, but I think there is plenty of stuff upon which enough consensus exists that it really ought to be in the docs. Like, who really uses REINDEX on a production system, rather than CREATE INDEX CONCURRENTLY + DROP INDEX? Yet, you won't find that suggested in our documentation anywhere, AFAIK. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17)
From
"Kevin Grittner"
Date:
Robert Haas <robertmhaas@gmail.com> wrote: > Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: >> we *really* need to actually get *something* into the docs on >> this, > Sure, I'm not opposed to documenting it. Putting it in the actual > documentation is not a demotion relative to putting a link to it > in the documentation! I didn't figure it was; my emphasis was because this has been raised before and nothing happened for want of a consensus on what particular wording should be used, so users were left with no guidance. I don't want this to continue to be a victim of "the perfect is the enemy of the good" syndrome. >> preferably back-patched. > As for back-patching, we do that regularly with documentation > patches, as the situation seems to warrant. Yeah, I was arguing that this is one of those situations where it is warranted. > Let's add a <note> to that section right after the first > paragraph. OK. >> I actually think that the issue merits space in the docs roughly >> matchnig the Wiki page. Perhaps we could find a place in the >> Server Setup and Operation chapter to more-or-less include the >> current Wiki page contents, and reference *that* from the >> max_connections docs? > > I think the page as it exists today needs some rewriting for tone, > but I think the contents are mostly OK. Would you like to edit the tone in Wiki form? (If the tone is off, I suspect it should be fixed in both places.) > Also, I am a bit doubtful about the advice on sizing the > connection pool as applied to small servers: > surely it's not sane to recommend that a single-processor system > with one disk should have max_connections = 3. At least, *I* > don't think that's sane. I'm not sure it's wrong when combined with this: "Remember that this "sweet spot" is for the number of connections that are actively doing work. ... You should always make max_connections a bit bigger than the number of connections you enable in your connection pool. That way there are always a few slots available for direct connections for system maintenance and monitoring." Where would you expect the knee to be for connections concurrently actively doing work on a single-core, single-drive system ? > At least in HEAD/9.2, I like the idea of adding a section to > Server Setup and Operation to address this. Perhaps something > like "Managing Server Load" or "Preventing Server Overload". I > think there's quite a bit we could say in such a chapter apart > from the stuff on maximum connections, and I think it would be > valuable to have that stuff in our documentation rather than > scattered around the Internet in wiki articles and random Greg > Smith web pages. I'm not sure we want to back-patch an entire new > chapter but let's put something together first and then we'll > figure out what makes sense. As long as we're not drifting (again) into "enemy of the good" territory. Perhaps we should address this issue first, since it is particularly acute, and then add others as separate patches? > A systemic problem with our documentation is that it tells you how > to do X, rather than what you ought to do. This is a good example > of that, but not the only one. We need to be careful of putting > advice in the docs that is one person's opinion rather than an > actual best practice, but I think there is plenty of stuff upon > which enough consensus exists that it really ought to be in the > docs. Like, who really uses REINDEX on a production system, > rather than CREATE INDEX CONCURRENTLY + DROP INDEX? Yet, you > won't find that suggested in our documentation anywhere, AFAIK. Good points. -Kevin
Re: Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17)
From
Robert Haas
Date:
On Thu, Aug 9, 2012 at 1:11 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > I didn't figure it was; my emphasis was because this has been raised > before and nothing happened for want of a consensus on what > particular wording should be used, so users were left with no > guidance. I don't want this to continue to be a victim of "the > perfect is the enemy of the good" syndrome. So, to get the ball rolling here, I spent some time on this today, and added a paragraph to the Linux Memory Overcommit section of the documentation. I back-patched it back to 9.0. There were additional merge on conflicts in 8.4 which I did not bother to resolve. There may be room to further improve what I did here; suggestions are welcome. I think we probably still need to add something to the max_connections documentation; I have not done that. >> Also, I am a bit doubtful about the advice on sizing the >> connection pool as applied to small servers: >> surely it's not sane to recommend that a single-processor system >> with one disk should have max_connections = 3. At least, *I* >> don't think that's sane. > > I'm not sure it's wrong when combined with this: "Remember that this > "sweet spot" is for the number of connections that are actively > doing work. ... You should always make max_connections a bit > bigger than the number of connections you enable in your connection > pool. That way there are always a few slots available for direct > connections for system maintenance and monitoring." Where would you > expect the knee to be for connections concurrently actively doing > work on a single-core, single-drive system ? I don't know. But my experience with our customers is that people are often forced to set the size of the connection pool far larger than what that formula would suggest. Many people are doing transaction-level pooling, and for those people, they've got to look at how many multi-statement transactions they've got and think about what the peak value for that quantity is. It's still worth using pooling because it reduces the number of simultaneous connections, but it's not going to reduce it to the kind of values you're talking about. Also, consider that transactions aren't all the same length. Suppose 90% of your queries execute in 50ms, and 10% execute in 60s. Even though it's less efficient, you've got to set the connection pool large enough that at least some of the 50 ms queries can continue to get processed even if the maximum number of 60s queries that you ever expect to see in parallel are already running. This may seem like a theoretical problem but we have customers who use connection pools to get the number of simultaneous connections down to, say, 800. I guarantee you that these people do not have 200 CPUs and 400 disks, but they're smart people and they find that smaller pool sizes don't work. Sure, we can say, well, the fine print tells you that 2*CPUs+disks is not REALLY the formula you should use, but it's just so far off what I see in the field that I have a hard time thinking it's really helping people to give them that as a starting point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17)
From
"Kevin Grittner"
Date:
Robert Haas <robertmhaas@gmail.com> wrote: > Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: >>> Also, I am a bit doubtful about the advice on sizing the >>> connection pool as applied to small servers: >>> surely it's not sane to recommend that a single-processor system >>> with one disk should have max_connections = 3. At least, *I* >>> don't think that's sane. >> >> I'm not sure it's wrong when combined with this: "Remember that >> this "sweet spot" is for the number of connections that are >> actively doing work. ... You should always make max_connections >> a bit bigger than the number of connections you enable in your >> connection pool. That way there are always a few slots available >> for direct connections for system maintenance and monitoring." >> Where would you expect the knee to be for connections >> concurrently actively doing work on a single-core, single-drive >> system ? > > I don't know. But my experience with our customers is that people > are often forced to set the size of the connection pool far larger > than what that formula would suggest. Many people are doing > transaction-level pooling, and for those people, they've got to > look at how many multi-statement transactions they've got and > think about what the peak value for that quantity is. It's still > worth using pooling because it reduces the number of simultaneous > connections, but it's not going to reduce it to the kind of values > you're talking about. Also, consider that transactions aren't all > the same length. Suppose 90% of your queries execute in 50ms, and > 10% execute in 60s. Even though it's less efficient, you've got > to set the connection pool large enough that at least some of the > 50 ms queries can continue to get processed even if the maximum > number of 60s queries that you ever expect to see in parallel are > already running. This may seem like a theoretical problem but we > have customers who use connection pools to get the number of > simultaneous connections down to, say, 800. I guarantee you that > these people do not have 200 CPUs and 400 disks, but they're smart > people and they find that smaller pool sizes don't work. It is something which has to be considered, and I don't think it's theoretical at all. Here's how we deal with it. We don't use a plain FIFO queue for our transaction requests, but a prioritized FIFO with 10 levels of priority (0 to 9). The highest priority (9) is reserved for "utility" requests -- where a running transaction needs to spin off a related transaction to do some work for it. For the lowest level (0) we normally allocate only a single connection, and it is used for very long-running reports which we want to queue to run one-at-a-time. As examples of how we categorize queries, filling a large list in an interactive application will run at priority 3, while translating a key which must cause a description on the screen to display is run at priority 8. Normal single-row updates and deletes from an interactive application run at priority 5. Each connection in the pool has a worker thread, and is assigned a minimum priority that it will handle. When all threads are busy and transaction requests are queued, any thread completing a database transaction pulls from the front of the highest priority queue with a waiting request to run a transaction, looking only at priorities which are not "beneath it". If there are no waiting requests of high enough priority, the thread waits for one to arrive. We have found that the formula I presented, when combined with transactional request queuing like I describe here gives us our best performance. I don't have the exact numbers in front of me at the moment, but on a machine with 16 cores and a 40-drive array (but heavily cached, so that the effective spindle count was lower than that), servicing thousands of concurrent web users with hundreds of tps, we improved performance significantly by dropping our connection pool size from about 60 to about 30, in addition to the separate pool of six which are handling logical replication from about 80 sources. That was a real-life production situation, but we ran a series of benchmarks and found that in a pretty wide spectrum of situations the formula I gave fits pretty neatly. If someone is using 800 connections for, say, a 32 core machine with a 200 drive array I would suspect that they would get a lot of benefit from a smarter connection pool. > Sure, we can say, well, the fine print tells you that 2*CPUs+disks > is not REALLY the formula you should use, but it's just so far off > what I see in the field that I have a hard time thinking it's > really helping people to give them that as a starting point. The point is that it *is* generally really close to the numbers we have seen here in both benchmarks and production, and I have gotten comments both on and off the lists from people who have told me that they tried that formula against their benchmark results and found that it fit well. Now, this may be dependent on OS or hardware in some way that I haven't recognized, and it seems likely to need adjustment under 9.2, but so far on Linux with hard drives it has been a useful guide for me. It might be instructive to figure out what the difference is in environments where much larger pools have been found beneficial. -Kevin
Re: Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17)
From
Robert Haas
Date:
On Tue, Sep 4, 2012 at 11:15 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > It is something which has to be considered, and I don't think it's > theoretical at all. Here's how we deal with it. We don't use a > plain FIFO queue for our transaction requests, but a prioritized > FIFO with 10 levels of priority (0 to 9). The highest priority (9) > is reserved for "utility" requests -- where a running transaction > needs to spin off a related transaction to do some work for it. For > the lowest level (0) we normally allocate only a single connection, > and it is used for very long-running reports which we want to queue > to run one-at-a-time. As examples of how we categorize queries, > filling a large list in an interactive application will run at > priority 3, while translating a key which must cause a description > on the screen to display is run at priority 8. Normal single-row > updates and deletes from an interactive application run at priority > 5. > > Each connection in the pool has a worker thread, and is assigned a > minimum priority that it will handle. When all threads are busy and > transaction requests are queued, any thread completing a database > transaction pulls from the front of the highest priority queue with > a waiting request to run a transaction, looking only at priorities > which are not "beneath it". If there are no waiting requests of > high enough priority, the thread waits for one to arrive. I well believe that with this sort of sophisticated system you can make the connection pool much smaller and get a benefit out of it. However, I think it's quite rare for people to have a system this sophisticated. I suspect that's why I typically see much larger pool sizes. Here's my other thought about this: we talk a lot about how a system with 32 cores and 40 drives can't do more than 72 things at once, and that's absolutely true. But I think much of the reason why PostgreSQL users get a benefit out of connection pooling is unrelated to that effect. What I think we're really working around, in many cases, is internal lock contention. That's why people are talking about adjusting formulas for 9.2. It's not that a system with 72 resources can suddenly do more than 72 things; it's that in the old world lock contention could easily make it a loser to have even half that many tasks running at once, and now that's less true. Hopefully we'll make further improvements in the future and it'll become even less true still. So is the real issue the hardware limits of the server, or is it the limits of our software? The former is certainly in the mix, but I personally believe the latter has a lot more to do with pool size selection than we typically credit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company