Re: Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17) - Mailing list pgsql-hackers
From | Kevin Grittner |
---|---|
Subject | Re: Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17) |
Date | |
Msg-id | 5045D4DE0200002500049E0A@gw.wicourts.gov Whole thread Raw |
In response to | Re: Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17) (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Wiki link for max_connections? (Fwd: Re: [ADMIN]
PostgreSQL oom_adj postmaster process to -17)
|
List | pgsql-hackers |
Robert Haas <robertmhaas@gmail.com> wrote: > Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: >>> Also, I am a bit doubtful about the advice on sizing the >>> connection pool as applied to small servers: >>> surely it's not sane to recommend that a single-processor system >>> with one disk should have max_connections = 3. At least, *I* >>> don't think that's sane. >> >> I'm not sure it's wrong when combined with this: "Remember that >> this "sweet spot" is for the number of connections that are >> actively doing work. ... You should always make max_connections >> a bit bigger than the number of connections you enable in your >> connection pool. That way there are always a few slots available >> for direct connections for system maintenance and monitoring." >> Where would you expect the knee to be for connections >> concurrently actively doing work on a single-core, single-drive >> system ? > > I don't know. But my experience with our customers is that people > are often forced to set the size of the connection pool far larger > than what that formula would suggest. Many people are doing > transaction-level pooling, and for those people, they've got to > look at how many multi-statement transactions they've got and > think about what the peak value for that quantity is. It's still > worth using pooling because it reduces the number of simultaneous > connections, but it's not going to reduce it to the kind of values > you're talking about. Also, consider that transactions aren't all > the same length. Suppose 90% of your queries execute in 50ms, and > 10% execute in 60s. Even though it's less efficient, you've got > to set the connection pool large enough that at least some of the > 50 ms queries can continue to get processed even if the maximum > number of 60s queries that you ever expect to see in parallel are > already running. This may seem like a theoretical problem but we > have customers who use connection pools to get the number of > simultaneous connections down to, say, 800. I guarantee you that > these people do not have 200 CPUs and 400 disks, but they're smart > people and they find that smaller pool sizes don't work. It is something which has to be considered, and I don't think it's theoretical at all. Here's how we deal with it. We don't use a plain FIFO queue for our transaction requests, but a prioritized FIFO with 10 levels of priority (0 to 9). The highest priority (9) is reserved for "utility" requests -- where a running transaction needs to spin off a related transaction to do some work for it. For the lowest level (0) we normally allocate only a single connection, and it is used for very long-running reports which we want to queue to run one-at-a-time. As examples of how we categorize queries, filling a large list in an interactive application will run at priority 3, while translating a key which must cause a description on the screen to display is run at priority 8. Normal single-row updates and deletes from an interactive application run at priority 5. Each connection in the pool has a worker thread, and is assigned a minimum priority that it will handle. When all threads are busy and transaction requests are queued, any thread completing a database transaction pulls from the front of the highest priority queue with a waiting request to run a transaction, looking only at priorities which are not "beneath it". If there are no waiting requests of high enough priority, the thread waits for one to arrive. We have found that the formula I presented, when combined with transactional request queuing like I describe here gives us our best performance. I don't have the exact numbers in front of me at the moment, but on a machine with 16 cores and a 40-drive array (but heavily cached, so that the effective spindle count was lower than that), servicing thousands of concurrent web users with hundreds of tps, we improved performance significantly by dropping our connection pool size from about 60 to about 30, in addition to the separate pool of six which are handling logical replication from about 80 sources. That was a real-life production situation, but we ran a series of benchmarks and found that in a pretty wide spectrum of situations the formula I gave fits pretty neatly. If someone is using 800 connections for, say, a 32 core machine with a 200 drive array I would suspect that they would get a lot of benefit from a smarter connection pool. > Sure, we can say, well, the fine print tells you that 2*CPUs+disks > is not REALLY the formula you should use, but it's just so far off > what I see in the field that I have a hard time thinking it's > really helping people to give them that as a starting point. The point is that it *is* generally really close to the numbers we have seen here in both benchmarks and production, and I have gotten comments both on and off the lists from people who have told me that they tried that formula against their benchmark results and found that it fit well. Now, this may be dependent on OS or hardware in some way that I haven't recognized, and it seems likely to need adjustment under 9.2, but so far on Linux with hard drives it has been a useful guide for me. It might be instructive to figure out what the difference is in environments where much larger pools have been found beneficial. -Kevin
pgsql-hackers by date: