Thread: Simmultanous Connections (fwd)
Can anyone here help? Vince. ---------- Forwarded message ---------- Date: Mon, 10 Jan 2000 08:52:06 +0000 From: Jude Weaver <exec@shreve.net> To: webmaster@postgresql.org Subject: Simmultanous Connections We are a company that writes academic software . We are converting our software to use either PostgreSQL or MySQL. We are leaning toward PostgreSQL, but, I still have several questions. I hope someone can answer these for me. 1. I have read the Q&A for postgreSQL and would like to know the difference between a temporary and a permanant connection. Do you have a connection when you open the database or only when the frontend sends a job to the backend? If 32 people are running a module that opens a database is that 32 connections or will it vary as users read and write to the database? 2. I saw in the Q&A that to run more than 32 simmultanous connects could be a big drain on our re- sources. Our Linux boxes , in general, are Intel 166 to 500s, 128MG of RAM and 6.2 to 13 GIG. Can anyone tell me roughly how much resources per connection does PostgreSQL use? 3. If I have 90 teachers posting grades at the same time, (the grade posting program opens 5 dif- ferent databases) and 25 secretaries and administrators poking around in assorted databases looking at information, will postgresql handle that much traffic? I would appreciate any information you can give me, Thank you - Jude Weaver.
> Can anyone here help? > > Vince. > > ---------- Forwarded message ---------- > Date: Mon, 10 Jan 2000 08:52:06 +0000 > From: Jude Weaver <exec@shreve.net> > To: webmaster@postgresql.org > Subject: Simmultanous Connections > > We are a company that writes academic software . We are converting our > software to use either PostgreSQL or MySQL. We are leaning toward > PostgreSQL, but, I still have several questions. > I hope someone can answer these for me. > > 1. I have read the Q&A for postgreSQL and would like to know the > difference between a temporary > and a permanant connection. Do you have a connection when you open > the database or only when > the frontend sends a job to the backend? If 32 people are running > a module that opens a database > is that 32 connections or will it vary as users read and write to > the database? Sounds like she may looking at postgres in PHP - at least PHP uses that temporary and permanant connection concept. My experience is that PHP persistent connections are not worth it - the time to establish a new connection is pretty small, and stale connections can cause problems. > 2. I saw in the Q&A that to run more than 32 simmultanous connects could > be a big drain on our re- > sources. Our Linux boxes , in general, are Intel 166 to 500s, 128MG > of RAM and 6.2 to 13 GIG. > Can anyone tell me roughly how much resources per connection does > PostgreSQL use? If an idle psql connection is left open, we're looking at about 1 MB RAM plus 4MB swap on my linux box. As I noted above, I'd generally recommend against persistent connections when there are more than a few users. Sounds like the machines have the capacity for what sounds like a fairly small task. Of course, there would generally be only one server machine, so I would recommend choosing one of the faster ones. But it should be stable and usable ath eith end of the spectrum, at least from my experience. > 3. If I have 90 teachers posting grades at the same time, (the grade > posting program opens 5 dif- > ferent databases) and 25 secretaries and administrators poking > around in assorted databases > looking at information, will postgresql handle that much traffic? Postgres should handle that easily. Just my $0.02 worth. Hope it's helpful. -- Karl DeBisschop <kdebisschop@alert.infoplease.com> 617.832.0332 (Fax: 617.956.2696) Information Please - your source for FREE online reference http://www.infoplease.com - Your Ultimate Fact Finder http://kids.infoplease.com - The Great Homework Helper Netsaint Plugins Development http://netsaintplug.sourceforge.net
On Mon, 10 Jan 2000, Vince Vielhaber wrote: > 1. I have read the Q&A for postgreSQL and would like to know the > difference between a temporary and a permanant connection. Do you have > a connection when you open the database or only when the frontend > sends a job to the backend? If 32 people are running a module that > opens a database is that 32 connections or will it vary as users read > and write to the database? You will have 32 connections open to the backend ... > 2. I saw in the Q&A that to run more than 32 simmultanous connects > could be a big drain on our re- sources. Our Linux boxes , in general, > are Intel 166 to 500s, 128MG of RAM and 6.2 to 13 GIG. Can anyone > tell me roughly how much resources per connection does PostgreSQL use? It depends on what the connections are doing...if someone is doing a 'SELECT...ORDER BY', it will take more resources then if you are doing something that doesn't involve any sort routines... > 3. If I have 90 teachers posting grades at the same time, (the grade > posting program opens 5 dif- ferent databases) and 25 secretaries and > administrators poking around in assorted databases looking at > information, will postgresql handle that much traffic? 5 different databases, vs 5 different tables? 5 different databases will mean 90 x 5 (450) connections opened up...whereas 5 tables would be just 90 connections... ... but, either way, will it handle that much traffic? give it enough RAM, and I personally don't see why not, but I've yet to hit *that* kind of a load on it. Right now, I have PostgreSQL setup to handle several databases, and the postmaster processes each take up ~4-5Meg: hub> ps aux | grep data pgsql 895 0.0 0.2 4508 1416 d0- S 6:52AM 0:00.98 /home/database/v pgsql 896 0.0 0.2 3976 1308 d0- I 6:52AM 0:00.02 /home/database/v When I open up a session/connection to a database, I'm seeing: pgsql 71041 5.1 0.4 5028 3492 ?? R 11:40AM 0:00.54 /home/database/v pgsql 71032 0.0 0.4 4992 3148 ?? S 11:40AM 0:00.02 /home/database/v pgsql 71034 0.0 0.4 4980 2976 ?? S 11:40AM 0:00.02 /home/database/v Now, I always get this backwards/confused, but...the first value (ie. 4508) is the binary size, which is mis-informed due to the use of shared libraries... the important one is the second value (ie. 1416), which, again, if I recall correctly, is the datasize...for the udmsearch database, just starting up 'psql udmsearch', each database is taking <3.5Meg...depending on the sizes of your queries and whatnot, figure that I'd need 3.5Meg*450 (~1.5gig) of memory on this machine to handle it (I have half of that now)...bear in mind that not all 450 connections would be active, so there is room for some processes to be swap'd out and whatnot.. My personal opinion is that there isn't anything that PostgreSQL hasn't been able to handle so far, to the best of my knowledge...my next step for my system is to go dual-processor, and bring on a full gig of RAM, but my machine also does alot more then just PostgreSQL :) Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
At 11:46 AM 1/10/00 -0500, Karl DeBisschop wrote: >Sounds like she may looking at postgres in PHP - at least PHP uses >that temporary and permanant connection concept. My experience is >that PHP persistent connections are not worth it - the time to >establish a new connection is pretty small, and stale connections can >cause problems. Boy, persistent connections in AOLserver sure help a lot (ask Lamar Owen!). If stale connections cause problems in your PHP environment, then the PHP persistent connection implementation needs some work. Forking a new backend is actually considerably more expensive then just passing back the PID of an existing backend... On Sun Solaris systems, forking is about 25 times as costly as starting up a new thread (according to data from Sun). Of course, returning an existing persistent db connection's even cheaper than starting a new thread. And that comparative cost will vary between OS. But not necessarily in a direction favoring more forking :) I sent her a private note saying she really probably shouldn't be looking at MySQL for her application, presumably having a real transaction-based db is a Good Thing when maintaining a database of student grades. Told her she should be looking at various real RDBMS solutions and should leave MySQL out of the picture entirely (while also telling her I thought PG would work fine for her needs, of course). - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
Don Baccus wrote: > > At 11:46 AM 1/10/00 -0500, Karl DeBisschop wrote: > > >Sounds like she may looking at postgres in PHP - at least PHP uses > >that temporary and permanant connection concept. My experience is > >that PHP persistent connections are not worth it - the time to > >establish a new connection is pretty small, and stale connections can > >cause problems. > Boy, persistent connections in AOLserver sure help a lot (ask Lamar > Owen!). If stale connections cause problems in your PHP environment, > then the PHP persistent connection implementation needs some work. Let's work some math. Under AOLserver, using the pooled connection paradigm that it uses, for 5 databases, you would need to define 5 pools. You then can control how many instances of each pool can be opened at any given time. So, if all databases need the same number of connections average, you raise the max on pool instances until users quit getting busy messages during normal usage -- which usually , for a small number of users (~25 here), is only 2 or 3 instances. The persistent pooled model avoids fork() penalties -- after all, there is overhead there, regardless of how small that overhead may be. I have gone as far as reducing the instances to 1 here -- it's amazing how few people actually do simultaneous accesses! I currently am running with an instance max of 3 -- and users get busy's very rarely. With 90 users on a single database with 5 tables, an instance max of 10-20 would probably give less than a 10% busy rate. And, as you add more RAM, you can up your instance max to adjust. I don't know how close to the AOLserver model PHP is (I think it is pretty close, as the beta of PHP4 is buildable to run as a module under AOLserver), but the concept of pooled persistent connections is a sound one, and eliminates some grief (as long as you watch your transactions -- don't want two connections that happen to share a pool instance to share a transaction roolback!). Plus, you can service that required nnumber of users at varying satisfaction/busy levels depending upon your current server resources. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
> Boy, persistent connections in AOLserver sure help a lot (ask Lamar > Owen!). If stale connections cause problems in your PHP environment, > then the PHP persistent connection implementation needs some work. This isn't really a hackers issue, so I'll try to be brief but also give a little more info than I originally did. Maybe any further discussion would be best placed in pgsql-general. Basically, I think it may depend on the use - for our website, we get connections from a variety of sources - most of them don't repeat for a long time, if ever. Which means a bunch sit around at any given time, never to be reused. If the new connections come fast enough, this can translate to real problems unless they timeout quickly, which defeats the purpose. That being said, maybe the PHP implementaion does need some work, or maybe there are site parameters we could tune to make it work. But whenever we use it, we do eventually end up in trouble as a result. So, personally, I don't recommend it in situations where alot of different clients will be connecting to the DBMS - at least if low maintennence is a key goal. > Forking a new backend is actually considerably more expensive then > just passing back the PID of an existing backend... >From the point of view of the server, absolutely. But that connection time is still a very small part of the user's total trransaction time. And, although I am making alot of guesses as to the nature of the planned DB will be, my guess is that overall machine load will not be so high that the process forking becomes critical. My guess is that support will be hard to come by in alot of public school environments, so I'd guess their building for trouble-free operation before speed. > I sent her a private note saying she really probably shouldn't be looking > at MySQL for her application, presumably having a real transaction-based > db is a Good Thing when maintaining a database of student grades. Told > her she should be looking at various real RDBMS solutions and should leave > MySQL out of the picture entirely (while also telling her I thought PG > would work fine for her needs, of course). That's a good summary of my intended take-home point as well, though you said it much more clearly. All the rest was just personal experience that applies to our environment but my not apply to yours or hers. Karl
At 01:35 PM 1/10/00 -0500, Lamar Owen wrote: >I don't know how close to the AOLserver model PHP is (I think it is >pretty close, as the beta of PHP4 is buildable to run as a module under >AOLserver), but the concept of pooled persistent connections is a sound >one, and eliminates some grief (as long as you watch your transactions >-- don't want two connections that happen to share a pool instance to >share a transaction roolback!). Spoken like a long-suffering user of AOLserver's original postgres driver :) I've solved this particular problem in the latest version of the driver, and other problems related to backends crashing and the like. This is why I suggest that if there are problems with PHPs persistent database connections and Postgres that the PHP implementation of such connections needs work. I know from experience that persistent pooled connections can be implemented in a non-robust fashion (the old postgres driver for AOLserver) but I also know that they can be made robust, from personal experience. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
At 01:38 PM 1/10/00 -0500, Karl DeBisschop wrote: >This isn't really a hackers issue, so I'll try to be brief but also >give a little more info than I originally did. Maybe any further >discussion would be best placed in pgsql-general. Perhaps. I'll give one brief answer here, though. It probably doesn't hurt the developers to see how their product is used in real-life scenarios anyway... >Basically, I think it may depend on the use - for our website, we get >connections from a variety of sources - most of them don't repeat for >a long time, if ever. Which means a bunch sit around at any given >time, never to be reused. If the new connections come fast enough, >this can translate to real problems unless they timeout quickly, which >defeats the purpose. >That being said, maybe the PHP implementaion does need some work, or >maybe there are site parameters we could tune to make it work. But >whenever we use it, we do eventually end up in trouble as a result. My short answer: yes, it does need work if it works as you describe. The whole point of pooling persistent connections is to allow re-use. It sounds like either PHP makes it hard/impossible or that (maybe?) you folks haven't quite figured out how fully exploit their implementation of pooled connections. >So, personally, I don't recommend it in situations where alot of >different clients will be connecting to the DBMS - at least if low >maintennence is a key goal. The problem isn't persistent connections, the problem is the particular implementation you're using. AOLserver's implementation is trouble free, for Postgres, Sybase, Oracle, and Solid. And totally transparent to scripts and dynamic pages (other than SQL differences due to the dbs themselves). The PHP folks are making it available within AOLserver, as Lamar Owen has pointed out. If they also plug into the AOLserver implementation of pooled persistent database connections, then PHP users will also have a platform available which reliably supports such connections. >> Forking a new backend is actually considerably more expensive then >> just passing back the PID of an existing backend... > >>From the point of view of the server, absolutely. But that connection >time is still a very small part of the user's total trransaction time. Depends on how you're using the database. If you're using it to personalize pages, for instance, you'll be using a lot of simple, quick selects. If you're only using the database for complicated, slow queries then perhaps you're right. Let's put it this way ... folks who have a lot more experience than me at running very busy database-backed web sites have observed that it DOES make a large difference in the scalability of a site. These, though, are sites make heavy use of the database when serving up pages. If forking weren't a problem, the Apache folks wouldn't've bothered building modPerl, for instance... >And, although I am making alot of guesses as to the nature of the >planned DB will be, my guess is that overall machine load will not be >so high that the process forking becomes critical. My guess is that >support will be hard to come by in alot of public school environments, >so I'd guess their building for trouble-free operation before speed. Nothing to disagree with here, other than the fact that my own personal experience tells me that persistent connections needn't be a source of trouble. If they are PHP users, though, and if the site really is using PHP as you suspect, then they should probably avoid them if your experience is an accurate reflection of the state of the implementation of persistent connections available to PHP users. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
When I was researching PG vs MySQL - the big kicker was transaction support (mySQL doesn't have it). It looks like MySQL is faster than PG and so is great for serving data to a web site. But if you also need to perform updates to multiple tables, you probably will want to use transactions - so use PostgreSQL. I chose PostgreSQL for this reason. The PostgreSQL team is incredible. To try and contribute a little, we switched our development from the released software to the PG development software (so we can report, and sometimes fix problems) - any bugs we discovered but couldn't fix ourselves were fixed within hours. Our own modifications were checked and entered in their development tree in less than an hour. Cool. The code itself is of very high quality. Steve Vince Vielhaber wrote: > Can anyone here help? > > Vince. > > ---------- Forwarded message ---------- > Date: Mon, 10 Jan 2000 08:52:06 +0000 > From: Jude Weaver <exec@shreve.net> > To: webmaster@postgresql.org > Subject: Simmultanous Connections > > We are a company that writes academic software . We are converting our > software to use either PostgreSQL or MySQL. We are leaning toward > PostgreSQL, but, I still have several questions. > I hope someone can answer these for me. > > 1. I have read the Q&A for postgreSQL and would like to know the > difference between a temporary > and a permanant connection. Do you have a connection when you open > the database or only when > the frontend sends a job to the backend? If 32 people are running > a module that opens a database > is that 32 connections or will it vary as users read and write to > the database? > > 2. I saw in the Q&A that to run more than 32 simmultanous connects could > be a big drain on our re- > sources. Our Linux boxes , in general, are Intel 166 to 500s, 128MG > of RAM and 6.2 to 13 GIG. > Can anyone tell me roughly how much resources per connection does > PostgreSQL use? > > 3. If I have 90 teachers posting grades at the same time, (the grade > posting program opens 5 dif- > ferent databases) and 25 secretaries and administrators poking > around in assorted databases > looking at information, will postgresql handle that much traffic? > > I would appreciate any information you can give me, > Thank you - Jude Weaver. > > ************
Karl DeBisschop writes:> Sounds like she may looking at postgres in PHP - at least PHP uses> that temporary and permanantconnection concept. My experience is> that PHP persistent connections are not worth it - the time to> establisha new connection is pretty small, and stale connections can> cause problems.> > > 2. I saw in the Q&A that torun more than 32 simmultanous connects could> > be a big drain on our re-> > sources. Our Linux boxes , in general,are Intel 166 to 500s, 128MG> > of RAM and 6.2 to 13 GIG.> > Can anyone tell me roughly how much resourcesper connection does> > PostgreSQL use?> > If an idle psql connection is left open, we're looking at about 1 MB>RAM plus 4MB swap on my linux box.> > As I noted above, I'd generally recommend against persistent> connections when thereare more than a few users. As an example, I have systems with 2 or 3 hundred simultaneos connections and besides being short time connections it's impossible to have 200 or 300 backends running at the same time. In this case, I had to create a proxy to use few connections. I have AF_INET and AF_UNIX versions. []'s Mateus Cordeiro Inssa --------------------- Linux User: 76186 Kernel: 2.3.36 ICQ (Licq): 15243895 --------------------- mateus@ifnet.com.br mateus@cwb.fnn.net Tue Jan 11 08:45:00 EDT 2000
At 08:45 AM 1/11/00 -0200, Mateus Cordeiro Inssa wrote: > As an example, I have systems with 2 or 3 hundred simultaneos >connections and besides being short time connections it's impossible to >have 200 or 300 backends running at the same time. Again, the problem isn't persistent connections but rather an lousy implementation of pooled persistent connections. > In this case, I had to create a proxy to use few connections. Another approach is to throttle the number of connections in the persistent pool manager. This is how AOLserver deals with the problem. You tell it the max number of connections to fire up and only that many handles are doled out to threads, the rest waiting for others to complete. There's another parameter which places a ceiling on the number of threads allowed to wait for a pool connection, which allows me to return a "too busy" message to the user if I so choose. Of course, if a server starts getting too many of these it's time to upgrade to something faster, to dig into one's queries looking for needless inefficiency, or maybe to remember that you forgot to say "vacuum analyze" (who, me?) Some folks like to roll their own. I'm lazy and picked a web server that has already solved such problems for me. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.