Thread: Re: [ADMIN] When postgres will be faster?
I'm not concern very much about speed of Postgres but mostly about its connection schema. Every new connect to database postgres forks another children. It's impossible to work with different databases. On my production site I work with persistent connections between http (mod_perl) <-> postgres and quite satisfies with efficiency - I have 20 httpd running and 20 db backends accordingly. This requires some memory, but I could live. Now other developers want to use postgres as a db backend in their Web applications and also want to have persistence to some another databases. If you have N databases and M httpd servers, you will end with N*M DB backends. This is too much and I'm afraid my solution could be scalable. MySQL seems could works with several databases. I don't know if it's possible to have a pool of db childrens, which connected to, say, template1 database and children could switch to requested database on demand. This would require some modification of DBD driver of course, but I think it's not hard. I'm working on very big project with many databases involved, current traffic is more than 2 mln. pageviews and most of them dynamic. We expect about 5x more requests and I really need scalable solution. Is anybody working on COBRA interface to postgres ? CORBA is just a magic word for me :-) Could it be a magic wand ? Regards, Oleg On Mon, 29 Nov 1999, Marcin Mazurek - Multinet SA - Poznan wrote: > Date: Mon, 29 Nov 1999 14:27:55 +0100 (CET) > From: Marcin Mazurek - Multinet SA - Poznan <m.mazurek@multinet.pl> > Cc: pgsql-admin@postgreSQL.org > Subject: Re: [ADMIN] When postgres will be faster? > > On Mon, 29 Nov 1999 sk.list@comset.net wrote: > > Yes! But I recommend backend pool too. What is it? The postmaster task runs now > > backend for each query. Good. But After query backend finished. I recommend to > > stay backend running within a some timeout. If the next query occured > > the postmaster redirect query to any idle backend or run a new one unless. Then > > backend serve some connections it shut down itself, this prevents memory leaks. > Somebody advised me to do such thing with servlets, holding pool of > connections in one srvlet and give them as they are needed, but frankly > speaking i have no idea how to do it. Does anybodyhas such examples with > Connection pools? > mazek > > Marcin Mazurek > > -- > administrator > MULTINET SA o/Poznan > http://www.multinet.pl/ > > > ************ > _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
On Mon, 29 Nov 1999, Oleg Bartunov wrote: > I'm not concern very much about speed of Postgres but mostly > about its connection schema. Every new connect to database postgres > forks another children. It's impossible to work with different > databases. On my production site I work with persistent connections > between http (mod_perl) <-> postgres and quite satisfies with efficiency - > I have 20 httpd running and 20 db backends accordingly. > This requires some memory, but I could live. Now other developers > want to use postgres as a db backend in their Web applications and > also want to have persistence to some another databases. > If you have N databases and M httpd servers, you will end with > N*M DB backends. This is too much and I'm afraid my solution > could be scalable. MySQL seems could works with several databases. I use (not for production, though) Zope and Postgres (little non spectacular demo is here: http://sun.med.ru/cgi-bin/Zope.cgi/phd01) Zope can maintain a database connection or a pool ofdatabase connections. If there is no activity on a connection within a long period (few hours) Zope closes the connection and reopens it on next access. Oleg. ---- Oleg Broytmann http://members.xoom.com/phd2/ phd2@earthling.net Programmers don't die, they justGOSUB without RETURN.
Hi! On 29-Nov-99 Oleg Bartunov wrote: > I'm not concern very much about speed of Postgres but mostly > about its connection schema. Every new connect to database postgres > forks another children. It's impossible to work with different fork and fork/exec are some different. postmaster forks and execute backend binary. > databases. On my production site I work with persistent connections > between http (mod_perl) <-> postgres and quite satisfies with efficiency - > I have 20 httpd running and 20 db backends accordingly. > This requires some memory, but I could live. Now other developers I have >100 connections in peak load. Not all of them use postgres. If I use pconnect I lost my RAM ;-) > want to use postgres as a db backend in their Web applications and > also want to have persistence to some another databases. > If you have N databases and M httpd servers, you will end with > N*M DB backends. This is too much and I'm afraid my solution Why? Why N*M? After disconnect the persistent connection backend should not finish but next connection opens other bata base? Or i misunderstood? > I don't know if it's possible to have a pool of db childrens, > which connected to, say, template1 database and children could > switch to requested database on demand. This would require some > modification of DBD driver of course, but I think it's not hard. Hmmm... There is 2 ways to support pool. 1. FORK only. Postmaster and postgres are same binary. postmaster accept connection and forked. Parent creates structure with child pid, descriptors etc... Child becomes backend. When child finish the request it send signal (smem,fifo etc) to parent. Parent set IDLE flag to child structure. When next connection accepted parent seek through list of child to find first idle one. parent clear IDLE flag and fd_dup file descriptors to backend's. Child structure contain call counter and time stamp of start and last call time. If call counter exceeds N or time exceeds T all descriptors becomes closed. Child catch SIGPIPE on closed descriptors and finish. Parent scans list of structures and check time stamps to stop idle backends or start new one (to have pool of idle backends). 2. Fork/exec. I dont know. But it possible too. Same like previous. So, if backend works with one database only and cannot reconnect - add 'database' field to child structure described above. Or add keywords CONNECT/DISCONNECT to language. Hmm... I was sure backend can server more then 1 database sequentially. SKiller -------------------------- Sergei Keler WebMaster of "ComSet" E-Mail: skiller@comset.net http://www.comset.net --------------------------
On Tue, 30 Nov 1999 sk.list@comset.net wrote: > Date: Tue, 30 Nov 1999 18:11:36 +0300 (MSK) > From: sk.list@comset.net > To: Oleg Bartunov <oleg@sai.msu.su> > Cc: pgsql-hackers@postgreSQL.org, pgsql-admin@postgreSQL.org, > Marcin Mazurek - Multinet SA - Poznan <m.mazurek@multinet.pl> > Subject: Re: [ADMIN] When postgres will be faster? > > Hi! > > On 29-Nov-99 Oleg Bartunov wrote: > > I'm not concern very much about speed of Postgres but mostly > > about its connection schema. Every new connect to database postgres > > forks another children. It's impossible to work with different > > fork and fork/exec are some different. postmaster forks and execute backend > binary. > > > databases. On my production site I work with persistent connections > > between http (mod_perl) <-> postgres and quite satisfies with efficiency - > > I have 20 httpd running and 20 db backends accordingly. > > This requires some memory, but I could live. Now other developers > > I have >100 connections in peak load. Not all of them use postgres. If I use > pconnect I lost my RAM ;-) > > > want to use postgres as a db backend in their Web applications and > > also want to have persistence to some another databases. > > If you have N databases and M httpd servers, you will end with > > N*M DB backends. This is too much and I'm afraid my solution > > Why? Why N*M? After disconnect the persistent connection backend should not > finish but next connection opens other bata base? Or i misunderstood? persistent connections are never disconnected during httpd children's life, that's what I need for performance reason. every httpd children holds their own connection to specific database and there are no method (well, AFAIK) to share connection between childrens (see discussion in modperl mailing list for today and yesterday). If you need to work with another database you have to open new connection, because postgres doesn't works with several database through one connection. Latest version of Mysql could do this and you could explicitly specify database name "select something from database.table" Simple experiment with psql like 1. psql db12. look at process list - you'll see something like: 19714 ? S 0:00/usr/local/pgsql/bin/postgres localhost megera db1 idle3. \c db24. again look at process list: 19718 ? S 0:00/usr/local/pgsql/bin/postgres localhost megera db2 idle new process is forked. I dont' know backend internals, probably it's possible using libpq interface to switch between databases through one connection, but I suspect it could be difficult to 'hide' so nice feature :-) > > > I don't know if it's possible to have a pool of db childrens, > > which connected to, say, template1 database and children could > > switch to requested database on demand. This would require some > > modification of DBD driver of course, but I think it's not hard. > > Hmmm... There is 2 ways to support pool. > 1. FORK only. > Postmaster and postgres are same binary. postmaster accept connection and > forked. Parent creates structure with child pid, descriptors etc... Child > becomes backend. When child finish the request it send signal (smem,fifo etc) > to parent. Parent set IDLE flag to child structure. When next connection > accepted parent seek through list of child to find first idle one. parent clear > IDLE flag and fd_dup file descriptors to backend's. Child structure contain > call counter and time stamp of start and last call time. If call counter exceeds > N or time exceeds T all descriptors becomes closed. Child catch SIGPIPE on > closed descriptors and finish. Parent scans list of structures and check time > stamps to stop idle backends or start new one (to have pool of idle backends). > > 2. Fork/exec. > I dont know. But it possible too. Same like previous. > > So, if backend works with one database only and cannot reconnect - add > 'database' field to child structure described above. Or add keywords > CONNECT/DISCONNECT to language. Hmm... I was sure backend can server more then > 1 database sequentially. > I suggest postgres experts comment this topic. We really need to work with different databases using one connection. Postgres is rather good scalable DB engine and IMO it's worth to have such feature like DB pooling. Once postgres support db pooling it would be possible to develope/modify various interfaces to work with httpd. I'm using mod_perl, apache, perl, DBI, ApacheDBI and now looking for CORBA :-) regards, Oleg > SKiller > -------------------------- > Sergei Keler > WebMaster of "ComSet" > E-Mail: skiller@comset.net > http://www.comset.net > -------------------------- > _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Oleg Bartunov wrote: > I suggest postgres experts comment this topic. We really need to work > with different databases using one connection. Postgres is rather good > scalable DB engine and IMO it's worth to have such feature like > DB pooling. Once postgres support db pooling it would be possible The AOLserver webserver/application server already fully supports pooled database connections to PostgreSQL. AOLserver is fully multithreaded, and allows a configurable number of database connections to be persistently pooled. There can be multiple pools available, each connecting to a single database. AOLserver dynamically manages the pools, with maximum number of pools and pool persistence timeout configurable. This allows many thousands of http connections to share a limited number of database connections, thanks to AOLserver's multithreaded front end. AOLserver will happily coexist with apache, just by binding to another port. The performance increase is on the order of 100 times faster than plain CGI using the perl Pg module. AOLserver features tight database integration through a tcl and C API. The tcl API has specialized database connection commands, http connection commands, thread creation-mutex-destruction-etc commands, and many other highly useful (for web scripts) commands that make even tcl a good web scripting language. www.aolserver.com, or aolserver.lcs.mit.edu. While it might be tempting to lift code out of AOLserver to do pooling, AOLserver is under the dual APL/GPL license -- such code could be GPL'd, but not BSD'd. But, AOLserver's source does give you an example of how such pooling can be accomplished from a client-side libpq-using program. The only problem is the issue of libpq's thread-safety or lack thereof (in practice, the thread-safety issue doesn't show until you hit a high load). Ask Vince about AOLserver :-). -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
At 11:44 AM 11/30/99 -0500, Lamar Owen wrote: >Oleg Bartunov wrote: >> I suggest postgres experts comment this topic. We really need to work >> with different databases using one connection. Postgres is rather good >> scalable DB engine and IMO it's worth to have such feature like >> DB pooling. Once postgres support db pooling it would be possible > >The AOLserver webserver/application server already fully supports pooled >database connections to PostgreSQL. > >AOLserver is fully multithreaded, and allows a configurable number of >database connections to be persistently pooled. There can be multiple >pools available, each connecting to a single database. AOLserver >dynamically manages the pools, with maximum number of pools and pool >persistence timeout configurable. > >This allows many thousands of http connections to share a limited number >of database connections, thanks to AOLserver's multithreaded front end. And there's a great toolset from Ars Digita that runs under AOLserver. I've ported part of it to Postgres. You can see one of the modules in action, a bulletin board module, at http://dsl-dhogaza.pacifier.net/bboard Unfortunately, portions of the Ars Digita toolkit use outer joins fairly heavily. I was somewhat saddened to hear that outer joins apparently won't make it into V7 after all, because I was planning to port the entire toolkit when V7 made its debut. I still may do so, because you can mechanically translate the queries to not be dependent on outer joins, but it makes doing a port a heck of a lot more tedious. The Ars Digita toolkit contains, among other things, a very robust e-commerce module which is in use at some large, Oracle-based web sites. It would be cool to make this available for Postgres... (The toolkit's GPL'd, BTW) - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
[Charset KOI8-R unsupported, filtering to ASCII...] > Hi! > > On 29-Nov-99 Oleg Bartunov wrote: > > I'm not concern very much about speed of Postgres but mostly > > about its connection schema. Every new connect to database postgres > > forks another children. It's impossible to work with different > > fork and fork/exec are some different. postmaster forks and execute backend > binary. postmaster forks() and does not do an exec(). -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Hi! On 30-Nov-99 Bruce Momjian wrote: > [Charset KOI8-R unsupported, filtering to ASCII...] >> Hi! >> >> On 29-Nov-99 Oleg Bartunov wrote: >> > I'm not concern very much about speed of Postgres but mostly >> > about its connection schema. Every new connect to database postgres >> > forks another children. It's impossible to work with different >> >> fork and fork/exec are some different. postmaster forks and execute backend >> binary. > > postmaster forks() and does not do an exec(). >From postmaster log: FindExec: found "/usr/comset/dbase/bin/postgres" using argv[0] ps ax|grep pos 10665 ? R 0:01 /usr/comset/dbase/bin/postgres main.comset.com polithit pol 13329 ? S 0:24 /usr/comset/dbase/bin/postmaster -i -D/usr/comset/dbase/dat These samples push me thinking it was fork/exec... :-( SKiller -------------------------- Sergei Keler WebMaster of "ComSet" E-Mail: skiller@comset.net http://www.comset.net --------------------------
Hi! On 30-Nov-99 Oleg Bartunov wrote: > I suggest postgres experts comment this topic. We really need to work > with different databases using one connection. Postgres is rather good > scalable DB engine and IMO it's worth to have such feature like > DB pooling. Once postgres support db pooling it would be possible > to develope/modify various interfaces to work with httpd. > I'm using mod_perl, apache, perl, DBI, ApacheDBI and now looking > for CORBA :-) If backend/db pooling will be made by postgres developers it will be a great step to speed up www-based application using postgresql. ;-) Really. So, When the pooling realized in postmaster other application NOT nessesary to modify to speed up connection process... I read comments about AOL server. Good. But this should be postgresql feature. SKiller -------------------------- Sergei Keler WebMaster of "ComSet" E-Mail: skiller@comset.net http://www.comset.net --------------------------
> > postmaster forks() and does not do an exec(). > > >From postmaster log: > > FindExec: found "/usr/comset/dbase/bin/postgres" using argv[0] > > ps ax|grep pos > > 10665 ? R 0:01 /usr/comset/dbase/bin/postgres main.comset.com polithit pol > 13329 ? S 0:24 /usr/comset/dbase/bin/postmaster -i -D/usr/comset/dbase/dat > > These samples push me thinking it was fork/exec... :-( We re-exec the postmaster so it has an absolute path, which is sometimes needed for dynamic loading. We also need 5 paramaters to we can do ps display if forked backends. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Hi! On 01-Dec-99 Bruce Momjian wrote: >> > postmaster forks() and does not do an exec(). >> >> >From postmaster log: >> >> FindExec: found "/usr/comset/dbase/bin/postgres" using argv[0] >> >> ps ax|grep pos >> >> 10665 ? R 0:01 /usr/comset/dbase/bin/postgres main.comset.com polithit >> pol >> 13329 ? S 0:24 /usr/comset/dbase/bin/postmaster -i >> -D/usr/comset/dbase/dat >> >> These samples push me thinking it was fork/exec... :-( > > We re-exec the postmaster so it has an absolute path, which is sometimes > needed for dynamic loading. We also need 5 paramaters to we can do ps > display if forked backends. But you know several ways to send parameters to child process... So, You have: 1. Shared memory 2. Fifo file like /tmp/.s.PGSQL.ctl.pid-of-backend 3. Additional unnamed pipe opened for child 4. Signals like SIGUSR1 etc to force fetch parameters from somewhere. So, in addition I found thet there is not nessesary to create a dynamic list of child pool. You have static/dynamic linear array of backend running ;-). Waw! Possible to add some additional info to this structure about pooled backend (I offer before) to manage pool. I hope I dig postgresql code this weekend to have ideas offer for developers more constructively. I think p.3 shown before is preferable. The main() of backend should gentle read this pipe. Pipes in Unix have more that 1024 bytes buffer... while (readCommand(....)) { initBackend(...); // Same as parse args... doQuery(..); finishBackend(...); } readCommand() should use select() with timeout for check pipe. Then signal received it set flag on and select() loop may finish on it with 0 returned. SKiller -------------------------- Sergei Keler WebMaster of "ComSet" E-Mail: skiller@comset.net http://www.comset.net --------------------------
This email is moved off of pgsql-admin and left only on pgsql-hackers, where it belongs... Sergei...we look forward to seeing patches that demonstrate, and possibly implement, that which you are proposing...it would give us, I think, a much clearer idea of what you are thinking :) On Thu, 2 Dec 1999 sk.list@comset.net wrote: > Hi! > > On 01-Dec-99 Bruce Momjian wrote: > >> > postmaster forks() and does not do an exec(). > >> > >> >From postmaster log: > >> > >> FindExec: found "/usr/comset/dbase/bin/postgres" using argv[0] > >> > >> ps ax|grep pos > >> > >> 10665 ? R 0:01 /usr/comset/dbase/bin/postgres main.comset.com polithit > >> pol > >> 13329 ? S 0:24 /usr/comset/dbase/bin/postmaster -i > >> -D/usr/comset/dbase/dat > >> > >> These samples push me thinking it was fork/exec... :-( > > > > We re-exec the postmaster so it has an absolute path, which is sometimes > > needed for dynamic loading. We also need 5 paramaters to we can do ps > > display if forked backends. > > But you know several ways to send parameters to child process... > So, You have: > 1. Shared memory > 2. Fifo file like /tmp/.s.PGSQL.ctl.pid-of-backend > 3. Additional unnamed pipe opened for child > 4. Signals like SIGUSR1 etc to force fetch parameters from somewhere. > > So, in addition I found thet there is not nessesary to create a dynamic list of > child pool. You have static/dynamic linear array of backend running ;-). Waw! > Possible to add some additional info to this structure about pooled backend (I > offer before) to manage pool. > > I hope I dig postgresql code this weekend to have ideas offer for developers > more constructively. > > I think p.3 shown before is preferable. The main() of backend should gentle > read this pipe. Pipes in Unix have more that 1024 bytes buffer... > > while (readCommand(....)) { > initBackend(...); // Same as parse args... > doQuery(..); > finishBackend(...); > } > > readCommand() should use select() with timeout for check pipe. Then signal > received it set flag on and select() loop may finish on it with 0 returned. > > SKiller > -------------------------- > Sergei Keler > WebMaster of "ComSet" > E-Mail: skiller@comset.net > http://www.comset.net > -------------------------- > > ************ > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org