Thread: Re: [ADMIN] When postgres will be faster?

Re: [ADMIN] When postgres will be faster?

From
Oleg Bartunov
Date:
I'm not concern very much about speed of Postgres but mostly
about its connection schema. Every new connect to database postgres
forks another children. It's impossible to work with different
databases. On my production site I work with persistent connections
between http (mod_perl) <-> postgres and quite satisfies with efficiency -
I have 20 httpd running and 20 db backends accordingly. 
This requires some memory, but I could live. Now other developers
want to use postgres as a db backend in their Web applications and
also want to have persistence to some another databases. 
If you have N databases and M httpd servers, you will end with
N*M DB backends. This is too much and I'm afraid my solution
could be scalable. MySQL seems could works with several databases.
I don't know if it's possible to have a pool of db childrens,
which connected to, say, template1 database and children could
switch to requested database on demand. This would require some
modification of DBD driver of course, but I think it's not hard.
I'm working on very big project with many databases involved,
current traffic is more than 2 mln. pageviews and most of them
dynamic. We expect about 5x more requests and I really need scalable
solution. Is anybody working on COBRA interface to postgres ?
CORBA is just a magic word for me :-) Could it be a magic wand ?
Regards,
    Oleg


On Mon, 29 Nov 1999, Marcin Mazurek - Multinet SA - Poznan wrote:

> Date: Mon, 29 Nov 1999 14:27:55 +0100 (CET)
> From: Marcin Mazurek - Multinet SA - Poznan <m.mazurek@multinet.pl>
> Cc: pgsql-admin@postgreSQL.org
> Subject: Re: [ADMIN] When postgres will be faster?
> 
> On Mon, 29 Nov 1999 sk.list@comset.net wrote:
> > Yes! But I recommend backend pool too. What is it? The postmaster task runs now
> > backend for each query. Good. But After query backend finished. I recommend to
> > stay backend running within a some timeout. If the next query occured
> > the postmaster redirect query to any idle backend or run a new one unless. Then
> > backend serve some connections it shut down itself, this prevents memory leaks.
> Somebody advised me to do such thing with servlets, holding pool of
> connections in one srvlet and give them as they are needed, but frankly
> speaking i have no idea how to do it. Does anybodyhas such examples with
> Connection pools?
> mazek
> 
>  Marcin Mazurek
> 
> -- 
> administrator
> MULTINET SA o/Poznan
> http://www.multinet.pl/
> 
> 
> ************
> 

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83



Re: [HACKERS] Re: [ADMIN] When postgres will be faster?

From
Oleg Broytmann
Date:
On Mon, 29 Nov 1999, Oleg Bartunov wrote:
> I'm not concern very much about speed of Postgres but mostly
> about its connection schema. Every new connect to database postgres
> forks another children. It's impossible to work with different
> databases. On my production site I work with persistent connections
> between http (mod_perl) <-> postgres and quite satisfies with efficiency -
> I have 20 httpd running and 20 db backends accordingly. 
> This requires some memory, but I could live. Now other developers
> want to use postgres as a db backend in their Web applications and
> also want to have persistence to some another databases. 
> If you have N databases and M httpd servers, you will end with
> N*M DB backends. This is too much and I'm afraid my solution
> could be scalable. MySQL seems could works with several databases.
  I use (not for production, though) Zope and Postgres (little non
spectacular demo is here: http://sun.med.ru/cgi-bin/Zope.cgi/phd01)  Zope can maintain a database connection or a pool
ofdatabase
 
connections. If there is no activity on a connection within a long period
(few hours) Zope closes the connection and reopens it on next access.

Oleg.
----    Oleg Broytmann     http://members.xoom.com/phd2/     phd2@earthling.net          Programmers don't die, they
justGOSUB without RETURN.
 



Re: [ADMIN] When postgres will be faster?

From
sk.list@comset.net
Date:
Hi!

On 29-Nov-99 Oleg Bartunov wrote:
> I'm not concern very much about speed of Postgres but mostly
> about its connection schema. Every new connect to database postgres
> forks another children. It's impossible to work with different

fork and fork/exec are some different. postmaster forks and execute backend
binary.

> databases. On my production site I work with persistent connections
> between http (mod_perl) <-> postgres and quite satisfies with efficiency -
> I have 20 httpd running and 20 db backends accordingly. 
> This requires some memory, but I could live. Now other developers

I have >100 connections in peak load. Not all of them use postgres. If I use
pconnect I lost my RAM ;-)

> want to use postgres as a db backend in their Web applications and
> also want to have persistence to some another databases. 
> If you have N databases and M httpd servers, you will end with
> N*M DB backends. This is too much and I'm afraid my solution

Why? Why N*M? After disconnect the persistent connection backend should not
finish but next connection opens other bata base? Or i misunderstood?

> I don't know if it's possible to have a pool of db childrens,
> which connected to, say, template1 database and children could
> switch to requested database on demand. This would require some
> modification of DBD driver of course, but I think it's not hard.

Hmmm... There is 2 ways to support pool.
1. FORK only.
Postmaster and postgres are same binary. postmaster accept connection and
forked. Parent creates structure with child pid, descriptors etc... Child
becomes backend. When child finish the request it send signal (smem,fifo etc)
to parent. Parent set IDLE flag to child structure. When next connection
accepted parent seek through list of child to find first idle one. parent clear
IDLE flag and fd_dup file descriptors to backend's. Child structure contain
call counter and time stamp of start and last call time. If call counter exceeds
N or time exceeds T all descriptors becomes closed. Child catch SIGPIPE on
closed descriptors and finish. Parent scans list of  structures and check time
stamps to stop idle backends or start new one (to have pool of idle backends).

2. Fork/exec.
I dont know. But it possible too. Same like previous.

So, if backend works with one database only and cannot reconnect - add
'database' field to child structure described above. Or add keywords 
CONNECT/DISCONNECT to language. Hmm... I was sure backend can server more then
1 database sequentially.

SKiller
--------------------------
Sergei Keler
WebMaster of "ComSet"
E-Mail: skiller@comset.net
http://www.comset.net
--------------------------


Re: [ADMIN] When postgres will be faster?

From
Oleg Bartunov
Date:
On Tue, 30 Nov 1999 sk.list@comset.net wrote:

> Date: Tue, 30 Nov 1999 18:11:36 +0300 (MSK)
> From: sk.list@comset.net
> To: Oleg Bartunov <oleg@sai.msu.su>
> Cc: pgsql-hackers@postgreSQL.org, pgsql-admin@postgreSQL.org,
>     Marcin Mazurek - Multinet SA - Poznan <m.mazurek@multinet.pl>
> Subject: Re: [ADMIN] When postgres will be faster?
> 
> Hi!
> 
> On 29-Nov-99 Oleg Bartunov wrote:
> > I'm not concern very much about speed of Postgres but mostly
> > about its connection schema. Every new connect to database postgres
> > forks another children. It's impossible to work with different
> 
> fork and fork/exec are some different. postmaster forks and execute backend
> binary.
> 
> > databases. On my production site I work with persistent connections
> > between http (mod_perl) <-> postgres and quite satisfies with efficiency -
> > I have 20 httpd running and 20 db backends accordingly. 
> > This requires some memory, but I could live. Now other developers
> 
> I have >100 connections in peak load. Not all of them use postgres. If I use
> pconnect I lost my RAM ;-)
> 
> > want to use postgres as a db backend in their Web applications and
> > also want to have persistence to some another databases. 
> > If you have N databases and M httpd servers, you will end with
> > N*M DB backends. This is too much and I'm afraid my solution
> 
> Why? Why N*M? After disconnect the persistent connection backend should not
> finish but next connection opens other bata base? Or i misunderstood?

persistent connections are never disconnected during httpd children's life,
that's what I need for performance reason. every httpd children holds
their own connection to specific database and there are no method 
(well, AFAIK) to share connection between childrens (see discussion in
modperl mailing list for today and yesterday). If you need to work with 
another database you have to open new connection, because postgres doesn't 
works with several database through one connection. Latest version of Mysql
could do this and you could explicitly specify database name  "select something from database.table"
Simple experiment with psql like 1. psql db12. look at process list - you'll see something like:     19714  ?  S
0:00/usr/local/pgsql/bin/postgres localhost megera db1 idle3. \c db24. again look at process list:     19718  ?  S
0:00/usr/local/pgsql/bin/postgres localhost megera db2 idle new process is forked.
 
I dont' know backend internals, probably it's possible using libpq interface
to switch between databases through one connection, but I suspect it could be
difficult to 'hide' so nice feature :-)

> 
> > I don't know if it's possible to have a pool of db childrens,
> > which connected to, say, template1 database and children could
> > switch to requested database on demand. This would require some
> > modification of DBD driver of course, but I think it's not hard.
> 
> Hmmm... There is 2 ways to support pool.
> 1. FORK only.
> Postmaster and postgres are same binary. postmaster accept connection and
> forked. Parent creates structure with child pid, descriptors etc... Child
> becomes backend. When child finish the request it send signal (smem,fifo etc)
> to parent. Parent set IDLE flag to child structure. When next connection
> accepted parent seek through list of child to find first idle one. parent clear
> IDLE flag and fd_dup file descriptors to backend's. Child structure contain
> call counter and time stamp of start and last call time. If call counter exceeds
> N or time exceeds T all descriptors becomes closed. Child catch SIGPIPE on
> closed descriptors and finish. Parent scans list of  structures and check time
> stamps to stop idle backends or start new one (to have pool of idle backends).
> 
> 2. Fork/exec.
> I dont know. But it possible too. Same like previous.
> 
> So, if backend works with one database only and cannot reconnect - add
> 'database' field to child structure described above. Or add keywords 
> CONNECT/DISCONNECT to language. Hmm... I was sure backend can server more then
> 1 database sequentially.
> 

I suggest postgres experts  comment this topic. We really need to work
with different databases using one connection. Postgres is rather good
scalable DB engine and IMO it's worth to have such feature like
DB pooling. Once postgres support db pooling it would be possible
to develope/modify various interfaces to work with httpd.
I'm using mod_perl, apache, perl, DBI, ApacheDBI and now looking 
for CORBA :-)

regards,
    Oleg

> SKiller
> --------------------------
> Sergei Keler
> WebMaster of "ComSet"
> E-Mail: skiller@comset.net
> http://www.comset.net
> --------------------------
> 

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83



Re: [HACKERS] Re: [ADMIN] When postgres will be faster?

From
Lamar Owen
Date:
Oleg Bartunov wrote:
> I suggest postgres experts  comment this topic. We really need to work
> with different databases using one connection. Postgres is rather good
> scalable DB engine and IMO it's worth to have such feature like
> DB pooling. Once postgres support db pooling it would be possible

The AOLserver webserver/application server already fully supports pooled
database connections to PostgreSQL.

AOLserver is fully multithreaded, and allows a configurable number of
database connections to be persistently pooled.  There can be multiple
pools available, each connecting to a single database.  AOLserver
dynamically manages the pools, with maximum number of pools and pool
persistence timeout configurable.

This allows many thousands of http connections to share a limited number
of database connections, thanks to AOLserver's multithreaded front end.

AOLserver will happily coexist with apache, just by binding to another
port.

The performance increase is on the order of 100 times faster than plain
CGI using the perl Pg module.

AOLserver features tight database integration through a tcl and C API.
The tcl API has specialized database connection commands, http
connection commands, thread creation-mutex-destruction-etc commands, and
many other highly useful (for web scripts) commands that make even tcl a
good web scripting language.  www.aolserver.com, or
aolserver.lcs.mit.edu.

While it might be tempting to lift code out of AOLserver to do pooling,
AOLserver is under the dual APL/GPL license -- such code could be GPL'd,
but not BSD'd.  But, AOLserver's source does give you an example of how
such pooling can be accomplished from a client-side libpq-using program.

The only problem is the issue of libpq's thread-safety or lack thereof
(in practice, the thread-safety issue doesn't show until you hit a high
load).

Ask Vince about AOLserver :-).

--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11


Re: [HACKERS] Re: [ADMIN] When postgres will be faster?

From
Don Baccus
Date:
At 11:44 AM 11/30/99 -0500, Lamar Owen wrote:
>Oleg Bartunov wrote:
>> I suggest postgres experts  comment this topic. We really need to work
>> with different databases using one connection. Postgres is rather good
>> scalable DB engine and IMO it's worth to have such feature like
>> DB pooling. Once postgres support db pooling it would be possible
>
>The AOLserver webserver/application server already fully supports pooled
>database connections to PostgreSQL.
>
>AOLserver is fully multithreaded, and allows a configurable number of
>database connections to be persistently pooled.  There can be multiple
>pools available, each connecting to a single database.  AOLserver
>dynamically manages the pools, with maximum number of pools and pool
>persistence timeout configurable.
>
>This allows many thousands of http connections to share a limited number
>of database connections, thanks to AOLserver's multithreaded front end.

And there's a great toolset from Ars Digita that runs under AOLserver.

I've ported part of it to Postgres.  You can see one of the modules
in action, a bulletin board module, at http://dsl-dhogaza.pacifier.net/bboard

Unfortunately, portions of the Ars Digita toolkit use outer joins fairly
heavily.  I was somewhat saddened to hear that outer joins apparently
won't make it into V7 after all, because I was planning to port the
entire toolkit when V7 made its debut.  I still may do so, because
you can mechanically translate the queries to not be dependent on
outer joins, but it makes doing a port a heck of a lot more tedious.

The Ars Digita toolkit contains, among other things, a very robust
e-commerce module which is in use at some large, Oracle-based web
sites.  It would be cool to make this available for Postgres...

(The toolkit's GPL'd, BTW)



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: [ADMIN] When postgres will be faster?

From
Bruce Momjian
Date:
[Charset KOI8-R unsupported, filtering to ASCII...]
> Hi!
> 
> On 29-Nov-99 Oleg Bartunov wrote:
> > I'm not concern very much about speed of Postgres but mostly
> > about its connection schema. Every new connect to database postgres
> > forks another children. It's impossible to work with different
> 
> fork and fork/exec are some different. postmaster forks and execute backend
> binary.

postmaster forks() and does not do an exec().


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [ADMIN] When postgres will be faster?

From
sk.list@comset.net
Date:
Hi!

On 30-Nov-99 Bruce Momjian wrote:
> [Charset KOI8-R unsupported, filtering to ASCII...]
>> Hi!
>> 
>> On 29-Nov-99 Oleg Bartunov wrote:
>> > I'm not concern very much about speed of Postgres but mostly
>> > about its connection schema. Every new connect to database postgres
>> > forks another children. It's impossible to work with different
>> 
>> fork and fork/exec are some different. postmaster forks and execute backend
>> binary.
> 
> postmaster forks() and does not do an exec().

>From postmaster log:

FindExec: found "/usr/comset/dbase/bin/postgres" using argv[0]

ps ax|grep pos

10665  ?  R    0:01 /usr/comset/dbase/bin/postgres main.comset.com polithit pol
13329  ?  S    0:24 /usr/comset/dbase/bin/postmaster -i -D/usr/comset/dbase/dat

These samples push me thinking it was fork/exec... :-(

SKiller
--------------------------
Sergei Keler
WebMaster of "ComSet"
E-Mail: skiller@comset.net
http://www.comset.net
--------------------------


Re: [ADMIN] When postgres will be faster?

From
sk.list@comset.net
Date:
Hi!

On 30-Nov-99 Oleg Bartunov wrote:

> I suggest postgres experts  comment this topic. We really need to work
> with different databases using one connection. Postgres is rather good
> scalable DB engine and IMO it's worth to have such feature like
> DB pooling. Once postgres support db pooling it would be possible
> to develope/modify various interfaces to work with httpd.
> I'm using mod_perl, apache, perl, DBI, ApacheDBI and now looking 
> for CORBA :-)

If backend/db pooling will be made by postgres developers it will be a great
step to speed up www-based application using postgresql. ;-) Really.

So, When the pooling realized in postmaster other application NOT nessesary to
modify to speed up connection process... I read comments about AOL server.
Good. But this should be postgresql feature.

SKiller
--------------------------
Sergei Keler
WebMaster of "ComSet"
E-Mail: skiller@comset.net
http://www.comset.net
--------------------------


Re: [ADMIN] When postgres will be faster?

From
Bruce Momjian
Date:
> > postmaster forks() and does not do an exec().
> 
> >From postmaster log:
> 
> FindExec: found "/usr/comset/dbase/bin/postgres" using argv[0]
> 
> ps ax|grep pos
> 
> 10665  ?  R    0:01 /usr/comset/dbase/bin/postgres main.comset.com polithit pol
> 13329  ?  S    0:24 /usr/comset/dbase/bin/postmaster -i -D/usr/comset/dbase/dat
> 
> These samples push me thinking it was fork/exec... :-(

We re-exec the postmaster so it has an absolute path, which is sometimes
needed for dynamic loading.  We also need 5 paramaters to we can do ps
display if forked backends.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [ADMIN] When postgres will be faster?

From
sk.list@comset.net
Date:
Hi!

On 01-Dec-99 Bruce Momjian wrote:
>> > postmaster forks() and does not do an exec().
>> 
>> >From postmaster log:
>> 
>> FindExec: found "/usr/comset/dbase/bin/postgres" using argv[0]
>> 
>> ps ax|grep pos
>> 
>> 10665  ?  R    0:01 /usr/comset/dbase/bin/postgres main.comset.com polithit
>> pol
>> 13329  ?  S    0:24 /usr/comset/dbase/bin/postmaster -i
>> -D/usr/comset/dbase/dat
>> 
>> These samples push me thinking it was fork/exec... :-(
> 
> We re-exec the postmaster so it has an absolute path, which is sometimes
> needed for dynamic loading.  We also need 5 paramaters to we can do ps
> display if forked backends.

But you know several ways to send parameters to child process...
So, You have:
1. Shared memory
2. Fifo file like /tmp/.s.PGSQL.ctl.pid-of-backend
3. Additional unnamed pipe opened for child
4. Signals like SIGUSR1 etc to force fetch parameters from somewhere.

So, in addition I found thet there is not nessesary to create a dynamic list of
child pool. You have static/dynamic linear array of backend running ;-). Waw!
Possible to add some additional info to this structure about pooled backend (I
offer before) to manage pool. 

I hope I dig postgresql code this weekend to have ideas offer for developers
more constructively.

I think p.3 shown before is preferable. The main() of backend should gentle
read this pipe. Pipes in Unix have more that 1024 bytes buffer...

while (readCommand(....)) { initBackend(...);   // Same as parse args... doQuery(..); finishBackend(...);
}

readCommand() should use select() with timeout for check pipe. Then signal
received it set flag on and select() loop may finish on it with 0 returned.

SKiller
--------------------------
Sergei Keler
WebMaster of "ComSet"
E-Mail: skiller@comset.net
http://www.comset.net
--------------------------


Re: [ADMIN] When postgres will be faster?

From
The Hermit Hacker
Date:
This email is moved off of pgsql-admin and left only on pgsql-hackers,
where it belongs...

Sergei...we look forward to seeing patches that demonstrate, and possibly
implement, that which you are proposing...it would give us, I think, a
much clearer idea of what you are thinking :)



On Thu, 2 Dec 1999 sk.list@comset.net wrote:

> Hi!
> 
> On 01-Dec-99 Bruce Momjian wrote:
> >> > postmaster forks() and does not do an exec().
> >> 
> >> >From postmaster log:
> >> 
> >> FindExec: found "/usr/comset/dbase/bin/postgres" using argv[0]
> >> 
> >> ps ax|grep pos
> >> 
> >> 10665  ?  R    0:01 /usr/comset/dbase/bin/postgres main.comset.com polithit
> >> pol
> >> 13329  ?  S    0:24 /usr/comset/dbase/bin/postmaster -i
> >> -D/usr/comset/dbase/dat
> >> 
> >> These samples push me thinking it was fork/exec... :-(
> > 
> > We re-exec the postmaster so it has an absolute path, which is sometimes
> > needed for dynamic loading.  We also need 5 paramaters to we can do ps
> > display if forked backends.
> 
> But you know several ways to send parameters to child process...
> So, You have:
> 1. Shared memory
> 2. Fifo file like /tmp/.s.PGSQL.ctl.pid-of-backend
> 3. Additional unnamed pipe opened for child
> 4. Signals like SIGUSR1 etc to force fetch parameters from somewhere.
> 
> So, in addition I found thet there is not nessesary to create a dynamic list of
> child pool. You have static/dynamic linear array of backend running ;-). Waw!
> Possible to add some additional info to this structure about pooled backend (I
> offer before) to manage pool. 
> 
> I hope I dig postgresql code this weekend to have ideas offer for developers
> more constructively.
> 
> I think p.3 shown before is preferable. The main() of backend should gentle
> read this pipe. Pipes in Unix have more that 1024 bytes buffer...
> 
> while (readCommand(....)) {
>   initBackend(...);   // Same as parse args...
>   doQuery(..);
>   finishBackend(...);
> }
> 
> readCommand() should use select() with timeout for check pipe. Then signal
> received it set flag on and select() loop may finish on it with 0 returned.
> 
> SKiller
> --------------------------
> Sergei Keler
> WebMaster of "ComSet"
> E-Mail: skiller@comset.net
> http://www.comset.net
> --------------------------
> 
> ************
> 

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org