Thread: fork() bad
Hi, I'm probably doing something wrong here. My process establishes a connection with a PostgreSQL backen and then fork()s (twice, actually). To make things even more complicated (though I don't know if it affects things), my connection handle is a global variable in a dynamically-loaded shared object. I tried PQreset()ing the connections after forking and then I'd try a begin but I'd get the ff.: NOTICE: BeginTransactionBlock and not in default state What does that mean? So, what's the deal with fork()ing and connections? L L Richi Plana 8^) ,-,-. ,-,-. ,-,-. ,-,-. ,- LL LL Systems Administrator / / \ \ / / \ \ / / \ \ / / \ \ / / LLLLL Mosaic Communications, Inc. \ \ / / \ \ / / \ \ / / \ \ / / LLLLL mailto:richip@mozcom.com `-'-' `-'-' `-'-' `-'-' ------------------------------------------------------------------------ P G P Key available at http://www2.mozcom.com/~richip/richip.asc Tired of Spam? Join this CAUCE! http://www.cauce.org/
> > Hi, > > I'm probably doing something wrong here. > > My process establishes a connection with a PostgreSQL backen and then > fork()s (twice, actually). To make things even more complicated (though I > don't know if it affects things), my connection handle is a global > variable in a dynamically-loaded shared object. > > I tried PQreset()ing the connections after forking and then I'd try a > begin but I'd get the ff.: > > NOTICE: BeginTransactionBlock and not in default state > > What does that mean? > > > So, what's the deal with fork()ing and connections? > Well, Ive not looked at the code, but I should be right here. If you fork, you will have two processes pumping data down the same connection to the database, cos the fork() will simply duplicate the file descriptor reference, not make you a new connection. If two forked() processes try and send data or retrieve data at the same time, everything will break, as the database obviously will not know what the hell is going on and will get a mangled transmission
Hi, On Mon, 22 Mar 1999, M Simms wrote: |o| > My process establishes a connection with a PostgreSQL backen and then |o| > fork()s (twice, actually). To make things even more complicated (though I |o| > don't know if it affects things), my connection handle is a global |o| > variable in a dynamically-loaded shared object. |o| > |o| > I tried PQreset()ing the connections after forking and then I'd try a |o| > begin but I'd get the ff.: |o| If you fork, you will have two processes pumping data down the |o| same connection to the database, cos the fork() will simply |o| duplicate the file descriptor reference, not make you a new |o| connection. If two forked() processes try and send data or |o| retrieve data at the same time, everything will break, as the |o| database obviously will not know what the hell is going on and |o| will get a mangled transmission Looks like the general consensus is fork()ing is a bad thing where PostgreSQL is concerned. So what I did was refrained from opening a connection to the backend until AFTER the process fork()ed. As some of you may know, I'm hacking Ascend RADIUS 2.01 to look up a PostgreSQL database for authentication and log to PG for accounting. Normally, RADIUS fork()s once for Accounting and fork()s for each Authentication request. That's a lot of fork()ing and establishing connections to the backend. It's slow, but it's better than junking whatever code I've written so far. If anyone can give a better suggestion, I'm all ears. Also, if anyone wants the code when it's done, try asking. ;^) L L Richi Plana 8^) ,-,-. ,-,-. ,-,-. ,-,-. ,- LL LL Systems Administrator / / \ \ / / \ \ / / \ \ / / \ \ / / LLLLL Mosaic Communications, Inc. \ \ / / \ \ / / \ \ / / \ \ / / LLLLL mailto:richip@mozcom.com `-'-' `-'-' `-'-' `-'-' ------------------------------------------------------------------------ P G P Key available at http://www2.mozcom.com/~richip/richip.asc Tired of Spam? Join this CAUCE! http://www.cauce.org/
At 17:48 +0200 on 22/03/1999, Richi Plana wrote: > As some of you may know, I'm hacking Ascend RADIUS 2.01 to look up a > PostgreSQL database for authentication and log to PG for accounting. > Normally, RADIUS fork()s once for Accounting and fork()s for each > Authentication request. That's a lot of fork()ing and establishing > connections to the backend. It's slow, but it's better than junking > whatever code I've written so far. > > If anyone can give a better suggestion, I'm all ears. Also, if anyone > wants the code when it's done, try asking. ;^) Why don't you try to synchronize access to the connection between the various processes? You know, lock it in an exclusive lock, on an inter-process basis, such that when one process accesses it, the others have to wait. Or you can have a few connections open, so that the bottleneck is wider. You know, like you would treat any shared object in an inter-process environment? Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Hello, Richi Plana wrote: > > Hi, > [ previous discussion snipped ] > > Looks like the general consensus is fork()ing is a bad thing where > PostgreSQL is concerned. So what I did was refrained from opening a > connection to the backend until AFTER the process fork()ed. > > As some of you may know, I'm hacking Ascend RADIUS 2.01 to look up a > PostgreSQL database for authentication and log to PG for accounting. > Normally, RADIUS fork()s once for Accounting and fork()s for each > Authentication request. That's a lot of fork()ing and establishing > connections to the backend. It's slow, but it's better than junking > whatever code I've written so far. > > If anyone can give a better suggestion, I'm all ears. Also, if anyone > wants the code when it's done, try asking. ;^) > Would it be possible to create a "connection pool" sort of have an array of connections go pgsql, and mark one of them as "in use" right before you fork? You may have to stick the "in use" marks in shared memory, so that after the fork-ed process is done with the pgsql connection it marks it as "free" again, so it can be re-used for a next process. I hope I'm making some sense :) Regards, Gerard Saraber gsaraber@glasscity.net
Hi, Herouth. On Mon, 22 Mar 1999, Herouth Maoz wrote: |o| > As some of you may know, I'm hacking Ascend RADIUS 2.01 to look up a |o| > PostgreSQL database for authentication and log to PG for accounting. |o| > Normally, RADIUS fork()s once for Accounting and fork()s for each |o| > Authentication request. That's a lot of fork()ing and establishing |o| > connections to the backend. It's slow, but it's better than junking |o| > whatever code I've written so far. |o| > |o| > If anyone can give a better suggestion, I'm all ears. Also, if anyone |o| > wants the code when it's done, try asking. ;^) |o| |o| Why don't you try to synchronize access to the connection between the |o| various processes? You know, lock it in an exclusive lock, on an |o| inter-process basis, such that when one process accesses it, the others |o| have to wait. Or you can have a few connections open, so that the |o| bottleneck is wider. You know, like you would treat any shared object in an |o| inter-process environment? It kinda defeats the purpose of allowing RADIUS to fork() if I do locking. I've no benchmarks to prove it, but if I allow it to execute one process at a time via locking, that would probably slow the other processes down. (ie. Should the waiting process block? If so, when will it try again? Are the overheads to establishin a connection really that big?) L L Richi Plana 8^) ,-,-. ,-,-. ,-,-. ,-,-. ,- LL LL Systems Administrator / / \ \ / / \ \ / / \ \ / / \ \ / / LLLLL Mosaic Communications, Inc. \ \ / / \ \ / / \ \ / / \ \ / / LLLLL mailto:richip@mozcom.com `-'-' `-'-' `-'-' `-'-' ------------------------------------------------------------------------ P G P Key available at http://www2.mozcom.com/~richip/richip.asc Tired of Spam? Join this CAUCE! http://www.cauce.org/
At 18:29 +0200 on 22/03/1999, Richi Plana wrote: > It kinda defeats the purpose of allowing RADIUS to fork() if I do locking. > I've no benchmarks to prove it, but if I allow it to execute one process > at a time via locking, that would probably slow the other processes down. > (ie. Should the waiting process block? If so, when will it try again? Are > the overheads to establishin a connection really that big?) It always depends on the case in hand. You are using the current version of postgres, not one of the snapshots, right? Well, in the current version, any update to a table locks it, so other processes doing the same operations are blocked anyway. If your process tree generates a lot of children that have to write a record to the same table, it will be blocked anyway. So why not save at least the price of starting another postgres process and establishing a connection? BTW, working in a three tier environment has a similar effect as locking. It's the old "message passing" vs. "memory sharing" in disguise. Herouth