Thread: fork() bad

fork() bad

From
Richi Plana
Date:
Hi,

I'm probably doing something wrong here.

My process establishes a connection with a PostgreSQL backen and then
fork()s (twice, actually). To make things even more complicated (though I
don't know if it affects things), my connection handle is a global
variable in a dynamically-loaded shared object.

I tried PQreset()ing the connections after forking and then I'd try a
begin but I'd get the ff.:

NOTICE:  BeginTransactionBlock and not in default state

What does that mean?


So, what's the deal with fork()ing and connections?


L   L Richi Plana 8^)         ,-,-.     ,-,-.     ,-,-.     ,-,-.     ,-
LL LL Systems Administrator  / / \ \   / / \ \   / / \ \   / / \ \   / /
LLLLL Mosaic Communications, Inc. \ \ / /   \ \ / /   \ \ / /   \ \ / /
LLLLL mailto:richip@mozcom.com     `-'-'     `-'-'     `-'-'     `-'-'
------------------------------------------------------------------------
P G P Key available at http://www2.mozcom.com/~richip/richip.asc
Tired of Spam? Join this CAUCE! http://www.cauce.org/


Re: [GENERAL] fork() bad

From
M Simms
Date:
>
> Hi,
>
> I'm probably doing something wrong here.
>
> My process establishes a connection with a PostgreSQL backen and then
> fork()s (twice, actually). To make things even more complicated (though I
> don't know if it affects things), my connection handle is a global
> variable in a dynamically-loaded shared object.
>
> I tried PQreset()ing the connections after forking and then I'd try a
> begin but I'd get the ff.:
>
> NOTICE:  BeginTransactionBlock and not in default state
>
> What does that mean?
>
>
> So, what's the deal with fork()ing and connections?
>

Well, Ive not looked at the code, but I should be right here.

If you fork, you will have two processes pumping data down the same
connection to the database, cos the fork() will simply duplicate the
file descriptor reference, not make you a new connection. If two
forked() processes try and send data or retrieve data at the same
time, everything will break, as the database obviously will not know
what the hell is going on and will get a mangled transmission

Re: [GENERAL] fork() bad

From
Richi Plana
Date:
Hi,

On Mon, 22 Mar 1999, M Simms wrote:

|o| > My process establishes a connection with a PostgreSQL backen and then
|o| > fork()s (twice, actually). To make things even more complicated (though I
|o| > don't know if it affects things), my connection handle is a global
|o| > variable in a dynamically-loaded shared object.
|o| >
|o| > I tried PQreset()ing the connections after forking and then I'd try a
|o| > begin but I'd get the ff.:

|o| If you fork, you will have two processes pumping data down the
|o| same connection to the database, cos the fork() will simply
|o| duplicate the file descriptor reference, not make you a new
|o| connection. If two forked() processes try and send data or
|o| retrieve data at the same time, everything will break, as the
|o| database obviously will not know what the hell is going on and
|o| will get a mangled transmission

Looks like the general consensus is fork()ing is a bad thing where
PostgreSQL is concerned. So what I did was refrained from opening a
connection to the backend until AFTER the process fork()ed.

As some of you may know, I'm hacking Ascend RADIUS 2.01 to look up a
PostgreSQL database for authentication and log to PG for accounting.
Normally, RADIUS fork()s once for Accounting and fork()s for each
Authentication request. That's a lot of fork()ing and establishing
connections to the backend. It's slow, but it's better than junking
whatever code I've written so far.

If anyone can give a better suggestion, I'm all ears. Also, if anyone
wants the code when it's done, try asking. ;^)


L   L Richi Plana 8^)         ,-,-.     ,-,-.     ,-,-.     ,-,-.     ,-
LL LL Systems Administrator  / / \ \   / / \ \   / / \ \   / / \ \   / /
LLLLL Mosaic Communications, Inc. \ \ / /   \ \ / /   \ \ / /   \ \ / /
LLLLL mailto:richip@mozcom.com     `-'-'     `-'-'     `-'-'     `-'-'
------------------------------------------------------------------------
P G P Key available at http://www2.mozcom.com/~richip/richip.asc
Tired of Spam? Join this CAUCE! http://www.cauce.org/


Re: [GENERAL] fork() bad

From
Herouth Maoz
Date:
At 17:48 +0200 on 22/03/1999, Richi Plana wrote:


> As some of you may know, I'm hacking Ascend RADIUS 2.01 to look up a
> PostgreSQL database for authentication and log to PG for accounting.
> Normally, RADIUS fork()s once for Accounting and fork()s for each
> Authentication request. That's a lot of fork()ing and establishing
> connections to the backend. It's slow, but it's better than junking
> whatever code I've written so far.
>
> If anyone can give a better suggestion, I'm all ears. Also, if anyone
> wants the code when it's done, try asking. ;^)

Why don't you try to synchronize access to the connection between the
various processes? You know, lock it in an exclusive lock, on an
inter-process basis, such that when one process accesses it, the others
have to wait. Or you can have a few connections open, so that the
bottleneck is wider. You know, like you would treat any shared object in an
inter-process environment?

Herouth

--
Herouth Maoz, Internet developer.
Open University of Israel - Telem project
http://telem.openu.ac.il/~herutma



Re: [GENERAL] fork() bad

From
Gerard Saraber
Date:
Hello,

Richi Plana wrote:
>
> Hi,
>
[ previous discussion snipped ]

>
> Looks like the general consensus is fork()ing is a bad thing where
> PostgreSQL is concerned. So what I did was refrained from opening a
> connection to the backend until AFTER the process fork()ed.
>
> As some of you may know, I'm hacking Ascend RADIUS 2.01 to look up a
> PostgreSQL database for authentication and log to PG for accounting.
> Normally, RADIUS fork()s once for Accounting and fork()s for each
> Authentication request. That's a lot of fork()ing and establishing
> connections to the backend. It's slow, but it's better than junking
> whatever code I've written so far.
>
> If anyone can give a better suggestion, I'm all ears. Also, if anyone
> wants the code when it's done, try asking. ;^)
>

Would it be possible to create a "connection pool" sort of have an array
of connections go pgsql, and mark one of them as "in use" right before
you fork?
You may have to stick the "in use" marks in shared memory, so that after
the fork-ed process is done with the pgsql connection it marks it as
"free" again, so it can be re-used for a next process.
I hope I'm making some sense :)

Regards,
Gerard Saraber
gsaraber@glasscity.net

Re: [GENERAL] fork() bad

From
Richi Plana
Date:
Hi, Herouth.

On Mon, 22 Mar 1999, Herouth Maoz wrote:

|o| > As some of you may know, I'm hacking Ascend RADIUS 2.01 to look up a
|o| > PostgreSQL database for authentication and log to PG for accounting.
|o| > Normally, RADIUS fork()s once for Accounting and fork()s for each
|o| > Authentication request. That's a lot of fork()ing and establishing
|o| > connections to the backend. It's slow, but it's better than junking
|o| > whatever code I've written so far.
|o| >
|o| > If anyone can give a better suggestion, I'm all ears. Also, if anyone
|o| > wants the code when it's done, try asking. ;^)
|o|
|o| Why don't you try to synchronize access to the connection between the
|o| various processes? You know, lock it in an exclusive lock, on an
|o| inter-process basis, such that when one process accesses it, the others
|o| have to wait. Or you can have a few connections open, so that the
|o| bottleneck is wider. You know, like you would treat any shared object in an
|o| inter-process environment?

It kinda defeats the purpose of allowing RADIUS to fork() if I do locking.
I've no benchmarks to prove it, but if I allow it to execute one process
at a time via locking, that would probably slow the other processes down.
(ie. Should the waiting process block? If so, when will it try again? Are
the overheads to establishin a connection really that big?)


L   L Richi Plana 8^)         ,-,-.     ,-,-.     ,-,-.     ,-,-.     ,-
LL LL Systems Administrator  / / \ \   / / \ \   / / \ \   / / \ \   / /
LLLLL Mosaic Communications, Inc. \ \ / /   \ \ / /   \ \ / /   \ \ / /
LLLLL mailto:richip@mozcom.com     `-'-'     `-'-'     `-'-'     `-'-'
------------------------------------------------------------------------
P G P Key available at http://www2.mozcom.com/~richip/richip.asc
Tired of Spam? Join this CAUCE! http://www.cauce.org/


Re: [GENERAL] fork() bad

From
Herouth Maoz
Date:
At 18:29 +0200 on 22/03/1999, Richi Plana wrote:


> It kinda defeats the purpose of allowing RADIUS to fork() if I do locking.
> I've no benchmarks to prove it, but if I allow it to execute one process
> at a time via locking, that would probably slow the other processes down.
> (ie. Should the waiting process block? If so, when will it try again? Are
> the overheads to establishin a connection really that big?)

It always depends on the case in hand. You are using the current version of
postgres, not one of the snapshots, right? Well, in the current version,
any update to a table locks it, so other processes doing the same
operations are blocked anyway. If your process tree generates a lot of
children that have to write a record to the same table, it will be blocked
anyway. So why not save at least the price of starting another postgres
process and establishing a connection?

BTW, working in a three tier environment has a similar effect as locking.
It's the old "message passing" vs. "memory sharing" in disguise.

Herouth