Thread: Sharing database handles across forked child processes
How does Postgres handle sharing database handles across child processes? That is, if I have a process that opens a connection to the database and then forks a few child processes, what happens? Can the child processes safely use the handle? If one child closes the handle, what happens to the handle in all the other children? The parent? This isn't a great thing to do, I realize, but I'm wedging database access into an existing heavily fork-bound perl program, so my hands are somewhat tied architecturally. (If it means I have to constantly test to see if the handle's valid, and may have to deal with a handle randomly going away on me, I can handle that -- I'm more worried about data corruption and deadlock problems here, stuff I can't reasonably catch at the application level) -Dan
On Tue, Nov 13, 2007 at 12:02:31PM -0500, dan@sidhe.org wrote: > How does Postgres handle sharing database handles across child processes? > That is, if I have a process that opens a connection to the database and > then forks a few child processes, what happens? > > Can the child processes safely use the handle? No. > If one child closes the handle, what happens to the handle in all the > other children? The parent? Just closing the file descriptor is ok. Just forgetting about it is ok too.. Best just ignore you have it open at all... > This isn't a great thing to do, I realize, but I'm wedging database access > into an existing heavily fork-bound perl program, so my hands are somewhat > tied architecturally. (If it means I have to constantly test to see if the > handle's valid, and may have to deal with a handle randomly going away on > me, I can handle that -- I'm more worried about data corruption and > deadlock problems here, stuff I can't reasonably catch at the application > level) You're going to need a seperate handle for each process. Two processes writing to the same socket won't work. Maybe just setup a table indexed by PID and make sure you only use your own. Or after a fork() do a "close $dbh->getfd()" (untested). Hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Those who make peaceful revolution impossible will make violent revolution inevitable. > -- John F Kennedy
Attachment
Martijn van Oosterhout <kleptog@svana.org> writes: > On Tue, Nov 13, 2007 at 12:02:31PM -0500, dan@sidhe.org wrote: >> How does Postgres handle sharing database handles across child processes? >> That is, if I have a process that opens a connection to the database and >> then forks a few child processes, what happens? >> >> Can the child processes safely use the handle? > No. For some time now, libpq has set FD_CLOEXEC on the socket connection to the backend, which ensures that child processes won't be able to mess up the parent's database connection. However it sounded like Dan might be doing fork without exec, in which case he's definitely at risk ... regards, tom lane
> Martijn van Oosterhout <kleptog@svana.org> writes: >> On Tue, Nov 13, 2007 at 12:02:31PM -0500, dan@sidhe.org wrote: >>> How does Postgres handle sharing database handles across child >>> processes? >>> That is, if I have a process that opens a connection to the database >>> and >>> then forks a few child processes, what happens? >>> >>> Can the child processes safely use the handle? > >> No. > > For some time now, libpq has set FD_CLOEXEC on the socket connection to > the backend, which ensures that child processes won't be able to mess up > the parent's database connection. However it sounded like Dan might be > doing fork without exec, in which case he's definitely at risk ... Yep, this is a fork without exec. And the child processes often aren't even doing any database access -- the database connection's opened and held, then a child is forked off, and the child 'helpfully' closes the handle during the child's global destruction phase. Am I at any risk in the parent process? That is, if the parent's got some transaction open, the child is forked, then the child either issues (perhaps in error) a command to the database or shuts the handle down, am I going to see any sort of corruption of the data on the back end? I fully realize this is a Bad Thing, no argument there -- I'm just trying to get a feel for my failure modes. If it's just going to be that the parent sees the handle go away that's one thing, if I'm going to see weird interleaving of commands from the parent and child or the back end's going to get confused enough to corrupt the database it's something else entirely. -Dan
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 > Yep, this is a fork without exec. And the child processes often aren't > even doing any database access -- the database connection's opened and > held, then a child is forked off, and the child 'helpfully' closes the > handle during the child's global destruction phase. > > Am I at any risk in the parent process? Yes. But there is an easy solution, asuming you are using DBI: $dbh->{InactiveDestroy} = 1; This tells DBI not to do anything special when inside of DESTROY. Set on the kids immediately after forking. > "the child processes often aren't even doing any database access" ^^^^^^^^^^^^ Often aren't? This should be "never", period, unless the parent contracts to stop doing database access after the fork. You can't have two processes sharing a handle. Note also that InactiveDestroy should not be your first choice. Far better to do the forking before the database connection whenever possible. If they both need access, you can also disconnect, fork, and have both reconnect afterwards. - -- Greg Sabino Mullane greg@turnstep.com End Point Corporation PGP Key: 0x14964AC8 200711131332 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iD8DBQFHOe7qvJuQZxSWSsgRA/BUAJ4tfyoZja93h3q6EtJ3lHiGRRODOACg/M2Y 5VlkKiSZNfstdgrD5Ru+Q/c= =OjGF -----END PGP SIGNATURE-----
On Tue, Nov 13, 2007 at 01:18:25PM -0500, dan@sidhe.org wrote: > Yep, this is a fork without exec. And the child processes often aren't > even doing any database access -- the database connection's opened and > held, then a child is forked off, and the child 'helpfully' closes the > handle during the child's global destruction phase. Yes, that happens. > Am I at any risk in the parent process? That is, if the parent's got some > transaction open, the child is forked, then the child either issues > (perhaps in error) a command to the database or shuts the handle down, am > I going to see any sort of corruption of the data on the back end? Well,corruption of the backend is unlikely, but you're likely to get and lot of strange errors on your other connections. That why I suggested closing the filehandle behind the back of the library. Or better, dup2() /dev/null over the top of it. Then during global destruction it'll just think the DB went away. It's a bit of a hack though. > I fully realize this is a Bad Thing, no argument there -- I'm just trying > to get a feel for my failure modes. If it's just going to be that the > parent sees the handle go away that's one thing, if I'm going to see weird > interleaving of commands from the parent and child or the back end's going > to get confused enough to corrupt the database it's something else > entirely. I think the effect is comparable to two people typing into the same shell, and each only getting half the output back. Sure, you're unlikely to lose anything big, but do you want to risk it? Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Those who make peaceful revolution impossible will make violent revolution inevitable. > -- John F Kennedy
Attachment
On Nov 13, 2007, at 1:18 PM, dan@sidhe.org wrote: > Yep, this is a fork without exec. And the child processes often aren't > even doing any database access -- the database connection's opened and > held, then a child is forked off, and the child 'helpfully' closes the > handle during the child's global destruction phase. What's your programming language? If it is perl using the DBI, you *must* close the handle on the child else perl's object destroy will try to close the handle by doing a shutdown on the connection, which will muck up your parent. The voodoo to make this happen is this: $dbh->{InactiveDestroy} = 1; $dbh = undef; Also note that for some reason, this invalidates any prepared statements in the parent DBI object, so you need to make sure you don't have any, or just re-open the handle on the parent too.
> > -----BEGIN PGP SIGNED MESSAGE----- > Hash: RIPEMD160 > > >> Yep, this is a fork without exec. And the child processes often aren't >> even doing any database access -- the database connection's opened and >> held, then a child is forked off, and the child 'helpfully' closes the >> handle during the child's global destruction phase. >> >> Am I at any risk in the parent process? > > Yes. But there is an easy solution, asuming you are using DBI: > > $dbh->{InactiveDestroy} = 1; > > This tells DBI not to do anything special when inside of DESTROY. Set > on the kids immediately after forking. I don't currently have a wedge into the parts of the programs that're forking. I'd hoped to avoid having to, but at this point I'm thinking that was a touch naive. (I'm also thinking I may want to hassle Rafael into putting a post-fork handler into 5.10, but that's a separate issue) >> "the child processes often aren't even doing any database access" > ^^^^^^^^^^^^ > > Often aren't? This should be "never", period, unless the parent contracts > to stop doing database access after the fork. You can't have two processes > sharing a handle. The child processes are supposed to get their own handles; there's some caching involved, but the cache checks pids. That doesn't mean the children all do get their own handles, just that they're supposed to. Regardless, at this point I'm sufficiently convinced that things will potentially be bad (or at least annoying) enough that it warrants fixing it now, rather than just putting it off and relying on error traps. -Dan