On 27.01.2012 18:46, Robert Haas wrote:
> On Sun, Jan 15, 2012 at 1:01 PM, Joachim Wieland<joe@mcknight.de> wrote:
>> In parallel restore, the master closes its own connection to the
>> database before forking of worker processes, just as it does now. In
>> parallel dump however, we need to hold the masters connection open so
>> that we can detect deadlocks. The issue is that somebody could have
>> requested an exclusive lock after the master has initially requested a
>> shared lock on all tables. Therefore, the worker process also requests
>> a shared lock on the table with NOWAIT and if this fails, we know that
>> there is a conflicting lock in between and that we need to abort the
>> dump.
>
> I think this is an acceptable limitation, but the window where it can
> happen seems awfully wide right now. As things stand, it seems like
> we don't try to lock the table in the child until we're about to
> access it, which means that, on a large database, we could dump out
> 99% of the database and then be forced to abort the dump because of a
> conflicting lock on the very last table. We could fix that by having
> every child lock every table right at the beginning, so that all
> possible failures of this type would happen before we do any work, but
> that will eat up a lot of lock table space. It would be nice if the
> children could somehow piggyback on the parent's locks, but I don't
> see any obvious way to make that work. Maybe we just have to live
> with it the way it is, but I worry that people whose dumps fail 10
> hours into a 12 hour parallel dump are going to be grumpy.
If the master process keeps the locks it acquires in the beginning, you
could fall back to dumping those tables where the child lock fails using
the master connection.
-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com