Re: parallel pg_restore - WIP patch - Mailing list pgsql-hackers

From Russell Smith
Subject Re: parallel pg_restore - WIP patch
Date
Msg-id 48DCDBF8.1020501@pws.com.au
Whole thread Raw
In response to Re: parallel pg_restore - WIP patch  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: parallel pg_restore - WIP patch  (Andrew Dunstan <andrew@dunslane.net>)
Re: parallel pg_restore - WIP patch  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-hackers
Andrew Dunstan wrote:
>> Do we know why we experience "tuple concurrently updated" errors if we
>> spawn thread too fast?
>>   
>
> No. That's an open item.

Okay, I'll see if I can have a little more of a look into it.  No
promises as the restore the restore isn't playing nicely.
>
>>
>> the memory context is shared across all threads.  Which means that it's
>> possible the memory contexts are stomping on each other.  My GDB skills
>> are now up to being able to reproduce this in a gdb session as there are
>> forks going on all over the place.  And if you process them in a serial
>> fashion, there aren't any errors.  I'm not sure of the fix for this.
>> But in a parallel environment it doesn't seem possible to store the
>> memory context in the AH.
>>   
>
>
> There are no threads, hence nothing is shared. fork() create s new
> process, not a new thread, and all they share are file descriptors.
>
> However, there does seem to be something odd happening with the
> compression lib, which I will investigate. Thanks for the report.

I'm sorry, I meant processes there.  I'm aware there are no threads. 
But my feeling was that when you forked with open files you got all of
the open file properties, including positions, and as you dupped the
descriptor, you share all that it's pointing to with every other copy of
the descriptor.  My brief research on that shows that in 2005 there was
a kernel mailing list discussion on this issue. 
http://mail.nl.linux.org/kernelnewbies/2005-09/msg00479.html was quite
informative for me.  I again could be wrong but worth a read.  If it is
true, then the file needs to be reopened by each child, it can't use the
duplicated descriptor.  I haven't had a change to implementation test is
as it's late here.  But I'd take a stab that it will solve the
compression library problems.

I hope this helps, not hinders

Russell.


pgsql-hackers by date:

Previous
From: Zeugswetter Andreas OSB sIT
Date:
Subject: Re: Updates of SE-PostgreSQL 8.4devel patches
Next
From: Zdenek Kotala
Date:
Subject: Re: FSM, now without WAL-logging