Thread: small parallel restore optimization

small parallel restore optimization

From

Andrew Dunstan

Date:

06 March 2009, 13:01:42

Here's a little optimization for the parallel restore code, that 
inhibits reopening the archive file unless the worker will actually need 
to read from the file (i.e. a data member). It seems to work OK on both 
Linux and Windows, and I propose to apply it in a day or two.

I've seen a recent error that suggests we are clobbering memory 
somewhere in the parallel code, as well as Olivier Prennant's reported 
error that suggests the same thing, although I'm blessed if I can see 
where it might be. Maybe some more eyeballs on the code would help.


cheers

andrew





Index: pg_backup_archiver.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_backup_archiver.c,v
retrieving revision 1.165
diff -c -u -r1.165 pg_backup_archiver.c
cvs diff: conflicting specifications of output style
--- pg_backup_archiver.c        5 Mar 2009 14:51:10 -0000       1.165
+++ pg_backup_archiver.c        6 Mar 2009 16:51:41 -0000
@@ -3471,8 +3471,11 @@        *        * Note: on Windows, since we are using threads not processes, this        *
*doesn't*close the original file pointer but just open a new 
 
one.
+        *
+        * Only do this if we actually need to read the file for data.        */
-       (AH->ReopenPtr) (AH);
+       if ( te->section == SECTION_DATA )
+          (AH->ReopenPtr) (AH);       /*        * We need our own database connection, too
@@ -3490,7 +3493,9 @@       PQfinish(AH->connection);       AH->connection = NULL;
-       (AH->ClosePtr) (AH);
+       /* close the file if we reopened it */
+       if ( te->section == SECTION_DATA )
+          (AH->ClosePtr) (AH);       if (retval == 0 && AH->public.n_errors)          retval = WORKER_IGNORED_ERRORS;

Re: small parallel restore optimization

From

Tom Lane

Date:

06 March 2009, 13:20:27

Andrew Dunstan <andrew@dunslane.net> writes:
> Here's a little optimization for the parallel restore code, that 
> inhibits reopening the archive file unless the worker will actually need 
> to read from the file (i.e. a data member). It seems to work OK on both 
> Linux and Windows, and I propose to apply it in a day or two.

I think you should close the file immediately at fork if you're not
going to reopen it --- otherwise it's a foot-gun waiting to fire.
IOW, not this, but something more like
if (te->section == SECTION_DATA)    (AH->ReopenPtr) (AH);else    (AH->ClosePtr) (AH);
... worker task ...
if (te->section == SECTION_DATA)    (AH->ClosePtr) (AH);

> I've seen a recent error that suggests we are clobbering memory 
> somewhere in the parallel code, as well as Olivier Prennant's reported 
> error that suggests the same thing, although I'm blessed if I can see 
> where it might be. Maybe some more eyeballs on the code would help.

Can you put together even a weakly reproducible test case?  Something
that only fails every tenth or hundredth time would still help.
        regards, tom lane

Re: small parallel restore optimization

From

Guillaume Smet

Date:

06 March 2009, 13:59:02

On Fri, Mar 6, 2009 at 6:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Can you put together even a weakly reproducible test case?  Something
> that only fails every tenth or hundredth time would still help.

It seems that Olivier can reproduce the problem at will on Unixware. I
don't know if it's easy to find useful information to debug the
problem on this platform though.

See http://archives.postgresql.org/pgsql-hackers/2009-03/msg00201.php

--
Guillaume

Re: small parallel restore optimization

From

ohp@pyrenet.fr

Date:

07 March 2009, 03:41:32

On Fri, 6 Mar 2009, Guillaume Smet wrote:

> Date: Fri, 6 Mar 2009 18:58:58 +0100
> From: Guillaume Smet <guillaume.smet@gmail.com>
> To: Tom Lane <tgl@sss.pgh.pa.us>
> Cc: Andrew Dunstan <andrew@dunslane.net>,
>     PostgreSQL-development <pgsql-hackers@postgresql.org>, ohp@pyrenet.fr
> Subject: Re: [HACKERS] small parallel restore optimization
> 
> On Fri, Mar 6, 2009 at 6:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Can you put together even a weakly reproducible test case?  Something
>> that only fails every tenth or hundredth time would still help.
not sure, none of my tests did fail at the same place.
the only thing I could come  with is a calloc(1,12) that seems to alloc 
mem for filename, in that case sdewitte.dmp; so  the alloc is not counting 
the null char at the end.
not sure it could explain everything though  >
> It seems that Olivier can reproduce the problem at will on Unixware. I
> don't know if it's easy to find useful information to debug the
> problem on this platform though.
>
> See http://archives.postgresql.org/pgsql-hackers/2009-03/msg00201.php
>
>

-- 
Olivier PRENANT                    Tel: +33-5-61-50-97-00 (Work)
15, Chemin des Monges                +33-5-61-50-97-01 (Fax)
31190 AUTERIVE                       +33-6-07-63-80-64 (GSM)
FRANCE                          Email: ohp@pyrenet.fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)

Re: small parallel restore optimization

From

Tom Lane

Date:

07 March 2009, 12:46:53

ohp@pyrenet.fr writes:
> the only thing I could come  with is a calloc(1,12) that seems to alloc 
> mem for filename, in that case sdewitte.dmp; so  the alloc is not counting 
> the null char at the end.

Where do you see that?  The memtool dump you sent doesn't show it AFAICS.
        regards, tom lane

Re: small parallel restore optimization

From

Andrew Dunstan

Date:

07 March 2009, 19:51:14


Tom Lane wrote:
> ohp@pyrenet.fr writes:
>   
>> the only thing I could come  with is a calloc(1,12) that seems to alloc 
>> mem for filename, in that case sdewitte.dmp; so  the alloc is not counting 
>> the null char at the end.
>>     
>
> Where do you see that?  The memtool dump you sent doesn't show it AFAICS.
>
>             
>   

And the only copying of the filename that I can find is done with strdup().

cheers

andrew

Re: small parallel restore optimization

From

Andrew Dunstan

Date:

08 March 2009, 21:45:52

Tom Lane wrote:
>> I've seen a recent error that suggests we are clobbering memory 
>> somewhere in the parallel code, as well as Olivier Prennant's reported 
>> error that suggests the same thing, although I'm blessed if I can see 
>> where it might be. Maybe some more eyeballs on the code would help.
>>     
>
> Can you put together even a weakly reproducible test case?  Something
> that only fails every tenth or hundredth time would still help.
>
>             
>   

I have found the source of the problem I saw. dumputils.c:fmtId() uses a 
static PQExpBuffer which it initialises the first time it's called. This 
gets clobbered by simultaneous calls by Windows threads.

I could just make it auto and set it up on each call, but that could 
result in a non-trivial memory leak ... it's probably called a great 
many times. Or I could provide a parallel version where we pass in a 
PQExpBuffer that we create, one per thread, and is used by anything 
called by the parallel code. That seems like a bit of a potential 
footgun, though.

Has anyone got a better plan?

cheers

andrew

Re: small parallel restore optimization

From

Tom Lane

Date:

08 March 2009, 22:20:40

Andrew Dunstan <andrew@dunslane.net> writes:
> I have found the source of the problem I saw. dumputils.c:fmtId() uses a 
> static PQExpBuffer which it initialises the first time it's called. This 
> gets clobbered by simultaneous calls by Windows threads.

Ugh.  But that doesn't explain the original trouble report on Unixware.

> I could just make it auto and set it up on each call, but that could 
> result in a non-trivial memory leak ... it's probably called a great 
> many times. Or I could provide a parallel version where we pass in a 
> PQExpBuffer that we create, one per thread, and is used by anything 
> called by the parallel code. That seems like a bit of a potential 
> footgun, though.

I think we should try hard to keep this localized to fmtId(), rather
than changing the callers --- the latter would be a huge readability
hit.  Is there a reasonable way to have fmtId use thread-local storage
for its PQExpBuffer pointer on Windows?
        regards, tom lane

Re: small parallel restore optimization

From

Alvaro Herrera

Date:

08 March 2009, 22:28:46

Andrew Dunstan wrote:

> I have found the source of the problem I saw. dumputils.c:fmtId() uses a  
> static PQExpBuffer which it initialises the first time it's called. This  
> gets clobbered by simultaneous calls by Windows threads.
>
> I could just make it auto and set it up on each call, but that could  
> result in a non-trivial memory leak ... it's probably called a great  
> many times. Or I could provide a parallel version where we pass in a  
> PQExpBuffer that we create, one per thread, and is used by anything  
> called by the parallel code. That seems like a bit of a potential  
> footgun, though.

Could you use a different static PQExpBuffer on each thread?
pthread_getspecific sort of thing, only to be used on Windows.

BTW does fmtQualifiedId have the same problem?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: small parallel restore optimization

From

Andrew Dunstan

Date:

08 March 2009, 22:45:51


Alvaro Herrera wrote:
> Andrew Dunstan wrote:
>
>   
>> I have found the source of the problem I saw. dumputils.c:fmtId() uses a  
>> static PQExpBuffer which it initialises the first time it's called. This  
>> gets clobbered by simultaneous calls by Windows threads.
>>
>> I could just make it auto and set it up on each call, but that could  
>> result in a non-trivial memory leak ... it's probably called a great  
>> many times. Or I could provide a parallel version where we pass in a  
>> PQExpBuffer that we create, one per thread, and is used by anything  
>> called by the parallel code. That seems like a bit of a potential  
>> footgun, though.
>>     
>
> Could you use a different static PQExpBuffer on each thread?
> pthread_getspecific sort of thing, only to be used on Windows.
>   

For MSVC we could declare it with "_declspec(thread)" and it would be 
allocated in thread-local storage, but it looks like this isn't 
supported on Mingw.

> BTW does fmtQualifiedId have the same problem?
>
>   

Yes, but it's not called in threaded code - it's only in pg_dump and 
only pg_restore is threaded.

cheers

andrew

Re: small parallel restore optimization

From

Tom Lane

Date:

08 March 2009, 23:16:07

Andrew Dunstan <andrew@dunslane.net> writes:
> Alvaro Herrera wrote:
>> Could you use a different static PQExpBuffer on each thread?
>> pthread_getspecific sort of thing, only to be used on Windows.

> For MSVC we could declare it with "_declspec(thread)" and it would be 
> allocated in thread-local storage, but it looks like this isn't 
> supported on Mingw.

Some googling suggests that thread-local storage is supported on latest
release (not clear if it's exactly the same syntax though :-().

Worst case, we could say that parallel restore isn't supported on
mingw.  I'm not entirely sure why we bother with that platform at all...
        regards, tom lane

Re: small parallel restore optimization

From

Alvaro Herrera

Date:

08 March 2009, 23:34:28

Tom Lane wrote:

> Worst case, we could say that parallel restore isn't supported on
> mingw.  I'm not entirely sure why we bother with that platform at all...

I think you're confusing it with cygwin ...

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: small parallel restore optimization

From

Magnus Hagander

Date:

09 March 2009, 06:48:23

Alvaro Herrera wrote:
> Tom Lane wrote:
> 
>> Worst case, we could say that parallel restore isn't supported on
>> mingw.  I'm not entirely sure why we bother with that platform at all...
> 
> I think you're confusing it with cygwin ...

Yeah. Much as I hate working around the quirks of mingw, I definitely
think we need to keep that platform.

(yes, cygwin is another story, but let's not repeat that discussion now)

//Magnus

Re: small parallel restore optimization

From

Magnus Hagander

Date:

09 March 2009, 06:53:17

Andrew Dunstan wrote:
> 
> 
> Alvaro Herrera wrote:
>> Andrew Dunstan wrote:
>>
>>  
>>> I have found the source of the problem I saw. dumputils.c:fmtId()
>>> uses a  static PQExpBuffer which it initialises the first time it's
>>> called. This  gets clobbered by simultaneous calls by Windows threads.
>>>
>>> I could just make it auto and set it up on each call, but that could 
>>> result in a non-trivial memory leak ... it's probably called a great 
>>> many times. Or I could provide a parallel version where we pass in a 
>>> PQExpBuffer that we create, one per thread, and is used by anything 
>>> called by the parallel code. That seems like a bit of a potential 
>>> footgun, though.
>>>     
>>
>> Could you use a different static PQExpBuffer on each thread?
>> pthread_getspecific sort of thing, only to be used on Windows.
>>   
> 
> For MSVC we could declare it with "_declspec(thread)" and it would be
> allocated in thread-local storage, but it looks like this isn't
> supported on Mingw.

Yeah, _declspec(thread) would be the easy way to do it. But you can also
use the TlsAlloc(), TlsSetValue() and friends functions directly. We do
this to use TLS in ecpg.

It requires some coding around, but for ecpg we did a macro that makes
it behave like the posix functions (see
src/interfaces/ecpg/include/ecpg-pthread-win32.h)

//Magnus

Re: small parallel restore optimization

From

Andrew Dunstan

Date:

09 March 2009, 12:47:52


Magnus Hagander wrote:
> Andrew Dunstan wrote:
>   
>> Alvaro Herrera wrote:
>>     
>>> Andrew Dunstan wrote:
>>>
>>>  
>>>       
>>>> I have found the source of the problem I saw. dumputils.c:fmtId()
>>>> uses a  static PQExpBuffer which it initialises the first time it's
>>>> called. This  gets clobbered by simultaneous calls by Windows threads.
>>>>
>>>> I could just make it auto and set it up on each call, but that could 
>>>> result in a non-trivial memory leak ... it's probably called a great 
>>>> many times. Or I could provide a parallel version where we pass in a 
>>>> PQExpBuffer that we create, one per thread, and is used by anything 
>>>> called by the parallel code. That seems like a bit of a potential 
>>>> footgun, though.
>>>>     
>>>>         
>>> Could you use a different static PQExpBuffer on each thread?
>>> pthread_getspecific sort of thing, only to be used on Windows.
>>>   
>>>       
>> For MSVC we could declare it with "_declspec(thread)" and it would be
>> allocated in thread-local storage, but it looks like this isn't
>> supported on Mingw.
>>     
>
> Yeah, _declspec(thread) would be the easy way to do it. But you can also
> use the TlsAlloc(), TlsSetValue() and friends functions directly. We do
> this to use TLS in ecpg.
>
> It requires some coding around, but for ecpg we did a macro that makes
> it behave like the posix functions (see
> src/interfaces/ecpg/include/ecpg-pthread-win32.h)
>
>
>   


Yeah, we'll just have to call TlsAlloc to set the thread handle 
somewhere near program start, but that shouldn't be too intrusive.

cheers

andrew