Re: parallel restore vs. windows - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: parallel restore vs. windows
Date
Msg-id 4948F65E.4020504@dunslane.net
Whole thread Raw
In response to Re: parallel restore vs. windows  (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
List pgsql-hackers

ITAGAKI Takahiro wrote:
> Andrew Dunstan <andrew@dunslane.net> wrote:
>
>   
>> I did this, but it turned out that the problem was a logic error that I 
>> found once I had managed to get a working debugger. However, the Windows 
>> thread code should now be more robust, so thanks to Andrew and Magnus 
>> for the suggestions.
>>     
>
> Hello, I tested parallel restore on Windows.
> I have some random comments about it:
>   


Thanks for this
> * Two compiler warnings.
> pg_backup_custom.c: In function `_PrintTocData':
> pg_backup_custom.c:437: warning: unused variable `ctx'
> pg_backup_custom.c: In function `_ReopenArchive':
> pg_backup_custom.c:849: warning: unused variable `ctx'
>   


Will be fixed in code cleanup
> * No description about new options in pg_restore --help.
> There are no help messages about multi-thread (-m) and
> truncate-before-load options.
>   

Will fix
> * multi-thread option is ignored if --data-only is on.
> Is it an intended behavior? Even if so, we'd better to have
> warning messages here.
>   

Not intended, unless my memory is fading. I will check.
> * Threads, forked processes and connections are disposed per entry.
> I think it's a designed behavior, but there might be room for
> improvement. The present implementation is slower when there
> are many small objects. If we can specialize in thread-based
> implementation, thread pooling and connections pooling are
> typically used in the context. -- it might be a ToDo item in 8.5.
>   


Yes. I only got threading working at all just a few days ago. I think 
your suggestion is a good one, and we should probably converge on a 
threaded implementation and then look at using pooling. However, as you 
say that would be work for the 8.5 timeframe.

> ----
> I have no idea about performance because I don't have multi-core
> machine for windows. Parallel restore seems to be slower than
> serial restore on single-cpu machine.
>   

Not surprising. There is extra connection, worker setup/breakdown, 
dependency housekeeping and context switching involved. However, I'd be 
surprised if the overhead were huge.


cheers

andrew


pgsql-hackers by date:

Previous
From: Zdenek Kotala
Date:
Subject: Re: Visibility map and freezing
Next
From: Tom Lane
Date:
Subject: Re: visibility maps