Thread: Testing of parallel restore with current snapshot

Testing of parallel restore with current snapshot

From
Josh Berkus
Date:
Andrew, Tom,

Just wanted to remark on my tests of this feature since the last set of 
patches.

First of all, no locking.  Yay.

Andrew's latest algorithm tends to result in building indexes on the 
same table at the same time.  This is excellent for most users; I'm on a 
client's site which is I/O bound and that approach is speeding up 
parallel load about 20% compared to the beta1 version.

In other words, don't mess with it now.  I think it's perfect.  ;-)

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


Re: Testing of parallel restore with current snapshot

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> Andrew's latest algorithm tends to result in building indexes on the 
> same table at the same time.  This is excellent for most users; I'm on a 
> client's site which is I/O bound and that approach is speeding up 
> parallel load about 20% compared to the beta1 version.

Hmph ... that seems like a happenstance, because there isn't anything in
there that is specifically trying to organize things that way.  AFAIK
it's only accounting for required dependencies, not for possible
performance implications of scheduling various tasks together.

> In other words, don't mess with it now.  I think it's perfect.  ;-)

I don't want to mess with it right now either, but perhaps we should
have a TODO item to improve the intelligence of parallel restore so that
it really does try to do things this way.
        regards, tom lane


Re: Testing of parallel restore with current snapshot

From
Andrew Dunstan
Date:

Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
>   
>> Andrew's latest algorithm tends to result in building indexes on the 
>> same table at the same time.  This is excellent for most users; I'm on a 
>> client's site which is I/O bound and that approach is speeding up 
>> parallel load about 20% compared to the beta1 version.
>>     
>
> Hmph ... that seems like a happenstance, because there isn't anything in
> there that is specifically trying to organize things that way.  AFAIK
> it's only accounting for required dependencies, not for possible
> performance implications of scheduling various tasks together.
>
>   
>> In other words, don't mess with it now.  I think it's perfect.  ;-)
>>     
>
> I don't want to mess with it right now either, but perhaps we should
> have a TODO item to improve the intelligence of parallel restore so that
> it really does try to do things this way.
>
>             
>   

Other things being equal it schedules things in TOC order, which often 
works as we want anyway. I think there's a good case for altering the 
name sort order of pg_dump to group sub-objects of a table (indexes, 
constraints etc.) together, ie. instead of sorting by <objectname>, we'd 
sort by <tablename, objectname>. This would possibly improve the effect 
seen in parallel restore without requiring any extra intelligence there.

But I agree it's worth further study. I suspect we can probably beef up 
parallel restore quite a bit. My object for this release was to get the 
basics working, especially since I started quite late in the development 
cycle, and it was a struggle just to make the cut.

cheers

andrew





Re: Testing of parallel restore with current snapshot

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> I don't want to mess with it right now either, but perhaps we should
>> have a TODO item to improve the intelligence of parallel restore so that
>> it really does try to do things this way.

> Other things being equal it schedules things in TOC order, which often 
> works as we want anyway. I think there's a good case for altering the 
> name sort order of pg_dump to group sub-objects of a table (indexes, 
> constraints etc.) together, ie. instead of sorting by <objectname>, we'd 
> sort by <tablename, objectname>. This would possibly improve the effect 
> seen in parallel restore without requiring any extra intelligence there.

I'm not at all excited about substituting one arbitrary ordering rule
for another one ...

What is probably happening that accounts for Josh's positive experience
is that all the indexes of a particular table "come free" from the
dependency restrictions at the same instant, namely when the data load
for that table ends.  So if there's nothing else to do they'll get
scheduled together.  However, if the data load for table B finishes
before all the indexes of table A have been scheduled, then B's indexes
will start competing with A's for scheduling slots.  The performance
considerations suggest that we'd be best advised to finish out all of
A's indexes before scheduling any of B's, but I'm not sure that that's
what it will actually do.

Based on this thought, what seems to make sense as a quick-and-dirty
answer is to make sure that items get scheduled in the same order they
came free from dependency restrictions.  I don't recall whether that
is true at the moment, or how hard it might be to make it true if it
isn't already.
        regards, tom lane


Re: Testing of parallel restore with current snapshot

From
Andrew Dunstan
Date:

Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>   
>> Tom Lane wrote:
>>     
>>> I don't want to mess with it right now either, but perhaps we should
>>> have a TODO item to improve the intelligence of parallel restore so that
>>> it really does try to do things this way.
>>>       
>
>   
>> Other things being equal it schedules things in TOC order, which often 
>> works as we want anyway. I think there's a good case for altering the 
>> name sort order of pg_dump to group sub-objects of a table (indexes, 
>> constraints etc.) together, ie. instead of sorting by <objectname>, we'd 
>> sort by <tablename, objectname>. This would possibly improve the effect 
>> seen in parallel restore without requiring any extra intelligence there.
>>     
>
> I'm not at all excited about substituting one arbitrary ordering rule
> for another one ...
>
> What is probably happening that accounts for Josh's positive experience
> is that all the indexes of a particular table "come free" from the
> dependency restrictions at the same instant, namely when the data load
> for that table ends.  So if there's nothing else to do they'll get
> scheduled together.  However, if the data load for table B finishes
> before all the indexes of table A have been scheduled, then B's indexes
> will start competing with A's for scheduling slots.  The performance
> considerations suggest that we'd be best advised to finish out all of
> A's indexes before scheduling any of B's, but I'm not sure that that's
> what it will actually do.
>
> Based on this thought, what seems to make sense as a quick-and-dirty
> answer is to make sure that items get scheduled in the same order they
> came free from dependency restrictions.  I don't recall whether that
> is true at the moment, or how hard it might be to make it true if it
> isn't already.
>
>             
>   

AIUI, pg_dump sorts items by <object-type, schema, objectname> and then 
does a topological sort to permute this order to reflect dependencies. 
This is the TOC order parallel restore starts with (unless the order is 
mucked with by the user via the --use-list option).  Each time it needs 
to schedule an item from the list, it chooses the first one yet to run 
that meets both these criteria:
   * all its dependencies have already been restored   * it has no locking conflicts with a currently running item.

Now, it is common practice to use the table name as a prefix of an index 
name, and this will actually cause indexes for a table to be grouped 
together in the TOC list. I think that's why Josh is seeing what he's 
seeing. If this holds, then all of the index creations for A will be 
started before any of the indexes for B, even if B's table data finishes 
restoring half way through restoring A's indexes. So your speculation 
about B's indexes contending with A's is incorrect unless their names 
sort intermingled.

During development, I did play with changing the TOC order some, but 
abandoned it, as testing didn't show any obvious gain - if anything the 
reverse. There are some remnants of this in the code.

cheers

andrew


Re: Testing of parallel restore with current snapshot

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> Based on this thought, what seems to make sense as a quick-and-dirty
>> answer is to make sure that items get scheduled in the same order they
>> came free from dependency restrictions.  I don't recall whether that
>> is true at the moment, or how hard it might be to make it true if it
>> isn't already.

> AIUI, pg_dump sorts items by <object-type, schema, objectname> and then 
> does a topological sort to permute this order to reflect dependencies. 
> This is the TOC order parallel restore starts with (unless the order is 
> mucked with by the user via the --use-list option).  Each time it needs 
> to schedule an item from the list, it chooses the first one yet to run 
> that meets both these criteria:

>     * all its dependencies have already been restored
>     * it has no locking conflicts with a currently running item.

Hmm, the locking-conflicts point puts a bit of a hole in my thinking,
because that's non-monotonic --- an item that was ready to run a moment
ago might no longer be ready to run after you start some other task that
has a lock conflict with it.

Anyway, the idea I'd had was to use two lists: a list of jobs that are
still blocked by dependencies, and a list of jobs that have no remaining
dependencies.  Whenever a job finishes, run through the former list and
transfer any no-longer-blocked items to the end of the latter list.
Then the selection rule is to run the first element of the latter list
that hasn't got any locking conflicts.  This would tend to preserve the
behavior of executing things in the order they come free, though it
wouldn't be perfect because of the locking issue.  It would be easy
enough to code up, and probably cleaner/faster than what's there now
(which has a weird hack to avoid O(N^2) behavior due to examining the
*entire* TOC list every time).  I'm unconvinced though that it would
make any meaningful difference, and I'm not in a position to do serious
testing.  Is anyone interested enough to try it if I code it?
        regards, tom lane


Re: Testing of parallel restore with current snapshot

From
Josh Berkus
Date:
Tom,

>   Is anyone interested enough to try it if I code it?

If you're patient for results, sure.  I seem to be doing a customer 
migration or upgrade every week now, so it wouldn't take me long to have 
a test subject with a fairly complex database.


-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


Re: Testing of parallel restore with current snapshot

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> Tom,
>> Is anyone interested enough to try it if I code it?

> If you're patient for results, sure.  I seem to be doing a customer
> migration or upgrade every week now, so it wouldn't take me long to have
> a test subject with a fairly complex database.

Here's a draft patch that does ordering using two lists, as I proposed.
Please test to see if it's any faster or slower than the original logic.

Note: since this changes struct TocEntry, be sure to recompile all files
in src/bin/pg_dump/ after patching.

            regards, tom lane


Attachment

Re: Testing of parallel restore with current snapshot

From
Josh Berkus
Date:
Tom,

> Here's a draft patch that does ordering using two lists, as I proposed.
> Please test to see if it's any faster or slower than the original logic.

Great.  I'll need to get permission from a client; I can't host large 
enough/complex enough databases on my own system.  :-(

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


Re: Testing of parallel restore with current snapshot

From
Bruce Momjian
Date:
I don't see this as every having been applied.  What should we do with
it?

---------------------------------------------------------------------------

Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
> > Tom,
> >> Is anyone interested enough to try it if I code it?
> 
> > If you're patient for results, sure.  I seem to be doing a customer 
> > migration or upgrade every week now, so it wouldn't take me long to have 
> > a test subject with a fairly complex database.
> 
> Here's a draft patch that does ordering using two lists, as I proposed.
> Please test to see if it's any faster or slower than the original logic.
> 
> Note: since this changes struct TocEntry, be sure to recompile all files
> in src/bin/pg_dump/ after patching.
> 
>             regards, tom lane
> 

Content-Description: alternate-parallel-restore-1.patch.gz

[ Type application/octet-stream treated as attachment, skipping... ]

> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.comPG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do + If your life is a hard
drive,Christ can be your backup. +
 


Re: Testing of parallel restore with current snapshot

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> I don't see this as every having been applied.  What should we do with
> it?

I believe we decided that there wasn't any measurable win.
        regards, tom lane


Re: Testing of parallel restore with current snapshot

From
Gokulakannan Somasundaram
Date:
Tom,<br />    I just took the patch, but it seems to be in binary format. Can you send me the patch to me?<br /><br
/>Thanks,<br/>Gokul.<br /><br /><div class="gmail_quote">On Sat, May 30, 2009 at 3:12 AM, Tom Lane <span
dir="ltr"><<ahref="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>></span> wrote:<br /><blockquote
class="gmail_quote"style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left:
1ex;"><divclass="im">Josh Berkus <<a href="mailto:josh@agliodbs.com">josh@agliodbs.com</a>> writes:<br
/></div><divclass="im">> Tom,<br /> >> Is anyone interested enough to try it if I code it?<br /><br /> > If
you'repatient for results, sure.  I seem to be doing a customer<br /> > migration or upgrade every week now, so it
wouldn'ttake me long to have<br /> > a test subject with a fairly complex database.<br /><br /></div>Here's a draft
patchthat does ordering using two lists, as I proposed.<br /> Please test to see if it's any faster or slower than the
originallogic.<br /><br /> Note: since this changes struct TocEntry, be sure to recompile all files<br /> in
src/bin/pg_dump/after patching.<br /><br />                        regards, tom lane<br /><br /><br /><br /> --<br />
Sentvia pgsql-hackers mailing list (<a href="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br
/>To make changes to your subscription:<br /><a href="http://www.postgresql.org/mailpref/pgsql-hackers"
target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/><br /></blockquote></div><br /> 

Re: Testing of parallel restore with current snapshot

From
Andrea Suisani
Date:
On 27/02/2010 07:52, Gokulakannan Somasundaram wrote:
> Tom,
>      I just took the patch, but it seems to be in binary format. Can you send
> me the patch to me?

gunzip shuould do the trick

> Thanks,
> Gokul.
>
> On Sat, May 30, 2009 at 3:12 AM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>
>> Josh Berkus<josh@agliodbs.com>  writes:
>>> Tom,
>>>> Is anyone interested enough to try it if I code it?
>>
>>> If you're patient for results, sure.  I seem to be doing a customer
>>> migration or upgrade every week now, so it wouldn't take me long to have
>>> a test subject with a fairly complex database.