Thread: Testing of parallel restore with current snapshot
Andrew, Tom, Just wanted to remark on my tests of this feature since the last set of patches. First of all, no locking. Yay. Andrew's latest algorithm tends to result in building indexes on the same table at the same time. This is excellent for most users; I'm on a client's site which is I/O bound and that approach is speeding up parallel load about 20% compared to the beta1 version. In other words, don't mess with it now. I think it's perfect. ;-) -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: > Andrew's latest algorithm tends to result in building indexes on the > same table at the same time. This is excellent for most users; I'm on a > client's site which is I/O bound and that approach is speeding up > parallel load about 20% compared to the beta1 version. Hmph ... that seems like a happenstance, because there isn't anything in there that is specifically trying to organize things that way. AFAIK it's only accounting for required dependencies, not for possible performance implications of scheduling various tasks together. > In other words, don't mess with it now. I think it's perfect. ;-) I don't want to mess with it right now either, but perhaps we should have a TODO item to improve the intelligence of parallel restore so that it really does try to do things this way. regards, tom lane
Tom Lane wrote: > Josh Berkus <josh@agliodbs.com> writes: > >> Andrew's latest algorithm tends to result in building indexes on the >> same table at the same time. This is excellent for most users; I'm on a >> client's site which is I/O bound and that approach is speeding up >> parallel load about 20% compared to the beta1 version. >> > > Hmph ... that seems like a happenstance, because there isn't anything in > there that is specifically trying to organize things that way. AFAIK > it's only accounting for required dependencies, not for possible > performance implications of scheduling various tasks together. > > >> In other words, don't mess with it now. I think it's perfect. ;-) >> > > I don't want to mess with it right now either, but perhaps we should > have a TODO item to improve the intelligence of parallel restore so that > it really does try to do things this way. > > > Other things being equal it schedules things in TOC order, which often works as we want anyway. I think there's a good case for altering the name sort order of pg_dump to group sub-objects of a table (indexes, constraints etc.) together, ie. instead of sorting by <objectname>, we'd sort by <tablename, objectname>. This would possibly improve the effect seen in parallel restore without requiring any extra intelligence there. But I agree it's worth further study. I suspect we can probably beef up parallel restore quite a bit. My object for this release was to get the basics working, especially since I started quite late in the development cycle, and it was a struggle just to make the cut. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> I don't want to mess with it right now either, but perhaps we should >> have a TODO item to improve the intelligence of parallel restore so that >> it really does try to do things this way. > Other things being equal it schedules things in TOC order, which often > works as we want anyway. I think there's a good case for altering the > name sort order of pg_dump to group sub-objects of a table (indexes, > constraints etc.) together, ie. instead of sorting by <objectname>, we'd > sort by <tablename, objectname>. This would possibly improve the effect > seen in parallel restore without requiring any extra intelligence there. I'm not at all excited about substituting one arbitrary ordering rule for another one ... What is probably happening that accounts for Josh's positive experience is that all the indexes of a particular table "come free" from the dependency restrictions at the same instant, namely when the data load for that table ends. So if there's nothing else to do they'll get scheduled together. However, if the data load for table B finishes before all the indexes of table A have been scheduled, then B's indexes will start competing with A's for scheduling slots. The performance considerations suggest that we'd be best advised to finish out all of A's indexes before scheduling any of B's, but I'm not sure that that's what it will actually do. Based on this thought, what seems to make sense as a quick-and-dirty answer is to make sure that items get scheduled in the same order they came free from dependency restrictions. I don't recall whether that is true at the moment, or how hard it might be to make it true if it isn't already. regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> Tom Lane wrote: >> >>> I don't want to mess with it right now either, but perhaps we should >>> have a TODO item to improve the intelligence of parallel restore so that >>> it really does try to do things this way. >>> > > >> Other things being equal it schedules things in TOC order, which often >> works as we want anyway. I think there's a good case for altering the >> name sort order of pg_dump to group sub-objects of a table (indexes, >> constraints etc.) together, ie. instead of sorting by <objectname>, we'd >> sort by <tablename, objectname>. This would possibly improve the effect >> seen in parallel restore without requiring any extra intelligence there. >> > > I'm not at all excited about substituting one arbitrary ordering rule > for another one ... > > What is probably happening that accounts for Josh's positive experience > is that all the indexes of a particular table "come free" from the > dependency restrictions at the same instant, namely when the data load > for that table ends. So if there's nothing else to do they'll get > scheduled together. However, if the data load for table B finishes > before all the indexes of table A have been scheduled, then B's indexes > will start competing with A's for scheduling slots. The performance > considerations suggest that we'd be best advised to finish out all of > A's indexes before scheduling any of B's, but I'm not sure that that's > what it will actually do. > > Based on this thought, what seems to make sense as a quick-and-dirty > answer is to make sure that items get scheduled in the same order they > came free from dependency restrictions. I don't recall whether that > is true at the moment, or how hard it might be to make it true if it > isn't already. > > > AIUI, pg_dump sorts items by <object-type, schema, objectname> and then does a topological sort to permute this order to reflect dependencies. This is the TOC order parallel restore starts with (unless the order is mucked with by the user via the --use-list option). Each time it needs to schedule an item from the list, it chooses the first one yet to run that meets both these criteria: * all its dependencies have already been restored * it has no locking conflicts with a currently running item. Now, it is common practice to use the table name as a prefix of an index name, and this will actually cause indexes for a table to be grouped together in the TOC list. I think that's why Josh is seeing what he's seeing. If this holds, then all of the index creations for A will be started before any of the indexes for B, even if B's table data finishes restoring half way through restoring A's indexes. So your speculation about B's indexes contending with A's is incorrect unless their names sort intermingled. During development, I did play with changing the TOC order some, but abandoned it, as testing didn't show any obvious gain - if anything the reverse. There are some remnants of this in the code. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> Based on this thought, what seems to make sense as a quick-and-dirty >> answer is to make sure that items get scheduled in the same order they >> came free from dependency restrictions. I don't recall whether that >> is true at the moment, or how hard it might be to make it true if it >> isn't already. > AIUI, pg_dump sorts items by <object-type, schema, objectname> and then > does a topological sort to permute this order to reflect dependencies. > This is the TOC order parallel restore starts with (unless the order is > mucked with by the user via the --use-list option). Each time it needs > to schedule an item from the list, it chooses the first one yet to run > that meets both these criteria: > * all its dependencies have already been restored > * it has no locking conflicts with a currently running item. Hmm, the locking-conflicts point puts a bit of a hole in my thinking, because that's non-monotonic --- an item that was ready to run a moment ago might no longer be ready to run after you start some other task that has a lock conflict with it. Anyway, the idea I'd had was to use two lists: a list of jobs that are still blocked by dependencies, and a list of jobs that have no remaining dependencies. Whenever a job finishes, run through the former list and transfer any no-longer-blocked items to the end of the latter list. Then the selection rule is to run the first element of the latter list that hasn't got any locking conflicts. This would tend to preserve the behavior of executing things in the order they come free, though it wouldn't be perfect because of the locking issue. It would be easy enough to code up, and probably cleaner/faster than what's there now (which has a weird hack to avoid O(N^2) behavior due to examining the *entire* TOC list every time). I'm unconvinced though that it would make any meaningful difference, and I'm not in a position to do serious testing. Is anyone interested enough to try it if I code it? regards, tom lane
Tom, > Is anyone interested enough to try it if I code it? If you're patient for results, sure. I seem to be doing a customer migration or upgrade every week now, so it wouldn't take me long to have a test subject with a fairly complex database. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: > Tom, >> Is anyone interested enough to try it if I code it? > If you're patient for results, sure. I seem to be doing a customer > migration or upgrade every week now, so it wouldn't take me long to have > a test subject with a fairly complex database. Here's a draft patch that does ordering using two lists, as I proposed. Please test to see if it's any faster or slower than the original logic. Note: since this changes struct TocEntry, be sure to recompile all files in src/bin/pg_dump/ after patching. regards, tom lane
Attachment
Tom, > Here's a draft patch that does ordering using two lists, as I proposed. > Please test to see if it's any faster or slower than the original logic. Great. I'll need to get permission from a client; I can't host large enough/complex enough databases on my own system. :-( -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
I don't see this as every having been applied. What should we do with it? --------------------------------------------------------------------------- Tom Lane wrote: > Josh Berkus <josh@agliodbs.com> writes: > > Tom, > >> Is anyone interested enough to try it if I code it? > > > If you're patient for results, sure. I seem to be doing a customer > > migration or upgrade every week now, so it wouldn't take me long to have > > a test subject with a fairly complex database. > > Here's a draft patch that does ordering using two lists, as I proposed. > Please test to see if it's any faster or slower than the original logic. > > Note: since this changes struct TocEntry, be sure to recompile all files > in src/bin/pg_dump/ after patching. > > regards, tom lane > Content-Description: alternate-parallel-restore-1.patch.gz [ Type application/octet-stream treated as attachment, skipping... ] > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.comPG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do + If your life is a hard drive,Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes: > I don't see this as every having been applied. What should we do with > it? I believe we decided that there wasn't any measurable win. regards, tom lane
Tom,<br /> I just took the patch, but it seems to be in binary format. Can you send me the patch to me?<br /><br />Thanks,<br/>Gokul.<br /><br /><div class="gmail_quote">On Sat, May 30, 2009 at 3:12 AM, Tom Lane <span dir="ltr"><<ahref="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>></span> wrote:<br /><blockquote class="gmail_quote"style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><divclass="im">Josh Berkus <<a href="mailto:josh@agliodbs.com">josh@agliodbs.com</a>> writes:<br /></div><divclass="im">> Tom,<br /> >> Is anyone interested enough to try it if I code it?<br /><br /> > If you'repatient for results, sure. I seem to be doing a customer<br /> > migration or upgrade every week now, so it wouldn'ttake me long to have<br /> > a test subject with a fairly complex database.<br /><br /></div>Here's a draft patchthat does ordering using two lists, as I proposed.<br /> Please test to see if it's any faster or slower than the originallogic.<br /><br /> Note: since this changes struct TocEntry, be sure to recompile all files<br /> in src/bin/pg_dump/after patching.<br /><br /> regards, tom lane<br /><br /><br /><br /> --<br /> Sentvia pgsql-hackers mailing list (<a href="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br />To make changes to your subscription:<br /><a href="http://www.postgresql.org/mailpref/pgsql-hackers" target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/><br /></blockquote></div><br />
On 27/02/2010 07:52, Gokulakannan Somasundaram wrote: > Tom, > I just took the patch, but it seems to be in binary format. Can you send > me the patch to me? gunzip shuould do the trick > Thanks, > Gokul. > > On Sat, May 30, 2009 at 3:12 AM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > >> Josh Berkus<josh@agliodbs.com> writes: >>> Tom, >>>> Is anyone interested enough to try it if I code it? >> >>> If you're patient for results, sure. I seem to be doing a customer >>> migration or upgrade every week now, so it wouldn't take me long to have >>> a test subject with a fairly complex database.