Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
>
>> Andrew's latest algorithm tends to result in building indexes on the
>> same table at the same time. This is excellent for most users; I'm on a
>> client's site which is I/O bound and that approach is speeding up
>> parallel load about 20% compared to the beta1 version.
>>
>
> Hmph ... that seems like a happenstance, because there isn't anything in
> there that is specifically trying to organize things that way. AFAIK
> it's only accounting for required dependencies, not for possible
> performance implications of scheduling various tasks together.
>
>
>> In other words, don't mess with it now. I think it's perfect. ;-)
>>
>
> I don't want to mess with it right now either, but perhaps we should
> have a TODO item to improve the intelligence of parallel restore so that
> it really does try to do things this way.
>
>
>
Other things being equal it schedules things in TOC order, which often
works as we want anyway. I think there's a good case for altering the
name sort order of pg_dump to group sub-objects of a table (indexes,
constraints etc.) together, ie. instead of sorting by <objectname>, we'd
sort by <tablename, objectname>. This would possibly improve the effect
seen in parallel restore without requiring any extra intelligence there.
But I agree it's worth further study. I suspect we can probably beef up
parallel restore quite a bit. My object for this release was to get the
basics working, especially since I started quite late in the development
cycle, and it was a struggle just to make the cut.
cheers
andrew