I've rounded this patch out with a test and I've set up the commitfest
website for this thread. The latest patches are attached and I think
they are ready for review.
On Wed, May 20, 2020 at 11:05 PM David Gilman <davidgilman1@gmail.com> wrote:
>
> I did some more digging. To keep everyone on the same page there are
> four different ways to order TOCs:
>
> 1. topological order,
> 2. dataLength order, size of the table, is always zero when pg_dump can't seek,
> 3. dumpId order, which should be thought as random but roughly
> correlates to topological order to make things fun,
> 4. file order, the order that tables are physically stored in the
> custom dump file.
>
> Without being able to seek backwards a parallel restore of the custom
> dump archive format has to be ordered by #1 and #4. The reference
> counting that reduce_dependencies does inside of the parallel restore
> logic upholds ordering #1. Unfortunately, 548e50976ce changed
> TocEntrySizeCompare (which is used to break ties within #1) to order
> by #2, then by #3. This most often breaks on dumps written by pg_dump
> without seeks (everything has a dataLength of zero) as it then falls
> back to #3 ordering every time. But, because nothing in pg_restore
> does any ordering by #4 you could potentially run into this with any
> custom dump so I think it's a regression.
>
> For some troubleshooting I changed ready_list_sort to never call
> qsort. This fixes the problem by never ordering by #3, leaving things
> in #4 order, but breaks the new algorithm introduced in 548e50976ce.
>
> I did what Justin suggested earlier and it works great. Parallel
> restore requires seekable input (enforced elsewhere) so everyone's
> parallel restores should work again.
>
> On Wed, May 20, 2020 at 10:48 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > David Gilman <davidgilman1@gmail.com> writes:
> > >> I think the PG11
> > >> commit you mentioned (548e5097) happens to make some databases fail in
> > >> parallel restore that previously worked (I didn't check).
> >
> > > Correct, if you do the bisect around that yourself you'll see
> > > pg_restore start failing with the expected "possibly due to
> > > out-of-order restore request" on offset-less dumps.
> >
> > Yeah. Now, the whole point of that patch was to decouple the restore
> > order from the dump order ... but with an offset-less dump file, we
> > can't do that, or at least the restore order is greatly constrained.
> > I wonder if it'd be sensible for pg_restore to use a different parallel
> > scheduling algorithm if it notices that the input lacks offsets.
> > (There could still be some benefit from parallelism, just not as much.)
> > No idea if this is going to be worth the trouble, but it probably
> > is worth looking into.
> >
> > regards, tom lane
>
>
>
> --
> David Gilman
> :DG<
--
David Gilman
:DG<