Home > mailing lists

Re: pg_dump test instability - Mailing list pgsql-hackers

From	Stephen Frost
Subject	Re: pg_dump test instability
Date	August 27, 2018 20:41:38
Msg-id	20180827144138.GO3326@tamriel.snowman.net Whole thread Raw
In response to	Re: pg_dump test instability (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: pg_dump test instability
List	pgsql-hackers

Tree view

Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> > In a non-data_only dump, the order of the tables doesn't matter, because
> > the foreign keys are added at the very end.  In parallel dumps, the
> > tables are in addition sorted by size, so the resultant order is
> > different from a single-threaded dump.  This can be seen by comparing
> > the dumped TOCs of the defaults_dir_format and defaults_parallel cases.
> > But it all happens to pass the tests right now.
>
> I noticed that business about sorting the TOC by size yesterday.
> I think that's a completely bletcherous hack, and we ought to get
> rid of it in favor of keeping the TOC order the same between parallel
> and non-parallel cases, and instead doing size comparisons during
> parallel worker dispatch.

So instead of dumping things by the order of the TOC, we'll perform the
sorting later on before handing out jobs to workers?  That seems alright
to me for the most part.  One thing I do wonder about is if we should
also be sorting by tablespace and not just size, to try and maximize
throughput (that is, assign out parallel workers to each tablespace,
each going after the largest table in that tablespace, before coming
back around to assigning the next-largest file to the second worker on a
given tablespace, presuming we have more workers than tablespaces),
that's what we've seen works rather well in pgbackrest.

> However, at least for the directory-format case (which I think is the
> only one supported for parallel restore), we could make it compare the
> file sizes of the TABLE DATA items.  That'd work pretty well as a proxy
> for both the amount of effort needed for table restore, and the amount
> of effort needed to build indexes on the tables afterwards.

Parallel restore also works w/ custom-format dumps.

Thanks!

Stephen

Attachment

signature.asc

pgsql-hackers by date:

From: Peter Eisentraut
Date: 27 August 2018, 20:32:42
Subject: Re: some more error location support

From: Tom Lane
Date: 27 August 2018, 20:45:58
Subject: Re: pg_dump test instability

Re: pg_dump test instability - Mailing list pgsql-hackers

Attachment

Previous

Next