Home > mailing lists

Re: Warn when parallel restoring a custom dump without data offsets - Mailing list pgsql-hackers

From	David Gilman
Subject	Re: Warn when parallel restoring a custom dump without data offsets
Date	May 20, 2020 12:55:23
Msg-id	CALBH9DAmhfBw=VsKCcmig9jP2pBtuyLU5U5Tun_G=3ULs+q-dg@mail.gmail.com Whole thread
In response to	Re: Warn when parallel restoring a custom dump without data offsets (Justin Pryzby <pryzby@telsasoft.com>)
Responses	Re: Warn when parallel restoring a custom dump without data offsets
List	pgsql-hackers

Tree view

Your understanding of the issue is mostly correct:

> I think the PG11
> commit you mentioned (548e5097) happens to make some databases fail in
> parallel restore that previously worked (I didn't check).

Correct, if you do the bisect around that yourself you'll see
pg_restore start failing with the expected "possibly due to
out-of-order restore request" on offset-less dumps. It is a known
issue but it's only documented in code comments, not anywhere user
facing, which is sending people to StackOverflow.

> If the input is unseekable, then we can
> never do a parallel restore at all.

I don't know if this is strictly true. Imagine the case of a database
dump of a single large table with a few indexes, so simple enough that
everything in the file is going to be in restore order. It might seem
silly to parallel restore a single table but remember that pg_restore
also creates indexes in parallel and on a typical development
workstation with a few CPU cores and an SSD it'll be a substantial
improvement. There are probably some other corner cases where you can
get lucky with the offset-less dump and it'll work. That's why my gut
instinct was to warn instead of fail.

> If it *is* seekable, could we
> make _PrintTocData rewind if it gets to EOF using ftello(SEEK_SET, 0)
> and re-scan again from the beginning?  Would you want to try that ?

I will try this and report back. I will also see if I can get an strace.

-- 
David Gilman
:DG<

pgsql-hackers by date:

From: David Rowley
Date: 20 May 2020, 11:47:05
Subject: Re: Subplan result caching

From: Atsushi Torikoshi
Date: 20 May 2020, 12:56:04
Subject: Re: Is it useful to record whether plans are generic or custom?

Re: Warn when parallel restoring a custom dump without data offsets - Mailing list pgsql-hackers

Previous

Next