From 130b7614b4592ea2f0fee44b7a7b59703e80806e Mon Sep 17 00:00:00 2001 From: David Gilman Date: Wed, 20 May 2020 22:49:28 -0400 Subject: [PATCH 2/3] Scan all TOCs when restoring a custom dump file without offsets TOC requests are not guaranteed to come in disk order. If the custom dump file was written with data offsets pg_restore can seek directly to the data, making request order irrelevant. If there are no data offsets pg_restore would only find the TOC if it happened to be immediately after the current one on disk. 548e50976 changed how pg_restore's parallel algorithm worked at the cost of greatly increasing out-of-order TOC requests. This patch changes pg_restore to scan through all TOCs to service a TOC read request only when restoring a custom dump file without data offsets. The odds of getting a successful parallel restore go way up at the cost of a bunch of extra tiny reads when pg_restore starts up. The pg_restore manpage now warns against running pg_dump with an unseekable output file and suggests that if you plan on doing a parallel restore of a custom dump you should run pg_dump with --file. --- doc/src/sgml/ref/pg_restore.sgml | 9 +++++++++ src/bin/pg_dump/pg_backup_custom.c | 20 ++++++++++++++++++-- 2 files changed, 27 insertions(+), 2 deletions(-) diff --git doc/src/sgml/ref/pg_restore.sgml doc/src/sgml/ref/pg_restore.sgml index 232f88024f..fd23d6720c 100644 --- doc/src/sgml/ref/pg_restore.sgml +++ doc/src/sgml/ref/pg_restore.sgml @@ -279,6 +279,15 @@ PostgreSQL documentation jobs cannot be used together with the option . + + + The custom archive format may not work with the + option if the archive was originally created by writing the archive + to an unseekable output file. For the best concurrent restoration + performance with the custom archive format use + pg_dump's option + to specify an output file. + diff --git src/bin/pg_dump/pg_backup_custom.c src/bin/pg_dump/pg_backup_custom.c index 369dcea429..8dfb6581d1 100644 --- src/bin/pg_dump/pg_backup_custom.c +++ src/bin/pg_dump/pg_backup_custom.c @@ -423,9 +423,25 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te) { /* * We cannot seek directly to the desired block. Instead, skip over - * block headers until we find the one we want. This could fail if we - * are asked to restore items out-of-order. + * block headers until we find the one we want. */ + + if (ctx->hasSeek) + { + /* + * Start searching from the first block. If this is possible, we're + * all but guaranteed to find the block, although at the cost of a bunch + * of redundant, tiny reads. TOC requests aren't guaranteed to come in + * disk order so this is a necessary evil. + * + * If the input file can't be seeked we're at the mercy of the + * file's TOC layout on disk. An out-of-order restore request will + * halt the restore. + */ + if (fseeko(AH->FH, ctx->dataStart, SEEK_SET) != 0) + fatal("error during file seek: %m"); + } + _readBlockHeader(AH, &blkType, &id); while (blkType != EOF && id != te->dumpId) -- 2.26.2