Re: pg_verifybackup: TAR format backup verification - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: pg_verifybackup: TAR format backup verification |
Date | |
Msg-id | CA+Tgmobs-gbRUMGPpzVZnPPP=cgeEO7hnzV2yKmM0BQ=MGQSMg@mail.gmail.com Whole thread Raw |
In response to | Re: pg_verifybackup: TAR format backup verification (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
+ pg_log_error("pg_waldump cannot read from a tar"); "tar" isn't normally used as a noun as you do here, so I think this should say "pg_waldump cannot read tar files". Technically, the position of this check could lead to an unnecessary failure, if -n wasn't specified but pg_wal.tar(.whatever) also doesn't exist on disk. But I think it's OK to ignore that case. However, I also notice this change to the TAP tests in a few places: - [ 'pg_verifybackup', $tempdir ], + [ 'pg_verifybackup', '-Fp', $tempdir ], It's not the end of the world to have to make a change like this, but it seems easy to do better. Just postpone the call to find_backup_format() until right after we call parse_manifest_file(). That also means postponing the check mentioned above until right after that, but that's fine: after parse_manifest_file() and then find_backup_format(), you can do if (!no_parse_wal && context.format == 't') { bail out }. + if (stat(path, &sb) == 0) + result = 'p'; + else + { + if (errno != ENOENT) + { + pg_log_error("cannot determine backup format : %m"); + pg_log_error_hint("Try \"%s --help\" for more information.", progname); + exit(1); + } + + /* Otherwise, it is assumed to be a tar format backup. */ + result = 't'; + } This doesn't look good, for a few reasons: 1. It would be clearer to structure this as if (stat(...) == 0) result = 'p'; else if (errno == ENOENT) result = 't'; else { report an error; } instead of the way you have it. 2. "cannot determine backup format" is not an appropriate way to report the failure of stat(). The appropriate message is "could not stat file \"%s\"". 3. It is almost never correct to put a space before a colon in an error message. 4. The hint doesn't look helpful, or necessary. I think you can just delete that. Regarding both point #2 and point #4, I think we should ask ourselves how stat() could realistically fail here. On my system (macOS), the document failed modes for stat() are EACCES (i.e. permission denied), EFAULT (i.e. we have a bug in pg_verifybackup), EIO (I/O Error), ELOOP (symlink loop), ENAMETOOLONG, ENOENT, ENOTDIR, and EOVERFLOW. In none of those cases does it seem likely that specifying the format manually will help anything. Thus, suggesting that the user look at the help, presumably to find --format, is unlikely to solve anything, and telling them that the error happened while trying to determine the backup format isn't really helping anything, either. What the user needs to know is that it was stat() that failed, and the pathname for which it failed. Then they need to sort out whatever problem is causing them to get one of those really weird errors. Aside from the above, 0010 looks pretty reasonable, although I'll probably want to do some wordsmithing on some of the comments at some point. - of the backup. The backup must be stored in the "plain" - format; a "tar" format backup can be checked after extracting it. + of the backup. The backup must be stored in the "plain" or "tar" + format. Verification is supported for <literal>gzip</literal>, + <literal>lz4</literal>, and <literal>zstd</literal> compressed tar backup; + any other compressed format backups can be checked after decompressing them. I don't think that we need to say that the backup must be stored in the plain or tar format, because those are the only backup formats pg_basebackup knows about. Similarly, it doesn't seem help to me to enumerate all the compression algorithms that pg_basebackup supports and say we only support those; what else would a user expect? What I would do is replace the original sentence ("The backup must be stored...") with: The backup may be stored either in the "plain" or the "tar" format; this includes "tar" backups compressed with any algorithm supported by pg_basebackup. However, at present, WAL verification is supported only for plain-format backups. Therefore, if the backup is stored in "tar" format, the <literal>-n, --no-parse-wal<literal> option should be used. + # Add tar backup format option + push @backup, ('-F', 't'); + # Add switch to skip WAL verification. + push @verify, ('-n'); Say why, not what. The second comment should say something like "WAL verification not yet supported for tar-format backups". + "$format backup fails with algorithm \"$algorithm\""); + $primary->command_ok(\@backup, "$format backup ok with algorithm \"$algorithm\""); + ok(-f "$backup_path/backup_manifest", "$format backup manifest exists"); + "verify $format backup with algorithm \"$algorithm\""); Personally I would change "$format" to "$format format" in all of these places, so that we talk about a "tar format backup" or a "plain format backup" instead of a "tar backup" or a "plain backup". + 'skip_on_windows' => 1 I don't understand why 4 of the 7 new tests are skipped on Windows. The existing "skip" message for this parameter says "unix-style permissions not supported on Windows" but that doesn't seem applicable for any of the new cases, and I couldn't find a comment about it, either. + my @files = glob("*"); + system_or_bail($tar, '-cf', "$backup_path/$archive", @files); Why not just system_or_bail($tar, '-cf', "$backup_path/$archive", '.')? Also, instead of having separate entries in the test array to do basically the same thing on Windows, could we just iterate through the test array twice and do everything once for plain format and then a second time for tar format, and do the tests once for each? I don't think that idea QUITE works, because the open_file_fails, open_directory_fails, and search_directory_fails tests are really not applicable to tar format. But we could rename skip_on_windows to tests_file_permissions and skip those both on Windows and for tar format. But aside from that, I don't quite see why it makes sense to, for example, test extra_file for both formats but not extra_tablespace_file, and indeed it seems like an important bit of test coverage. I also feel like we should have tests someplace that add extra files to a tar-format backup in the backup directory (e.g. 1234567.tar, or wug.tar, or 123456.wug) or remove entire files. ...Robert
pgsql-hackers by date: