Re: pg_verifybackup: TAR format backup verification - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: pg_verifybackup: TAR format backup verification |
Date | |
Msg-id | CA+TgmoZ-fwQ0aWUs_40oXkEpF7JAn4D9SnKj8dj=qGLtO0VSRw@mail.gmail.com Whole thread Raw |
In response to | pg_verifybackup: TAR format backup verification (Amul Sul <sulamul@gmail.com>) |
Responses |
Re: pg_verifybackup: TAR format backup verification
|
List | pgsql-hackers |
I would rather that you didn't add simple_ptr_list_destroy_deep() given that you don't need it for this patch series. + "\"%s\" unexpected file in the tar format backup", This doesn't seem grammatical to me. Perhaps change this to: file \"%s\" is not expected in a tar format backup + /* We are only interested in files that are not in the ignore list. */ + if (member->is_directory || member->is_link || + should_ignore_relpath(mystreamer->context, member->pathname)) + return; Doesn't this need to happen after we add pg_tblspc/$OID to the path, rather than before? I bet this doesn't work correctly for files in user-defined tablespaces, compared to the way it work for a directory-format backup. + char temp[MAXPGPATH]; + + /* Copy original name at temporary space */ + memcpy(temp, member->pathname, MAXPGPATH); + + snprintf(member->pathname, MAXPGPATH, "%s/%d/%s", + "pg_tblspc", mystreamer->tblspc_oid, temp); I don't like this at all. This function doesn't have any business modifying the astreamer_member, and it doesn't need to. It can just do char *pathname; char tmppathbuf[MAXPGPATH] and then set pathname to either member->pathname or tmppathbuf depending on OidIsValid(tblspcoid). Also, shouldn't this be using %u, not %d? + mystreamer->mfile = (void *) m; Either the cast to void * isn't necessary, or it indicates that there's a type mismatch that should be fixed. + * We could have these checks while receiving contents. However, since + * contents are received in multiple iterations, this would result in + * these lengthy checks being performed multiple times. Instead, setting + * flags here and using them before proceeding with verification will be + * more efficient. Seems unnecessary to explain this. + Assert(mystreamer->verify_checksum); + + /* Should have came for the right file */ + Assert(strcmp(member->pathname, m->pathname) == 0); + + /* + * The checksum context should match the type noted in the backup + * manifest. + */ + Assert(checksum_ctx->type == m->checksum_type); What do you think about: Assert(m != NULL && !m->bad); Assert(checksum_ctx->type == m->checksum_type); Assert(strcmp(member->pathname, m->pathname) == 0); Or possibly change the first one to Assert(should_verify_checksum(m))? + memcpy(&control_file, streamer->bbs_buffer.data, sizeof(ControlFileData)); This probably doesn't really hurt anything, but it's a bit ugly. You first use astreamer_buffer_until() to force the entire file into a buffer. And then here, you copy the entire file into a second buffer which is exactly the same except that it's guaranteed to be properly aligned. It would be possible to include a ControlFileData in astreamer_verify and copy the bytes directly into it (you'd need a second astreamer_verify field for the number of bytes already copied into that structure). I'm not 100% sure that's worth the code, but it seems like it wouldn't take more than a few lines, so perhaps it is. +/* + * Verify plain backup. + */ +static void +verify_plain_backup(verifier_context *context) +{ + Assert(context->format == 'p'); + verify_backup_directory(context, NULL, context->backup_directory); +} + This seems like a waste of space. +verify_tar_backup(verifier_context *context) I like this a lot better now! I'm still not quite sure about the decision to have the ignore list apply to both the backup directory and the tar file contents -- but given the low participation on this thread, I don't think we have much chance of getting feedback from anyone else right now, so let's just do it the way you have it and we can change it later if someone shows up to complain. +verify_all_tar_files(verifier_context *context, SimplePtrList *tarfiles) I think this code could be moved into its only caller instead of having a separate function. And then if you do that, maybe verify_one_tar_file could be renamed to just verify_tar_file. Or perhaps that function could also be removed and just move the code into the caller. It's not much code and not very deeply nested. Similarly create_archive_verifier() could be considered for this treatment. Maybe inlining all of these is too much and the result will look messy, but I think we should at least try to combine some of them. ...Robert
pgsql-hackers by date: