Re: pgsql: Add TAP tests for pg_verify_checksums - Mailing list pgsql-hackers

From Andres Freund
Subject Re: pgsql: Add TAP tests for pg_verify_checksums
Date
Msg-id 20181019200723.4byzaw5km5jvsxlc@alap3.anarazel.de
Whole thread Raw
In response to Re: pgsql: Add TAP tests for pg_verify_checksums  (Stephen Frost <sfrost@snowman.net>)
Responses Re: pgsql: Add TAP tests for pg_verify_checksums  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On 2018-10-19 15:57:28 -0400, Stephen Frost wrote:
> > > > > Of course- the same is true with a crash/restart case, so I'm not sure
> > > > > what you're getting at here.
> > > > 
> > > > pg_verify_checksum doesn't support running on a crashed cluster, and I'm
> > > > not sure it'd make sense to teach it to so (there's not really much it
> > > > could do to discern whether a block is a torn page that replay will fix,
> > > > or corrupted).
> > > 
> > > ... and this isn't at all relevant, because pg_basebackup does run on a
> > > running cluster.
> > 
> > I wasn't ever denying that or anything close to it?  My point is that
> > pg_verify_checksum needs much more filtering than it has now to deal
> > with that, because it needs to handle all files that could be present,
> > not just files that could be present after a graceful shutdown.
> 
> Perhaps it doesn't today but surely one goal of pg_verify_checksum is to
> be able to run it on a running cluster eventually.

I was saying *precisely* that above.  I give up.


> > pg_basebackup's case is *not* really comparable because basebackup.c
> > already performed a lot of filtering before noChecksumFiles is applied.
> 
> This all really just points out that we should have the code for
> handling this somewhere common that both pg_verify_checksum and
> pg_basebackup can leverage without duplicating all of it.

I never argued against that.  My point is that your argument that they
started out the same isn't true.


> The specific case that started all of this certainly looks pretty
> clearly like a case that both need to deal with.

Yep.


> > > As pointed out elsewhere on this thread, the logic was the same between
> > > the two before this commit...  The code in pg_basebackup came from the
> > > prior pg_verify_checksums code.  Certainly, some mention of the code
> > > existing in two places, at least, should have been in the comments.
> > 
> > But the filter for basebackup only comes after the pre-existing
> > filtering that the basebackup.c code already does. All of:
> > 
> > /*
> >  * List of files excluded from backups.
> >  */
> > static const char *excludeFiles[] =
> > {
> >     /* Skip auto conf temporary file. */
> >     PG_AUTOCONF_FILENAME ".tmp",
> > 
> >     /* Skip current log file temporary file */
> >     LOG_METAINFO_DATAFILE_TMP,
> > 
> >     /* Skip relation cache because it is rebuilt on startup */
> >     RELCACHE_INIT_FILENAME,
> > 
> >     /*
> >      * If there's a backup_label or tablespace_map file, it belongs to a
> >      * backup started by the user with pg_start_backup().  It is *not* correct
> >      * for this backup.  Our backup_label/tablespace_map is injected into the
> >      * tar separately.
> >      */
> >     BACKUP_LABEL_FILE,
> >     TABLESPACE_MAP,
> > 
> >     "postmaster.pid",
> >     "postmaster.opts",
> > 
> >     /* end of list */
> >     NULL
> > };
> > 
> > is not applied, for example.  Nor is:
> > 
> >         /* Skip temporary files */
> >         if (strncmp(de->d_name,
> >                     PG_TEMP_FILE_PREFIX,
> >                     strlen(PG_TEMP_FILE_PREFIX)) == 0)
> >             continue;
> > 
> > Nor is
> >         /* Exclude temporary relations */
> >         if (isDbDir && looks_like_temp_rel_name(de->d_name))
> >         {
> >             elog(DEBUG2,
> >                  "temporary relation file \"%s\" excluded from backup",
> >                  de->d_name);
> > 
> >             continue;
> >         }
> > 
> > So, yea, they had:
> > 
> > static const char *const skip[] = {
> >     "pg_control",
> >     "pg_filenode.map",
> >     "pg_internal.init",
> >     "PG_VERSION",
> >     NULL,
> > };
> > 
> > in common, but not more.  All the above exclusions are strictly
> > necessary.
> 
> ... but all of that code doesn't change that pg_basebackup also needs to
> ignore the EXEC_BACKEND created config_exec_params/.new files.

Right.


> You're right, pg_verify_checksums, with the assumption that it only runs
> on a cleanly shut-down cluster, doesn't need the temp-file-skipping
> logic, today, but it's going to need it tomorrow, isn't it?

No, it needs it today, as explain below in the email you're replaying
to.


> > FWIW, as far as I can tell, pg_verify_checksum appears to be broken in a
> > lot of (unfortunately) pretty normal scenarios right now. Not just on
> > windows. Besides the narrow window of crashing while a .tmp file is
> > present (and then shutting down normally after a crash restart), it also
> > has the much of wider window of crashing while *any* backend has
> > temporary files/relations.  As crash-restarts don't release temp files,
> > they'll still be present after the crash, and a single graceful shutdown
> > afterwards will leave them in place. No?
> 
> We do go through and do some cleanup at crash/restart

We don't clean up temp files tho.

 * NOTE: we could, but don't, call this during a post-backend-crash restart
 * cycle.  The argument for not doing it is that someone might want to examine
 * the temp files for debugging purposes.  This does however mean that
 * OpenTemporaryFile had better allow for collision with an existing temp
 * file name.


> As you say, some of those are likely to have checksums, which should be
> handled by pg_basebackup and pg_verify_checksums, and that goes back to
> the point I was making up-thread that we want to make sure an account
> for everything.  With this pattern-based approach, we could easily end
> up forgetting to add the correct new pattern into pg_verify_checksums.

If you're adding checksums for something, you better test it I don't buy
this.  In contrast it's much more likely that there's a file that's
short-lived that you won't easily test against pg_verify_checksum
running in that moment.


> Playing this further and assuming that extensions dropping files into
> tablespace directories is acceptable, what are we supposed to do when
> there's some direct conflict between a file that we want to create in a
> PG tablespace directory and a file that some extension dropped there?
> Create yet another subdirectory which we call
> "THIS_IS_REALLY_ONLY_FOR_PG"?

Then it's a buggy extension. And we error out.


> What about two different extensions wanting to create files with the
> same names in the tablespace directories..?
> 
> Experimentation is fine, of course, this is open source code and people
> should feel free to play with it, but we are not obligated to avoid
> breaking things when an extension author, through their experimentation,
> does something which is clearly not supported, like dropping files into
> PG's tablespace directories.  Further, when it comes to our user's data,
> we should be taking a strict approach and accounting for everything,
> something that this whitelist-of-patterns-based approach to finding
> files to verify the checksums on doesn't do.

It's not economically feasible to work on extensions that will only be
usable a year down the road.

Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: pgsql: Add TAP tests for pg_verify_checksums
Next
From: Stephen Frost
Date:
Subject: Re: pgsql: Add TAP tests for pg_verify_checksums