Thread: Re: pgsql: Add TAP tests for pg_verify_checksums
On Fri, Oct 12, 2018 at 12:17:57AM +0000, Michael Paquier wrote: > Add TAP tests for pg_verify_checksums > > All options available in the utility get coverage: > - Tests with disabled page checksums. > - Tests with enabled test checksums. > - Emulation of corruption and broken checksums with a full scan and > single relfilenode scan. culicidae is failing after this commit in an interesting way, so we really needed some tests for this utility: https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=culicidae&br=HEAD ok 7 - fails with corrupted data status (got 1 vs expected 1) not ok 8 - fails with corrupted data stdout /(?^:Bad checksums:.*1)/ # Failed test 'fails with corrupted data stdout /(?^:Bad checksums:.*1)/' # at t/002_actions.pl line 57. # '' # doesn't match '(?^:Bad checksums:.*1)' not ok 9 - fails with corrupted data stderr /(?^:checksum verification failed)/ So the checksum failure is happening, but the output is not present. What's not clear to me is that all the other hosts running Debian SID don't complain, and that the follow-up test on a single relfilenode is able to pass with a test much similar to the previous one. Is the issue that we should just call fflush() on stderr and stdout at the end of main() in pg_verify_checksum.c? Shouldn't we back-patch that? -- Michael
Attachment
Hi, On 2018-10-12 09:56:14 +0900, Michael Paquier wrote: > On Fri, Oct 12, 2018 at 12:17:57AM +0000, Michael Paquier wrote: > > Add TAP tests for pg_verify_checksums > > > > All options available in the utility get coverage: > > - Tests with disabled page checksums. > > - Tests with enabled test checksums. > > - Emulation of corruption and broken checksums with a full scan and > > single relfilenode scan. > > culicidae is failing after this commit in an interesting way, so we > really needed some tests for this utility: > https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=culicidae&br=HEAD > > ok 7 - fails with corrupted data status (got 1 vs expected 1) > not ok 8 - fails with corrupted data stdout /(?^:Bad checksums:.*1)/ > > # Failed test 'fails with corrupted data stdout /(?^:Bad checksums:.*1)/' > # at t/002_actions.pl line 57. > # '' > # doesn't match '(?^:Bad checksums:.*1)' > not ok 9 - fails with corrupted data stderr /(?^:checksum verification failed)/ > > So the checksum failure is happening, but the output is not present. > What's not clear to me is that all the other hosts running Debian SID > don't complain, and that the follow-up test on a single relfilenode is > able to pass with a test much similar to the previous one. culicidae tests EXEC_BACKEND, so there's an explanation as to why it sometimes behaves differently. But here I don't immediately see how that'd matter. Probably still worth verifying that it's not somehow caused by that. > Is the issue > that we should just call fflush() on stderr and stdout at the end of > main() in pg_verify_checksum.c? Shouldn't we back-patch that? I don't see how that'd would change anything. A "missing" fflush() isn't a licence to simply throw away buffer contents, so I doubt it's just that. Greetings, Andres Freund
On Thu, Oct 11, 2018 at 06:04:11PM -0700, Andres Freund wrote: > culicidae tests EXEC_BACKEND, so there's an explanation as to why it > sometimes behaves differently. But here I don't immediately see how > that'd matter. Probably still worth verifying that it's not somehow > caused by that. Thanks, that's the point of detail I needed about culicidae (you will need to explain me one day face-to-face how you pronounce it). I have been able to reproduce the problem, and that's a bug within pg_verify_checksums as it fails to consider that config_exec_params is not a file it should scan when using EXEC_BACKEND. The same can happen with config_exec_params.new. The attached, which fixes the issue for me, needs to be back-patched to v11. -- Michael
Attachment
Hi, On 2018-10-12 10:39:18 +0900, Michael Paquier wrote: > On Thu, Oct 11, 2018 at 06:04:11PM -0700, Andres Freund wrote: > > culicidae tests EXEC_BACKEND, so there's an explanation as to why it > > sometimes behaves differently. But here I don't immediately see how > > that'd matter. Probably still worth verifying that it's not somehow > > caused by that. > > Thanks, that's the point of detail I needed about culicidae Note that that's also mentioned on the BF page when you hover over the yellow stick-it symbol. > (you will need to explain me one day face-to-face how you pronounce > it). I didn't choose it ;) > I have been able to reproduce the problem, and that's a bug within > pg_verify_checksums as it fails to consider that config_exec_params is > not a file it should scan when using EXEC_BACKEND. The same can > happen with config_exec_params.new. Ugh, so it's actively borked on windows right now? > The attached, which fixes the issue for me, needs to be back-patched to > v11. Looks consistent with what we have rn. But: > diff --git a/src/bin/pg_verify_checksums/pg_verify_checksums.c b/src/bin/pg_verify_checksums/pg_verify_checksums.c > index 1bc020ab6c..a6c884f149 100644 > --- a/src/bin/pg_verify_checksums/pg_verify_checksums.c > +++ b/src/bin/pg_verify_checksums/pg_verify_checksums.c > @@ -54,6 +54,10 @@ static const char *const skip[] = { > "pg_filenode.map", > "pg_internal.init", > "PG_VERSION", > +#ifdef EXEC_BACKEND > + "config_exec_params", > + "config_exec_params.new", > +#endif > NULL, > }; > Hm. Maybe I'm confused, but how is it correct to use a blacklisting approach here? It's far from absurd to have files inside a tablespacs when using temp_tablespace that aren't written through the buffer manager. Like, uh, temp files, when using temp_tablespaces. But even leaving that aside, there's a few extensions that place additional files into the tablespace directories (and we don't give them an alternative solution). I think at the very least you're going to need to pattern match to relation looking files. Greetings, Andres Freund
On Thu, Oct 11, 2018 at 06:53:19PM -0700, Andres Freund wrote: > On 2018-10-12 10:39:18 +0900, Michael Paquier wrote: >> I have been able to reproduce the problem, and that's a bug within >> pg_verify_checksums as it fails to consider that config_exec_params is >> not a file it should scan when using EXEC_BACKEND. The same can >> happen with config_exec_params.new. > > Ugh, so it's actively borked on windows right now? It looks so. Automated testing proves to be useful sometimes. > Hm. Maybe I'm confused, but how is it correct to use a blacklisting > approach here? It's far from absurd to have files inside a tablespacs > when using temp_tablespace that aren't written through the buffer > manager. Like, uh, temp files, when using temp_tablespaces. pg_verify_checksums assumes that the cluster has been cleanly shut down so temp files are not an issue, no? However... > But even > leaving that aside, there's a few extensions that place additional files > into the tablespace directories (and we don't give them an alternative > solution). This is a good argument. I have not played much with such extensions myself though. > I think at the very least you're going to need to pattern match to > relation looking files. Yes, it seems so. A whitelist approach would consist in checking for relfilenodes, VMs and FSMs as files authorized for scan, in a way rather similar to what looks_like_temp_rel_name() does for temp files... -- Michael
Attachment
Hi, On 2018-10-12 11:07:58 +0900, Michael Paquier wrote: > On Thu, Oct 11, 2018 at 06:53:19PM -0700, Andres Freund wrote: > > Hm. Maybe I'm confused, but how is it correct to use a blacklisting > > approach here? It's far from absurd to have files inside a tablespacs > > when using temp_tablespace that aren't written through the buffer > > manager. Like, uh, temp files, when using temp_tablespaces. > > pg_verify_checksums assumes that the cluster has been cleanly shut > down so temp files are not an issue, no? However... We only remove temp dirs on startup, and I'm pretty sure there at least not too long ago were a few error cases where we can leak temp files in individual sessions. And there's talk about allowing pg_verify_checksums when online. So this seems to be an angle that needs to be thought about... Greetings, Andres Freund
On Thu, Oct 11, 2018 at 07:41:18PM -0700, Andres Freund wrote: > We only remove temp dirs on startup, and I'm pretty sure there at least > not too long ago were a few error cases where we can leak temp files in > individual sessions. And there's talk about allowing > pg_verify_checksums when online. So this seems to be an angle that > needs to be thought about... Agreed. I am just working on a patch for v11- which uses a whitelist-based method instead of what is present now. Reverting the tests to put the buildfarm to green could be done, but that's not the root of the problem. -- Michael
Attachment
On Fri, Oct 12, 2018 at 12:11:58PM +0900, Michael Paquier wrote: > Agreed. I am just working on a patch for v11- which uses a > whitelist-based method instead of what is present now. Reverting the > tests to put the buildfarm to green could be done, but that's not the > root of the problem. So, I have coded this thing, and finish with the attached. The following file patterns are accepted for the checksums: <digits>.<segment> <digits>_<forkname> <digits>_<forkname>.<segment> I have added some tests on the way to make sure that all the patterns get covered. Please note that this skips the temporary files. Thoughts? -- Michael
Attachment
Hi, On Fri, Oct 12, 2018 at 03:05:43PM +0900, Michael Paquier wrote: > On Fri, Oct 12, 2018 at 12:11:58PM +0900, Michael Paquier wrote: > > Agreed. I am just working on a patch for v11- which uses a > > whitelist-based method instead of what is present now. Reverting the > > tests to put the buildfarm to green could be done, but that's not the > > root of the problem. I think that's the right solution; the online verification patch adds even more logic to the blacklist so getting rid of it in favor of a whitelist is +1 with me. > So, I have coded this thing, and finish with the attached. The > following file patterns are accepted for the checksums: > <digits>.<segment> > <digits>_<forkname> > <digits>_<forkname>.<segment> > I have added some tests on the way to make sure that all the patterns > get covered. Please note that this skips the temporary files. Maybe also add some correct (to be scanned) but non-empty garbage files later on that it should barf on? Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.banck@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
On Fri, Oct 12, 2018 at 12:14:48PM +0200, Michael Banck wrote: > On Fri, Oct 12, 2018 at 03:05:43PM +0900, Michael Paquier wrote: >> On Fri, Oct 12, 2018 at 12:11:58PM +0900, Michael Paquier wrote: >>> Agreed. I am just working on a patch for v11- which uses a >>> whitelist-based method instead of what is present now. Reverting the >>> tests to put the buildfarm to green could be done, but that's not the >>> root of the problem. > > I think that's the right solution; the online verification patch adds > even more logic to the blacklist so getting rid of it in favor of a > whitelist is +1 with me. Thanks Michael for the input! >> So, I have coded this thing, and finish with the attached. The >> following file patterns are accepted for the checksums: >> <digits>.<segment> >> <digits>_<forkname> >> <digits>_<forkname>.<segment> >> I have added some tests on the way to make sure that all the patterns >> get covered. Please note that this skips the temporary files. > > Maybe also add some correct (to be scanned) but non-empty garbage files > later on that it should barf on? I was not sure about doing that in the main patch so I tweaked manually the test to make sure that the tool still complained with "could not read block" as it should. That's easy enough to add, so I'll add them with multiple file patterns. Those are cheap checks as well if they are placed in global/. Another problem that the patch has is that it is not using forkname_to_number() to scan for all the fork types, and I forgot init forks in the previous version. Using forkname_to_number() also makes the tool more bug-proof, it is also not complicated to plug into the existing patch. Anyway, I have a bit of a problem here, which prevents me to stay in front of a computer or to look at a screen for more than a couple of minutes in a row for a couple of days at least, and I don't like to keep the buildfarm unhappy for the time being. There are three options: 1) Revert the TAP tests of pg_verify_checksums. 2) Push the patch which adds new entries for EXEC_BACKEND files in the skip list. That's a short workaround, and that would allow default deployments of Postgres to use the tool. 3) Finish the patch with the whitelist approach. I can do 1) or 2) in my condition. 3) requires more work than I can do now, though the patch to do is getting in shape, so the buildfarm would stay unhappy. Any preference of the course of action to take? -- Michael
Attachment
On 10/13/2018 04:30 AM, Michael Paquier wrote: > On Fri, Oct 12, 2018 at 12:14:48PM +0200, Michael Banck wrote: >> On Fri, Oct 12, 2018 at 03:05:43PM +0900, Michael Paquier wrote: >>> On Fri, Oct 12, 2018 at 12:11:58PM +0900, Michael Paquier wrote: >>>> Agreed. I am just working on a patch for v11- which uses a >>>> whitelist-based method instead of what is present now. Reverting the >>>> tests to put the buildfarm to green could be done, but that's not the >>>> root of the problem. >> I think that's the right solution; the online verification patch adds >> even more logic to the blacklist so getting rid of it in favor of a >> whitelist is +1 with me. > Thanks Michael for the input! > >>> So, I have coded this thing, and finish with the attached. The >>> following file patterns are accepted for the checksums: >>> <digits>.<segment> >>> <digits>_<forkname> >>> <digits>_<forkname>.<segment> >>> I have added some tests on the way to make sure that all the patterns >>> get covered. Please note that this skips the temporary files. >> Maybe also add some correct (to be scanned) but non-empty garbage files >> later on that it should barf on? > I was not sure about doing that in the main patch so I tweaked manually > the test to make sure that the tool still complained with "could not > read block" as it should. That's easy enough to add, so I'll add them > with multiple file patterns. Those are cheap checks as well if they are > placed in global/. > > Another problem that the patch has is that it is not using > forkname_to_number() to scan for all the fork types, and I forgot init > forks in the previous version. Using forkname_to_number() also makes > the tool more bug-proof, it is also not complicated to plug into the > existing patch. > > Anyway, I have a bit of a problem here, which prevents me to stay in > front of a computer or to look at a screen for more than a couple of > minutes in a row for a couple of days at least, and I don't like to keep > the buildfarm unhappy for the time being. There are three options: > 1) Revert the TAP tests of pg_verify_checksums. > 2) Push the patch which adds new entries for EXEC_BACKEND files in the > skip list. That's a short workaround, and that would allow default > deployments of Postgres to use the tool. > 3) Finish the patch with the whitelist approach. > > I can do 1) or 2) in my condition. 3) requires more work than I can do > now, though the patch to do is getting in shape, so the buildfarm would > stay unhappy. Any preference of the course of action to take? I have disabled the test temporarily on my two animals since I want to make sure they are working OK with other changes, and we know what the problem is. Andres might want to do that with his animal also just add "--skip-steps=pg_verify_checksums-check" to the command line. If you want to throw what you have for 3) over to wall to me I can see if I can finish it. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/13/2018 10:00 AM, Andrew Dunstan wrote: > > > On 10/13/2018 04:30 AM, Michael Paquier wrote: >> On Fri, Oct 12, 2018 at 12:14:48PM +0200, Michael Banck wrote: >>> On Fri, Oct 12, 2018 at 03:05:43PM +0900, Michael Paquier wrote: >>>> On Fri, Oct 12, 2018 at 12:11:58PM +0900, Michael Paquier wrote: >>>>> Agreed. I am just working on a patch for v11- which uses a >>>>> whitelist-based method instead of what is present now. Reverting the >>>>> tests to put the buildfarm to green could be done, but that's not the >>>>> root of the problem. >>> I think that's the right solution; the online verification patch adds >>> even more logic to the blacklist so getting rid of it in favor of a >>> whitelist is +1 with me. >> Thanks Michael for the input! >> >>>> So, I have coded this thing, and finish with the attached. The >>>> following file patterns are accepted for the checksums: >>>> <digits>.<segment> >>>> <digits>_<forkname> >>>> <digits>_<forkname>.<segment> >>>> I have added some tests on the way to make sure that all the patterns >>>> get covered. Please note that this skips the temporary files. >>> Maybe also add some correct (to be scanned) but non-empty garbage files >>> later on that it should barf on? >> I was not sure about doing that in the main patch so I tweaked manually >> the test to make sure that the tool still complained with "could not >> read block" as it should. That's easy enough to add, so I'll add them >> with multiple file patterns. Those are cheap checks as well if they are >> placed in global/. >> >> Another problem that the patch has is that it is not using >> forkname_to_number() to scan for all the fork types, and I forgot init >> forks in the previous version. Using forkname_to_number() also makes >> the tool more bug-proof, it is also not complicated to plug into the >> existing patch. >> >> Anyway, I have a bit of a problem here, which prevents me to stay in >> front of a computer or to look at a screen for more than a couple of >> minutes in a row for a couple of days at least, and I don't like to keep >> the buildfarm unhappy for the time being. There are three options: >> 1) Revert the TAP tests of pg_verify_checksums. >> 2) Push the patch which adds new entries for EXEC_BACKEND files in the >> skip list. That's a short workaround, and that would allow default >> deployments of Postgres to use the tool. >> 3) Finish the patch with the whitelist approach. >> >> I can do 1) or 2) in my condition. 3) requires more work than I can do >> now, though the patch to do is getting in shape, so the buildfarm would >> stay unhappy. Any preference of the course of action to take? > > > > I have disabled the test temporarily on my two animals since I want to > make sure they are working OK with other changes, and we know what the > problem is. Andres might want to do that with his animal also just add > "--skip-steps=pg_verify_checksums-check" to the command line. > > If you want to throw what you have for 3) over to wall to me I can see > if I can finish it. > It occurred to me that a pretty simple fix could just be to blacklist everything that didn't start with a digit. The whitelist approach is probably preferable ... depends how urgent we see this as. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sat, Oct 13, 2018 at 05:53:00PM -0400, Andrew Dunstan wrote: > It occurred to me that a pretty simple fix could just be to blacklist > everything that didn't start with a digit. The whitelist approach is > probably preferable... depends how urgent we see this as. Yeah, possibly, still that's not a correct long-term fix. So attached is what I think we should do for HEAD and REL_11_STABLE with an approach using a whitelist. I have added positive and negative tests on top of the existing TAP tests, as suggested by Michael B, and I made the code use relpath.h to make sure that we don't miss any fork types. Any objections to this patch? -- Michael
Attachment
On 10/13/2018 09:59 PM, Michael Paquier wrote: > On Sat, Oct 13, 2018 at 05:53:00PM -0400, Andrew Dunstan wrote: >> It occurred to me that a pretty simple fix could just be to blacklist >> everything that didn't start with a digit. The whitelist approach is >> probably preferable... depends how urgent we see this as. > Yeah, possibly, still that's not a correct long-term fix. So attached > is what I think we should do for HEAD and REL_11_STABLE with an approach > using a whitelist. I have added positive and negative tests on top of > the existing TAP tests, as suggested by Michael B, and I made the code > use relpath.h to make sure that we don't miss any fork types. > > Any objections to this patch? This code now seems redundant: if (strcmp(fn, ".") == 0 || strcmp(fn, "..") == 0) return true; I would probably reverse the order of these two tests. It might not make any difference, assuming fn is never an empty string, but it seems more logical to me. + /* good to go if only digits */ + if (fn[pos] == '\0') + return false; + /* leave if no digits */ + if (pos == 0) + return true; It also looks to me like the check for a segment number doesn't ensure there is at least one digit, so "123." might pass,but I could be wrong. In any case, there isn't a test for that, and there probably should be. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Oct 14, 2018 at 10:24:55AM -0400, Andrew Dunstan wrote: > [snip] Thanks a lot for the review, Andrew! > This code now seems redundant: > > if (strcmp(fn, ".") == 0 || > strcmp(fn, "..") == 0) > return true; Indeed. I have renamed skipfile() to isRelFileName on the way, which looks better in the context. > I would probably reverse the order of these two tests. It might not make any > difference, assuming fn is never an empty string, but it seems more logical > to me. > > + /* good to go if only digits */ > + if (fn[pos] == '\0') > + return false; > + /* leave if no digits */ > + if (pos == 0) > + return true; No objections to that. Done. > It also looks to me like the check for a segment number doesn't ensure > there is at least one digit, so "123." might pass, but I could be > wrong. In any case, there isn't a test for that, and there probably > should be. You are right here. This name pattern has been failing. Fixed in the attached. I have added "123_" and "123_." as extra patterns to check. What do you think about the updated version attached? -- Michael
Attachment
On 10/14/2018 06:41 PM, Michael Paquier wrote: > On Sun, Oct 14, 2018 at 10:24:55AM -0400, Andrew Dunstan wrote: >> [snip] > Thanks a lot for the review, Andrew! > >> This code now seems redundant: >> >> if (strcmp(fn, ".") == 0 || >> strcmp(fn, "..") == 0) >> return true; > Indeed. I have renamed skipfile() to isRelFileName on the way, which > looks better in the context. > >> I would probably reverse the order of these two tests. It might not make any >> difference, assuming fn is never an empty string, but it seems more logical >> to me. >> >> + /* good to go if only digits */ >> + if (fn[pos] == '\0') >> + return false; >> + /* leave if no digits */ >> + if (pos == 0) >> + return true; > No objections to that. Done. > >> It also looks to me like the check for a segment number doesn't ensure >> there is at least one digit, so "123." might pass, but I could be >> wrong. In any case, there isn't a test for that, and there probably >> should be. > You are right here. This name pattern has been failing. Fixed in the > attached. I have added "123_" and "123_." as extra patterns to check. > > What do you think about the updated version attached? Looks good to me. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/14/2018 09:11 PM, Andrew Dunstan wrote: > > >> >> What do you think about the updated version attached? > > > Looks good to me. > > We should fix this in PG11 rather than ship a broken utility. If everyone is happy, I can apply this. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: > We should fix this in PG11 rather than ship a broken utility. If > everyone is happy, I can apply this. At this point I'd be happier if you waited till after 11.0 wraps... there is no margin for error today. regards, tom lane
On 10/15/2018 11:05 AM, Tom Lane wrote: > Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: >> We should fix this in PG11 rather than ship a broken utility. If >> everyone is happy, I can apply this. > At this point I'd be happier if you waited till after 11.0 wraps... > there is no margin for error today. > > OK. In that case should we note that the utility is broken on Windows? cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Oct 15, 2018 at 11:20:28AM -0400, Andrew Dunstan wrote: > On 10/15/2018 11:05 AM, Tom Lane wrote: >> Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: >>> We should fix this in PG11 rather than ship a broken utility. If >>> everyone is happy, I can apply this. >> >> At this point I'd be happier if you waited till after 11.0 wraps... >> there is no margin for error today. If there is no urgency for this week, could you let me wrap this patch then? I cannot do that this week, but the beginning of next week will be fine per the current odds. > OK. In that case should we note that the utility is broken on Windows? Not sure if that's worth adding in the documentation, something in the press release would be more adapted? -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > On Mon, Oct 15, 2018 at 11:20:28AM -0400, Andrew Dunstan wrote: >> OK. In that case should we note that the utility is broken on Windows? > Not sure if that's worth adding in the documentation, something in the > press release would be more adapted? I can't get too excited about it. I can guarantee you that there are worse bugs than this one in 11.0. I would not be too surprised if we know of some by next week. So I don't see much point in loudly advertising this particular bug ... regards, tom lane
On October 16, 2018 3:47:55 PM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote: >Michael Paquier <michael@paquier.xyz> writes: >> On Mon, Oct 15, 2018 at 11:20:28AM -0400, Andrew Dunstan wrote: >>> OK. In that case should we note that the utility is broken on >Windows? > >> Not sure if that's worth adding in the documentation, something in >the >> press release would be more adapted? > >I can't get too excited about it. I can guarantee you that there are >worse bugs than this one in 11.0. I would not be too surprised if we >know of some by next week. So I don't see much point in loudly >advertising this particular bug ... Same. -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
On 10/16/2018 06:09 PM, Michael Paquier wrote: > On Mon, Oct 15, 2018 at 11:20:28AM -0400, Andrew Dunstan wrote: >> On 10/15/2018 11:05 AM, Tom Lane wrote: >>> Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: >>>> We should fix this in PG11 rather than ship a broken utility. If >>>> everyone is happy, I can apply this. >>> At this point I'd be happier if you waited till after 11.0 wraps... >>> there is no margin for error today. > If there is no urgency for this week, could you let me wrap this patch > then? I cannot do that this week, but the beginning of next week will > be fine per the current odds. > > Fine by me. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Oct 17, 2018 at 05:30:05PM -0400, Andrew Dunstan wrote: > Fine by me. Thanks. This is now committed after some tweaks to the comments, a bit earlier than I thought first. -- Michael
Attachment
Greetings, * Michael Paquier (michael@paquier.xyz) wrote: > On Wed, Oct 17, 2018 at 05:30:05PM -0400, Andrew Dunstan wrote: > > Fine by me. > > Thanks. This is now committed after some tweaks to the comments, a bit > earlier than I thought first. I just saw this committed and I'm trying to figure out why we are creating yet-another-list when it comes to deciding what should be checksum'd and what shouldn't be. Specifically, pg_basebackup (or, really, src/backend/replication/basebackup.c) has 'is_checksummed_file' which operates differently from pg_verify_checksum with this change, and that seems rather wrong. Maybe we need to fix both places but I *really* don't like this approach of "well, we'll just guess if the file should have a checksum or not" and it certainly seems likely that we'll end up forgetting to update this when we introduce things in the future which have checksums (which seems pretty darn likely to happen). I also categorically disagree with the notion that it's ok for extensions to dump files into our tablespace diretories or that we should try to work around random code dumping extra files in the directories which we maintain- it's not like this additional code will actually protect us from that, after all, and it's foolish to think we really could protect against that. Basically, I think this commit should be reverted, we should go back to having a blacklist, update it in both places (and add comments to both places to make it clear that this list exists in two different places) and add code to handle the temp tablespace case explicitly. Even better would be to find a way to expose the list and the code for walking the directories and identifying the files which contain checksums, so that we're only doing that in one place instead of multiple different places. Thanks! Stephen
Attachment
Hi, Am Freitag, den 19.10.2018, 10:36 -0400 schrieb Stephen Frost: > Greetings, > > * Michael Paquier (michael@paquier.xyz) wrote: > > On Wed, Oct 17, 2018 at 05:30:05PM -0400, Andrew Dunstan wrote: > > > Fine by me. > > > > Thanks. This is now committed after some tweaks to the comments, a bit > > earlier than I thought first. > > I just saw this committed and I'm trying to figure out why we are > creating yet-another-list when it comes to deciding what should be > checksum'd and what shouldn't be. > > Specifically, pg_basebackup (or, really, > src/backend/replication/basebackup.c) has 'is_checksummed_file' which > operates differently from pg_verify_checksum with this change, and that > seems rather wrong. To be fair, the list in src/backend/replication/basebackup.c was a copy- paste from the one in pg_verify_checksums (or from other parts of the online activation patch). I agree it makes sense to have both in sync or, better yet, factored out in a central place, but I don't currently have further opinions on whether it should be a black- or whitelist. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.banck@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Michael Banck <michael.banck@credativ.de> writes: > Am Freitag, den 19.10.2018, 10:36 -0400 schrieb Stephen Frost: >> I just saw this committed and I'm trying to figure out why we are >> creating yet-another-list when it comes to deciding what should be >> checksum'd and what shouldn't be. > To be fair, the list in src/backend/replication/basebackup.c was a copy- > paste from the one in pg_verify_checksums (or from other parts of the > online activation patch). Seems like a job for a new(?) module in src/common/. regards, tom lane
Greetings, * Tom Lane (tgl@sss.pgh.pa.us) wrote: > Michael Banck <michael.banck@credativ.de> writes: > > Am Freitag, den 19.10.2018, 10:36 -0400 schrieb Stephen Frost: > >> I just saw this committed and I'm trying to figure out why we are > >> creating yet-another-list when it comes to deciding what should be > >> checksum'd and what shouldn't be. > > > To be fair, the list in src/backend/replication/basebackup.c was a copy- > > paste from the one in pg_verify_checksums (or from other parts of the > > online activation patch). > > Seems like a job for a new(?) module in src/common/. Yeah, that would be nice... Even nicer would be a way for non-core PG tools to be able to use it (we have basically the same thing in pgbackrest, unsurprisingly), though I realize that's a much larger step. Thanks! Stephen
Attachment
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> Seems like a job for a new(?) module in src/common/. > Yeah, that would be nice... Even nicer would be a way for non-core PG > tools to be able to use it (we have basically the same thing in > pgbackrest, unsurprisingly), though I realize that's a much larger step. We install libpgcommon.a (and, as of HEAD, libpgcommon_shlib.a) for precisely that sort of reason. regards, tom lane
Greetings, * Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > * Tom Lane (tgl@sss.pgh.pa.us) wrote: > >> Seems like a job for a new(?) module in src/common/. > > > Yeah, that would be nice... Even nicer would be a way for non-core PG > > tools to be able to use it (we have basically the same thing in > > pgbackrest, unsurprisingly), though I realize that's a much larger step. > > We install libpgcommon.a (and, as of HEAD, libpgcommon_shlib.a) for > precisely that sort of reason. Hmm, yeah, we might be able to use that for this.. One challenge we've been thinking about has been dealing with multiple major versions of PG with one build of pgbackrest since we'd really like to do things like inspect the control file, but that changes between versions. We do have multi-version code in some places (quite a bit in pg_dump..) so perhaps we could have libpgcommon support also. Probably a topic for another thread. Thanks! Stephen
Attachment
Hi, On 2018-10-19 10:36:59 -0400, Stephen Frost wrote: > * Michael Paquier (michael@paquier.xyz) wrote: > > On Wed, Oct 17, 2018 at 05:30:05PM -0400, Andrew Dunstan wrote: > > > Fine by me. > > > > Thanks. This is now committed after some tweaks to the comments, a bit > > earlier than I thought first. > > I just saw this committed and I'm trying to figure out why we are > creating yet-another-list when it comes to deciding what should be > checksum'd and what shouldn't be. I'm not sure it's fair to blame Michael here. He's picking up slack, because the original authors of the tool didn't even bother to reply to the issue for quite a while. pg_verify_checksum was obviously never tested on windows, it just didn't have tests showing that. > I also categorically disagree with the notion that it's ok for > extensions to dump files into our tablespace diretories or that we > should try to work around random code dumping extra files in the > directories which we maintain- it's not like this additional code will > actually protect us from that, after all, and it's foolish to think we > really could protect against that. I'm unconvinced. There already are extensions doing so, and it's not like we've given them any sort of reasonable alternative. You can't just create relations in a "registered" / "supported" kinda way right now, so uh, yea, extension gotta make do. And that has worked for many years. Also, as made obvious here, it's pretty clear that there's platform differences about which files exists and which don't, so it's not that a blacklist approach automatically is more maintainable. And I fail to see why a blacklist is architecturally better. There's plenty cases where we might want to create temporary files, non-checksummed data or such that we'd need a teach a blacklist about, but there's far fewer cases where add new checksummed files. Basically never since checksums have been introduced. And if checksums were introduced for something new, say slrus, we'd ceertainly use pg_verify_checksum during development. Greetings, Andres Freund
Greetings, * Andres Freund (andres@anarazel.de) wrote: > On 2018-10-19 10:36:59 -0400, Stephen Frost wrote: > > * Michael Paquier (michael@paquier.xyz) wrote: > > > On Wed, Oct 17, 2018 at 05:30:05PM -0400, Andrew Dunstan wrote: > > > > Fine by me. > > > > > > Thanks. This is now committed after some tweaks to the comments, a bit > > > earlier than I thought first. > > > > I just saw this committed and I'm trying to figure out why we are > > creating yet-another-list when it comes to deciding what should be > > checksum'd and what shouldn't be. > > I'm not sure it's fair to blame Michael here. He's picking up slack, > because the original authors of the tool didn't even bother to reply to > the issue for quite a while. pg_verify_checksum was obviously never > tested on windows, it just didn't have tests showing that. I wasn't really going for 'blame' here, just that we've now got two different methods for doing the same thing- and those will give different answers; presumably this issue impacts pg_basebackup too, since the logic there wasn't changed. > > I also categorically disagree with the notion that it's ok for > > extensions to dump files into our tablespace diretories or that we > > should try to work around random code dumping extra files in the > > directories which we maintain- it's not like this additional code will > > actually protect us from that, after all, and it's foolish to think we > > really could protect against that. > > I'm unconvinced. There already are extensions doing so, and it's not > like we've given them any sort of reasonable alternative. You can't just > create relations in a "registered" / "supported" kinda way right now, so > uh, yea, extension gotta make do. And that has worked for many years. I don't agree that any of this is an argument for accepting random code writing into arbitrary places in the data directory or tablespace directories as being something which we support or which we write code to work-around. If this is a capability that extension authors need then I'm all for having a way for them to have that capability in a clearly defined and documented way- and one which we could actually write code to handle and work with. > Also, as made obvious here, it's pretty clear that there's platform > differences about which files exists and which don't, so it's not that a > blacklist approach automatically is more maintainable. Platform differences are certainly something we can manage and we have code in a few different places for doing exactly that. A platform difference is a heck of a lot more reasonable than trying to guess at what random code that we've never seen before has done and try to work around that. > And I fail to see why a blacklist is architecturally better. There's > plenty cases where we might want to create temporary files, > non-checksummed data or such that we'd need a teach a blacklist about, > but there's far fewer cases where add new checksummed files. Basically > never since checksums have been introduced. And if checksums were > introduced for something new, say slrus, we'd ceertainly use > pg_verify_checksum during development. In cases where we need something temporary, we're going to need to be prepared to clean those up and we had better make it very plain that they're temporary and easy to write code for. Anything we aren't prepared to blow away on a crash-restart should be checksum'd and in an ideal world, we'd be looking to reduce the blacklist to only those things which are temporary. Of course, we may need different code for calculating the checksum of different types of files, which moves it from really being a 'blacklist' to a list of file-types and their associated checksum'ing mechansim. Thanks! Stephen
Attachment
Hi, On 2018-10-19 13:52:28 -0400, Stephen Frost wrote: > > > I also categorically disagree with the notion that it's ok for > > > extensions to dump files into our tablespace diretories or that we > > > should try to work around random code dumping extra files in the > > > directories which we maintain- it's not like this additional code will > > > actually protect us from that, after all, and it's foolish to think we > > > really could protect against that. > > > > I'm unconvinced. There already are extensions doing so, and it's not > > like we've given them any sort of reasonable alternative. You can't just > > create relations in a "registered" / "supported" kinda way right now, so > > uh, yea, extension gotta make do. And that has worked for many years. > > I don't agree that any of this is an argument for accepting random code > writing into arbitrary places in the data directory or tablespace > directories as being something which we support or which we write code > to work-around. > > If this is a capability that extension authors need then I'm all for > having a way for them to have that capability in a clearly defined and > documented way- and one which we could actually write code to handle and > work with. cstore e.g. does this, and it's already out there. So yes, we should provide better infrastructure. But for now, we gotta deal with reality, unless we just want to gratuituously break things. > > And I fail to see why a blacklist is architecturally better. There's > > plenty cases where we might want to create temporary files, > > non-checksummed data or such that we'd need a teach a blacklist about, > > but there's far fewer cases where add new checksummed files. Basically > > never since checksums have been introduced. And if checksums were > > introduced for something new, say slrus, we'd ceertainly use > > pg_verify_checksum during development. > > In cases where we need something temporary, we're going to need to be > prepared to clean those up and we had better make it very plain that > they're temporary and easy to write code for. Anything we aren't > prepared to blow away on a crash-restart should be checksum'd and in an > ideal world, we'd be looking to reduce the blacklist to only those > things which are temporary. There's pending patches that add support for pg_verify_checksums running against a running cluster. We'll not just need to recognize files that are there after a graceful shutdown, but with anything that a cluster can have there while running. > Of course, we may need different code for > calculating the checksum of different types of files, which moves it > from really being a 'blacklist' to a list of file-types and their > associated checksum'ing mechansim. Which pretty much is a whitelist. Given that we need a filetype->checksumtype mapping or something, how is a blacklist a reasonable approach for this? Greetings, Andres Freund
On 10/19/2018 01:17 PM, Andres Freund wrote: > On 2018-10-19 10:36:59 -0400, Stephen Frost wrote: > >> I also categorically disagree with the notion that it's ok for >> extensions to dump files into our tablespace diretories .... > > I'm unconvinced. There already are extensions doing so, and it's not > like we've given them any sort of reasonable alternative. You can't just > create relations in a "registered" / "supported" kinda way right now, so > uh, yea, extension gotta make do. And that has worked for many years. For some of us following along at home, this got interesting right here. Could someone elaborate on what extensions do this, for what purpose? And what would it mean to create relations in a "registered" / "supported" kinda way? Has that been the topic of a past discussion I could catch up with? -Chap
Greetings, * Andres Freund (andres@anarazel.de) wrote: > On 2018-10-19 13:52:28 -0400, Stephen Frost wrote: > > > > I also categorically disagree with the notion that it's ok for > > > > extensions to dump files into our tablespace diretories or that we > > > > should try to work around random code dumping extra files in the > > > > directories which we maintain- it's not like this additional code will > > > > actually protect us from that, after all, and it's foolish to think we > > > > really could protect against that. > > > > > > I'm unconvinced. There already are extensions doing so, and it's not > > > like we've given them any sort of reasonable alternative. You can't just > > > create relations in a "registered" / "supported" kinda way right now, so > > > uh, yea, extension gotta make do. And that has worked for many years. > > > > I don't agree that any of this is an argument for accepting random code > > writing into arbitrary places in the data directory or tablespace > > directories as being something which we support or which we write code > > to work-around. > > > > If this is a capability that extension authors need then I'm all for > > having a way for them to have that capability in a clearly defined and > > documented way- and one which we could actually write code to handle and > > work with. > > cstore e.g. does this, and it's already out there. So yes, we should > provide better infrastructure. But for now, we gotta deal with reality, > unless we just want to gratuituously break things. No, I don't agree that we are beholden to an external extension that decided to start writing into directories that clearly belong to PG. How do we even know that what cstore does in the tablespace directory wouldn't get caught up in the checksum file pattern-matching that this commit put in? What if there was an extension which did create files that would match, what would we do then? I'm happy to go create one, if that'd help make the point that this "pattern whitelist" isn't actually a solution but is really rather just a hack that'll break in the future. > > > And I fail to see why a blacklist is architecturally better. There's > > > plenty cases where we might want to create temporary files, > > > non-checksummed data or such that we'd need a teach a blacklist about, > > > but there's far fewer cases where add new checksummed files. Basically > > > never since checksums have been introduced. And if checksums were > > > introduced for something new, say slrus, we'd ceertainly use > > > pg_verify_checksum during development. > > > > In cases where we need something temporary, we're going to need to be > > prepared to clean those up and we had better make it very plain that > > they're temporary and easy to write code for. Anything we aren't > > prepared to blow away on a crash-restart should be checksum'd and in an > > ideal world, we'd be looking to reduce the blacklist to only those > > things which are temporary. > > There's pending patches that add support for pg_verify_checksums running > against a running cluster. We'll not just need to recognize files that > are there after a graceful shutdown, but with anything that a cluster > can have there while running. Of course- the same is true with a crash/restart case, so I'm not sure what you're getting at here. I wasn't ever thinking of only the graceful shutdown case- certainly pg_basebackup operates when the cluster is running also and that's more what I was thinking about from the start and which is part of what started the discussion about this commit as that was entirely ignored in this change. > > Of course, we may need different code for > > calculating the checksum of different types of files, which moves it > > from really being a 'blacklist' to a list of file-types and their > > associated checksum'ing mechansim. > > Which pretty much is a whitelist. Given that we need a > filetype->checksumtype mapping or something, how is a blacklist a > reasonable approach for this? We only need the mapping for special files- call that list of special files a whitelist or a blacklist or a bluelist, it doesn't matter, what's relevant here is that we don't skip over anything- we account for everything and make sure that everything that would survive a crash/restart is checked or at least accounted for if we can't, yet, check it. Thanks! Stephen
Attachment
On 2018-10-19 14:08:19 -0400, J Chapman Flack wrote: > On 10/19/2018 01:17 PM, Andres Freund wrote: > > On 2018-10-19 10:36:59 -0400, Stephen Frost wrote: > > > >> I also categorically disagree with the notion that it's ok for > >> extensions to dump files into our tablespace diretories .... > > > > I'm unconvinced. There already are extensions doing so, and it's not > > like we've given them any sort of reasonable alternative. You can't just > > create relations in a "registered" / "supported" kinda way right now, so > > uh, yea, extension gotta make do. And that has worked for many years. > > For some of us following along at home, this got interesting right here. > Could someone elaborate on what extensions do this, for what purpose? > And what would it mean to create relations in a "registered" / > "supported" kinda way? Has that been the topic of a past discussion > I could catch up with? An fdw, like cstore, might be doing it, to store data, for example. The registered / supported thing would be to have a catalog representation of on-disk files that is represented in the catalogs somewhow. Right now you can't really do that because you need a proper relation (table, index, ...) to get a relfilenode. Greetings, Andres Freund
On 2018-10-19 14:11:03 -0400, Stephen Frost wrote: > Greetings, > > * Andres Freund (andres@anarazel.de) wrote: > > On 2018-10-19 13:52:28 -0400, Stephen Frost wrote: > > > > > I also categorically disagree with the notion that it's ok for > > > > > extensions to dump files into our tablespace diretories or that we > > > > > should try to work around random code dumping extra files in the > > > > > directories which we maintain- it's not like this additional code will > > > > > actually protect us from that, after all, and it's foolish to think we > > > > > really could protect against that. > > > > > > > > I'm unconvinced. There already are extensions doing so, and it's not > > > > like we've given them any sort of reasonable alternative. You can't just > > > > create relations in a "registered" / "supported" kinda way right now, so > > > > uh, yea, extension gotta make do. And that has worked for many years. > > > > > > I don't agree that any of this is an argument for accepting random code > > > writing into arbitrary places in the data directory or tablespace > > > directories as being something which we support or which we write code > > > to work-around. > > > > > > If this is a capability that extension authors need then I'm all for > > > having a way for them to have that capability in a clearly defined and > > > documented way- and one which we could actually write code to handle and > > > work with. > > > > cstore e.g. does this, and it's already out there. So yes, we should > > provide better infrastructure. But for now, we gotta deal with reality, > > unless we just want to gratuituously break things. > > No, I don't agree that we are beholden to an external extension that > decided to start writing into directories that clearly belong to PG. Did it have an alternative? > How do we even know that what cstore does in the tablespace directory > wouldn't get caught up in the checksum file pattern-matching that this > commit put in? You listen to people? > What if there was an extension which did create files that would > match, what would we do then? I'm happy to go create one, if that'd > help make the point that this "pattern whitelist" isn't actually a > solution but is really rather just a hack that'll break in the future. Oh, for crying out loud. Yes, obviously you can create conflicts, nobody ever doubted that. How on earth is that a useful point for anything? The point isn't that it's a good idea for extensions to do so, but that it has happened because we didn't provide better alternative. > > > > And I fail to see why a blacklist is architecturally better. There's > > > > plenty cases where we might want to create temporary files, > > > > non-checksummed data or such that we'd need a teach a blacklist about, > > > > but there's far fewer cases where add new checksummed files. Basically > > > > never since checksums have been introduced. And if checksums were > > > > introduced for something new, say slrus, we'd ceertainly use > > > > pg_verify_checksum during development. > > > > > > In cases where we need something temporary, we're going to need to be > > > prepared to clean those up and we had better make it very plain that > > > they're temporary and easy to write code for. Anything we aren't > > > prepared to blow away on a crash-restart should be checksum'd and in an > > > ideal world, we'd be looking to reduce the blacklist to only those > > > things which are temporary. > > > > There's pending patches that add support for pg_verify_checksums running > > against a running cluster. We'll not just need to recognize files that > > are there after a graceful shutdown, but with anything that a cluster > > can have there while running. > > Of course- the same is true with a crash/restart case, so I'm not sure > what you're getting at here. pg_verify_checksum doesn't support running on a crashed cluster, and I'm not sure it'd make sense to teach it to so (there's not really much it could do to discern whether a block is a torn page that replay will fix, or corrupted). > I wasn't ever thinking of only the graceful shutdown case- certainly > pg_basebackup operates when the cluster is running also and that's > more what I was thinking about from the start and which is part of > what started the discussion about this commit as that was entirely > ignored in this change. It was mainly ignored in the original pg_verify_checksums change, not in the commit discussed here. That was broken. That built separate logic. Both basebackup an verify checksums already only scan certain directories. They *already* are encoding directory structure information.
Greetings, * Andres Freund (andres@anarazel.de) wrote: > On 2018-10-19 14:11:03 -0400, Stephen Frost wrote: > > * Andres Freund (andres@anarazel.de) wrote: > > > cstore e.g. does this, and it's already out there. So yes, we should > > > provide better infrastructure. But for now, we gotta deal with reality, > > > unless we just want to gratuituously break things. > > > > No, I don't agree that we are beholden to an external extension that > > decided to start writing into directories that clearly belong to PG. > > Did it have an alternative? Yes, work with us to come up with a way to accomplish what they wanted without creating unknown/untracked files in our tablespaces. Or, have their own mechanism for storing files on other filesystems. And likely other options. Saying that we should account for their putting arbitrary files into the PG tablespace directories because they "didn't have any other choice" seems pretty ridiculous, imv. > > How do we even know that what cstore does in the tablespace directory > > wouldn't get caught up in the checksum file pattern-matching that this > > commit put in? > > You listen to people? > > > What if there was an extension which did create files that would > > match, what would we do then? I'm happy to go create one, if that'd > > help make the point that this "pattern whitelist" isn't actually a > > solution but is really rather just a hack that'll break in the future. > > Oh, for crying out loud. Yes, obviously you can create conflicts, nobody > ever doubted that. How on earth is that a useful point for anything? The > point isn't that it's a good idea for extensions to do so, but that it > has happened because we didn't provide better alternative. The point is that you're making a case because you happen to know of an extension which does this, but neither of us know about every extension out there and I, at least, don't believe we should be coming up with hacks to try and avoid looking at files that some extension has dumped into the PG tablespace directories because that's never been a documented or supported thing to do. > > > > > And I fail to see why a blacklist is architecturally better. There's > > > > > plenty cases where we might want to create temporary files, > > > > > non-checksummed data or such that we'd need a teach a blacklist about, > > > > > but there's far fewer cases where add new checksummed files. Basically > > > > > never since checksums have been introduced. And if checksums were > > > > > introduced for something new, say slrus, we'd ceertainly use > > > > > pg_verify_checksum during development. > > > > > > > > In cases where we need something temporary, we're going to need to be > > > > prepared to clean those up and we had better make it very plain that > > > > they're temporary and easy to write code for. Anything we aren't > > > > prepared to blow away on a crash-restart should be checksum'd and in an > > > > ideal world, we'd be looking to reduce the blacklist to only those > > > > things which are temporary. > > > > > > There's pending patches that add support for pg_verify_checksums running > > > against a running cluster. We'll not just need to recognize files that > > > are there after a graceful shutdown, but with anything that a cluster > > > can have there while running. > > > > Of course- the same is true with a crash/restart case, so I'm not sure > > what you're getting at here. > > pg_verify_checksum doesn't support running on a crashed cluster, and I'm > not sure it'd make sense to teach it to so (there's not really much it > could do to discern whether a block is a torn page that replay will fix, > or corrupted). ... and this isn't at all relevant, because pg_basebackup does run on a running cluster. > > I wasn't ever thinking of only the graceful shutdown case- certainly > > pg_basebackup operates when the cluster is running also and that's > > more what I was thinking about from the start and which is part of > > what started the discussion about this commit as that was entirely > > ignored in this change. > > It was mainly ignored in the original pg_verify_checksums change, not in > the commit discussed here. That was broken. That built separate logic. As pointed out elsewhere on this thread, the logic was the same between the two before this commit... The code in pg_basebackup came from the prior pg_verify_checksums code. Certainly, some mention of the code existing in two places, at least, should have been in the comments. Also, just to be clear, as I tried to say up-thread, I'm not trying to lay blame here or suggest that Michael shouldn't have been trying to fix the issue that came up, but I don't agree that this is the way to fix it, and, in any case, we need to make sure that pg_basebackup behaves correctly as well. As mentioned elsewhere, that'd be best done by adding something to libpgcommon and then changing both pg_basebackup and pg_verify_checksums to use that. > Both basebackup an verify checksums already only scan certain > directories. They *already* are encoding directory structure > information. Yes, they do. I'd like to see that expanded in the future as well, of course. Sadly, we've already given up any sense of control over what exists in the root of the data directory because of all the random config files and directories like 'pg_log' that end up there, but let's try to avoid making that same mistake about the directories which we create and maintain. Also, I feel like maybe you hit send before you intended to..? Thanks! Stephen
Attachment
Hi, On 2018-10-19 14:47:51 -0400, Stephen Frost wrote: > * Andres Freund (andres@anarazel.de) wrote: > > On 2018-10-19 14:11:03 -0400, Stephen Frost wrote: > > > * Andres Freund (andres@anarazel.de) wrote: > > > > cstore e.g. does this, and it's already out there. So yes, we should > > > > provide better infrastructure. But for now, we gotta deal with reality, > > > > unless we just want to gratuituously break things. > > > > > > No, I don't agree that we are beholden to an external extension that > > > decided to start writing into directories that clearly belong to PG. > > > > Did it have an alternative? > > Yes, work with us to come up with a way to accomplish what they wanted > without creating unknown/untracked files in our tablespaces. > > Or, have their own mechanism for storing files on other filesystems. > > And likely other options. Saying that we should account for their > putting arbitrary files into the PG tablespace directories because they > "didn't have any other choice" seems pretty ridiculous, imv. > > > > How do we even know that what cstore does in the tablespace directory > > > wouldn't get caught up in the checksum file pattern-matching that this > > > commit put in? > > > > You listen to people? > > > > > What if there was an extension which did create files that would > > > match, what would we do then? I'm happy to go create one, if that'd > > > help make the point that this "pattern whitelist" isn't actually a > > > solution but is really rather just a hack that'll break in the future. > > > > Oh, for crying out loud. Yes, obviously you can create conflicts, nobody > > ever doubted that. How on earth is that a useful point for anything? The > > point isn't that it's a good idea for extensions to do so, but that it > > has happened because we didn't provide better alternative. > > The point is that you're making a case because you happen to know of an > extension which does this, but neither of us know about every extension > out there and I, at least, don't believe we should be coming up with > hacks to try and avoid looking at files that some extension has dumped > into the PG tablespace directories because that's never been a > documented or supported thing to do. It's not a hack if it's a quite defendable choice on its own, and the presense of such extensions is just one further argument in one direction (for explicit whitelisting here). > > > > > > And I fail to see why a blacklist is architecturally better. There's > > > > > > plenty cases where we might want to create temporary files, > > > > > > non-checksummed data or such that we'd need a teach a blacklist about, > > > > > > but there's far fewer cases where add new checksummed files. Basically > > > > > > never since checksums have been introduced. And if checksums were > > > > > > introduced for something new, say slrus, we'd ceertainly use > > > > > > pg_verify_checksum during development. > > > > > > > > > > In cases where we need something temporary, we're going to need to be > > > > > prepared to clean those up and we had better make it very plain that > > > > > they're temporary and easy to write code for. Anything we aren't > > > > > prepared to blow away on a crash-restart should be checksum'd and in an > > > > > ideal world, we'd be looking to reduce the blacklist to only those > > > > > things which are temporary. > > > > > > > > There's pending patches that add support for pg_verify_checksums running > > > > against a running cluster. We'll not just need to recognize files that > > > > are there after a graceful shutdown, but with anything that a cluster > > > > can have there while running. > > > > > > Of course- the same is true with a crash/restart case, so I'm not sure > > > what you're getting at here. > > > > pg_verify_checksum doesn't support running on a crashed cluster, and I'm > > not sure it'd make sense to teach it to so (there's not really much it > > could do to discern whether a block is a torn page that replay will fix, > > or corrupted). > > ... and this isn't at all relevant, because pg_basebackup does run on a > running cluster. I wasn't ever denying that or anything close to it? My point is that pg_verify_checksum needs much more filtering than it has now to deal with that, because it needs to handle all files that could be present, not just files that could be present after a graceful shutdown. pg_basebackup's case is *not* really comparable because basebackup.c already performed a lot of filtering before noChecksumFiles is applied. > > > I wasn't ever thinking of only the graceful shutdown case- certainly > > > pg_basebackup operates when the cluster is running also and that's > > > more what I was thinking about from the start and which is part of > > > what started the discussion about this commit as that was entirely > > > ignored in this change. > > > > It was mainly ignored in the original pg_verify_checksums change, not in > > the commit discussed here. That was broken. That built separate logic. > > As pointed out elsewhere on this thread, the logic was the same between > the two before this commit... The code in pg_basebackup came from the > prior pg_verify_checksums code. Certainly, some mention of the code > existing in two places, at least, should have been in the comments. But the filter for basebackup only comes after the pre-existing filtering that the basebackup.c code already does. All of: /* * List of files excluded from backups. */ static const char *excludeFiles[] = { /* Skip auto conf temporary file. */ PG_AUTOCONF_FILENAME ".tmp", /* Skip current log file temporary file */ LOG_METAINFO_DATAFILE_TMP, /* Skip relation cache because it is rebuilt on startup */ RELCACHE_INIT_FILENAME, /* * If there's a backup_label or tablespace_map file, it belongs to a * backup started by the user with pg_start_backup(). It is *not* correct * for this backup. Our backup_label/tablespace_map is injected into the * tar separately. */ BACKUP_LABEL_FILE, TABLESPACE_MAP, "postmaster.pid", "postmaster.opts", /* end of list */ NULL }; is not applied, for example. Nor is: /* Skip temporary files */ if (strncmp(de->d_name, PG_TEMP_FILE_PREFIX, strlen(PG_TEMP_FILE_PREFIX)) == 0) continue; Nor is /* Exclude temporary relations */ if (isDbDir && looks_like_temp_rel_name(de->d_name)) { elog(DEBUG2, "temporary relation file \"%s\" excluded from backup", de->d_name); continue; } So, yea, they had: static const char *const skip[] = { "pg_control", "pg_filenode.map", "pg_internal.init", "PG_VERSION", NULL, }; in common, but not more. All the above exclusions are strictly necessary. FWIW, as far as I can tell, pg_verify_checksum appears to be broken in a lot of (unfortunately) pretty normal scenarios right now. Not just on windows. Besides the narrow window of crashing while a .tmp file is present (and then shutting down normally after a crash restart), it also has the much of wider window of crashing while *any* backend has temporary files/relations. As crash-restarts don't release temp files, they'll still be present after the crash, and a single graceful shutdown afterwards will leave them in place. No? > > Both basebackup an verify checksums already only scan certain > > directories. They *already* are encoding directory structure > > information. > > Yes, they do. I'd like to see that expanded in the future as well, of > course. Sadly, we've already given up any sense of control over what > exists in the root of the data directory because of all the random > config files and directories like 'pg_log' that end up there, but let's > try to avoid making that same mistake about the directories which we > create and maintain. I actually don't think that's a useful goal. ISTM it's going to be very likely that we'll *add* more different files that can be in data directories / subdirectories. Various forms of pluggable storage will need to have various forms of storing data, and we'll definitely want to have their data inside the database directory. Some will want to be checksummed, for others that won't make sense. We should always know which types of files postgres knows about, and treat those in the way we want. We shouldn't worry about conflicts against things we don't know in a new major version, but we also shouldn't break things if they're there. Realisticaly extensions *have* to experiment before PG has the support for whatever they're doing, our release schedule is way too long for anything to just wait for PG to support it nice and proper. > Also, I feel like maybe you hit send before you intended to..? Nope. Greetings, Andres Freund
Greetings, * Andres Freund (andres@anarazel.de) wrote: > On 2018-10-19 14:47:51 -0400, Stephen Frost wrote: > > * Andres Freund (andres@anarazel.de) wrote: > > > Oh, for crying out loud. Yes, obviously you can create conflicts, nobody > > > ever doubted that. How on earth is that a useful point for anything? The > > > point isn't that it's a good idea for extensions to do so, but that it > > > has happened because we didn't provide better alternative. > > > > The point is that you're making a case because you happen to know of an > > extension which does this, but neither of us know about every extension > > out there and I, at least, don't believe we should be coming up with > > hacks to try and avoid looking at files that some extension has dumped > > into the PG tablespace directories because that's never been a > > documented or supported thing to do. > > It's not a hack if it's a quite defendable choice on its own, and the > presense of such extensions is just one further argument in one > direction (for explicit whitelisting here). You've not convinced me that an extension dropping files into our tablespace directories is either something we should accept or that we should code around such cases, nor that we are doing anything but putting a hack in place to deal with what is pretty clearly a hack in its own right, imv. > > > > Of course- the same is true with a crash/restart case, so I'm not sure > > > > what you're getting at here. > > > > > > pg_verify_checksum doesn't support running on a crashed cluster, and I'm > > > not sure it'd make sense to teach it to so (there's not really much it > > > could do to discern whether a block is a torn page that replay will fix, > > > or corrupted). > > > > ... and this isn't at all relevant, because pg_basebackup does run on a > > running cluster. > > I wasn't ever denying that or anything close to it? My point is that > pg_verify_checksum needs much more filtering than it has now to deal > with that, because it needs to handle all files that could be present, > not just files that could be present after a graceful shutdown. Perhaps it doesn't today but surely one goal of pg_verify_checksum is to be able to run it on a running cluster eventually. > pg_basebackup's case is *not* really comparable because basebackup.c > already performed a lot of filtering before noChecksumFiles is applied. This all really just points out that we should have the code for handling this somewhere common that both pg_verify_checksum and pg_basebackup can leverage without duplicating all of it. The specific case that started all of this certainly looks pretty clearly like a case that both need to deal with. > > As pointed out elsewhere on this thread, the logic was the same between > > the two before this commit... The code in pg_basebackup came from the > > prior pg_verify_checksums code. Certainly, some mention of the code > > existing in two places, at least, should have been in the comments. > > But the filter for basebackup only comes after the pre-existing > filtering that the basebackup.c code already does. All of: > > /* > * List of files excluded from backups. > */ > static const char *excludeFiles[] = > { > /* Skip auto conf temporary file. */ > PG_AUTOCONF_FILENAME ".tmp", > > /* Skip current log file temporary file */ > LOG_METAINFO_DATAFILE_TMP, > > /* Skip relation cache because it is rebuilt on startup */ > RELCACHE_INIT_FILENAME, > > /* > * If there's a backup_label or tablespace_map file, it belongs to a > * backup started by the user with pg_start_backup(). It is *not* correct > * for this backup. Our backup_label/tablespace_map is injected into the > * tar separately. > */ > BACKUP_LABEL_FILE, > TABLESPACE_MAP, > > "postmaster.pid", > "postmaster.opts", > > /* end of list */ > NULL > }; > > is not applied, for example. Nor is: > > /* Skip temporary files */ > if (strncmp(de->d_name, > PG_TEMP_FILE_PREFIX, > strlen(PG_TEMP_FILE_PREFIX)) == 0) > continue; > > Nor is > /* Exclude temporary relations */ > if (isDbDir && looks_like_temp_rel_name(de->d_name)) > { > elog(DEBUG2, > "temporary relation file \"%s\" excluded from backup", > de->d_name); > > continue; > } > > So, yea, they had: > > static const char *const skip[] = { > "pg_control", > "pg_filenode.map", > "pg_internal.init", > "PG_VERSION", > NULL, > }; > > in common, but not more. All the above exclusions are strictly > necessary. ... but all of that code doesn't change that pg_basebackup also needs to ignore the EXEC_BACKEND created config_exec_params/.new files. You're right, pg_verify_checksums, with the assumption that it only runs on a cleanly shut-down cluster, doesn't need the temp-file-skipping logic, today, but it's going to need it tomorrow, isn't it? It seems like a good idea to avoid duplicating all of that code between the two, hence the suggestion to put it into libpgcommon, and this all seems to be getting pretty far away from the main point of disagreement we've been having around how to handle random files that show up in our tablespace directories. > FWIW, as far as I can tell, pg_verify_checksum appears to be broken in a > lot of (unfortunately) pretty normal scenarios right now. Not just on > windows. Besides the narrow window of crashing while a .tmp file is > present (and then shutting down normally after a crash restart), it also > has the much of wider window of crashing while *any* backend has > temporary files/relations. As crash-restarts don't release temp files, > they'll still be present after the crash, and a single graceful shutdown > afterwards will leave them in place. No? We do go through and do some cleanup at crash/restart, but I generally agree that we should really be looking to have pg_verify_checksums work on a running cluster and should therefore make it able to identify temp files and skip over things which we know don't have checksums (and that's ok) in a similar way to what pg_basebackup does. > > > Both basebackup an verify checksums already only scan certain > > > directories. They *already* are encoding directory structure > > > information. > > > > Yes, they do. I'd like to see that expanded in the future as well, of > > course. Sadly, we've already given up any sense of control over what > > exists in the root of the data directory because of all the random > > config files and directories like 'pg_log' that end up there, but let's > > try to avoid making that same mistake about the directories which we > > create and maintain. > > I actually don't think that's a useful goal. ISTM it's going to be very > likely that we'll *add* more different files that can be in data > directories / subdirectories. Various forms of pluggable storage will > need to have various forms of storing data, and we'll definitely want to > have their data inside the database directory. Some will want to be > checksummed, for others that won't make sense. We should always know > which types of files postgres knows about, and treat those in the way we > want. We shouldn't worry about conflicts against things we don't know in > a new major version, but we also shouldn't break things if they're > there. Realisticaly extensions *have* to experiment before PG has the > support for whatever they're doing, our release schedule is way too long > for anything to just wait for PG to support it nice and proper. I'm all for adding more different files- I didn't intend to suggest otherwise, but those are things we're adding and can/should account for. As you say, some of those are likely to have checksums, which should be handled by pg_basebackup and pg_verify_checksums, and that goes back to the point I was making up-thread that we want to make sure an account for everything. With this pattern-based approach, we could easily end up forgetting to add the correct new pattern into pg_verify_checksums. By making sure we account for everything and complain if we don't, we aren't going to miss anything. That kind of fail-safe is the kind of design that I think we are generally trying to go for and is part of why PG is as robust as it is. Playing this further and assuming that extensions dropping files into tablespace directories is acceptable, what are we supposed to do when there's some direct conflict between a file that we want to create in a PG tablespace directory and a file that some extension dropped there? Create yet another subdirectory which we call "THIS_IS_REALLY_ONLY_FOR_PG"? What about two different extensions wanting to create files with the same names in the tablespace directories..? Experimentation is fine, of course, this is open source code and people should feel free to play with it, but we are not obligated to avoid breaking things when an extension author, through their experimentation, does something which is clearly not supported, like dropping files into PG's tablespace directories. Further, when it comes to our user's data, we should be taking a strict approach and accounting for everything, something that this whitelist-of-patterns-based approach to finding files to verify the checksums on doesn't do. Thanks! Stephen
Attachment
On 2018-10-19 15:57:28 -0400, Stephen Frost wrote: > > > > > Of course- the same is true with a crash/restart case, so I'm not sure > > > > > what you're getting at here. > > > > > > > > pg_verify_checksum doesn't support running on a crashed cluster, and I'm > > > > not sure it'd make sense to teach it to so (there's not really much it > > > > could do to discern whether a block is a torn page that replay will fix, > > > > or corrupted). > > > > > > ... and this isn't at all relevant, because pg_basebackup does run on a > > > running cluster. > > > > I wasn't ever denying that or anything close to it? My point is that > > pg_verify_checksum needs much more filtering than it has now to deal > > with that, because it needs to handle all files that could be present, > > not just files that could be present after a graceful shutdown. > > Perhaps it doesn't today but surely one goal of pg_verify_checksum is to > be able to run it on a running cluster eventually. I was saying *precisely* that above. I give up. > > pg_basebackup's case is *not* really comparable because basebackup.c > > already performed a lot of filtering before noChecksumFiles is applied. > > This all really just points out that we should have the code for > handling this somewhere common that both pg_verify_checksum and > pg_basebackup can leverage without duplicating all of it. I never argued against that. My point is that your argument that they started out the same isn't true. > The specific case that started all of this certainly looks pretty > clearly like a case that both need to deal with. Yep. > > > As pointed out elsewhere on this thread, the logic was the same between > > > the two before this commit... The code in pg_basebackup came from the > > > prior pg_verify_checksums code. Certainly, some mention of the code > > > existing in two places, at least, should have been in the comments. > > > > But the filter for basebackup only comes after the pre-existing > > filtering that the basebackup.c code already does. All of: > > > > /* > > * List of files excluded from backups. > > */ > > static const char *excludeFiles[] = > > { > > /* Skip auto conf temporary file. */ > > PG_AUTOCONF_FILENAME ".tmp", > > > > /* Skip current log file temporary file */ > > LOG_METAINFO_DATAFILE_TMP, > > > > /* Skip relation cache because it is rebuilt on startup */ > > RELCACHE_INIT_FILENAME, > > > > /* > > * If there's a backup_label or tablespace_map file, it belongs to a > > * backup started by the user with pg_start_backup(). It is *not* correct > > * for this backup. Our backup_label/tablespace_map is injected into the > > * tar separately. > > */ > > BACKUP_LABEL_FILE, > > TABLESPACE_MAP, > > > > "postmaster.pid", > > "postmaster.opts", > > > > /* end of list */ > > NULL > > }; > > > > is not applied, for example. Nor is: > > > > /* Skip temporary files */ > > if (strncmp(de->d_name, > > PG_TEMP_FILE_PREFIX, > > strlen(PG_TEMP_FILE_PREFIX)) == 0) > > continue; > > > > Nor is > > /* Exclude temporary relations */ > > if (isDbDir && looks_like_temp_rel_name(de->d_name)) > > { > > elog(DEBUG2, > > "temporary relation file \"%s\" excluded from backup", > > de->d_name); > > > > continue; > > } > > > > So, yea, they had: > > > > static const char *const skip[] = { > > "pg_control", > > "pg_filenode.map", > > "pg_internal.init", > > "PG_VERSION", > > NULL, > > }; > > > > in common, but not more. All the above exclusions are strictly > > necessary. > > ... but all of that code doesn't change that pg_basebackup also needs to > ignore the EXEC_BACKEND created config_exec_params/.new files. Right. > You're right, pg_verify_checksums, with the assumption that it only runs > on a cleanly shut-down cluster, doesn't need the temp-file-skipping > logic, today, but it's going to need it tomorrow, isn't it? No, it needs it today, as explain below in the email you're replaying to. > > FWIW, as far as I can tell, pg_verify_checksum appears to be broken in a > > lot of (unfortunately) pretty normal scenarios right now. Not just on > > windows. Besides the narrow window of crashing while a .tmp file is > > present (and then shutting down normally after a crash restart), it also > > has the much of wider window of crashing while *any* backend has > > temporary files/relations. As crash-restarts don't release temp files, > > they'll still be present after the crash, and a single graceful shutdown > > afterwards will leave them in place. No? > > We do go through and do some cleanup at crash/restart We don't clean up temp files tho. * NOTE: we could, but don't, call this during a post-backend-crash restart * cycle. The argument for not doing it is that someone might want to examine * the temp files for debugging purposes. This does however mean that * OpenTemporaryFile had better allow for collision with an existing temp * file name. > As you say, some of those are likely to have checksums, which should be > handled by pg_basebackup and pg_verify_checksums, and that goes back to > the point I was making up-thread that we want to make sure an account > for everything. With this pattern-based approach, we could easily end > up forgetting to add the correct new pattern into pg_verify_checksums. If you're adding checksums for something, you better test it I don't buy this. In contrast it's much more likely that there's a file that's short-lived that you won't easily test against pg_verify_checksum running in that moment. > Playing this further and assuming that extensions dropping files into > tablespace directories is acceptable, what are we supposed to do when > there's some direct conflict between a file that we want to create in a > PG tablespace directory and a file that some extension dropped there? > Create yet another subdirectory which we call > "THIS_IS_REALLY_ONLY_FOR_PG"? Then it's a buggy extension. And we error out. > What about two different extensions wanting to create files with the > same names in the tablespace directories..? > > Experimentation is fine, of course, this is open source code and people > should feel free to play with it, but we are not obligated to avoid > breaking things when an extension author, through their experimentation, > does something which is clearly not supported, like dropping files into > PG's tablespace directories. Further, when it comes to our user's data, > we should be taking a strict approach and accounting for everything, > something that this whitelist-of-patterns-based approach to finding > files to verify the checksums on doesn't do. It's not economically feasible to work on extensions that will only be usable a year down the road. Greetings, Andres Freund
Greetings, * Andres Freund (andres@anarazel.de) wrote: > On 2018-10-19 15:57:28 -0400, Stephen Frost wrote: > > Perhaps it doesn't today but surely one goal of pg_verify_checksum is to > > be able to run it on a running cluster eventually. > > I was saying *precisely* that above. I give up. I'm glad we agree on that (I thought we did before, in fact, so it was odd to see it coming up again). > > > pg_basebackup's case is *not* really comparable because basebackup.c > > > already performed a lot of filtering before noChecksumFiles is applied. > > > > This all really just points out that we should have the code for > > handling this somewhere common that both pg_verify_checksum and > > pg_basebackup can leverage without duplicating all of it. > > I never argued against that. My point is that your argument that they > started out the same isn't true. I overstated the relationship, clearly, but it hardly matters, as you say.. > > The specific case that started all of this certainly looks pretty > > clearly like a case that both need to deal with. > > Yep. ... and it's in the part of the code that was actually copied and was the same. > > > FWIW, as far as I can tell, pg_verify_checksum appears to be broken in a > > > lot of (unfortunately) pretty normal scenarios right now. Not just on > > > windows. Besides the narrow window of crashing while a .tmp file is > > > present (and then shutting down normally after a crash restart), it also > > > has the much of wider window of crashing while *any* backend has > > > temporary files/relations. As crash-restarts don't release temp files, > > > they'll still be present after the crash, and a single graceful shutdown > > > afterwards will leave them in place. No? > > > > We do go through and do some cleanup at crash/restart > > We don't clean up temp files tho. > > * NOTE: we could, but don't, call this during a post-backend-crash restart > * cycle. The argument for not doing it is that someone might want to examine > * the temp files for debugging purposes. This does however mean that > * OpenTemporaryFile had better allow for collision with an existing temp > * file name. Then, yes, we should go through and fix pg_verify_checksums to work correctly in this case, likely using more-or-less the same code that pg_basebackup has. After that, we can add the remaining code to check the last checkpoint and skip pages which have a more recent LSN... > > As you say, some of those are likely to have checksums, which should be > > handled by pg_basebackup and pg_verify_checksums, and that goes back to > > the point I was making up-thread that we want to make sure an account > > for everything. With this pattern-based approach, we could easily end > > up forgetting to add the correct new pattern into pg_verify_checksums. > > If you're adding checksums for something, you better test it I don't buy > this. In contrast it's much more likely that there's a file that's > short-lived that you won't easily test against pg_verify_checksum > running in that moment. The only justification for *not* doing this is that some extension author might have dumped files into our tablespace directory, something we've never claimed to support nor generally worried about in all the time that I can recall before this. As for saying that someone is obviously going to use pg_verify_checksums to check their checksum code- I simply don't agree that we should be happy to rely on that assumption when the only reason for any of this is that some extension decided to do something that wasn't supported and likely has issues in other parts of the code anyway (pg_basebackup would happily copy those files too, even though there's obviously no code for making sure it's a consistent copy or any such in pg_basebackup or core). > > Playing this further and assuming that extensions dropping files into > > tablespace directories is acceptable, what are we supposed to do when > > there's some direct conflict between a file that we want to create in a > > PG tablespace directory and a file that some extension dropped there? > > Create yet another subdirectory which we call > > "THIS_IS_REALLY_ONLY_FOR_PG"? > > Then it's a buggy extension. And we error out. Extensions writing into directories they shouldn't be makes them buggy even if the core code isn't likely to write to a particular filename, imv. > > What about two different extensions wanting to create files with the > > same names in the tablespace directories..? > > > > Experimentation is fine, of course, this is open source code and people > > should feel free to play with it, but we are not obligated to avoid > > breaking things when an extension author, through their experimentation, > > does something which is clearly not supported, like dropping files into > > PG's tablespace directories. Further, when it comes to our user's data, > > we should be taking a strict approach and accounting for everything, > > something that this whitelist-of-patterns-based approach to finding > > files to verify the checksums on doesn't do. > > It's not economically feasible to work on extensions that will only be > usable a year down the road. I listed out multiple other solutions to this problem which you summarily ignored. It's unfortunate that those other solutions weren't used and that, instead, this extension decided to drop files into PG's tablespace directories, but that doesn't mean we should condone or implicitly support that action. I stand by my position that this patch should be reverted (with no offense or ill-will towards Michael, of course, I certainly appreciate his efforts to address the issues with pg_verify_checksums) and that we should move more of this code to identify files to verify the checksums on into libpgcommon and then use it from both pg_basebackup and pg_verify_checksums, to the extent possible, but that we make sure to account for all of the files in our tablespace and database directories. To the extent that this is an issue for extension authors, perhaps it would encourage them to work with us to have supported mechanisms instead of using hacks like dropping files into our tablespace directories and such instead of using another approach to manage files across multiple filesystems. I'd be kind of surprised if they really had an issue doing that and hopefully everyone would feel better about what these extensions are doing once they start using a mechanism that we actually support. Thanks! Stephen
Attachment
Hi, On 2018-10-19 16:35:46 -0400, Stephen Frost wrote: > The only justification for *not* doing this is that some extension > author might have dumped files into our tablespace directory, something > we've never claimed to support nor generally worried about in all the > time that I can recall before this. No, it's really not the only reason. As I said, testing is much less likely to find cases where we're checksumming a short-lived file even though we shouldn't, than making sure that checksummed files are actually checksummed. > > > Playing this further and assuming that extensions dropping files into > > > tablespace directories is acceptable, what are we supposed to do when > > > there's some direct conflict between a file that we want to create in a > > > PG tablespace directory and a file that some extension dropped there? > > > Create yet another subdirectory which we call > > > "THIS_IS_REALLY_ONLY_FOR_PG"? > > > > Then it's a buggy extension. And we error out. > > Extensions writing into directories they shouldn't be makes them buggy > even if the core code isn't likely to write to a particular filename, > imv. I'll just stop talking to you about this for now. > > > What about two different extensions wanting to create files with the > > > same names in the tablespace directories..? > > > > > > Experimentation is fine, of course, this is open source code and people > > > should feel free to play with it, but we are not obligated to avoid > > > breaking things when an extension author, through their experimentation, > > > does something which is clearly not supported, like dropping files into > > > PG's tablespace directories. Further, when it comes to our user's data, > > > we should be taking a strict approach and accounting for everything, > > > something that this whitelist-of-patterns-based approach to finding > > > files to verify the checksums on doesn't do. > > > > It's not economically feasible to work on extensions that will only be > > usable a year down the road. > > I listed out multiple other solutions to this problem which you > summarily ignored. Except that none of them really achieves the goals you can achieve by having the files in the database (like DROP DATABASE working, for starters). > It's unfortunate that those other solutions weren't used and that, > instead, this extension decided to drop files into PG's tablespace > directories, but that doesn't mean we should condone or implicitly > support that action. This just seems pointlessly rigid. Our extension related APIs aren't generally very good or well documented. Most non-trivial extensions that add interesting things to the postgres ecosystem are going to need a few not so pretty things to get going. > I stand by my position that this patch should be reverted (with no > offense or ill-will towards Michael, of course, I certainly appreciate > his efforts to address the issues with pg_verify_checksums) and that we > should move more of this code to identify files to verify the checksums > on into libpgcommon and then use it from both pg_basebackup and > pg_verify_checksums, to the extent possible, but that we make sure to > account for all of the files in our tablespace and database directories. What does that do, except break things that currently work? Sure, work on a patch that fixes the architectural concerns, but what's the point in reverting it until that's done? > To the extent that this is an issue for extension authors, perhaps it > would encourage them to work with us to have supported mechanisms > instead of using hacks like dropping files into our tablespace > directories and such instead of using another approach to manage files > across multiple filesystems. I'd be kind of surprised if they really > had an issue doing that and hopefully everyone would feel better about > what these extensions are doing once they start using a mechanism that > we actually support. Haribabu has a patch, but it's on-top of pluggable storage, so not exactly a small prerequisite. Greetings, Andres Freund
Greetings, * Andres Freund (andres@anarazel.de) wrote: > On 2018-10-19 16:35:46 -0400, Stephen Frost wrote: > > The only justification for *not* doing this is that some extension > > author might have dumped files into our tablespace directory, something > > we've never claimed to support nor generally worried about in all the > > time that I can recall before this. > > No, it's really not the only reason. As I said, testing is much less > likely to find cases where we're checksumming a short-lived file even > though we shouldn't, than making sure that checksummed files are > actually checksummed. While I agree that we might possibly end up trying to checksum a short-lived and poorly-named file, I'd rather that than risk missing the checking of a file which does have checksums that should have been checked and claiming that we checked all of the checksums. > > > > What about two different extensions wanting to create files with the > > > > same names in the tablespace directories..? > > > > > > > > Experimentation is fine, of course, this is open source code and people > > > > should feel free to play with it, but we are not obligated to avoid > > > > breaking things when an extension author, through their experimentation, > > > > does something which is clearly not supported, like dropping files into > > > > PG's tablespace directories. Further, when it comes to our user's data, > > > > we should be taking a strict approach and accounting for everything, > > > > something that this whitelist-of-patterns-based approach to finding > > > > files to verify the checksums on doesn't do. > > > > > > It's not economically feasible to work on extensions that will only be > > > usable a year down the road. > > > > I listed out multiple other solutions to this problem which you > > summarily ignored. > > Except that none of them really achieves the goals you can achieve by > having the files in the database (like DROP DATABASE working, for > starters). The only reason things like DROP DATABASE "work" in this example case is exactly that it assumes that all of the files which exist are ones which we put there. Or, to put it another way, if we're only going to checksum files based on some whitelist of files we expect to be there, shouldn't we go back and make things like DROP DATABASE only remove those files that we expect to be there and not random other files that we have no knowledge of..? > > It's unfortunate that those other solutions weren't used and that, > > instead, this extension decided to drop files into PG's tablespace > > directories, but that doesn't mean we should condone or implicitly > > support that action. > > This just seems pointlessly rigid. Our extension related APIs aren't > generally very good or well documented. Most non-trivial extensions that > add interesting things to the postgres ecosystem are going to need a few > not so pretty things to get going. PostGIS is a fantastic example of an extension that is far from trivial, extends PG in ways which are clearly supported, and has worked with PG to improve things in core to be able to then use in the extension. > > I stand by my position that this patch should be reverted (with no > > offense or ill-will towards Michael, of course, I certainly appreciate > > his efforts to address the issues with pg_verify_checksums) and that we > > should move more of this code to identify files to verify the checksums > > on into libpgcommon and then use it from both pg_basebackup and > > pg_verify_checksums, to the extent possible, but that we make sure to > > account for all of the files in our tablespace and database directories. > > What does that do, except break things that currently work? Sure, work > on a patch that fixes the architectural concerns, but what's the point > in reverting it until that's done? As you pointed out previously, the current code *doesn't* work, before or after this change, and we clearly need to rework this to move things into libpgcommon and also fix pg_basebackup. Reverting this would at least get us back to having similar code between this and pg_basebackup, and then it'll be cleaner and clearer to have one patch which moves that similar logic into libpgcommon and fixes the missing exceptions for the EXEC_BACKEND case. Keeping the patch doesn't do anything for the pg_basebackup case, and confuses the issue by having these filename-pattern-whitelists which weren't there before and that should be removed, imv. Thanks! Stephen
Attachment
Hi, On 2018-10-19 17:32:58 -0400, Stephen Frost wrote: > As you pointed out previously, the current code *doesn't* work, before > or after this change, and we clearly need to rework this to move things > into libpgcommon and also fix pg_basebackup. Reverting this would at > least get us back to having similar code between this and pg_basebackup, > and then it'll be cleaner and clearer to have one patch which moves that > similar logic into libpgcommon and fixes the missing exceptions for the > EXEC_BACKEND case. > > Keeping the patch doesn't do anything for the pg_basebackup case, and > confuses the issue by having these filename-pattern-whitelists which > weren't there before and that should be removed, imv. The current state has pg_verify_checksum work on windows for casual usage, which includes the regression tests that previously were broken. Whereas reverting it has the advantage that a fixup diff would be a few lines smaller. If you want to push a revert together with a fix, be my guest, but until that time it seems unhelpful to revert. Greetings, Andres Freund
Greetings, * Andres Freund (andres@anarazel.de) wrote: > On 2018-10-19 17:32:58 -0400, Stephen Frost wrote: > > As you pointed out previously, the current code *doesn't* work, before > > or after this change, and we clearly need to rework this to move things > > into libpgcommon and also fix pg_basebackup. Reverting this would at > > least get us back to having similar code between this and pg_basebackup, > > and then it'll be cleaner and clearer to have one patch which moves that > > similar logic into libpgcommon and fixes the missing exceptions for the > > EXEC_BACKEND case. > > > > Keeping the patch doesn't do anything for the pg_basebackup case, and > > confuses the issue by having these filename-pattern-whitelists which > > weren't there before and that should be removed, imv. > > The current state has pg_verify_checksum work on windows for casual > usage, which includes the regression tests that previously were broken. > Whereas reverting it has the advantage that a fixup diff would be a few > lines smaller. If you want to push a revert together with a fix, be my > guest, but until that time it seems unhelpful to revert. Clearly, pg_basebackup *doesn't* work though, so it's at best only half a fix and only because our regression tests don't cover the pg_basebackup case (which is a sad state of affairs, to say the least, but an independent issue). I'm not in any specific rush on this and I hope to hear Michael's thoughts once he's had a chance to review our discussion; it's his commit, after all. Michael's initial patch is in the archives also, so my suggestion would be to revert this one and then take Michael's prior patch and add to it the fix of pg_basebackup, in the same manner, and commit those changes together with additional comments to note that the code exists in both places. We could then work on moving the logic into libpgcommon, in master. Thanks! Stephen
Attachment
On 10/19/2018 05:32 PM, Stephen Frost wrote: > > As you pointed out previously, the current code *doesn't* work, before > or after this change, and we clearly need to rework this to move things > into libpgcommon and also fix pg_basebackup. Reverting this would at > least get us back to having similar code between this and pg_basebackup, > and then it'll be cleaner and clearer to have one patch which moves that > similar logic into libpgcommon and fixes the missing exceptions for the > EXEC_BACKEND case. > > Keeping the patch doesn't do anything for the pg_basebackup case, and > confuses the issue by having these filename-pattern-whitelists which > weren't there before and that should be removed, imv. > I don't think just reverting it is really acceptable. The patch was a response to buildfarm breakage, and moreover was published and discussed before it was applied. If you don't like it I think you need to publish a better solution that will not involve reintroducing the buildfarm error. I don't have a strong opinion about the mechanism. The current conversation does seem to me to be generating more heat than light, TBH. But I do have a strong opinion about not having to enable/disable the TAP test in question constantly. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: > I don't think just reverting it is really acceptable. +several. I do not mind somebody writing and installing a better fix. I do object to turning the buildfarm red again. regards, tom lane
On Fri, Oct 19, 2018 at 06:43:18PM -0400, Tom Lane wrote: > Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: >> I don't think just reverting it is really acceptable. > > +several. I do not mind somebody writing and installing a better fix. > I do object to turning the buildfarm red again. I did not expect this thread to turn into a war zone. Anyway, there are a couple of things I agree with on this thread: - I agree with Andres point here: https://postgr.es/m/20181019171747.4uithw2sjkt6msne@alap3.anarazel.de A blacklist is fundamentally more difficult to maintain as there are way more things added in a data folder which do not have data checksums than things which have checksums. So using a blacklist approach looks unmaintainable in the long term. Future patches related to enabling online checksum verification make me worry if we keep the code like that. I can also easily imagine that anybody willing to use the pluggable storage API would like to put new files in tablespace-related data folders, relying on "base/" being the default system tablespace - I agree with Stephen's point that we should decide if a file has checksums or not in a single place, and that we should use the same logic for base backups and pg_verify_checksums. - I agree with not doing a simple revert to not turn the buildfarm red again. This is annoying for animal maintainers. Andrew has done a very nice work in disabling manually those tests temporarily. - The base backup logic deciding if a file has checksums looks broken to me: it misses files generated by EXEC_BACKEND, and any instance of Postgres using an extension with custom files and data checksums has its backups broken. cstore_fdw has been mentioned above, and I recall that Heroku for example enables data checksums. If you combine both, it basically means that such an instance cannot take base backups anymore while it was able to do so with pre-10 with default options. That's not cool. So what I think we ought to do is the following: - Start a new thread, this one about TAP tests is not adapted. - Add in src/common/relpath.c the API from d55241af called isRelFileName(), make use of it in the base backup code, and basically remove is_checksummed_file() and the checksum skip list. -- Michael
Attachment
Hi, On 2018-10-20 12:39:55 +0900, Michael Paquier wrote: > - I agree with Stephen's point that we should decide if a file has > checksums or not in a single place, and that we should use the same > logic for base backups and pg_verify_checksums. To be clear, I wholeheartedly agree on that. We probably only want to tackle that for "is checksummable file" in a backpatchable fashion, but we probably should move to having more common infrastructure like that. > So what I think we ought to do is the following: > - Start a new thread, this one about TAP tests is not adapted. > - Add in src/common/relpath.c the API from d55241af called > isRelFileName(), make use of it in the base backup code, and basically > remove is_checksummed_file() and the checksum skip list. I think it probably shouldn't quite be that as an API. The code should not just check whether the file matches a pattern, but also importantly needs to exclude files that are in a temp tablespace. isRelFileName() doesn't quite describe an API meaning that. I assume we should keep something like isRelFileName() but use an API ontop of that that also exclude temp files / relations. Greetings, Andres Freund Date: Fri, 19 Oct 2018 20:48:02 -0700 Message-ID: <87in1xmg7h.fsf@alap4.lan>
Greetings,
On Fri, Oct 19, 2018 at 23:40 Michael Paquier <michael@paquier.xyz> wrote:
On Fri, Oct 19, 2018 at 06:43:18PM -0400, Tom Lane wrote:
> Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
>> I don't think just reverting it is really acceptable.
>
> +several. I do not mind somebody writing and installing a better fix.
> I do object to turning the buildfarm red again.
I did not expect this thread to turn into a war zone. Anyway, there are
a couple of things I agree with on this thread:
- I agree with Andres point here:
https://postgr.es/m/20181019171747.4uithw2sjkt6msne@alap3.anarazel.de
A blacklist is fundamentally more difficult to maintain as there are
way more things added in a data folder which do not have data checksums
than things which have checksums. So using a blacklist approach looks
unmaintainable in the long term. Future patches related to enabling
online checksum verification make me worry if we keep the code like
that. I can also easily imagine that anybody willing to use the
pluggable storage API would like to put new files in tablespace-related
data folders, relying on "base/" being the default system tablespace
If we are going to decide that we only deal with files in our directories matching these whitelisted patterns, then shouldn’t we apply similar logic in things like DROP DATABASE and any other cases where we perform actions in a recursive manner across our database and table space directories?
Should we really be removing arbitrary files that we know nothing about, after all?
What about pg_basebackup? Shall we update it to only stream through files matching these patterns as those are the only files we consider ourselves to be aware of?
I don’t buy off on any argument that presents pluggable storage as not including some way for us to track and be aware of what files are associated with that pluggable storage mechanism and which of those files have checksums and how to verify them if they do. In other words, I sure hope we don’t accept an approach like cstore *fdw* uses for pluggable storage where the core system has no idea whatsoever about what these random files dropped into our tablespace directories are.
- I agree with Stephen's point that we should decide if a file has
checksums or not in a single place, and that we should use the same
logic for base backups and pg_verify_checksums.
Despite it being a lot of the discussion, I don’t think there was ever disagreement on this point.
- I agree with not doing a simple revert to not turn the buildfarm red
again. This is annoying for animal maintainers. Andrew has done a very
nice work in disabling manually those tests temporarily.
This is a red herring, and always was, so I’m rather unimpressed at how it keeps coming up- no, I’m not advocating that we should just make the build farm red and just leave it that way. Yes, we should fix this case, and fix pg_basebackup, and maybe even try to add some regression tests which test this exact same case in pg_basebackup, but making the build farm green is *not* the only thing we should care about.
- The base backup logic deciding if a file has checksums looks broken to
me: it misses files generated by EXEC_BACKEND, and any instance of
Postgres using an extension with custom files and data checksums has its
backups broken. cstore_fdw has been mentioned above, and I recall that
Heroku for example enables data checksums. If you combine both, it
basically means that such an instance cannot take base backups anymore
while it was able to do so with pre-10 with default options. That's not
cool.
This is incorrect, pg_basebackup will still back up and keep files which fail checksum checks- logic which David Steele and I pushed for when the checksumming logic was added to pg_basebackup, as I recall, but it’ll complain and warn about these files ...
As it *should*, even if we weren’t doing checksum checks, because these are random files that have been dropped into a PG directory that PG doesn’t know anything about, which aren’t handled through WAL and therefore there’s no way to know if they’ll be at all valid when they’re copied, or that backing them up is at all useful.
Backups are absolutely important and we wouldn’t want a backup to be aborted or failed due to checksum failures, in general, but the user should be made aware of these failures. This approach of only taking responsibility for the files we know of through these patterns could also imply that we shouldn’t back up in pg_basebackup files that don’t match- but that strikes me as another similarly risky approach as any mistake in this whitelist of known files could then result in an inconsistent backup that’s missing things that should have been included.
As for which approach is easier to maintain, I don’t see one as being meaningfully much easier to maintain than the other, code wise, while having these patterns looks to me as having a lot more risk, with the only rather nebulous gain being that we don’t complain about extensions dropping random files in places they shouldn’t have been- and which, if they had actually been implemented in a way we’d be accepting of (I.e. pluggable storage), we would have been tracking and handling appropriately.
So what I think we ought to do is the following:
- Start a new thread, this one about TAP tests is not adapted.
I don’t have any objection to a new thread. I don’t know that it’ll really get more people to comment, but perhaps it will.
Thanks!
Stephen
On Fri, Oct 19, 2018 at 08:50:04PM -0700, Andres Freund wrote: > On 2018-10-20 12:39:55 +0900, Michael Paquier wrote: >> So what I think we ought to do is the following: >> - Start a new thread, this one about TAP tests is not adapted. >> - Add in src/common/relpath.c the API from d55241af called >> isRelFileName(), make use of it in the base backup code, and basically >> remove is_checksummed_file() and the checksum skip list. > > I think it probably shouldn't quite be that as an API. The code should > not just check whether the file matches a pattern, but also importantly > needs to exclude files that are in a temp tablespace. isRelFileName() > doesn't quite describe an API meaning that. I assume we should keep > something like isRelFileName() but use an API ontop of that that also > exclude temp files / relations. From what I can see we would need to check for a couple of patterns if we go to this extent: - Look for PG_TEMP_FILE_PREFIX and exclude those. - looks_like_temp_rel_name() which checks for temp files names. This is similar to isRelFileName except that it works on temporary files. Moving it to relpath.c and renaming it IsTempRelFileName is an idea. But this one would not be necessary as isRelFileName discards temporary relations, no? - parse_filename_for_nontemp_relation() is also too similar to isRelFileName(). At the end, do we really need to do anything more than adding some checks on PG_TEMP_FILE_PREFIX? I am not sure that there is much need for a global API like isChecksummedFile for only those two places. I have already a patch doing the work of moving isRelFileName() into src/common/. Adding one check on PG_TEMP_FILE_PREFIX is not much work on top of it. -- Michael
Attachment
Hi, On 2018-10-20 13:30:46 +0900, Michael Paquier wrote: > On Fri, Oct 19, 2018 at 08:50:04PM -0700, Andres Freund wrote: > > On 2018-10-20 12:39:55 +0900, Michael Paquier wrote: > >> So what I think we ought to do is the following: > >> - Start a new thread, this one about TAP tests is not adapted. > >> - Add in src/common/relpath.c the API from d55241af called > >> isRelFileName(), make use of it in the base backup code, and basically > >> remove is_checksummed_file() and the checksum skip list. > > > > I think it probably shouldn't quite be that as an API. The code should > > not just check whether the file matches a pattern, but also importantly > > needs to exclude files that are in a temp tablespace. isRelFileName() > > doesn't quite describe an API meaning that. I assume we should keep > > something like isRelFileName() but use an API ontop of that that also > > exclude temp files / relations. > > From what I can see we would need to check for a couple of patterns if > we go to this extent: > - Look for PG_TEMP_FILE_PREFIX and exclude those. > - looks_like_temp_rel_name() which checks for temp files names. This is > similar to isRelFileName except that it works on temporary files. > Moving it to relpath.c and renaming it IsTempRelFileName is an idea. > But this one would not be necessary as isRelFileName discards temporary > relations, no? I think it's not good to have those necessary intermingled in an exposed function. > At the end, do we really need to do anything more than adding some > checks on PG_TEMP_FILE_PREFIX? I think we also need the exclusion list basebackup.c has that generally skips files. They might be excluded anyway, but I think it'd be safer to make sure. > I am not sure that there is much need for a global API like > isChecksummedFile for only those two places. Seems likely that other tools would want to have access too. Greetings, Andres Freund
Greetings,
On Sat, Oct 20, 2018 at 00:31 Michael Paquier <michael@paquier.xyz> wrote:
On Fri, Oct 19, 2018 at 08:50:04PM -0700, Andres Freund wrote:
> On 2018-10-20 12:39:55 +0900, Michael Paquier wrote:
>> So what I think we ought to do is the following:
>> - Start a new thread, this one about TAP tests is not adapted.
>> - Add in src/common/relpath.c the API from d55241af called
>> isRelFileName(), make use of it in the base backup code, and basically
>> remove is_checksummed_file() and the checksum skip list.
>
> I think it probably shouldn't quite be that as an API. The code should
> not just check whether the file matches a pattern, but also importantly
> needs to exclude files that are in a temp tablespace. isRelFileName()
> doesn't quite describe an API meaning that. I assume we should keep
> something like isRelFileName() but use an API ontop of that that also
> exclude temp files / relations.
From what I can see we would need to check for a couple of patterns if
we go to this extent:
- Look for PG_TEMP_FILE_PREFIX and exclude those.
- looks_like_temp_rel_name() which checks for temp files names. This is
similar to isRelFileName except that it works on temporary files.
Moving it to relpath.c and renaming it IsTempRelFileName is an idea.
But this one would not be necessary as isRelFileName discards temporary
relations, no?
- parse_filename_for_nontemp_relation() is also too similar to
isRelFileName().
At the end, do we really need to do anything more than adding some
checks on PG_TEMP_FILE_PREFIX? I am not sure that there is much need
for a global API like isChecksummedFile for only those two places. I
have already a patch doing the work of moving isRelFileName() into
src/common/. Adding one check on PG_TEMP_FILE_PREFIX is not much work
on top of it.
I’m not at my computer at the moment so may not be entirely following the question here but to be clear, whatever we do here will have downstream impact into other tools, such as pgbackrest, and therefore we definitely want to have the code in libpgcommon so that these external tools can leverage it and know that they’re doing what PG does.
I’d also like to give David Steele a chance to comment on the specific API, and any other backup tools authors, which I don’t think we should be rushing into anyway and I would think we’d only put into master..
Thanks!
Stephen
On Sat, Oct 20, 2018 at 12:25:19AM -0400, Stephen Frost wrote: > On Fri, Oct 19, 2018 at 23:40 Michael Paquier <michael@paquier.xyz> wrote: >> - I agree with not doing a simple revert to not turn the buildfarm red >> again. This is annoying for animal maintainers. Andrew has done a very >> nice work in disabling manually those tests temporarily. > > This is a red herring, and always was, so I’m rather unimpressed at how it > keeps coming up- no, I’m not advocating that we should just make the build > farm red and just leave it that way. Yes, we should fix this case, and fix > pg_basebackup, and maybe even try to add some regression tests which test > this exact same case in pg_basebackup, but making the build farm green is > *not* the only thing we should care about. Well, the root of the problem was that pg_verify_checksums has been committed without any tests on its own. If we had those tests from the start, then we would not be having this discussion post-release time, still trying to figure out if whitelisting or blacklisting is appropriate. The validation checksums in base backups has been added in 4eb77d50, a couple of days before pg_verify_checksums has been introduced. This has added corruption-related tests in src/bin/pg_basebackup, which is a good thing. However the feature has been designed so as checksum mismatches are ignored after 5 failures, which actually *masked* the fact that EXEC_BACKEND files like CONFIG_EXEC_PARAMS should have been skipped instead of getting checksum failures. And that's a bad thing. So this gives in my opinion a good point about using a whitelist. -- Michael
Attachment
On Sat, Oct 20, 2018 at 12:41:04AM -0400, Stephen Frost wrote: > I’d also like to give David Steele a chance to comment on the specific API, > and any other backup tools authors, which I don’t think we should be > rushing into anyway and I would think we’d only put into master.. Getting David input would be nice! -- Michael
Attachment
Greetings,
On Sat, Oct 20, 2018 at 00:43 Michael Paquier <michael@paquier.xyz> wrote:
On Sat, Oct 20, 2018 at 12:25:19AM -0400, Stephen Frost wrote:
> On Fri, Oct 19, 2018 at 23:40 Michael Paquier <michael@paquier.xyz> wrote:
>> - I agree with not doing a simple revert to not turn the buildfarm red
>> again. This is annoying for animal maintainers. Andrew has done a very
>> nice work in disabling manually those tests temporarily.
>
> This is a red herring, and always was, so I’m rather unimpressed at how it
> keeps coming up- no, I’m not advocating that we should just make the build
> farm red and just leave it that way. Yes, we should fix this case, and fix
> pg_basebackup, and maybe even try to add some regression tests which test
> this exact same case in pg_basebackup, but making the build farm green is
> *not* the only thing we should care about.
Well, the root of the problem was that pg_verify_checksums has been
committed without any tests on its own. If we had those tests from the
start, then we would not be having this discussion post-release time,
still trying to figure out if whitelisting or blacklisting is
appropriate.
The validation checksums in base backups has been added in 4eb77d50, a
couple of days before pg_verify_checksums has been introduced. This has
added corruption-related tests in src/bin/pg_basebackup, which is a good
thing. However the feature has been designed so as checksum mismatches
are ignored after 5 failures, which actually *masked* the fact that
EXEC_BACKEND files like CONFIG_EXEC_PARAMS should have been skipped
instead of getting checksum failures. And that's a bad thing. So this
gives in my opinion a good point about using a whitelist.
That counter of checksum failures should have been getting reset between files.. that’s certainly what I had understood was intended. The idea of the counter is to not flood the log with errors when a file is discovered that’s full of checksum failures (as can happen if large chunks of the file got replaced with something else, for example), but it should be reporting on each file that does have failures in it.
I don’t see how having a whitelist changes that or would have impacted that logic to make it correct initially or not.
I’m also trying to understand how this checksum logging limit is getting hit in the tests but they’re passing..? If this was an intended failure check then surely there’s a test done that’s intended to be successful and it should have complained about this file too, or perhaps that’s what’s missing. I’m happy to look into this later this weekend. Certainly seems like something here isn’t really making sense tho.
Thanks!
Stephen
On Sat, Oct 20, 2018 at 12:41:04AM -0400, Stephen Frost wrote: > I’d also like to give David Steele a chance to comment on the specific API, > and any other backup tools authors, which I don’t think we should be > rushing into anyway and I would think we’d only put into master.. By the way, we need to do something for the checksum verification code in base backups for v11 as well. If you enable checksums and take a base backup of a build with EXEC_BACKEND, then this creates spurious checksums failures. That's a bug. So while I agree that having a larger robust API is fine for HEAD, I would most likely not back-patch it. This is why I would suggest as a first step for HEAD and v11 to use a whitelist for base backups, to check for temporary tablespaces in pg_verify_checksums, to move isRelFileName into src/common/ and to keep the change minimalistic. -- Michael
Attachment
Greetings,
On Sat, Oct 20, 2018 at 00:58 Michael Paquier <michael@paquier.xyz> wrote:
On Sat, Oct 20, 2018 at 12:41:04AM -0400, Stephen Frost wrote:
> I’d also like to give David Steele a chance to comment on the specific API,
> and any other backup tools authors, which I don’t think we should be
> rushing into anyway and I would think we’d only put into master..
By the way, we need to do something for the checksum verification code
in base backups for v11 as well. If you enable checksums and take a
base backup of a build with EXEC_BACKEND, then this creates spurious
checksums failures. That's a bug. So while I agree that having a
larger robust API is fine for HEAD, I would most likely not back-patch
it. This is why I would suggest as a first step for HEAD and v11 to use
a whitelist for base backups, to check for temporary tablespaces in
pg_verify_checksums, to move isRelFileName into src/common/ and to keep
the change minimalistic.
I’m all for keeping the changes which are backpatched minimal, which updating the blacklist as your original patch on this thread did would certainly be. Even adding in the logic to skip temp files as pg_basebackup has would be simpler and based on existing well-tested and extensively used code, unlike this new pattern-based whitelist of files approach.
I have to say that I can’t recall hearing much in the way of complaints about pg_basebackup copying all the random cstore files, or the new checksum validation logic complaining about them, and such when doing backups and I wonder if that is because people simply don’t use the two together much, making me wonder how much of an issue this really is or would be with the account-for-everything approach I’ve been advocating for.
Thanks!
Stephen
Hi, On 2018-10-20 01:07:43 -0400, Stephen Frost wrote: > I have to say that I can’t recall hearing much in the way of complaints > about pg_basebackup copying all the random cstore files Why would somebody complain about that? It's usually desirable. > or the new checksum validation logic complaining about them, and such > when doing backups and I wonder if that is because people simply don’t > use the two together much, making me wonder how much of an issue this > really is or would be with the account-for-everything approach I’ve > been advocating for. I mean obviously pg_verify_checksum simply hasn't been actually tested much with plain postgres without extensions, given the all weaknesses identified in this thread. Greetings, Andres Freund
Hi, On 2018-10-20 00:25:19 -0400, Stephen Frost wrote: > If we are going to decide that we only deal with files in our directories > matching these whitelisted patterns, then shouldn’t we apply similar logic > in things like DROP DATABASE and any other cases where we perform actions > in a recursive manner across our database and table space directories? I'm honestly not sure if you're just trolling at this point. Why would anybody reasonable be concerned about DROP DATABASE dropping the directory for the database? You're not honestly suggesting that anybody would write an extension or anything like that that stores data in the wrong database's directory, right? Other iterations like fsyncing files, copying the entire template database directory, etc are similarly harmless or positive. Greetings, Andres Freund
Greetings,
On Sat, Oct 20, 2018 at 01:16 Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2018-10-20 00:25:19 -0400, Stephen Frost wrote:
> If we are going to decide that we only deal with files in our directories
> matching these whitelisted patterns, then shouldn’t we apply similar logic
> in things like DROP DATABASE and any other cases where we perform actions
> in a recursive manner across our database and table space directories?
I'm honestly not sure if you're just trolling at this point. Why would
anybody reasonable be concerned about DROP DATABASE dropping the
directory for the database? You're not honestly suggesting that anybody
would write an extension or anything like that that stores data in the
wrong database's directory, right? Other iterations like fsyncing
files, copying the entire template database directory, etc are similarly
harmless or positive.
No, I’m not trolling, what I was trying to do is make the point that this is moving us away from having a very clear idea of what’s PG’s responsibility and what we feel comfortable operating on and this new half-and-half stance where we’ll happily nuke files in a directory that we didn’t create, and back them up even if we have no idea if they’ll be consistent at all or not when restored, but we won’t try to check the checksums on them or do some other set of operations on them.
I suspect it’s pretty clear already, but just to make it plain, I really don’t like the half-and-half approach and it seems we’re being backed into this because it happens to fit some specific cases and not because there was any real thought or design put into supporting these use cases or being able to extend PG in this way. I do also see real risks with a whitelisting kind of approach that we end up missing things, and I get that you don’t see that risk but just stating that doesn’t change my opinion on it. Based on what you said up thread this whole thing also seems like it may be just going away anyway- because there’s real design being done to do this properly and allow PG to be extended in a way that we will know about what files are associated with what extensions or storage mechanisms and that seems like just another reason why we shouldn’t be moving to explicitly support random files being dropped into PG directories.
Thanks,
Stephen
Greetings,
On Sat, Oct 20, 2018 at 01:11 Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2018-10-20 01:07:43 -0400, Stephen Frost wrote:
> I have to say that I can’t recall hearing much in the way of complaints
> about pg_basebackup copying all the random cstore files
Why would somebody complain about that? It's usually desirable.
Even though they’re as likely as not to be invalid or corrupted..? Maybe things have moved forward here, I know there’s been discussion about it, but last I heard those files weren’t WAL’d and therefore the result of copying them from a running server was indeterminate. Yes, sometimes they’ll be fine, but you could say the same about regular PG relations too and yet we certainly wouldn’t be accepting of that. It certainly seems reasonable that people would complain about pg_basebackup misperforming when a backup that it did results in an invalid restore, though it tends to be a lot rarer to get complaints about partial failures like a corrupt or partial file being copied during a backup- but then that’s part of why we stress so much about trying to make sure we don’t do that as it can be hard to detect.
People certainly did complain about unlogged tables being backed up and that was just because they took up space in the backup and time on the backup and restore just to be nuked when the server is started.
> or the new checksum validation logic complaining about them, and such
> when doing backups and I wonder if that is because people simply don’t
> use the two together much, making me wonder how much of an issue this
> really is or would be with the account-for-everything approach I’ve
> been advocating for.
I mean obviously pg_verify_checksum simply hasn't been actually tested
much with plain postgres without extensions, given the all weaknesses
identified in this thread.
No, it hasn’t, but pg_basebackup has been around quite a while and has always copied everything, as best as I can recall anyway.
Thanks,
Stephen
On Sat, Oct 20, 2018 at 02:03:32AM -0400, Stephen Frost wrote: > On Sat, Oct 20, 2018 at 01:11 Andres Freund <andres@anarazel.de> wrote: >> or the new checksum validation logic complaining about them, and such >>> when doing backups and I wonder if that is because people simply don’t >>> use the two together much, making me wonder how much of an issue this >>> really is or would be with the account-for-everything approach I’ve >>> been advocating for. >> >> I mean obviously pg_verify_checksum simply hasn't been actually tested >> much with plain postgres without extensions, given the all weaknesses >> identified in this thread. > > No, it hasn’t, but pg_basebackup has been around quite a while and has > always copied everything, as best as I can recall anyway. At this point, let's create a new thread with a description of what has been discussed and what we'd like to do for HEAD and v11. I got something in mind which would result in a minimal patch. Let's start from that. -- Michael
Attachment
Hi, On Sun, Oct 14, 2018 at 10:59:02AM +0900, Michael Paquier wrote: > On Sat, Oct 13, 2018 at 05:53:00PM -0400, Andrew Dunstan wrote: > > It occurred to me that a pretty simple fix could just be to blacklist > > everything that didn't start with a digit. The whitelist approach is > > probably preferable... depends how urgent we see this as. > > Yeah, possibly, still that's not a correct long-term fix. So attached > is what I think we should do for HEAD and REL_11_STABLE with an approach > using a whitelist. I have added positive and negative tests on top of > the existing TAP tests, as suggested by Michael B, and I made the code > use relpath.h to make sure that we don't miss any fork types. I was looking at extending the testsuite for the online checksum verification feature as requested by Fabien, and it seems some of the additional tests are not working as intended: > diff --git a/src/bin/pg_verify_checksums/t/002_actions.pl b/src/bin/pg_verify_checksums/t/002_actions.pl > index dc29da09af..33408ced89 100644 > --- a/src/bin/pg_verify_checksums/t/002_actions.pl > +++ b/src/bin/pg_verify_checksums/t/002_actions.pl [...] > +append_to_file "$pgdata/global/99999_vm.123", ""; > + > # Checksums pass on a newly-created cluster > command_ok(['pg_verify_checksums', '-D', $pgdata], > "succeeds with offline cluster"); > @@ -67,3 +89,30 @@ $node->command_checks_all([ 'pg_verify_checksums', '-D', $pgdata, '-r', > [qr/Bad checksums:.*1/], > [qr/checksum verification failed/], > 'fails for corrupted data on single relfilenode'); > + > +# Utility routine to check that pg_verify_checksums is correctly > +# able to detect correctly-shaped relation files with some data. > +sub fail_corrupt > +{ > + my $node = shift; > + my $file = shift; > + my $pgdata = $node->data_dir; > + > + # Create the file with some data in it. > + append_to_file "$pgdata/global/$file", "foo"; > + > + $node->command_checks_all([ 'pg_verify_checksums', '-D', $pgdata], > + 1, > + [qr/^$/], > + [qr/could not read block/], > + "fails for corrupted data in $file"); > +} > + > +fail_corrupt($node, "99999"); > +fail_corrupt($node, "99999.123"); > +fail_corrupt($node, "99999_fsm"); > +fail_corrupt($node, "99999_init"); > +fail_corrupt($node, "99999_vm"); > +fail_corrupt($node, "99999_init.123"); > +fail_corrupt($node, "99999_fsm.123"); > +fail_corrupt($node, "99999_vm.123"); First off, I think those fail_corrupt files should have different filenames than the empty ones above (I left `99999_vm.123' as an example of a duplicated file). Second, and this is more severe, the subroutine never removes those files so in practise, we are checking the first (or possibly some randon other) file instead of the one we think we are looking at (or at least what I think the intention is). This can be shown if you expand the regexp from `qr/could not read block/' to `qr/could not read block 0 in file.*$file\":/': |t/002_actions.pl .. 6/38 |# Failed test 'fails for corrupted data in 99999.123 stderr /(?^:could |# not read block 0 in file.*99999.123\":)/' |# at t/002_actions.pl line 109. |# 'pg_verify_checksums: could not read block 0 in file |# "[...]src/bin/pg_verify_checksums/tmp_check/t_002_actions_node_checksum_data/pgdata/global/99999": |# read 3 of 8192 |# ' |# doesn't match '(?^:could not read block 0 in file.*99999.123\":)' So either we remove or truncate the files again after $node->command_checks_all(), or the tests use the relfilenode feature. However, in the latter case we would have to use a different main relfilenode ID for each subtest, as otherwise pg_verify_checksums would again fail on the first file it scans. So I am proposing something like the attached. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.banck@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment
On Mon, Nov 19, 2018 at 07:11:19PM +0100, Michael Banck wrote: > First off, I think those fail_corrupt files should have different > filenames than the empty ones above (I left `99999_vm.123' as an > example of a duplicated file). A comment about that is a good idea. So added. > So either we remove or truncate the files again after > $node->command_checks_all(), or the tests use the relfilenode > feature. > > However, in the latter case we would have to use a different main > relfilenode ID for each subtest, as otherwise pg_verify_checksums would > again fail on the first file it scans. The take here is to make sure that the correct file name is showing up in the produced error message, and removing the files makes future test more robust, so I have committed a version which removes the files instead. -- Michael
Attachment
Hi, Am Freitag, den 19.10.2018, 22:50 +0900 schrieb Michael Paquier: > On Wed, Oct 17, 2018 at 05:30:05PM -0400, Andrew Dunstan wrote: > Thanks. This is now committed after some tweaks to the comments, a bit > earlier than I thought first. I found an issue with this [d55241af7, "Use whitelist to choose files scanned with pg_verify_checksums"] commit, namely, it makes pg_verify_checksums no longer scan non-default tablespaces. So if you have all of your data in tablespaces, it will more-or-less immediately return with success. I've extended the test suite to induce corruption in a table located in a non-default tablespace, see the attached patch. If fails like this, i.e. does not detect the corruption: t/002_actions.pl .. 14/42 # Failed test 'fails with corrupted data in non-default tablespace status (got 0 vs expected 1)' # at t/002_actions.pl line 87. # Failed test 'fails with corrupted data in non-default tablespace stdout /(?^:Bad checksums:.*1)/' # at t/002_actions.pl line 87. # 'Checksum scan completed # Data checksum version: 1 # Files scanned: 1102 # Blocks scanned: 2861 # Bad checksums: 0 # ' # doesn't match '(?^:Bad checksums:.*1)' # Failed test 'fails with corrupted data in non-default tablespace stderr /(?^:checksum verification failed)/' # at t/002_actions.pl line 87. # '' # doesn't match '(?^:checksum verification failed)' # Looks like you failed 3 tests of 42. t/002_actions.pl .. Dubious, test returned 3 (wstat 768, 0x300) Failed 3/42 subtests The problem is that "PG_12_201811201" is not considered a valid relation file by isRelFileName(), so it skips it and the rest of the tablespace. I had a quick look at fixing this but did not manage to immediately come up with a solution, so posting here for now. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.banck@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment
On Tue, Nov 27, 2018 at 02:09:05PM +0100, Michael Banck wrote: > I had a quick look at fixing this but did not manage to immediately come > up with a solution, so posting here for now. If you look at another thread, the patch posted on the top would actually solve this issue: https://www.postgresql.org/message-id/20181021134206.GA14282@paquier.xyz Your problem could also be solved with the minimalistic patch attached, so fixing on the way the problems with temporary files present in PGDATA something like the attached could be used... Based on the stale status of the other thread I am unsure what should be done though. -- Michael
Attachment
Hi, Am Dienstag, den 27.11.2018, 22:52 +0900 schrieb Michael Paquier: > On Tue, Nov 27, 2018 at 02:09:05PM +0100, Michael Banck wrote: > > I had a quick look at fixing this but did not manage to immediately come > > up with a solution, so posting here for now. > > If you look at another thread, the patch posted on the top would > actually solve this issue: > https://www.postgresql.org/message-id/20181021134206.GA14282@paquier.xyz Oh, I kinda followed that thread a bit, but I think that patch fixes things more by matter of moving code around, at least I haven't noticed that tablespaces were explicitly claimed to be fixed in that thread. > Your problem could also be solved with the minimalistic patch attached, > so fixing on the way the problems with temporary files present in PGDATA > something like the attached could be used... Thanks! > Based on the stale status > of the other thread I am unsure what should be done though. As pg_verify_checksums appears to be broken with respect to tablespaces in REL_11_STABLE (so I think 11.1, but not 11.0) as well, I think this merits a short-term minimal invasive fix (i.e. the patch you posted, just that there's no TAP testsuite for pg_verify_checksums in v11 unfortunately) on its own, regardless of the wider changes proposed in the other thread. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.banck@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
On Tue, Nov 27, 2018 at 04:26:45PM +0100, Michael Banck wrote: > Oh, I kinda followed that thread a bit, but I think that patch fixes > things more by matter of moving code around, at least I haven't noticed > that tablespaces were explicitly claimed to be fixed in that thread. That may have been an unconscious move ;) It seems to me the root issue is that the original implementation of pg_verify_checksums naively assumed that files can be skipped without first checking at their types, which I got confused by in d55241af. You should definitely be mentioned as this reporter anyway, and get author credits for the additional test. > As pg_verify_checksums appears to be broken with respect to tablespaces > in REL_11_STABLE (so I think 11.1, but not 11.0) as well, I think this > merits a short-term minimal invasive fix (i.e. the patch you posted, > just that there's no TAP testsuite for pg_verify_checksums in v11 > unfortunately) on its own, regardless of the wider changes proposed in > the other thread. Yes, let's do so rather sooner than later if there are no objections from others. I am going to ping the other thread about what's discussed here additionally in case folks missed what you reported. Fixing the temp file issue which has been reported by Andres first is an additional bonus. -- Michael
Attachment
On Wed, Nov 28, 2018 at 06:36:25AM +0900, Michael Paquier wrote: > Yes, let's do so rather sooner than later if there are no objections > from others. I am going to ping the other thread about what's discussed > here additionally in case folks missed what you reported. Fixing the > temp file issue which has been reported by Andres first is an additional > bonus. For the archive's sake, this is fixed by 5c99513 and a1c91dd. -- Michael