Thread: PATCH: standby crashed when replay block which truncated in standbybut failed to truncate in master node
PATCH: standby crashed when replay block which truncated in standbybut failed to truncate in master node
From
Thunder
Date:
The step to reproduce this issue.
1. Create a table
create table gist_point_tbl(id int4, p point);
create index gist_pointidx on gist_point_tbl using gist(p);
2. Insert data
insert into gist_point_tbl (id, p) select g, point(g*10, g*10) from generate_series(1, 1000000) g;
3. Delete data
delete from gist_point_bl;
4. Vacuum table
vacuum gist_point_tbl;
-- Send SIGINT to vacuum process after WAL-log of the truncation is flushed and the truncation is not finished
-- We will receive error message "ERROR: canceling statement due to user request"
5. Vacuum table again
vacuum gist_point tbl;
-- The standby node crashed and the PANIC log is "PANIC: WAL contains references to invalid pages"
The standby node succeed to replay truncate log but master node doesn't truncate the file, it will be crashed if master node writes to these blocks which truncated in standby node.
I try to fix issue to prevent query cancel interrupts during truncating.
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 5df4382b7e..04b696ae01 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -26,6 +26,7 @@
#include "access/xlogutils.h"
#include "catalog/storage.h"
#include "catalog/storage_xlog.h"
+#include "miscadmin.h"
#include "storage/freespace.h"
#include "storage/smgr.h"
#include "utils/memutils.h"
@@ -248,6 +249,14 @@ RelationTruncate(Relation rel, BlockNumber nblocks)
if (vm)
visibilitymap_truncate(rel, nblocks);
+ /*
+ * When master node flush WAL-log of the truncation and then receive SIGINT signal to cancel
+ * this transaction before the truncation, if standby receive this WAL-log and do the truncation,
+ * standby node will crash when master node writes to these blocks which are truncated in standby node.
+ * So we prevent query cancel interrupts.
+ */
+ HOLD_CANCEL_INTERRUPTS();
+
/*
* We WAL-log the truncation before actually truncating, which means
* trouble if the truncation fails. If we then crash, the WAL replay
@@ -288,6 +297,8 @@ RelationTruncate(Relation rel, BlockNumber nblocks)
/* Do the real work */
smgrtruncate(rel->rd_smgr, MAIN_FORKNUM, nblocks);
+
+ RESUME_CANCEL_INTERRUPTS();
}
Re:PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Thunder
Date:
Is this an issue?
Can we fix like this?
Thanks!
At 2019-09-22 00:38:03, "Thunder" <thunder1@126.com> wrote:
The step to reproduce this issue.1. Create a tablecreate table gist_point_tbl(id int4, p point);create index gist_pointidx on gist_point_tbl using gist(p);2. Insert datainsert into gist_point_tbl (id, p) select g, point(g*10, g*10) from generate_series(1, 1000000) g;3. Delete datadelete from gist_point_bl;4. Vacuum tablevacuum gist_point_tbl;-- Send SIGINT to vacuum process after WAL-log of the truncation is flushed and the truncation is not finished-- We will receive error message "ERROR: canceling statement due to user request"5. Vacuum table againvacuum gist_point tbl;-- The standby node crashed and the PANIC log is "PANIC: WAL contains references to invalid pages"The standby node succeed to replay truncate log but master node doesn't truncate the file, it will be crashed if master node writes to these blocks which truncated in standby node.I try to fix issue to prevent query cancel interrupts during truncating.diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.cindex 5df4382b7e..04b696ae01 100644--- a/src/backend/catalog/storage.c+++ b/src/backend/catalog/storage.c@@ -26,6 +26,7 @@#include "access/xlogutils.h"#include "catalog/storage.h"#include "catalog/storage_xlog.h"+#include "miscadmin.h"#include "storage/freespace.h"#include "storage/smgr.h"#include "utils/memutils.h"@@ -248,6 +249,14 @@ RelationTruncate(Relation rel, BlockNumber nblocks)if (vm)visibilitymap_truncate(rel, nblocks);+ /*+ * When master node flush WAL-log of the truncation and then receive SIGINT signal to cancel+ * this transaction before the truncation, if standby receive this WAL-log and do the truncation,+ * standby node will crash when master node writes to these blocks which are truncated in standby node.+ * So we prevent query cancel interrupts.+ */+ HOLD_CANCEL_INTERRUPTS();+/** We WAL-log the truncation before actually truncating, which means* trouble if the truncation fails. If we then crash, the WAL replay@@ -288,6 +297,8 @@ RelationTruncate(Relation rel, BlockNumber nblocks)/* Do the real work */smgrtruncate(rel->rd_smgr, MAIN_FORKNUM, nblocks);++ RESUME_CANCEL_INTERRUPTS();}
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Tomas Vondra
Date:
On Mon, Sep 23, 2019 at 03:48:50PM +0800, Thunder wrote: >Is this an issue? >Can we fix like this? >Thanks! > I do think it is a valid issue. No opinion on the fix yet, though. The report was sent on saturday, so patience ;-) regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Michael Paquier
Date:
On Mon, Sep 23, 2019 at 01:45:14PM +0200, Tomas Vondra wrote: > On Mon, Sep 23, 2019 at 03:48:50PM +0800, Thunder wrote: >> Is this an issue? >> Can we fix like this? >> Thanks! >> > > I do think it is a valid issue. No opinion on the fix yet, though. > The report was sent on saturday, so patience ;-) And for some others it was even a longer weekend. Anyway, the problem can be reproduced if you apply the attached which introduces a failure point, and then if you run the following commands: create table aa as select 1; delete from aa; \! touch /tmp/truncate_flag vacuum aa; \! rm /tmp/truncate_flag vacuum aa; -- panic on standby This also points out that there are other things to worry about than interruptions, as for example DropRelFileNodeLocalBuffers() could lead to an ERROR, and this happens before the physical truncation is done but after the WAL record is replayed on the standby, so any failures happening at the truncation phase before the work is done would be a problem. However we are talking about failures which should not happen and these are elog() calls. It would be tempting to add a critical section here, but we could still have problems if we have a failure after the WAL record has been flushed, which means that it would be replayed on the standby, and the surrounding comments are clear about that. In short, as a matter of safety I'd like to think that what you are suggesting is rather acceptable (aka hold interrupts before the WAL record is written and release after the physical truncate), so as truncation avoids failures possible to avoid. Do others have thoughts to share on the matter? -- Michael
Attachment
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Kyotaro Horiguchi
Date:
Hello. At Tue, 24 Sep 2019 10:40:19 +0900, Michael Paquier <michael@paquier.xyz> wrote in <20190924014019.GB2012@paquier.xyz> > On Mon, Sep 23, 2019 at 01:45:14PM +0200, Tomas Vondra wrote: > > On Mon, Sep 23, 2019 at 03:48:50PM +0800, Thunder wrote: > >> Is this an issue? > >> Can we fix like this? > >> Thanks! > >> > > > > I do think it is a valid issue. No opinion on the fix yet, though. > > The report was sent on saturday, so patience ;-) > > And for some others it was even a longer weekend. Anyway, the problem > can be reproduced if you apply the attached which introduces a failure > point, and then if you run the following commands: > create table aa as select 1; > delete from aa; > \! touch /tmp/truncate_flag > vacuum aa; > \! rm /tmp/truncate_flag > vacuum aa; -- panic on standby > > This also points out that there are other things to worry about than > interruptions, as for example DropRelFileNodeLocalBuffers() could lead > to an ERROR, and this happens before the physical truncation is done > but after the WAL record is replayed on the standby, so any failures > happening at the truncation phase before the work is done would be a Indeed. > problem. However we are talking about failures which should not > happen and these are elog() calls. It would be tempting to add a > critical section here, but we could still have problems if we have a > failure after the WAL record has been flushed, which means that it > would be replayed on the standby, and the surrounding comments are Agreed. > clear about that. In short, as a matter of safety I'd like to think > that what you are suggesting is rather acceptable (aka hold interrupts > before the WAL record is written and release after the physical > truncate), so as truncation avoids failures possible to avoid. > > Do others have thoughts to share on the matter? Agreed for the concept, but does the patch work as described? It seems that query cancel doesn't fire during the holded-off section since no CHECK_FOR_INTERRUPTS() there. regares. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Kyotaro Horiguchi
Date:
At Tue, 24 Sep 2019 12:46:19 +0900 (Tokyo Standard Time), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in <20190924.124619.248088532.horikyota.ntt@gmail.com> > > clear about that. In short, as a matter of safety I'd like to think > > that what you are suggesting is rather acceptable (aka hold interrupts > > before the WAL record is written and release after the physical > > truncate), so as truncation avoids failures possible to avoid. > > > > Do others have thoughts to share on the matter? > > Agreed for the concept, but does the patch work as described? It > seems that query cancel doesn't fire during the holded-off > section since no CHECK_FOR_INTERRUPTS() there. Of course I found no *explicit* ones. But I found one ereport(DEBUG1 in register_dirty_segment. So it will work at least for the case where fsync request queue is full. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Michael Paquier
Date:
On Tue, Sep 24, 2019 at 02:48:16PM +0900, Kyotaro Horiguchi wrote: > Of course I found no *explicit* ones. But I found one > ereport(DEBUG1 in register_dirty_segment. So it will work at > least for the case where fsync request queue is full. Exactly. I have not checked the patch in details, but I think that we should not rely on the assumption that no code paths in this area do not check after CHECK_FOR_INTERRUPTS() as smgrtruncate() does much more than just the physical segment truncation. -- Michael
Attachment
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Kyotaro Horiguchi
Date:
At Wed, 25 Sep 2019 12:24:03 +0900, Michael Paquier <michael@paquier.xyz> wrote in <20190925032403.GF1815@paquier.xyz> > On Tue, Sep 24, 2019 at 02:48:16PM +0900, Kyotaro Horiguchi wrote: > > Of course I found no *explicit* ones. But I found one > > ereport(DEBUG1 in register_dirty_segment. So it will work at > > least for the case where fsync request queue is full. > > Exactly. I have not checked the patch in details, but I think that > we should not rely on the assumption that no code paths in this area do > not check after CHECK_FOR_INTERRUPTS() as smgrtruncate() does much > more than just the physical segment truncation. Agreed to the point. Just I doubted that it really fixes the author's problem. And confirmed that it can be. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Fujii Masao
Date:
On Tue, Sep 24, 2019 at 10:41 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Mon, Sep 23, 2019 at 01:45:14PM +0200, Tomas Vondra wrote: > > On Mon, Sep 23, 2019 at 03:48:50PM +0800, Thunder wrote: > >> Is this an issue? > >> Can we fix like this? > >> Thanks! > >> > > > > I do think it is a valid issue. No opinion on the fix yet, though. > > The report was sent on saturday, so patience ;-) > > And for some others it was even a longer weekend. Anyway, the problem > can be reproduced if you apply the attached which introduces a failure > point, and then if you run the following commands: > create table aa as select 1; > delete from aa; > \! touch /tmp/truncate_flag > vacuum aa; > \! rm /tmp/truncate_flag > vacuum aa; -- panic on standby > > This also points out that there are other things to worry about than > interruptions, as for example DropRelFileNodeLocalBuffers() could lead > to an ERROR, and this happens before the physical truncation is done > but after the WAL record is replayed on the standby, so any failures > happening at the truncation phase before the work is done would be a > problem. However we are talking about failures which should not > happen and these are elog() calls. It would be tempting to add a > critical section here, but we could still have problems if we have a > failure after the WAL record has been flushed, which means that it > would be replayed on the standby, and the surrounding comments are > clear about that. Could you elaborate what problem adding a critical section there occurs? Regards, -- Fujii Masao
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Michael Paquier
Date:
On Thu, Sep 26, 2019 at 01:13:56AM +0900, Fujii Masao wrote: > On Tue, Sep 24, 2019 at 10:41 AM Michael Paquier <michael@paquier.xyz> wrote: >> This also points out that there are other things to worry about than >> interruptions, as for example DropRelFileNodeLocalBuffers() could lead >> to an ERROR, and this happens before the physical truncation is done >> but after the WAL record is replayed on the standby, so any failures >> happening at the truncation phase before the work is done would be a >> problem. However we are talking about failures which should not >> happen and these are elog() calls. It would be tempting to add a >> critical section here, but we could still have problems if we have a >> failure after the WAL record has been flushed, which means that it >> would be replayed on the standby, and the surrounding comments are >> clear about that. > > Could you elaborate what problem adding a critical section there occurs? Wrapping the call of smgrtruncate() within RelationTruncate() to use a critical section would make things worse from the user perspective on the primary, no? If the physical truncation fails, we would still fail WAL replay on the standby, but instead of generating an ERROR in the session of the user attempting the TRUNCATE, the whole primary would be taken down. -- Michael
Attachment
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Fujii Masao
Date:
On Fri, Sep 27, 2019 at 3:14 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Sep 26, 2019 at 01:13:56AM +0900, Fujii Masao wrote: > > On Tue, Sep 24, 2019 at 10:41 AM Michael Paquier <michael@paquier.xyz> wrote: > >> This also points out that there are other things to worry about than > >> interruptions, as for example DropRelFileNodeLocalBuffers() could lead > >> to an ERROR, and this happens before the physical truncation is done > >> but after the WAL record is replayed on the standby, so any failures > >> happening at the truncation phase before the work is done would be a > >> problem. However we are talking about failures which should not > >> happen and these are elog() calls. It would be tempting to add a > >> critical section here, but we could still have problems if we have a > >> failure after the WAL record has been flushed, which means that it > >> would be replayed on the standby, and the surrounding comments are > >> clear about that. > > > > Could you elaborate what problem adding a critical section there occurs? > > Wrapping the call of smgrtruncate() within RelationTruncate() to use a > critical section would make things worse from the user perspective on > the primary, no? If the physical truncation fails, we would still > fail WAL replay on the standby, but instead of generating an ERROR in > the session of the user attempting the TRUNCATE, the whole primary > would be taken down. Thanks for elaborating that! Understood. But this can cause subsequent recovery to always fail with invalid-pages error and the server not to start up. This is bad. So, to allviate the situation, I'm thinking it would be worth adding something like igore_invalid_pages developer parameter. When this parameter is set to true, the startup process always ignores invalid-pages errors. Thought? Regards, -- Fujii Masao
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Michael Paquier
Date:
On Thu, Oct 03, 2019 at 01:49:34PM +0900, Fujii Masao wrote: > But this can cause subsequent recovery to always fail with invalid-pages error > and the server not to start up. This is bad. So, to allviate the situation, > I'm thinking it would be worth adding something like igore_invalid_pages > developer parameter. When this parameter is set to true, the startup process > always ignores invalid-pages errors. Thought? That could be helpful. -- Michael
Attachment
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Fujii Masao
Date:
On Thu, Oct 3, 2019 at 1:57 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Oct 03, 2019 at 01:49:34PM +0900, Fujii Masao wrote: > > But this can cause subsequent recovery to always fail with invalid-pages error > > and the server not to start up. This is bad. So, to allviate the situation, > > I'm thinking it would be worth adding something like igore_invalid_pages > > developer parameter. When this parameter is set to true, the startup process > > always ignores invalid-pages errors. Thought? > > That could be helpful. So attached patch adds new developer GUC "ignore_invalid_pages". Setting ignore_invalid_pages to true causes the system to ignore the failure (but still report a warning), and continue recovery. I will add this to next CommitFest. Regards, -- Fujii Masao
Attachment
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Michael Paquier
Date:
On Thu, Oct 03, 2019 at 05:54:40PM +0900, Fujii Masao wrote: > On Thu, Oct 3, 2019 at 1:57 PM Michael Paquier <michael@paquier.xyz> wrote: > > > > On Thu, Oct 03, 2019 at 01:49:34PM +0900, Fujii Masao wrote: > > > But this can cause subsequent recovery to always fail with invalid-pages error > > > and the server not to start up. This is bad. So, to allviate the situation, > > > I'm thinking it would be worth adding something like igore_invalid_pages > > > developer parameter. When this parameter is set to true, the startup process > > > always ignores invalid-pages errors. Thought? > > > > That could be helpful. > > So attached patch adds new developer GUC "ignore_invalid_pages". > Setting ignore_invalid_pages to true causes the system > to ignore the failure (but still report a warning), and continue recovery. > > I will add this to next CommitFest. No actual objections against this patch from me as a dev option. + Detection of WAL records having references to invalid pages during + recovery causes <productname>PostgreSQL</productname> to report + an error, aborting the recovery. Setting Well, that's not really an error. This triggers a PANIC, aka crashes the server. And in this case the actual problem is that you may not be able to move on with recovery when restarting the server again, except if luck is on your side because you would continuously face it.. + recovery. This behavior may <emphasis>cause crashes, data loss, + propagate or hide corruption, or other serious problems</emphasis>. Nit: indentation on the second line here. + However, it may allow you to get past the error, finish the recovery, + and cause the server to start up. For consistency here I would suggest the second part of the sentence to be "TO finish recovery, and TO cause the server to start up". + The default setting is off, and it can only be set at server start. Nit^2: Missing a <literal> markup for "off"? -- Michael
Attachment
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Fujii Masao
Date:
On Fri, Nov 29, 2019 at 11:39 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Oct 03, 2019 at 05:54:40PM +0900, Fujii Masao wrote: > > On Thu, Oct 3, 2019 at 1:57 PM Michael Paquier <michael@paquier.xyz> wrote: > > > > > > On Thu, Oct 03, 2019 at 01:49:34PM +0900, Fujii Masao wrote: > > > > But this can cause subsequent recovery to always fail with invalid-pages error > > > > and the server not to start up. This is bad. So, to allviate the situation, > > > > I'm thinking it would be worth adding something like igore_invalid_pages > > > > developer parameter. When this parameter is set to true, the startup process > > > > always ignores invalid-pages errors. Thought? > > > > > > That could be helpful. > > > > So attached patch adds new developer GUC "ignore_invalid_pages". > > Setting ignore_invalid_pages to true causes the system > > to ignore the failure (but still report a warning), and continue recovery. > > > > I will add this to next CommitFest. > > No actual objections against this patch from me as a dev option. Thanks for the review! Attached is the updated version of the patch. > + Detection of WAL records having references to invalid pages during > + recovery causes <productname>PostgreSQL</productname> to report > + an error, aborting the recovery. Setting > Well, that's not really an error. This triggers a PANIC, aka crashes > the server. And in this case the actual problem is that you may not > be able to move on with recovery when restarting the server again, > except if luck is on your side because you would continuously face > it.. So you're thinking that "report an error" should be changed to "trigger a PANIC"? Personally "report an error" sounds ok because PANIC is one of "error", I think. But if that misleads people, I will change the sentence. > + recovery. This behavior may <emphasis>cause crashes, data loss, > + propagate or hide corruption, or other serious problems</emphasis>. > Nit: indentation on the second line here. Yes, I fixed that. > + However, it may allow you to get past the error, finish the recovery, > + and cause the server to start up. > For consistency here I would suggest the second part of the sentence > to be "TO finish recovery, and TO cause the server to start up". Yes, I fixed that. > + The default setting is off, and it can only be set at server start. > Nit^2: Missing a <literal> markup for "off"? Yes, I fixed that. Regards, -- Fujii Masao
Attachment
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Michael Paquier
Date:
On Mon, Dec 16, 2019 at 12:22:18PM +0900, Fujii Masao wrote: > > + Detection of WAL records having references to invalid pages during > > + recovery causes <productname>PostgreSQL</productname> to report > > + an error, aborting the recovery. Setting > > Well, that's not really an error. This triggers a PANIC, aka crashes > > the server. And in this case the actual problem is that you may not > > be able to move on with recovery when restarting the server again, > > except if luck is on your side because you would continuously face > > it.. > > So you're thinking that "report an error" should be changed to > "trigger a PANIC"? Personally "report an error" sounds ok because > PANIC is one of "error", I think. But if that misleads people, > I will change the sentence. In the context of a recovery, an ERROR is promoted to a FATAL, but here are talking about something that bypasses the crash of the server. So this could bring confusion. I think that the documentation should be crystal clear about that, with two aspects outlined when the parameter is disabled, somewhat like data_sync_retry actually: - A PANIC-level error is triggered. - It crashes the cluster. -- Michael
Attachment
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Fujii Masao
Date:
On Tue, Dec 17, 2019 at 2:19 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Mon, Dec 16, 2019 at 12:22:18PM +0900, Fujii Masao wrote: > > > + Detection of WAL records having references to invalid pages during > > > + recovery causes <productname>PostgreSQL</productname> to report > > > + an error, aborting the recovery. Setting > > > Well, that's not really an error. This triggers a PANIC, aka crashes > > > the server. And in this case the actual problem is that you may not > > > be able to move on with recovery when restarting the server again, > > > except if luck is on your side because you would continuously face > > > it.. > > > > So you're thinking that "report an error" should be changed to > > "trigger a PANIC"? Personally "report an error" sounds ok because > > PANIC is one of "error", I think. But if that misleads people, > > I will change the sentence. > > In the context of a recovery, an ERROR is promoted to a FATAL, but > here are talking about something that bypasses the crash of the > server. So this could bring confusion. I think that the > documentation should be crystal clear about that, with two aspects > outlined when the parameter is disabled, somewhat like data_sync_retry > actually: > - A PANIC-level error is triggered. > - It crashes the cluster. OK, I updated the patch that way. Attached is the updated version of the patch. Regards, -- Fujii Masao
Attachment
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Michael Paquier
Date:
On Thu, Jan 16, 2020 at 11:17:36PM +0900, Fujii Masao wrote: > OK, I updated the patch that way. > Attached is the updated version of the patch. Thanks. I have few tweaks to propose to the docs. + raise a PANIC-level error, aborting the recovery. Setting Instead of "PANIC-level error", I would just use "PANIC error", and instead of "aborting the recovery" just "crashing the server". + causes the system to ignore those WAL records WAL records are not ignored, but errors caused by incorrect page references in those WAL records are. The current phrasing sounds like the WAL records are not applied. Another thing that I just recalled. Do you think that it would be better to mention that invalid page references can only be seen after reaching the consistent point during recovery? The information given looks enough, but I was just wondering if that's worth documenting or not. -- Michael
Attachment
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Fujii Masao
Date:
On Fri, Jan 17, 2020 at 1:47 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Jan 16, 2020 at 11:17:36PM +0900, Fujii Masao wrote: > > OK, I updated the patch that way. > > Attached is the updated version of the patch. > > Thanks. I have few tweaks to propose to the docs. > > + raise a PANIC-level error, aborting the recovery. Setting > Instead of "PANIC-level error", I would just use "PANIC error", and I have no strong opinion about this, but I used "PANIC-level error" because the description for data_sync_retry has already used it. > instead of "aborting the recovery" just "crashing the server". PANIC implies server crash, so IMO "crashing the server" is a bit redundant, and "aborting the recovery" is better because "continue the recovery" is used later. > + causes the system to ignore those WAL records > WAL records are not ignored, but errors caused by incorrect page > references in those WAL records are. The current phrasing sounds like > the WAL records are not applied. So, what about --------------- causes the system to ignore invalid page references in WAL records (but still report a warning), and continue the recovery. --------------- > Another thing that I just recalled. Do you think that it would be > better to mention that invalid page references can only be seen after > reaching the consistent point during recovery? The information given > looks enough, but I was just wondering if that's worth documenting or > not. ISTM that this is not the information that users should understand... Regards, -- Fujii Masao
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Michael Paquier
Date:
On Fri, Jan 17, 2020 at 07:36:51PM +0900, Fujii Masao wrote: > On Fri, Jan 17, 2020 at 1:47 PM Michael Paquier <michael@paquier.xyz> wrote: >> Thanks. I have few tweaks to propose to the docs. >> >> + raise a PANIC-level error, aborting the recovery. Setting >> Instead of "PANIC-level error", I would just use "PANIC error", and > > I have no strong opinion about this, but I used "PANIC-level error" > because the description for data_sync_retry has already used it. Okay. Fine with what you think is good. >> instead of "aborting the recovery" just "crashing the server". > > PANIC implies server crash, so IMO "crashing the server" is > a bit redundant, and "aborting the recovery" is better because > "continue the recovery" is used later. Okay. I see your point here. > So, what about > > --------------- > causes the system to ignore invalid page references in WAL records > (but still report a warning), and continue the recovery. > --------------- And that sounds good to me. Switching the patch as ready for committer. -- Michael
Attachment
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Amit Kapila
Date:
On Sat, Jan 18, 2020 at 9:18 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Fri, Jan 17, 2020 at 07:36:51PM +0900, Fujii Masao wrote: > > So, what about > > > > --------------- > > causes the system to ignore invalid page references in WAL records > > (but still report a warning), and continue the recovery. > > --------------- > > And that sounds good to me. Switching the patch as ready for > committer. > Are we planning to do something about the original problem reported in this thread? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Michael Paquier
Date:
On Mon, Jan 20, 2020 at 02:13:53PM +0530, Amit Kapila wrote: > Are we planning to do something about the original problem reported in > this thread? We should. This is on my TODO list, though seeing that it involved full_page_writes=off I drifted a bit away from it. -- Michael
Attachment
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Amit Kapila
Date:
On Tue, Jan 21, 2020 at 6:05 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Mon, Jan 20, 2020 at 02:13:53PM +0530, Amit Kapila wrote: > > Are we planning to do something about the original problem reported in > > this thread? > > We should. This is on my TODO list, though seeing that it involved > full_page_writes=off I drifted a bit away from it. > The original email doesn't say so. I might be missing something, but can you explain what makes you think so. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Michael Paquier
Date:
On Tue, Jan 21, 2020 at 08:45:14AM +0530, Amit Kapila wrote: > The original email doesn't say so. I might be missing something, but > can you explain what makes you think so. Oops. Incorrect thread, I was thinking about this one previously: https://www.postgresql.org/message-id/822113470.250068.1573246011818@connect.xfinity.com Re-reading the root of the thread, I am still not sure what we could do, as that's rather tricky. See here: https://www.postgresql.org/message-id/20190927061414.GF8485@paquier.xyz -- Michael
Attachment
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Fujii Masao
Date:
On 2020/01/21 13:39, Michael Paquier wrote: > On Tue, Jan 21, 2020 at 08:45:14AM +0530, Amit Kapila wrote: >> The original email doesn't say so. I might be missing something, but >> can you explain what makes you think so. > > Oops. Incorrect thread, I was thinking about this one previously: > https://www.postgresql.org/message-id/822113470.250068.1573246011818@connect.xfinity.com > > Re-reading the root of the thread, I am still not sure what we could > do, as that's rather tricky. See here: > https://www.postgresql.org/message-id/20190927061414.GF8485@paquier.xyz The original proposal, i.e., holding the interrupts during the truncation, is worth considering? It is not a perfect solution but might improve the situation a bit. Regards, -- Fujii Masao NTT DATA CORPORATION Advanced Platform Technology Group Research and Development Headquarters
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Fujii Masao
Date:
On 2020/01/18 12:48, Michael Paquier wrote: > On Fri, Jan 17, 2020 at 07:36:51PM +0900, Fujii Masao wrote: >> On Fri, Jan 17, 2020 at 1:47 PM Michael Paquier <michael@paquier.xyz> wrote: >>> Thanks. I have few tweaks to propose to the docs. >>> >>> + raise a PANIC-level error, aborting the recovery. Setting >>> Instead of "PANIC-level error", I would just use "PANIC error", and >> >> I have no strong opinion about this, but I used "PANIC-level error" >> because the description for data_sync_retry has already used it. > > Okay. Fine with what you think is good. > >>> instead of "aborting the recovery" just "crashing the server". >> >> PANIC implies server crash, so IMO "crashing the server" is >> a bit redundant, and "aborting the recovery" is better because >> "continue the recovery" is used later. > > Okay. I see your point here. > >> So, what about >> >> --------------- >> causes the system to ignore invalid page references in WAL records >> (but still report a warning), and continue the recovery. >> --------------- > > And that sounds good to me. Switching the patch as ready for > committer. Thanks! Committed! Regards, -- Fujii Masao NTT DATA CORPORATION Advanced Platform Technology Group Research and Development Headquarters
Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
From
Andres Freund
Date:
Hi, On 2020-01-21 15:41:54 +0900, Fujii Masao wrote: > On 2020/01/21 13:39, Michael Paquier wrote: > > On Tue, Jan 21, 2020 at 08:45:14AM +0530, Amit Kapila wrote: > > > The original email doesn't say so. I might be missing something, but > > > can you explain what makes you think so. > > > > Oops. Incorrect thread, I was thinking about this one previously: > > https://www.postgresql.org/message-id/822113470.250068.1573246011818@connect.xfinity.com > > > > Re-reading the root of the thread, I am still not sure what we could > > do, as that's rather tricky. Did anybody consider the proposal at https://www.postgresql.org/message-id/20191223005430.yhf4n3zr4ojwbcn2%40alap3.anarazel.de ? I think we're going to have to do something like that to actually fix the problem, rather than polish around the edges. > See here: > https://www.postgresql.org/message-id/20190927061414.GF8485@paquier.xyz On 2019-09-27 15:14:14 +0900, Michael Paquier wrote: > Wrapping the call of smgrtruncate() within RelationTruncate() to use a > critical section would make things worse from the user perspective on > the primary, no? If the physical truncation fails, we would still > fail WAL replay on the standby, but instead of generating an ERROR in > the session of the user attempting the TRUNCATE, the whole primary > would be taken down. FWIW, to me this argument just doesn't make any sense - even if a few people have argued it. A failure in the FS truncate currently yields to a cluster in a corrupted state in multiple ways: 1) Dirty buffer contents were thrown away, and going forward their old contents will be read back. 2) We have WAL logged something that we haven't done. That's *obviously* something *completely* violating WAL logging rules. And break WAL replay (including locally, should we crash before the next checkpoint - there could be subsequent WAL records relying on the block's existance). That's so obviously worse than a PANIC restart, that I really don't understand the "worse from the user perspective" argument from your email above. Obviously it sucks that the error might re-occur during recovery. But that's something that usually actually can be fixed - whereas the data corruption can't. > The original proposal, i.e., holding the interrupts during > the truncation, is worth considering? It is not a perfect > solution but might improve the situation a bit. I don't think it's useful in isolation. Greetings, Andres Freund