Re: TOAST corruption in standby database - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: TOAST corruption in standby database
Date
Msg-id CAA4eK1LXuDok8HM+L21o4UiVp5u56fL_NUYJAwugvYdHqgT5ag@mail.gmail.com
Whole thread Raw
In response to TOAST corruption in standby database  (Alex Adriaanse <alex@oseberg.io>)
List pgsql-hackers
On Fri, Oct 25, 2019 at 1:50 AM Alex Adriaanse <alex@oseberg.io> wrote:
>
> Standby (corrupted):
>
> # dd if=data/base/18034/16103928.13 bs=8192 skip=89185 count=1 status=none | hexdump -C | head -8
> 00000000  a3 0e 00 00 48 46 88 0e  00 00 05 00 30 00 58 0f  |....HF......0.X.|
> 00000010  00 20 04 20 00 00 00 00  00 00 00 00 00 00 00 00  |. . ............|
> 00000020  10 98 e0 0f 98 97 e8 00  a8 8f e0 0f 58 8f 96 00  |............X...|
> 00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 00000f50  00 00 00 00 00 00 00 00  32 b0 0a 01 00 00 00 00  |........2.......|
> 00000f60  00 00 00 00 1b 00 61 5c  06 00 03 00 02 09 18 00  |......a\........|
> 00000f70  9b 90 75 02 01 00 00 00  ac 00 00 00 83 9f 64 00  |..u...........d.|
>
> Primary:
>
> # dd if=data/base/18034/16103928.13 bs=8192 skip=89185 count=1 status=none | hexdump -C | head -8
> 00000000  bd 0e 00 00 08 ad 32 b7  00 00 05 00 30 00 90 04  |......2.....0...|
> 00000010  00 20 04 20 00 00 00 00  68 87 e0 0f 90 84 a8 05  |. . ....h.......|
> 00000020  10 98 e0 0f 98 97 e8 00  a8 8f e0 0f 58 8f 96 00  |............X...|
> 00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 00000490  a6 07 7e 02 00 00 00 00  00 00 00 00 1b 00 61 5c  |..~...........a\|
> 000004a0  02 00 03 00 02 09 18 00  ae 9d d4 03 01 00 00 00  |................|
> 000004b0  d0 0a 00 00 23 25 10 07  88 02 13 0f 2c 04 78 01  |....#%......,.x.|
>
> Based on the above observations it seems to me that occasionally some of the changes aren't replicating to or
persistingby the standby database.
 
>

I am not sure what is the best way to detect this, but one idea could
be to enable wal_consistency_checking [1].  This will at the very
least can detect if the block is replicated correctly for the very
first time.  Also, if there is some corruption issue on standby, you
might be able to detect.  But the point to note is that enabling this
option has overhead, so you need to be careful.


[1] - https://www.postgresql.org/docs/devel/runtime-config-developer.html
-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Remove one use of IDENT_USERNAME_MAX
Next
From: vignesh C
Date:
Subject: Cleanup - Removal of apply_typmod function in #if 0