On 2019-03-19 13:00:50 -0700, Andres Freund wrote:
> As it stands, the logic seems to give more false confidence than
> anything else.
To demonstrate that I ran a loop that verified that a) a normal backend
query using the tale detects the corruption b) pg_basebackup doesn't.
i=0;
while true; do
i=$(($i+1));
echo attempt $i;
dd if=/dev/urandom of=/srv/dev/pgdev-dev/base/13390/16384 bs=8192 count=1 conv=notrunc 2>/dev/null;
psql -X -c 'SELECT * FROM corruptme;' 2>/dev/null && break;
~/build/postgres/dev-assert/vpath/src/bin/pg_basebackup/pg_basebackup -X fetch -F t -D - -c fast > /dev/null ||
break;
done
(excuse the crappy one-off sh)
had, during ~12k iterations, always detected the corruption in the
backend, and never via pg_basebackup. Given the likely LSNs in a
cluster, that's not too surprising.
Greetings,
Andres Freund