Re: Why standby restores some WALs many times from archive? - Mailing list pgsql-hackers

From Sergey Burladyan
Subject Re: Why standby restores some WALs many times from archive?
Date
Msg-id 87efndt334.fsf@gmail.com
Whole thread Raw
In response to Re: Why standby restores some WALs many times from archive?  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: Why standby restores some WALs many times from archive?
List pgsql-hackers
Jeff Janes <jeff.janes@gmail.com> writes:

> On Thu, Dec 28, 2017 at 7:02 AM, Victor Yagofarov <xnasx@yandex.ru> wrote:
> > I have postgres 9.4 standby with archive-based replication (via
> > restore_command).

> Can you show us both your archive_command and your restore_command?

We use this scripts:
https://github.com/avito-tech/dba-utils/tree/master/pg_archive

But I can reproduce problem with simple cp & mv:
archive_command:
  test ! -f /var/lib/postgresql/wals/%f && \
  test ! -f /var/lib/postgresql/wals/%f.tmp && \
  cp %p /var/lib/postgresql/wals/%f.tmp && \
  mv /var/lib/postgresql/wals/%f.tmp /var/lib/postgresql/wals/%f

recovery.conf:
  restore_command = 'cp /var/lib/postgresql/wals/%f %p'
  archive_cleanup_command = 'pg_archivecleanup /var/lib/postgresql/wals/ %r'
  standby_mode = on

I use docker for tests and run standby with DEBUG3 (full log in attachments).

I run this commands in test:
PGVER=9.4 docker-compose up
    docker-compose exec -u postgres master psql \
        -c "create database test"
    docker-compose exec -u postgres master psql -d test \
        -c "create table xxx (id serial, v text)"
    # we need some recycled WALs
    docker-compose exec -u postgres master psql -d test \
        -c "insert into xxx (v) select 'this is long line for fill space ' || n from generate_series(1, 5000000) n"
    docker-compose exec -u postgres standby psql -c "checkpoint"
    docker-compose exec -u postgres master psql -d test \
        -c "insert into xxx (v) select 'this is long line for fill space ' || n from generate_series(1, 1000000) n;
truncatexxx;" 
    docker-compose logs standby | grep 'restored log file' | sort | uniq -c | awk '$1 > 1 { print }'
docker-compose down -v

Log have two duplicated WALS:
4 | 000000010000000000000024
3 | 00000001000000000000002B

========================================
DEBUG:  switched WAL source from stream to archive after failure
DEBUG:  executing restore command "cp /var/lib/postgresql/wals/000000010000000000000024 pg_xlog/RECOVERYXLOG"
LOG:  restored log file "000000010000000000000024" from archive
DEBUG:  got WAL segment from archive
DEBUG:  skipping restartpoint, already performed at 0/22258058
DEBUG:  executing restore command "cp /var/lib/postgresql/wals/000000010000000000000025 pg_xlog/RECOVERYXLOG"
cp: cannot stat ‘/var/lib/postgresql/wals/000000010000000000000025’: No such file or directory
DEBUG:  could not restore file "000000010000000000000025" from archive: child process exited with exit code 1
DEBUG:  unexpected pageaddr 0/10000000 in log segment 000000010000000000000025, offset 0
DEBUG:  switched WAL source from archive to stream after failure
DEBUG:  switched WAL source from stream to archive after failure
DEBUG:  executing restore command "cp /var/lib/postgresql/wals/000000010000000000000024 pg_xlog/RECOVERYXLOG"
LOG:  restored log file "000000010000000000000024" from archive
DEBUG:  got WAL segment from archive
DEBUG:  skipping restartpoint, already performed at 0/22258058
DEBUG:  executing restore command "cp /var/lib/postgresql/wals/000000010000000000000025 pg_xlog/RECOVERYXLOG"
cp: cannot stat ‘/var/lib/postgresql/wals/000000010000000000000025’: No such file or directory
DEBUG:  could not restore file "000000010000000000000025" from archive: child process exited with exit code 1
DEBUG:  unexpected pageaddr 0/10000000 in log segment 000000010000000000000025, offset 0
DEBUG:  switched WAL source from archive to stream after failure
DEBUG:  switched WAL source from stream to archive after failure
DEBUG:  executing restore command "cp /var/lib/postgresql/wals/000000010000000000000024 pg_xlog/RECOVERYXLOG"
LOG:  restored log file "000000010000000000000024" from archive
DEBUG:  got WAL segment from archive
DEBUG:  skipping restartpoint, already performed at 0/22258058
========================================

========================================
LOG:  restartpoint starting: xlog
DEBUG:  performing replication slot checkpoint
DEBUG:  switched WAL source from stream to archive after failure
DEBUG:  executing restore command "cp /var/lib/postgresql/wals/00000001000000000000002C pg_xlog/RECOVERYXLOG"
cp: cannot stat ‘/var/lib/postgresql/wals/00000001000000000000002C’: No such file or directory
DEBUG:  could not restore file "00000001000000000000002C" from archive: child process exited with exit code 1
LOG:  unexpected pageaddr 0/21000000 in log segment 00000001000000000000002C, offset 0
DEBUG:  switched WAL source from archive to stream after failure
DEBUG:  switched WAL source from stream to archive after failure
DEBUG:  executing restore command "cp /var/lib/postgresql/wals/00000001000000000000002B pg_xlog/RECOVERYXLOG"
LOG:  restored log file "00000001000000000000002B" from archive
DEBUG:  got WAL segment from archive
DEBUG:  executing restore command "cp /var/lib/postgresql/wals/00000001000000000000002C pg_xlog/RECOVERYXLOG"
cp: cannot stat ‘/var/lib/postgresql/wals/00000001000000000000002C’: No such file or directory
DEBUG:  could not restore file "00000001000000000000002C" from archive: child process exited with exit code 1
DEBUG:  unexpected pageaddr 0/21000000 in log segment 00000001000000000000002C, offset 0
DEBUG:  switched WAL source from archive to stream after failure
DEBUG:  switched WAL source from stream to archive after failure
DEBUG:  executing restore command "cp /var/lib/postgresql/wals/00000001000000000000002B pg_xlog/RECOVERYXLOG"
LOG:  restored log file "00000001000000000000002B" from archive
DEBUG:  got WAL segment from archive
DEBUG:  executing restore command "cp /var/lib/postgresql/wals/00000001000000000000002C pg_xlog/RECOVERYXLOG"
LOG:  restored log file "00000001000000000000002C" from archive
DEBUG:  got WAL segment from archive
========================================

--
Sergey Burladyan


Attachment

pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: Fix a Oracle-compatible instr function in thedocumentation
Next
From: Tom Lane
Date:
Subject: Re: Converting plpgsql to use DTYPE_REC for named composite types