Home > mailing lists

Re: Automatic restore corruption problem - Mailing list pgsql-admin

From	Matthieu Lejeune
Subject	Re: Automatic restore corruption problem
Date	July 14, 2015 06:09:07
Msg-id	55A4A77F.8060606@exxoss.com Whole thread Raw
In response to	Re: Automatic restore corruption problem (Guillaume Lelarge <guillaume@lelarge.info>)
Responses	Re: Automatic restore corruption problem
List	pgsql-admin

Tree view

Hi,

I had no recovery.conf on this server because I launch my replication every night I need a H-24 copy database.

This is my recovery.conf
root@p2prddnmdbc:/var/lib/postgresql# cat recovery.conf
standby_mode = 'off'
primary_conninfo = 'host=10.10.11.1 port=5432 user=replicator password=XXXXXX'
trigger_file = '/var/lib/postgresql/9.1/main/trigger'
restore_command = 'cp /mnt/p2prddnmdbm_pg_xlog/%f %p'
root@p2prddnmdbc:/var/lib/postgresql#

But with or without a recovery.conf file I can't start the database service :
root@p2prddnmdbc:/var/lib/postgresql# /etc/init.d/postgresql start
[....] Starting PostgreSQL 9.3 database server: main[....] The PostgreSQL server failed to start. Please check the log output: 2015-07-14 08:02:59 CEST LOG: database system was interrupted; last known up at 2015-07-13 23:33:46 CEST 2015-07-14 08:02:59 CEST LOG: invalid checkpoint record 2015-07-14 08:02:59 CEST FATAL: could not locate required checkpoint record 2015-07-14 08:02:59 CEST HINT: If you are not restoring from a backup, try removing the file "/var/lib/postgresql/9.3/main/backup_label". 2015-07-14 08:02:59 CEST LOG: startup process (PID 24617) exited with exit co[FAIL2015-07-14 08:02:59 CEST LOG: aborting startup due to startup process failure ... failed!
failed!

Thanks
Matthieu

Le 12/07/15 13:46, Guillaume Lelarge a écrit :

Hi,

2015-07-12 10:18 GMT+02:00 Matthieu Lejeune <matthieu.lejeune@exxoss.com>:
Hi thank for your reply

My target is to give a database for buisness testing query and they are modify the database during the buisness day.

Now I got this error if I keep the file backup_label :

root@p2prddnmdbc:/var/lib/postgresql/9.3/main# /etc/init.d/postgresql start
[....] Starting PostgreSQL 9.3 database server: main[....] The PostgreSQL server failed to start. Please check the log output: 2015-07-12 10:12:45 CEST LOG: database system was shut down at 2015-07-12 10:07:10 CEST 2015-07-12 10:12:45 CEST LOG: invalid checkpoint record 2015-07-12 10:12:45 CEST FATAL: could not locate required checkpoint record 2015-07-12 10:12:45 CEST HINT: If you are not restoring from a backup, try removing the file "/var/lib/postgresql/9.3/main/backup_label". 2015-07-12 10:12:45 CEST LOG: startup process (PID 28492) exited with exit code 1 2015-07-12 10:12:45 CEST LOG: abo[FAIL startup due to startup process failure ... failed!
failed!

If I put the recovery.conf the database is waiting for the wal to relaunch the replication.

postgres 27817 0.7 0.9 631212 39892 ? S 09:55 0:00 /usr/lib/postgresql/9.3/bin/postgres -D /var/lib/postgresql/9.3/main
postgres 27818 0.0 0.0 631472 2076 ? Ss 09:55 0:00 \_ postgres: startup process waiting for 0000000100000178000000B9
root@p2prddnmdbc:/var/lib/postgresql/9.3/main# su - postgres
postgres@p2prddnmdbc:~$ psql energycomm
psql: FATAL: the database system is starting up

Have you got an idea to stop the replication process and start the database ?

What did you put in the recovery.conf file? (hint: standby_mode must be off)

Kind regards
Matthieu

Le 10/07/15 16:46, Keith a écrit :
A recent, relevant post

http://tbeitr.blogspot.com/2015/07/deleting-backuplabel-on-restore-will.html

On Fri, Jul 10, 2015 at 10:07 AM, Guillaume Lelarge <guillaume@lelarge.info> wrote:
Hi,
Le 10 juil. 2015 3:02 PM, "Matthieu Lejeune" <matthieu.lejeune@exxoss.com> a écrit :
>
> Hi all,
>
> I have a script for restoring a database every night to an other postgresql database
>
> root@p2prddnmdbc:~# cat /var/admin/script/restoredb.sh
> #/bin/bash
> /etc/init.d/postgresql stop
> mv /var/log/postgresql/postgresql-9.3-main.log /var/log/postgresql/postgresql-9.3-main.log.old
> cd /var/lib/postgresql/9.3/main
> psql --host=p2prddnmdbm --username=replicator postgres -c "SELECT pg_start_backup('sync');"
> rsync -av --delete root@10.10.11.1:/var/lib/postgresql/9.3/main/* /var/lib/postgresql/9.3/main/
> rm backup_label
> chown -R postgres:postgres *
> psql --host=p2prddnmdbm --username=replicator postgres -c "SELECT pg_stop_backup();"
> /etc/init.d/postgresql start
> chmod 777 /var/log/postgresql/postgresql-9.3-main.log
> psql -U postgres -c "ALTER USER xxxx WITH PASSWORD 'XXXX';"
> psql -U postgres xxxx -c "CREATE EXTENSION dblink;"
> root@p2prddnmdbc:~#
>
>
> But during the day when the user are using the new database we got error like this :
>
> 2015-06-25 16:20:58 CEST ERROR: could not read block 257985 in file "base/16386/14064061.1": read only 0 of 8192 bytes
> 2015-06-22 15:21:11 CEST ERROR: could not read block 256801 in file "base/16386/14064061.1": read only 0 of 8192 bytes
>
> I have check the : filesystem on the vm, on the HW SAN,...
>
> Any idea to fix this problem?
Sure. Don't remove the backup_label file, and add the recovery.conf file.
--
Guillaume

--
Guillaume.
http://blog.guillaume.lelarge.info
http://www.dalibo.com

pgsql-admin by date:

From: Scott Ribe
Date: 14 July 2015, 01:15:24
Subject: Re: could not create shared memory segment: Invalid argument

From: Guillaume Lelarge
Date: 14 July 2015, 06:51:06
Subject: Re: Automatic restore corruption problem

Re: Automatic restore corruption problem - Mailing list pgsql-admin

Previous

Next