Re: Automatic restore corruption problem - Mailing list pgsql-admin

From Matthieu Lejeune
Subject Re: Automatic restore corruption problem
Date
Msg-id 55A4A77F.8060606@exxoss.com
Whole thread Raw
In response to Re: Automatic restore corruption problem  (Guillaume Lelarge <guillaume@lelarge.info>)
Responses Re: Automatic restore corruption problem
List pgsql-admin
Hi,

I had no recovery.conf on this server because I launch my replication every night I need a H-24 copy database.

This is my recovery.conf
root@p2prddnmdbc:/var/lib/postgresql# cat recovery.conf
standby_mode = 'off'
primary_conninfo = 'host=10.10.11.1 port=5432 user=replicator password=XXXXXX'
trigger_file = '/var/lib/postgresql/9.1/main/trigger'
restore_command = 'cp /mnt/p2prddnmdbm_pg_xlog/%f %p'
root@p2prddnmdbc:/var/lib/postgresql#

But with or without a recovery.conf file I can't start the database service :
root@p2prddnmdbc:/var/lib/postgresql# /etc/init.d/postgresql start
[....] Starting PostgreSQL 9.3 database server: main[....] The PostgreSQL server failed to start. Please check the log output: 2015-07-14 08:02:59 CEST LOG: database system was interrupted; last known up at 2015-07-13 23:33:46 CEST 2015-07-14 08:02:59 CEST LOG: invalid checkpoint record 2015-07-14 08:02:59 CEST FATAL: could not locate required checkpoint record 2015-07-14 08:02:59 CEST HINT: If you are not restoring from a backup, try removing the file "/var/lib/postgresql/9.3/main/backup_label". 2015-07-14 08:02:59 CEST LOG: startup process (PID 24617) exited with exit co[FAIL2015-07-14 08:02:59 CEST LOG: aborting startup due to startup process failure ... failed!
 failed!

Thanks
Matthieu

Le 12/07/15 13:46, Guillaume Lelarge a écrit :
Hi,

2015-07-12 10:18 GMT+02:00 Matthieu Lejeune <matthieu.lejeune@exxoss.com>:
Hi thank for your reply

My target is to give a database for buisness testing query and they are modify the database during the buisness day.

Now I got this error if I keep the file backup_label :

root@p2prddnmdbc:/var/lib/postgresql/9.3/main# /etc/init.d/postgresql start
[....] Starting PostgreSQL 9.3 database server: main[....] The PostgreSQL server failed to start. Please check the log output: 2015-07-12 10:12:45 CEST LOG: database system was shut down at 2015-07-12 10:07:10 CEST 2015-07-12 10:12:45 CEST LOG: invalid checkpoint record 2015-07-12 10:12:45 CEST FATAL: could not locate required checkpoint record 2015-07-12 10:12:45 CEST HINT: If you are not restoring from a backup, try removing the file "/var/lib/postgresql/9.3/main/backup_label". 2015-07-12 10:12:45 CEST LOG: startup process (PID 28492) exited with exit code 1 2015-07-12 10:12:45 CEST LOG: abo[FAIL startup due to startup process failure ... failed!
 failed!


If I put the recovery.conf the database is waiting for the wal to relaunch the replication.

postgres 27817  0.7  0.9 631212 39892 ?        S    09:55   0:00 /usr/lib/postgresql/9.3/bin/postgres -D /var/lib/postgresql/9.3/main
postgres 27818  0.0  0.0 631472  2076 ?        Ss   09:55   0:00  \_ postgres: startup process   waiting for 0000000100000178000000B9
root@p2prddnmdbc:/var/lib/postgresql/9.3/main# su - postgres
postgres@p2prddnmdbc:~$ psql energycomm
psql: FATAL:  the database system is starting up

Have you got an idea to stop the replication process and start the database ?


What did you put in the recovery.conf file? (hint: standby_mode must be off)

 
Kind regards
Matthieu

Le 10/07/15 16:46, Keith a écrit :

On Fri, Jul 10, 2015 at 10:07 AM, Guillaume Lelarge <guillaume@lelarge.info> wrote:

Hi,

Le 10 juil. 2015 3:02 PM, "Matthieu Lejeune" <matthieu.lejeune@exxoss.com> a écrit :
>
> Hi all,
>
> I have a script for restoring a database every night to an other postgresql database
>
> root@p2prddnmdbc:~# cat /var/admin/script/restoredb.sh
> #/bin/bash
> /etc/init.d/postgresql stop
> mv /var/log/postgresql/postgresql-9.3-main.log /var/log/postgresql/postgresql-9.3-main.log.old
> cd /var/lib/postgresql/9.3/main
> psql --host=p2prddnmdbm --username=replicator postgres -c "SELECT pg_start_backup('sync');"
> rsync -av --delete root@10.10.11.1:/var/lib/postgresql/9.3/main/* /var/lib/postgresql/9.3/main/
> rm backup_label
> chown -R postgres:postgres *
> psql --host=p2prddnmdbm --username=replicator postgres -c "SELECT pg_stop_backup();"
> /etc/init.d/postgresql start
> chmod 777 /var/log/postgresql/postgresql-9.3-main.log
> psql -U postgres -c "ALTER USER xxxx WITH PASSWORD 'XXXX';"
> psql -U postgres xxxx -c "CREATE EXTENSION dblink;"
> root@p2prddnmdbc:~#
>
>
> But during the day when the user are using the new database we got error like this :
>
> 2015-06-25 16:20:58 CEST ERROR:  could not read block 257985 in file "base/16386/14064061.1": read only 0 of 8192 bytes
> 2015-06-22 15:21:11 CEST ERROR:  could not read block 256801 in file "base/16386/14064061.1": read only 0 of 8192 bytes
>
> I have check the : filesystem on the vm, on the HW SAN,...
>
> Any idea to fix this problem?

Sure. Don't remove the backup_label file, and add the recovery.conf file.

--
Guillaume







--

pgsql-admin by date:

Previous
From: Scott Ribe
Date:
Subject: Re: could not create shared memory segment: Invalid argument
Next
From: Guillaume Lelarge
Date:
Subject: Re: Automatic restore corruption problem