Re: Broken primary key after backup restore. - Mailing list pgsql-general

From Michael Chau
Subject Re: Broken primary key after backup restore.
Date
Msg-id CALE++3TWuii4eo=xnvqtgAo2eHRp+M2sY5swKHVu6OO6EGQ6DA@mail.gmail.com
Whole thread Raw
In response to Re: Broken primary key after backup restore.  (Adrian Klaver <adrian.klaver@aklaver.com>)
Responses Re: Broken primary key after backup restore.
List pgsql-general
1) 
In Production, I have a DB2 which is replicated partially using Londiste
from DB1.

Well I think the above needs more explanation to help understand how the DB2 backup got into this state and possibly prevent it in the future.

A: So, the DB1 has several schemas in the database. We use Londiste to replicate just one of the schemas to DB2.  The table in question is in that schema. Backup is done on both DB1 and DB2.

2) 
I make file-system backups nightly on both DBs.

How is that done exactly?

A: To backup:

1) pg_start_backup()
2) tar up the files under the data directory
3) pg_stop_backup()

To restore on test server:

1) Just untar the tar ball, then start up Postgres. Of course the data directory is empty beforehand.

This has been working for almost 2 years without any problem until last Monday. I remember that I just ran vacuum analyze that table on both DB1 and DB2  that morning. But, I don't think that it harms anything.

Thanks


On Thu, Sep 17, 2015 at 4:53 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 09/17/2015 04:31 PM, Michael Chau wrote:
Hi,

In Production, I have a DB2 which is replicated partially using Londiste
from DB1.

Well I think the above needs more explanation to help understand how the DB2 backup got into this state and possibly prevent it in the future.


I make file-system backups nightly on both DBs.

How is that done exactly?


Last Monday, when I restored the backup made from DB2 to a test server,
Postgres(9.3.5) started up fine. But, I found out that the primary key
of one of the tables is broken

# select * from <mytable> order by id desc;
ERROR:  could not find left sibling of block 17392 in index "mytable_pkey"

I am able to select without using the id index. On Prod DB1 , DB2 and on
another test server restored from backup made from DB1, there is no
problem, as I am able to select the table with and without index.

Did you restore to the DB2 derived test server in the same way as you did the other servers?



The table has 5 million rows. And I run Vacuum Analyze once a week.

1) To fix the above error, I tried to run vacuum full on the table and
run 'reindex table <mytable>;. But it didn't help as the reindexing has
taken very very long time and not sure if it has finished or just timed out.

There is also a suggestion to recreate the primary key constraint
concurrently which I will look into later.

2) However, my main concern right now is whether there is any corruption
in the Prods table as it does look fine. Is there any way to check? And
also should we trust a file-system backup in this case?

Thanks








--
Adrian Klaver
adrian.klaver@aklaver.com



--
Michael Chau
Database Administrator
GAME GOLF
77 Geary St, 5th floor
San Francisco, CA 94108
e) michael.chau@gameyourgame.com
t) @GAMEGOLF

pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Broken primary key after backup restore.
Next
From: Jerry Sievers
Date:
Subject: Re: Hiding name and version