Home > mailing lists

Re: Recovery inconsistencies, standby much larger than primary - Mailing list pgsql-hackers

From	Greg Stark
Subject	Re: Recovery inconsistencies, standby much larger than primary
Date	February 1, 2014 05:33:06
Msg-id	CAM-w4HOnmqPXXRiA+-ojbVmt0-rtAEo0zM47Kea7dV9a0T-W+g@mail.gmail.com Whole thread Raw
In response to	Re: Recovery inconsistencies, standby much larger than primary (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

On Fri, Jan 31, 2014 at 10:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yeah, I'd been wondering if the WAL record somehow got corrupted while
> in memory (presumably after being CRC-checked).  It's a bit hard to see
> how though.

One thing I mentioned early on but bears repeating is that this
instance is 9.1.11.

Also something that occurred to me at 3am -- the "reference to invalid
pages" recovery errors that replayed correctly after the panic might
also explain why the slave seems to operate correctly. It's possible
after the panic it replayed those same records correctly.

> Are all the bloated-on-the-slave relations indexes?  I think the most
> fruitful thing to do at this point is to try to isolate the bloating
> events for the other affected rels as you've done for this one.
> Maybe we'll see a pattern.

I'll poke at those others tomorrow/today. I can also try to bring up a
new standby from the same base backup but it'll take time. It's a
large database. Also the fear I have above is that if I set a recovery
target I might make it miss the bug.

-- 
greg

pgsql-hackers by date:

From: Bruce Momjian
Date: 01 February 2014, 05:30:21
Subject: Re: Small catcache optimization

From: Fujii Masao
Date: 01 February 2014, 06:07:41
Subject: Re: Compression of full-page-writes

Re: Recovery inconsistencies, standby much larger than primary - Mailing list pgsql-hackers

Previous

Next