Home > mailing lists

Re: Recovery inconsistencies, standby much larger than primary - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Recovery inconsistencies, standby much larger than primary
Date	February 1, 2014 00:11:58
Msg-id	21858.1391202711@sss.pgh.pa.us Whole thread Raw
In response to	Re: Recovery inconsistencies, standby much larger than primary (Greg Stark <stark@mit.edu>)
Responses	Re: Recovery inconsistencies, standby much larger than primary (Greg Stark <stark@mit.edu>)
List	pgsql-hackers

Tree view

Greg Stark <stark@mit.edu> writes:
> One thing I keep coming back to is a bad ran chip setting a bit in the
> block number. But I just can't seem to get it to add up. The difference is
> not a power of two, it had happened on two different machines, and we don't
> see other weirdness on the machine. It seems like a strange coincidence it
> would happen to the same variable twice and not to other variables.

I also looked at the bit patterns for the two block numbers, and couldn't
detect any relationship.

> Unless there's some unrelated code writing through a wild pointer, possibly
> to a stack allocated object that just happens to often be that variable?

Yeah, I'd been wondering if the WAL record somehow got corrupted while
in memory (presumably after being CRC-checked).  It's a bit hard to see
how though.

Are all the bloated-on-the-slave relations indexes?  I think the most
fruitful thing to do at this point is to try to isolate the bloating
events for the other affected rels as you've done for this one.
Maybe we'll see a pattern.
        regards, tom lane

pgsql-hackers by date:

From: Bruce Momjian
Date: 01 February 2014, 00:06:55
Subject: Re: Misplaced BKI entries in pg_amproc.h

From: Bruce Momjian
Date: 01 February 2014, 00:28:14
Subject: Re: FOR [SHARE|UPDATE] NOWAIT may still block in EvalPlanQualFetch

Re: Recovery inconsistencies, standby much larger than primary - Mailing list pgsql-hackers

Previous

Next