Thread: Postgresql 9.1 replication failing

Postgresql 9.1 replication failing

From
Jim Buttafuoco
Date:
All,<br /><br />I have a large PG 9.1.1 server (over 1TB of data) and replica using log shipping.  I had some hardware
issueson the replica system and now I am getting the following in my pg_log/* files.  Same 2 lines over and over since
yesterday.<br/><br />2011-12-01 07:46:30 EST  >LOG:  restored log file "000000010000028E000000E5" from archive<br
/>2011-12-0107:46:30 EST  >LOG:  incorrect resource manager data checksum in record at 28E/E555E1B8<br /><br
/>AnythingI can do on the replica or do I have to start over?<br /><br />Finally, I know this is not the correct list,
Itried general with no answer.<br /><br />Thanks<br />Jim<br /><div
apple-content-edited="true">___________________________________________________________<br/><br /><br
/><span></span><span></span><span><imgapple-height="yes" apple-width="yes" height="67"
id="aea68a11-024f-4f2a-9ad9-aac2bdaaa400"src="cid:6330A43A-012D-4F5B-9908-82269F8D15EF@contactbda.com" width="153"
/></span><br/><br /><br /><br /><br />Jim Buttafuoco<br /><a
href="mailto:jim@contacttelecom.com">jim@contacttelecom.com</a><br/>603-647-7170 ext. 2222- Office<br />603-490-3409 -
Cell<br/>jimbuttafuoco - Skype<br /><br /><br /><br /><br /><br /><br /></div><br /> 

Re: Postgresql 9.1 replication failing

From
Robert Haas
Date:
On Thu, Dec 1, 2011 at 1:41 PM, Jim Buttafuoco <jim@contacttelecom.com> wrote:
> 2011-12-01 07:46:30 EST  >LOG:  restored log file "000000010000028E000000E5" from archive
> 2011-12-01 07:46:30 EST  >LOG:  incorrect resource manager data checksum in record at 28E/E555E1B8
>
> Anything I can do on the replica or do I have to start over?

I think you want to rebuild the standby.  Even if you could repair the
damaged WAL record, how can you have any confidence that there is no
other corruption?

Note that rsync has some options to only copy the changed data, which
might greatly accelerated resyncing the standby from the master.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Postgresql 9.1 replication failing

From
Jerry Sievers
Date:
Jim Buttafuoco <jim@contacttelecom.com> writes:

> All,
>
> I have a large PG 9.1.1 server (over 1TB of data) and replica using log shipping.  I had some hardware issues on the
> replica system and now I am getting the following in my pg_log/* files.  Same 2 lines over and over since yesterday.
>
> 2011-12-01 07:46:30 EST  >LOG:  restored log file "000000010000028E000000E5" from archive
> 2011-12-01 07:46:30 EST  >LOG:  incorrect resource manager data checksum in record at 28E/E555E1B8
>
> Anything I can do on the replica or do I have to start over?

INspect that WAL segment or possibly the one immediatly following it
in comparison to another copy if you still have it on the master or a
central WAL repository.

A standby crashing meanwhile copying in a WAL segment and/or synching
one to disk could result in ramdon corruption.

If you have another copy of the segment and does not compare equal to
the one your standby is trying to read, try another copy.

> Finally, I know this is not the correct list, I tried general with no answer.

The admin list is the right one for such a post probably.

HTH

> Thanks
> Jim
> ___________________________________________________________
>
> [cid]
>
> Jim Buttafuoco
> jim@contacttelecom.com
> 603-647-7170 ext. 2222- Office
> 603-490-3409 - Cell
> jimbuttafuoco - Skype
>

-- 
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@comcast.net
p: 305.321.1144


Re: Postgresql 9.1 replication failing

From
Jim Buttafuoco
Date:
the WAL file on the master is long gone, how would one inspect the web segment?  Any way to have PG "move" on?


On Dec 1, 2011, at 2:02 PM, Jerry Sievers wrote:

Jim Buttafuoco <jim@contacttelecom.com> writes:

All,

I have a large PG 9.1.1 server (over 1TB of data) and replica using log shipping.  I had some hardware issues on the
replica system and now I am getting the following in my pg_log/* files.  Same 2 lines over and over since yesterday.

2011-12-01 07:46:30 EST  >LOG:  restored log file "000000010000028E000000E5" from archive
2011-12-01 07:46:30 EST  >LOG:  incorrect resource manager data checksum in record at 28E/E555E1B8

Anything I can do on the replica or do I have to start over?

INspect that WAL segment or possibly the one immediatly following it
in comparison to another copy if you still have it on the master or a
central WAL repository.

A standby crashing meanwhile copying in a WAL segment and/or synching
one to disk could result in ramdon corruption.

If you have another copy of the segment and does not compare equal to
the one your standby is trying to read, try another copy.

Finally, I know this is not the correct list, I tried general with no answer.

The admin list is the right one for such a post probably.

HTH

Thanks
Jim
___________________________________________________________

[cid]

Jim Buttafuoco
jim@contacttelecom.com
603-647-7170 ext. 2222- Office
603-490-3409 - Cell
jimbuttafuoco - Skype


--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@comcast.net
p: 305.321.1144


___________________________________________________________







Jim Buttafuoco
jim@contacttelecom.com
603-647-7170 ext. 2222- Office
603-490-3409 - Cell
jimbuttafuoco - Skype







Attachment

Re: Postgresql 9.1 replication failing

From
Simon Riggs
Date:
On Thu, Dec 1, 2011 at 7:09 PM, Jim Buttafuoco <jim@contacttelecom.com> wrote:
the WAL file on the master is long gone, how would one inspect the web segment?  Any way to have PG "move" on?

Regenerate the master.

 
--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Postgresql 9.1 replication failing

From
Simon Riggs
Date:
On Thu, Dec 1, 2011 at 9:08 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Thu, Dec 1, 2011 at 7:09 PM, Jim Buttafuoco <jim@contacttelecom.com>
> wrote:
>>
>> the WAL file on the master is long gone, how would one inspect the web
>> segment?  Any way to have PG "move" on?
>
>
> Regenerate the master.

typo: regenerate *from* the master

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Postgresql 9.1 replication failing

From
Jim Buttafuoco
Date:
Simon,

What do you mean, start over with a base backup?

Jim

On Dec 1, 2011, at 4:08 PM, Simon Riggs wrote:

On Thu, Dec 1, 2011 at 7:09 PM, Jim Buttafuoco <jim@contacttelecom.com> wrote:
the WAL file on the master is long gone, how would one inspect the web segment?  Any way to have PG "move" on?

Regenerate the master.

 
--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


___________________________________________________________






Jim Buttafuoco
603-647-7170 ext. 2222- Office
603-490-3409 - Cell
jimbuttafuoco - Skype







Attachment

Re: Postgresql 9.1 replication failing

From
desmodemone
Date:
Hello Jim,
               I think you not have other possibilities if the archives are corrupted and there are no possibilities to restore it,
you need to recreate the standby starting from a base backup.

Kind Regards


2011/12/1 Jim Buttafuoco <jim@contacttelecom.com>
Simon,

What do you mean, start over with a base backup?

Jim

On Dec 1, 2011, at 4:08 PM, Simon Riggs wrote:

On Thu, Dec 1, 2011 at 7:09 PM, Jim Buttafuoco <jim@contacttelecom.com> wrote:
the WAL file on the master is long gone, how would one inspect the web segment?  Any way to have PG "move" on?

Regenerate the master.

 
--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


___________________________________________________________






Jim Buttafuoco
jimbuttafuoco - Skype