Thread: Interrupted streaming replication

Interrupted streaming replication

From
Tony Nelson
Date:

Hi all,

 

I have a small problem of my own making.  I have a simple streaming replication setup, one master and two slaves.  One slave is in the same rack as the master, the other is across a “slow” link in another server room.

 

Last night I executed some updates that caused a lot of changes.  The slave in the same room did worked just fine, and the slave across the slow link was working fine, applying logs.

 

Then my backups executed, and moved all of the WAL archives, to an archive directory.

 

The good news is, I *think* I have them all.

 

The slave is waiting:

 

postgres 12562 12561  0 Oct16 ?        00:11:07 postgres: startup process   waiting for 00000001000000890000002C

 

And I definitely have the file in my archive directory on the master. 

 

The master is logging this error:

 

2015-12-07 08:24:50 EST FATAL:  requested WAL segment 00000001000000890000002C has already been removed

 

Can I simply copy the file from my archive directory back to the WAL directory?

 

Thanks in advance

Tony Nelson



Since 1982, Starpoint Solutions has been a trusted source of human capital and solutions. We are committed to our clients, employees, environment, community and social concerns. We foster an inclusive culture based on trust, respect, honesty and solid performance. Learn more about Starpoint and our social responsibility at http://www.starpoint.com/social_responsibility


This email message from Starpoint Solutions LLC is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Opinions, conclusions and other information in this message that do not relate to the official business of Starpoint Solutions shall be understood as neither given nor endorsed by it.

Re: Interrupted streaming replication

From
Andreas Kretschmer
Date:
Tony Nelson <tnelson@starpoint.com> wrote:

> 2015-12-07 08:24:50 EST FATAL:  requested WAL segment 00000001000000890000002C
> has already been removed
>
>
>
> Can I simply copy the file from my archive directory back to the WAL directory?

I'm afraid that won't work, because of the master knows nothing about
this copy and can't stream its content.

Maybe you can create/update your recovery.conf, see restore_command.


for the future: increase wal_keep_segments, or better: use replication
slots (since 9.4 a really good feature to prevent such errors).





Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°


Re: Interrupted streaming replication

From
Tony Nelson
Date:

> -----Original Message-----
> Tony Nelson <tnelson@starpoint.com> wrote:
>
> > 2015-12-07 08:24:50 EST FATAL:  requested WAL segment
> > 00000001000000890000002C has already been removed
> >
> >
> >
> > Can I simply copy the file from my archive directory back to the WAL
> directory?
>
> I'm afraid that won't work, because of the master knows nothing about this
> copy and can't stream its content.
>
> Maybe you can create/update your recovery.conf, see restore_command.
>
>
> for the future: increase wal_keep_segments, or better: use replication slots
> (since 9.4 a really good feature to prevent such errors).
>
>

Thank you.  I'm still running 9.1, but I'll look into 9.4.  Fortunately my database is small so doing a new base backup
isnot the end of the world.  It's been a long time since I've set this up, so I'm going to have to go through a
refreshercourse on exactly what I need to do. 

Thanks for your help.

Tony

Since 1982, Starpoint Solutions has been a trusted source of human capital and solutions. We are committed to our
clients,employees, environment, community and social concerns.  We foster an inclusive culture based on trust, respect,
honestyand solid performance. Learn more about Starpoint and our social responsibility at
http://www.starpoint.com/social_responsibility

This email message from Starpoint Solutions LLC is for the sole use of  the intended recipient(s) and may contain
confidentialand privileged  information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If
youare not the intended recipient, please contact the sender by reply email and destroy all copies of the original
message. Opinions, conclusions and other information in this message that do not relate to the official business of
StarpointSolutions shall be understood as neither given nor endorsed by it. 


Re: Interrupted streaming replication

From
Melvin Davidson
Date:
As long as you have 00000001000000890000002C and the subsequent WALs in your archive directory, then you should
be able to simply scp them to you second slave's pg_xlog directory.


On Mon, Dec 7, 2015 at 8:50 AM, Tony Nelson <tnelson@starpoint.com> wrote:


> -----Original Message-----
> Tony Nelson <tnelson@starpoint.com> wrote:
>
> > 2015-12-07 08:24:50 EST FATAL:  requested WAL segment
> > 00000001000000890000002C has already been removed
> >
> >
> >
> > Can I simply copy the file from my archive directory back to the WAL
> directory?
>
> I'm afraid that won't work, because of the master knows nothing about this
> copy and can't stream its content.
>
> Maybe you can create/update your recovery.conf, see restore_command.
>
>
> for the future: increase wal_keep_segments, or better: use replication slots
> (since 9.4 a really good feature to prevent such errors).
>
>

Thank you.  I'm still running 9.1, but I'll look into 9.4.  Fortunately my database is small so doing a new base backup is not the end of the world.  It's been a long time since I've set this up, so I'm going to have to go through a refresher course on exactly what I need to do.

Thanks for your help.

Tony

Since 1982, Starpoint Solutions has been a trusted source of human capital and solutions. We are committed to our clients, employees, environment, community and social concerns.  We foster an inclusive culture based on trust, respect, honesty and solid performance. Learn more about Starpoint and our social responsibility at http://www.starpoint.com/social_responsibility

This email message from Starpoint Solutions LLC is for the sole use of  the intended recipient(s) and may contain confidential and privileged  information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.  Opinions, conclusions and other information in this message that do not relate to the official business of Starpoint Solutions shall be understood as neither given nor endorsed by it.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: Interrupted streaming replication

From
Adrian Klaver
Date:
On 12/07/2015 05:50 AM, Tony Nelson wrote:
>
>
>> -----Original Message-----
>> Tony Nelson <tnelson@starpoint.com> wrote:
>>
>>> 2015-12-07 08:24:50 EST FATAL:  requested WAL segment
>>> 00000001000000890000002C has already been removed
>>>
>>>
>>>
>>> Can I simply copy the file from my archive directory back to the WAL
>> directory?
>>
>> I'm afraid that won't work, because of the master knows nothing about this
>> copy and can't stream its content.
>>
>> Maybe you can create/update your recovery.conf, see restore_command.
>>
>>
>> for the future: increase wal_keep_segments, or better: use replication slots
>> (since 9.4 a really good feature to prevent such errors).
>>
>>
>
> Thank you.  I'm still running 9.1, but I'll look into 9.4.  Fortunately my database is small so doing a new base
backupis not the end of the world.  It's been a long time since I've set this up, so I'm going to have to go through a
refreshercourse on exactly what I need to do. 

I have not done it, but is seems to me given that you have an WAL
archive you could enable archive recovery on the stalled slave and
restart it:

http://www.postgresql.org/docs/9.1/interactive/archive-recovery-settings.html

>
> Thanks for your help.
>
> Tony
>
> Since 1982, Starpoint Solutions has been a trusted source of human capital and solutions. We are committed to our
clients,employees, environment, community and social concerns.  We foster an inclusive culture based on trust, respect,
honestyand solid performance. Learn more about Starpoint and our social responsibility at
http://www.starpoint.com/social_responsibility
>
> This email message from Starpoint Solutions LLC is for the sole use of  the intended recipient(s) and may contain
confidentialand privileged  information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If
youare not the intended recipient, please contact the sender by reply email and destroy all copies of the original
message. Opinions, conclusions and other information in this message that do not relate to the official business of
StarpointSolutions shall be understood as neither given nor endorsed by it. 
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Interrupted streaming replication

From
Tony Nelson
Date:

Thank you very much, this is working perfectly.  I copied a couple over and the server applied them, and is waiting for the next set.

 

Tony

 

From: Melvin Davidson [mailto:melvin6925@gmail.com]
Sent: Monday, December 07, 2015 8:59 AM
To: Tony Nelson
Cc: Andreas Kretschmer; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Interrupted streaming replication

 

As long as you have 00000001000000890000002C and the subsequent WALs in your archive directory, then you should
be able to simply scp them to you second slave's pg_xlog directory.

 

On Mon, Dec 7, 2015 at 8:50 AM, Tony Nelson <tnelson@starpoint.com> wrote:



> -----Original Message-----
> Tony Nelson <tnelson@starpoint.com> wrote:
>
> > 2015-12-07 08:24:50 EST FATAL:  requested WAL segment
> > 00000001000000890000002C has already been removed
> >
> >
> >
> > Can I simply copy the file from my archive directory back to the WAL
> directory?
>
> I'm afraid that won't work, because of the master knows nothing about this
> copy and can't stream its content.
>
> Maybe you can create/update your recovery.conf, see restore_command.
>
>
> for the future: increase wal_keep_segments, or better: use replication slots
> (since 9.4 a really good feature to prevent such errors).
>
>

Thank you.  I'm still running 9.1, but I'll look into 9.4.  Fortunately my database is small so doing a new base backup is not the end of the world.  It's been a long time since I've set this up, so I'm going to have to go through a refresher course on exactly what I need to do.

Thanks for your help.

Tony

Since 1982, Starpoint Solutions has been a trusted source of human capital and solutions. We are committed to our clients, employees, environment, community and social concerns.  We foster an inclusive culture based on trust, respect, honesty and solid performance. Learn more about Starpoint and our social responsibility at http://www.starpoint.com/social_responsibility

This email message from Starpoint Solutions LLC is for the sole use of  the intended recipient(s) and may contain confidential and privileged  information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.  Opinions, conclusions and other information in this message that do not relate to the official business of Starpoint Solutions shall be understood as neither given nor endorsed by it.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




--

Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.
Image removed by sender.



Since 1982, Starpoint Solutions has been a trusted source of human capital and solutions. We are committed to our clients, employees, environment, community and social concerns. We foster an inclusive culture based on trust, respect, honesty and solid performance. Learn more about Starpoint and our social responsibility at http://www.starpoint.com/social_responsibility


This email message from Starpoint Solutions LLC is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Opinions, conclusions and other information in this message that do not relate to the official business of Starpoint Solutions shall be understood as neither given nor endorsed by it.
Attachment

Re: Interrupted streaming replication

From
Andreas Kretschmer
Date:
Melvin Davidson <melvin6925@gmail.com> wrote:

> As long as you have 00000001000000890000002C and the subsequent WALs in your
> archive directory, then you should
> be able to simply scp them to you second slave's pg_xlog directory.

Nice idea ;-)

wasn't sure if that works, but yes, nice.


Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°


Re: Interrupted streaming replication

From
Adrian Klaver
Date:
On 12/07/2015 06:04 AM, Tony Nelson wrote:
> Thank you very much, this is working perfectly.  I copied a couple over
> and the server applied them, and is waiting for the next set.

To prevent this sort of thing in the future I would still take a look at
the restore_command, for the following reason:

http://www.postgresql.org/docs/9.1/interactive/warm-standby.html#STREAMING-REPLICATION

"If you use streaming replication without file-based continuous
archiving, you have to set wal_keep_segments in the master to a value
high enough to ensure that old WAL segments are not recycled too early,
while the standby might still need them to catch up. If the standby
falls behind too much, it needs to be reinitialized from a new base
backup. If you set up a WAL archive that's accessible from the standby,
wal_keep_segments is not required as the standby can always use the
archive to catch up."

A belt and suspenders approach where streaming is the default and WAL
archive recovery is the backup.

>
> Tony
>
> *From:*Melvin Davidson [mailto:melvin6925@gmail.com]
> *Sent:* Monday, December 07, 2015 8:59 AM
> *To:* Tony Nelson
> *Cc:* Andreas Kretschmer; pgsql-general@postgresql.org
> *Subject:* Re: [GENERAL] Interrupted streaming replication
>
> As long as you have 00000001000000890000002C and the subsequent WALs in
> your archive directory, then you should
> be able to simply scp them to you second slave's pg_xlog directory.
>
> On Mon, Dec 7, 2015 at 8:50 AM, Tony Nelson <tnelson@starpoint.com
> <mailto:tnelson@starpoint.com>> wrote:
>
>
>
>  > -----Original Message-----
>  > Tony Nelson <tnelson@starpoint.com <mailto:tnelson@starpoint.com>> wrote:
>  >
>  > > 2015-12-07 08:24:50 EST FATAL:  requested WAL segment
>  > > 00000001000000890000002C has already been removed
>  > >
>  > >
>  > >
>  > > Can I simply copy the file from my archive directory back to the WAL
>  > directory?
>  >
>  > I'm afraid that won't work, because of the master knows nothing about
> this
>  > copy and can't stream its content.
>  >
>  > Maybe you can create/update your recovery.conf, see restore_command.
>  >
>  >
>  > for the future: increase wal_keep_segments, or better: use
> replication slots
>  > (since 9.4 a really good feature to prevent such errors).
>  >
>  >
>
> Thank you.  I'm still running 9.1, but I'll look into 9.4.  Fortunately
> my database is small so doing a new base backup is not the end of the
> world.  It's been a long time since I've set this up, so I'm going to
> have to go through a refresher course on exactly what I need to do.
>
> Thanks for your help.
>
> Tony
>
> Since 1982, Starpoint Solutions has been a trusted source of human
> capital and solutions. We are committed to our clients, employees,
> environment, community and social concerns.  We foster an inclusive
> culture based on trust, respect, honesty and solid performance. Learn
> more about Starpoint and our social responsibility at
> http://www.starpoint.com/social_responsibility
>
> This email message from Starpoint Solutions LLC is for the sole use of
> the intended recipient(s) and may contain confidential and privileged
> information.  Any unauthorized review, use, disclosure or distribution
> is prohibited.  If you are not the intended recipient, please contact
> the sender by reply email and destroy all copies of the original
> message.  Opinions, conclusions and other information in this message
> that do not relate to the official business of Starpoint Solutions shall
> be understood as neither given nor endorsed by it.
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org
> <mailto:pgsql-general@postgresql.org>)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>
>
>
> --
>
> *Melvin Davidson*
> *I reserve the right to fantasize.  Whether or not you
> wish to share my fantasy is entirely up to you. Image removed by sender.*
>
>
> ------------------------------------------------------------------------
> Since 1982, Starpoint Solutions has been a trusted source of human
> capital and solutions. We are committed to our clients, employees,
> environment, community and social concerns. We foster an inclusive
> culture based on trust, respect, honesty and solid performance. Learn
> more about Starpoint and our social responsibility at
> http://www.starpoint.com/social_responsibility
>
> ------------------------------------------------------------------------
> This email message from Starpoint Solutions LLC is for the sole use of
> the intended recipient(s) and may contain confidential and privileged
> information. Any unauthorized review, use, disclosure or distribution is
> prohibited. If you are not the intended recipient, please contact the
> sender by reply email and destroy all copies of the original message.
> Opinions, conclusions and other information in this message that do not
> relate to the official business of Starpoint Solutions shall be
> understood as neither given nor endorsed by it.


--
Adrian Klaver
adrian.klaver@aklaver.com