Home > mailing lists

Thread: Cascade Standby - HELP!

Cascade Standby - HELP!

From

Murthy Nunna

Date:

02 April 2014, 19:10:08

I am not sure if I am hitting a bug… I am running 9.2.4

I created cascade standby as follows:

Master->cascading-standby->standby

I configured streaming replication. At first I thought everything worked. I created some data in master and went to cascading-standby and from there to standby. So, I thought it worked.

And then, I shutdown standby (the last one in the chain) and created data load in master for several hours. Everything got replicated to cascading-standby. However, when I restarted the standby (the one I shut down before), I got an error saying

“[2014-04-02 12:55:49 CDT][][]FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001C00000013 has already been removed”

So, everything worked in standby as long as the logs were available in pg_xlog area of cascading-standby. Once they are removed from pg_xlog, the streaming broke (this is expected).

Is there a way I can make cascading-standby send every log it gets from master down to standby server before it gets cleaned up by archive_cleanup_command? Or some other way??? (I don’t want master send logs to both slaves. That will be defeating the purpose of cascade set up.)

Thanks for your help.

Murthy

Re: Cascade Standby - HELP!

From

"Burgholzer, Robert (DEQ)"

Date:

02 April 2014, 20:29:06

Murthy,

I have experienced something similar to this issue, and recall that it had to do with log file difficulties. I tried my best to document this, but recall that I had problems working back through it, so please, note that this is a procedure that carries no guarantees. Note that the step-by-step calls a couple of custom scripts that I wrote (they are at the end of the page), AND this is on an 8.3 postgresql install:

http://deq3.bse.vt.edu/sifnwiki/index.php/Db_warm_standby-rescue

Regards,

/r/b

Re: Cascade Standby - HELP!

From

Murthy Nunna

Date:

02 April 2014, 20:35:22

Hi Robert,

Thanks for the response. Mine is 9.2.4 and the issue is specifically with cascading standby. I don’t think 8.3 had that feature.

Murthy

From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Burgholzer, Robert (DEQ)
Sent: Wednesday, April 02, 2014 3:29 PM
To: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Cascade Standby - HELP!

Murthy,

http://deq3.bse.vt.edu/sifnwiki/index.php/Db_warm_standby-rescue

Regards,

/r/b

Re: Cascade Standby - HELP!

From

Scott Whitney

Date:

02 April 2014, 20:48:46

My only response that might help you with this is that my setup is:

Server 1) primary production db server (5432)

Server 2) replicates production (5432) also primary for non-production use (6432)

Server 3) (offsite) replicates #2 (5432 -> 7432 internally and 6432 -> 6432) AND hosts internal databases

So:

1) Production

2) Replication + non production

3) Internal + replication of NON production + replication of PRODUCTION (via the cascaded server)

I would find that when replicating the PRODUCTION server internally (#3), I had not set wal_keep_segments high enough,

due to the lag of WAN replication, production and non production would replicate fine (gigabit backend), but sometimes when

a load of data was being updated/moved/deleted, the offsite (#3) would fall behind far enough that it could not recover

without a new pg base backup, in essence setting up replication again.

I solved this by upping the wal_keep_segments.

Hi Robert,

Thanks for the response. Mine is 9.2.4 and the issue is specifically with cascading standby. I don’t think 8.3 had that feature.

Murthy

From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Burgholzer, Robert (DEQ)
Sent: Wednesday, April 02, 2014 3:29 PM
To: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Cascade Standby - HELP!

Murthy,
I have experienced something similar to this issue, and recall that it had to do with log file difficulties. I tried my best to document this, but recall that I had problems working back through it, so please, note that this is a procedure that carries no guarantees. Note that the step-by-step calls a couple of custom scripts that I wrote (they are at the end of the page), AND this is on an 8.3 postgresql install:

http://deq3.bse.vt.edu/sifnwiki/index.php/Db_warm_standby-rescue

Regards,
/r/b

Re: Cascade Standby - HELP!

From

Murthy Nunna

Date:

02 April 2014, 21:07:01

Hi Scott,

Yes, I thought of increasing wal_keep_segments but it doesn’t seem to be a bullet proof solution. It locks that much space and most of the time it is a wasted space since streaming will be working and you really don’t need these logs hanging around.

Besides, things work fine in primary->slave configuration because archive_command ships the logs constantly to slave and if slave is in good hands of streaming it cleans up the logs. It will use it only when it is necessary for gap resolution. However, it doesn’t work from slave->slave configuration because the archive_command has no meaning there while it is recovery mode and it has no logs of its own.

This restriction limits the cascade feature and makes it fragile in my opinion. It could break anytime and you end up rebuilding.

Thanks,

Murthy

From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Scott Whitney
Sent: Wednesday, April 02, 2014 3:49 PM
To: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Cascade Standby - HELP!

My only response that might help you with this is that my setup is:

Server 1) primary production db server (5432)

Server 2) replicates production (5432) also primary for non-production use (6432)

Server 3) (offsite) replicates #2 (5432 -> 7432 internally and 6432 -> 6432) AND hosts internal databases

So:

1) Production

2) Replication + non production

3) Internal + replication of NON production + replication of PRODUCTION (via the cascaded server)

I would find that when replicating the PRODUCTION server internally (#3), I had not set wal_keep_segments high enough,

due to the lag of WAN replication, production and non production would replicate fine (gigabit backend), but sometimes when

a load of data was being updated/moved/deleted, the offsite (#3) would fall behind far enough that it could not recover

without a new pg base backup, in essence setting up replication again.

I solved this by upping the wal_keep_segments.

Hi Robert,

Thanks for the response. Mine is 9.2.4 and the issue is specifically with cascading standby. I don’t think 8.3 had that feature.

Murthy

From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Burgholzer, Robert (DEQ)
Sent: Wednesday, April 02, 2014 3:29 PM
To: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Cascade Standby - HELP!

Murthy,
I have experienced something similar to this issue, and recall that it had to do with log file difficulties. I tried my best to document this, but recall that I had problems working back through it, so please, note that this is a procedure that carries no guarantees. Note that the step-by-step calls a couple of custom scripts that I wrote (they are at the end of the page), AND this is on an 8.3 postgresql install:

http://deq3.bse.vt.edu/sifnwiki/index.php/Db_warm_standby-rescue

Regards,
/r/b