Thread: Strange issues with 9.2 pg_basebackup & replication

Strange issues with 9.2 pg_basebackup & replication

From

Josh Berkus

Date:

13 May 2012, 16:05:13

Doing some beta testing, managed to produce this issue using the daily
snapshot from Tuesday:

1. Created master server, loaded it with a couple dummy databases.

2. Created standby server.

3. Did pg_basebackup -x stream on standby server

4. Started standby server.

5. Realized I'd forgotten to create a recovery.conf.  Shut down the
standby server, wrote a recovery.conf, and restarted it.

6. Standby server looked normal and appeared to be replicating.  Master
server showed it as replicating:

postgres=# select * from pg_stat_replication;pid  | usesysid |  usename   | application_name |  client_addr   |
client_hostname | client_port |  backend_start         |   state   | sent_location | write_location |
flush_location | replay_location
| sync_priority | sync_state
------+----------+------------+------------------+----------------+-----------------+-------------+------
-------------------------+-----------+---------------+----------------+----------------+-----------------
+---------------+------------1278 |    16393 | replicator | walreceiver      | ###.###.61.227 |           |       45391
|2012-
 
05-13 18:44:18.603122+00 | streaming | 0/70000B8     | 0/70000B8      |
0/70000B8      | 0/70000E0
|             0 | async

7. Did a "create table" on the master server, creating an empty table.

8. Got this fatal error on the standby server:

LOG:  record with incorrect prev-link 0/70000B8 at 0/70000E0
LOG:  record with incorrect prev-link 0/70000B8 at 0/70000E0

... this error message repeated every 5s.


Either the swap of the standby into proper standby mode should have been
OK (since there were no writes on the master or the standby in that
time), or it should have failed immediately.  Clearly there's something
broken here.

Note that I more-or-less did the same test on 9.1, and it didn't break
in this way.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: Strange issues with 9.2 pg_basebackup & replication

From

Josh Berkus

Date:

13 May 2012, 16:23:42

More issues: the pg_basebackup -x stream on the cascading replica won't
complete until the xlog rotates on the master.  (again, this is
Tuesday's snapshot).

Servers:
.226 == master-master, the writeable master
.227 == master-replica, a direct replica of master-master
.228 == replica-replica, a cascading replica of master-replica

1. recreated master-master, loaded it with some dummy databases.

2. created master-replica

3. took pg_basebackup -x stream of master-master on master-replica

4. edited recovery.conf and started master-replica.  started normally.

5. created a table on master-master.  change replicated to master-replica.

6. created replica-replica.  started a pg_basebackup -x stream from
master-replica

7. pg_basebackup hung forever.  output of -v:

xlog start point: 0/A000020
pg_basebackup: starting background WAL receiver

xlog end point: 0/A01C188
pg_basebackup: waiting for background process to finish streaming...

8. tried creating a table on master-master to create a write.  this had
no effect (although the table did replicate to master-replica).

Here's pg_stat_replication on master-master while the basebackup is hung:
1385 |    16393 | replicator | walreceiver      | ###.###.61.227 |           |       45396 | 2012-
05-13 19:05:11.972471+00 | streaming | 0/A024F50     | 0/A024F50      |
0/A024F50      | 0/A024F50
|             0 | async

Here's pg_stat_replication on master-replica while basebackup is hung:
1243 |    16393 | replicator | pg_basebackup    | ###.###.61.228 |           |       49218 | 2012-
05-13 19:06:07.606378+00 | startup   | 0/0           |                |              |
|             0 | async1244 |    16393 | replicator | pg_basebackup    | ###.###.61.228 |           |       49219 |
2012-
05-13 19:06:07.611996+00 | streaming | 0/A024F50     |                |              |
|             0 | async


9. Ran pg_xlog_switch() on the master.

10. basebackup completed on replica-replica.

11. edited recovery.conf on replica-replica and started it.  works fine.

12. created a new table on master-master, changed replicated to
replica-replica.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: Strange issues with 9.2 pg_basebackup & replication

From

Josh Berkus

Date:

13 May 2012, 17:09:15

More issues: promoting intermediate standby breaks replication.

To be a bit blunt here, has anyone tested cascading replication *at all*
before this?

So, same setup as previous message.

1. Shut down master-master.

2. pg_ctl promote master-replica

3. replication breaks.  error message on replica-replica:

FATAL:  timeline 2 of the primary does not match recovery target timeline 1

4. No amount of adjustment on replica-replica will get it replicating
again.

Note that replica-replica was configured with:

recovery_target_timeline = 'latest'


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: Strange issues with 9.2 pg_basebackup & replication

From

Thom Brown

Date:

13 May 2012, 17:44:54

On 13 May 2012 20:23, Josh Berkus <josh@agliodbs.com> wrote:
> More issues: the pg_basebackup -x stream on the cascading replica won't
> complete until the xlog rotates on the master.  (again, this is
> Tuesday's snapshot).

This is already on the open items list:
http://wiki.postgresql.org/wiki/PostgreSQL_9.2_Open_Items#pg_basebackup.2Fpg_receivexlog

--
Thom

Re: Strange issues with 9.2 pg_basebackup & replication

From

Jim Nasby

Date:

14 May 2012, 15:58:19

On May 13, 2012, at 3:08 PM, Josh Berkus wrote:
> More issues: promoting intermediate standby breaks replication.
>
> To be a bit blunt here, has anyone tested cascading replication *at all*
> before this?

Josh, do you have scripts that you're using to do this testing? If so can you post them somewhere?

AFAIK we don't have any regression tests for all this replication stuff, but ISTM that we need some...
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net

Re: Strange issues with 9.2 pg_basebackup & replication

From

Joshua Berkus

Date:

15 May 2012, 12:13:15

Jim,

I didn't get as far as running any tests, actually.  All I did was try to set up 3 servers in cascading replication.
ThenI tried shutting down master-master and promoting master-replica.  That's it.
 

----- Original Message -----
> On May 13, 2012, at 3:08 PM, Josh Berkus wrote:
> > More issues: promoting intermediate standby breaks replication.
> > 
> > To be a bit blunt here, has anyone tested cascading replication *at
> > all*
> > before this?
> 
> Josh, do you have scripts that you're using to do this testing? If so
> can you post them somewhere?
> 
> AFAIK we don't have any regression tests for all this replication
> stuff, but ISTM that we need some...
> --
> Jim C. Nasby, Database Architect                   jim@nasby.net
> 512.569.9461 (cell)                         http://jim.nasby.net
> 
>

Re: Strange issues with 9.2 pg_basebackup & replication

From

Thom Brown

Date:

15 May 2012, 13:37:08

On 13 May 2012 16:08, Josh Berkus <josh@agliodbs.com> wrote:
> More issues: promoting intermediate standby breaks replication.
>
> To be a bit blunt here, has anyone tested cascading replication *at all*
> before this?
>
> So, same setup as previous message.
>
> 1. Shut down master-master.
>
> 2. pg_ctl promote master-replica
>
> 3. replication breaks.  error message on replica-replica:
>
> FATAL:  timeline 2 of the primary does not match recovery target timeline 1
>
> 4. No amount of adjustment on replica-replica will get it replicating
> again.
>
> Note that replica-replica was configured with:
>
> recovery_target_timeline = 'latest'

I can recreate this "issue", although the docs say:

"Promoting a cascading standby terminates the immediate downstream
replication connections which it serves. This is because the timeline
becomes different between standbys, and they can no longer continue
replication. The affected standby(s) may reconnect to reestablish
streaming replication."

(http://www.postgresql.org/docs/9.2/static/warm-standby.html#CASCADING-REPLICATION)

However, this isn't true when I restart the standby.  I've been
informed that this should work fine if a WAL archive has been
configured (which should be used anyway).

But one new problem I appear to have is that once I set up archiving
and restart, then try pg_basebackup, it gets stuck and never shows any
progress.  If I terminate pg_basebackup in this state and attempt to
restart it more times than max_wal_senders, it can no longer run, as
pg_basebackup didn't disconnect the stream, so ends up using all
senders.  And these show up in pg_stat_replication.  I have a theory
that if archiving is enabled, restart postgres then generate some WAL
to the point there is a file or two in the archive, pg_basebackup
can't stream anything.  Once I restart the server, it's fine and
continues as normal.  This has the same symptoms of the "pg_basebackup
from running standby with streaming" issue.

Steps to recreate:

1) initdb new cluster
2) start new cluster
3) make archive dir (in my case, /tmp/arch) and set the following: wal_level = hot_standby max_wal_senders = 3
archive_mode= on archive_command = 'cp %p /tmp/arch/%f' 
4) Set pg_hba.conf to allow streaming replication connections
5) Restart the cluster
6) Create a table and insert a few hundred thousand rows until
/tmp/arch shows some WAL files
7) Run: pg_basebackup -x stream -D s1 -Pv

This actually does finish eventually but it appears to need some
encouragement by generating some WAL and issuing a checkpoint:

thom@swift:~/Development$ time pg_basebackup -x stream -D s1 -Pv
xlog start point: 0/4000020
pg_basebackup: starting background WAL receiver
53951/53951 kB (100%), 1/1 tablespace
xlog end point: 0/5DE15E0
pg_basebackup: waiting for background process to finish streaming...
pg_basebackup: base backup completed

real    2m37.456s
user    0m0.016s
sys    0m0.724s

If I terminate pg_basebackup and restart it without generating
additional WAL, it doesn't appear to release the streaming connection
ever (or not within my patience limit of a few minutes).  And I can't
free these connections without restarting the cluster.

But once I get the standby up and running and acting as a hot standby,
and ignore the current issue with it getting stuck creating a standby
from a standby, I still get the mismatched timeline issue, so the
addition of WAL archiving didn't appear to resolve this for me.

--
Thom

Re: Strange issues with 9.2 pg_basebackup & replication

From

Fujii Masao

Date:

15 May 2012, 14:15:31

On Wed, May 16, 2012 at 1:36 AM, Thom Brown <thom@linux.com> wrote:
> However, this isn't true when I restart the standby.  I've been
> informed that this should work fine if a WAL archive has been
> configured (which should be used anyway).

The WAL archive should be shared by master-replica and replica-replica,
and recovery_target_timeline should be set to latest in replica-replica.
If you configure that way, replica-replica would successfully reconnect to
master-replica with no need to restart it.

> But one new problem I appear to have is that once I set up archiving
> and restart, then try pg_basebackup, it gets stuck and never shows any
> progress.  If I terminate pg_basebackup in this state and attempt to
> restart it more times than max_wal_senders, it can no longer run, as
> pg_basebackup didn't disconnect the stream, so ends up using all
> senders.  And these show up in pg_stat_replication.  I have a theory
> that if archiving is enabled, restart postgres then generate some WAL
> to the point there is a file or two in the archive, pg_basebackup
> can't stream anything.  Once I restart the server, it's fine and
> continues as normal.  This has the same symptoms of the "pg_basebackup
> from running standby with streaming" issue.

This seems to be caused by spread checkpoint which is requested by
pg_basebackup. IOW, this looks a normal behavior rather than a bug
or an issue. What if you specify "-c fast" option in pg_basebackup?

Regards,

--
Fujii Masao

Re: Strange issues with 9.2 pg_basebackup & replication

From

Fujii Masao

Date:

15 May 2012, 14:21:58

On Mon, May 14, 2012 at 4:04 AM, Josh Berkus <josh@agliodbs.com> wrote:
> Doing some beta testing, managed to produce this issue using the daily
> snapshot from Tuesday:
>
> 1. Created master server, loaded it with a couple dummy databases.
>
> 2. Created standby server.
>
> 3. Did pg_basebackup -x stream on standby server
>
> 4. Started standby server.
>
> 5. Realized I'd forgotten to create a recovery.conf.  Shut down the
> standby server, wrote a recovery.conf, and restarted it.

Before restarting it, you need to do pg_basebackup and make a base backup
onto the standby again. Since you started the standby without recovery.conf,
a series of WAL in the standby has gotten inconsistent with that in the master.
So you need a fresh backup to restart the standby.

Regards,

--
Fujii Masao

Re: Strange issues with 9.2 pg_basebackup & replication

From

Thom Brown

Date:

15 May 2012, 14:30:21

On 15 May 2012 13:15, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, May 16, 2012 at 1:36 AM, Thom Brown <thom@linux.com> wrote:
>> However, this isn't true when I restart the standby.  I've been
>> informed that this should work fine if a WAL archive has been
>> configured (which should be used anyway).
>
> The WAL archive should be shared by master-replica and replica-replica,
> and recovery_target_timeline should be set to latest in replica-replica.
> If you configure that way, replica-replica would successfully reconnect to
> master-replica with no need to restart it.

I had set the archive_command on the primary, then produced a base
backup which would have copied the archive settings, but I also added
a corresponding recovery_command setting, so everything was pointing
at the same archive.

>> But one new problem I appear to have is that once I set up archiving
>> and restart, then try pg_basebackup, it gets stuck and never shows any
>> progress.  If I terminate pg_basebackup in this state and attempt to
>> restart it more times than max_wal_senders, it can no longer run, as
>> pg_basebackup didn't disconnect the stream, so ends up using all
>> senders.  And these show up in pg_stat_replication.  I have a theory
>> that if archiving is enabled, restart postgres then generate some WAL
>> to the point there is a file or two in the archive, pg_basebackup
>> can't stream anything.  Once I restart the server, it's fine and
>> continues as normal.  This has the same symptoms of the "pg_basebackup
>> from running standby with streaming" issue.
>
> This seems to be caused by spread checkpoint which is requested by
> pg_basebackup. IOW, this looks a normal behavior rather than a bug
> or an issue. What if you specify "-c fast" option in pg_basebackup?

Yes, it works fine with that option.  And it appears this isn't to do
with there being an archive as I get the same symptoms without setting
one up.  But in any case, shouldn't the replication connection be
terminated when pg_basebackup is terminated?

--
Thom

Re: Strange issues with 9.2 pg_basebackup & replication

From

Joshua Berkus

Date:

15 May 2012, 15:43:20

Fujii,

Wait, are you telling me that we *still* can't remaster from streaming replication?  Why wasn't that fixed in 9.2?

And: if we still have to ship logs, what's the point in even having cascading replication?

----- Original Message -----
> On Wed, May 16, 2012 at 1:36 AM, Thom Brown <thom@linux.com> wrote:
> > However, this isn't true when I restart the standby.  I've been
> > informed that this should work fine if a WAL archive has been
> > configured (which should be used anyway).
>
> The WAL archive should be shared by master-replica and
> replica-replica,
> and recovery_target_timeline should be set to latest in
> replica-replica.
> If you configure that way, replica-replica would successfully
> reconnect to
> master-replica with no need to restart it.
>
> > But one new problem I appear to have is that once I set up
> > archiving
> > and restart, then try pg_basebackup, it gets stuck and never shows
> > any
> > progress.  If I terminate pg_basebackup in this state and attempt
> > to
> > restart it more times than max_wal_senders, it can no longer run,
> > as
> > pg_basebackup didn't disconnect the stream, so ends up using all
> > senders.  And these show up in pg_stat_replication.  I have a
> > theory
> > that if archiving is enabled, restart postgres then generate some
> > WAL
> > to the point there is a file or two in the archive, pg_basebackup
> > can't stream anything.  Once I restart the server, it's fine and
> > continues as normal.  This has the same symptoms of the
> > "pg_basebackup
> > from running standby with streaming" issue.
>
> This seems to be caused by spread checkpoint which is requested by
> pg_basebackup. IOW, this looks a normal behavior rather than a bug
> or an issue. What if you specify "-c fast" option in pg_basebackup?
>
> Regards,
>
> --
> Fujii Masao
>

Re: Strange issues with 9.2 pg_basebackup & replication

From

Joshua Berkus

Date:

15 May 2012, 15:44:13

> Before restarting it, you need to do pg_basebackup and make a base
> backup
> onto the standby again. Since you started the standby without
> recovery.conf,
> a series of WAL in the standby has gotten inconsistent with that in
> the master.
> So you need a fresh backup to restart the standby.

You're not understanding the bug.  The problem is that the standby came up and reported that it was replicating OK,
whenclearly it wasn't.
 

--Josh

Re: Strange issues with 9.2 pg_basebackup & replication

From

Fujii Masao

Date:

16 May 2012, 12:36:48

On Wed, May 16, 2012 at 2:29 AM, Thom Brown <thom@linux.com> wrote:
> On 15 May 2012 13:15, Fujii Masao <masao.fujii@gmail.com> wrote:
>> On Wed, May 16, 2012 at 1:36 AM, Thom Brown <thom@linux.com> wrote:
>>> However, this isn't true when I restart the standby.  I've been
>>> informed that this should work fine if a WAL archive has been
>>> configured (which should be used anyway).
>>
>> The WAL archive should be shared by master-replica and replica-replica,
>> and recovery_target_timeline should be set to latest in replica-replica.
>> If you configure that way, replica-replica would successfully reconnect to
>> master-replica with no need to restart it.
>
> I had set the archive_command on the primary, then produced a base
> backup which would have copied the archive settings, but I also added
> a corresponding recovery_command setting, so everything was pointing
> at the same archive.

Hmm.. when doing the same, the replica-replica successfully reconnected
to the master-replica after I shutdown the master-master and promoted the
master-replica. archive_command is the same in three servers,
restore_command is the same in two standby servers (i.e., master-replica
and replica-replica), and recovery_target_timeline is set to 'latest' in two
standby servers.

>>> But one new problem I appear to have is that once I set up archiving
>>> and restart, then try pg_basebackup, it gets stuck and never shows any
>>> progress.  If I terminate pg_basebackup in this state and attempt to
>>> restart it more times than max_wal_senders, it can no longer run, as
>>> pg_basebackup didn't disconnect the stream, so ends up using all
>>> senders.  And these show up in pg_stat_replication.  I have a theory
>>> that if archiving is enabled, restart postgres then generate some WAL
>>> to the point there is a file or two in the archive, pg_basebackup
>>> can't stream anything.  Once I restart the server, it's fine and
>>> continues as normal.  This has the same symptoms of the "pg_basebackup
>>> from running standby with streaming" issue.
>>
>> This seems to be caused by spread checkpoint which is requested by
>> pg_basebackup. IOW, this looks a normal behavior rather than a bug
>> or an issue. What if you specify "-c fast" option in pg_basebackup?
>
> Yes, it works fine with that option.  And it appears this isn't to do
> with there being an archive as I get the same symptoms without setting
> one up.

Yes.

> But in any case, shouldn't the replication connection be
> terminated when pg_basebackup is terminated?

+1 To do this, we would need to define SIGINT signal handler and make it
send QueryCancel packet when Ctrl-C is typed.

Regards,

--
Fujii Masao

Re: Strange issues with 9.2 pg_basebackup & replication

From

Fujii Masao

Date:

16 May 2012, 12:49:54

On Wed, May 16, 2012 at 3:42 AM, Joshua Berkus <josh@agliodbs.com> wrote:
> Fujii,
>
> Wait, are you telling me that we *still* can't remaster from streaming replication?

What's the "remaster"?

> And: if we still have to ship logs, what's the point in even having cascading replication?

At least cascading replication (1) allows you to adopt more flexible
configuration of servers,
(2) reduces the number of standby servers which directly connect to
the master, which
reduces the overhead of the master, and (3) provides the
infrastructure of standby-only
base backup feature.

Regards,

-- 
Fujii Masao

Re: Strange issues with 9.2 pg_basebackup & replication

From

Fujii Masao

Date:

16 May 2012, 12:53:34

On Wed, May 16, 2012 at 3:43 AM, Joshua Berkus <josh@agliodbs.com> wrote:
>
>> Before restarting it, you need to do pg_basebackup and make a base
>> backup
>> onto the standby again. Since you started the standby without
>> recovery.conf,
>> a series of WAL in the standby has gotten inconsistent with that in
>> the master.
>> So you need a fresh backup to restart the standby.
>
> You're not understanding the bug.  The problem is that the standby came up and reported that it was replicating OK,
whenclearly it wasn't. 

> 8. Got this fatal error on the standby server:
>
> LOG:  record with incorrect prev-link 0/70000B8 at 0/70000E0
> LOG:  record with incorrect prev-link 0/70000B8 at 0/70000E0
>
> ... this error message repeated every 5s.

According to your first report, ISTM you got error messages.

Regards,

--
Fujii Masao

Re: Strange issues with 9.2 pg_basebackup & replication

From

Thom Brown

Date:

16 May 2012, 13:08:05

On 16 May 2012 11:36, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, May 16, 2012 at 2:29 AM, Thom Brown <thom@linux.com> wrote:
>> On 15 May 2012 13:15, Fujii Masao <masao.fujii@gmail.com> wrote:
>>> On Wed, May 16, 2012 at 1:36 AM, Thom Brown <thom@linux.com> wrote:
>>>> However, this isn't true when I restart the standby.  I've been
>>>> informed that this should work fine if a WAL archive has been
>>>> configured (which should be used anyway).
>>>
>>> The WAL archive should be shared by master-replica and replica-replica,
>>> and recovery_target_timeline should be set to latest in replica-replica.
>>> If you configure that way, replica-replica would successfully reconnect to
>>> master-replica with no need to restart it.
>>
>> I had set the archive_command on the primary, then produced a base
>> backup which would have copied the archive settings, but I also added
>> a corresponding recovery_command setting, so everything was pointing
>> at the same archive.
>
> Hmm.. when doing the same, the replica-replica successfully reconnected
> to the master-replica after I shutdown the master-master and promoted the
> master-replica. archive_command is the same in three servers,
> restore_command is the same in two standby servers (i.e., master-replica
> and replica-replica), and recovery_target_timeline is set to 'latest' in two
> standby servers.

I didn't shut down the master-master, but I didn't expect to need to.

I also had recovery_target_timeline set to latest.  I also tried
explicitly setting it to the new timeline, and got an error saying
there was no such timeline.

>> But in any case, shouldn't the replication connection be
>> terminated when pg_basebackup is terminated?
>
> +1 To do this, we would need to define SIGINT signal handler and make it
> send QueryCancel packet when Ctrl-C is typed.

Also could we provide some feedback when using the -c spread option,
when there isn't progress within a short period of time?  Something
like "Waiting for checkpoint.  This can take up to
%checkpoint_timeout%", or something similar, rather than seeing
nothing happening and wondering if something has gone wrong.  And also
a note in the documentation saying that, on "quiet" clusters, it may
take some time before the base backup commences.  In fact, since
pg_start_backup will exhibit the same behaviour (i.e. no feedback when
waiting for a checkpoint), maybe that should return a notice (if there
are dirty pages) stating that it will complete when the next
checkpoint occurs.

--
Thom

Re: Strange issues with 9.2 pg_basebackup & replication

From

Jim Nasby

Date:

16 May 2012, 15:03:03

Well, that is a form of testing. :)

My point was that we need some kind of regression tests around all the new replication stuff, and if you had some
scriptsthat would be a useful starting point. But it sounds like you haven't gotten that far with it, so...
 

On 5/15/12 10:12 AM, Joshua Berkus wrote:
> Jim,
>
> I didn't get as far as running any tests, actually.  All I did was try to set up 3 servers in cascading replication.
ThenI tried shutting down master-master and promoting master-replica.  That's it.
 
>
> ----- Original Message -----
>> On May 13, 2012, at 3:08 PM, Josh Berkus wrote:
>>> More issues: promoting intermediate standby breaks replication.
>>>
>>> To be a bit blunt here, has anyone tested cascading replication *at
>>> all*
>>> before this?
>>
>> Josh, do you have scripts that you're using to do this testing? If so
>> can you post them somewhere?
>>
>> AFAIK we don't have any regression tests for all this replication
>> stuff, but ISTM that we need some...
>> --
>> Jim C. Nasby, Database Architect                   jim@nasby.net
>> 512.569.9461 (cell)                         http://jim.nasby.net
>>
>>
>


-- 
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net

Re: Strange issues with 9.2 pg_basebackup & replication

From

Jim Nasby

Date:

16 May 2012, 15:04:34

On 5/16/12 10:53 AM, Fujii Masao wrote:
> On Wed, May 16, 2012 at 3:43 AM, Joshua Berkus<josh@agliodbs.com>  wrote:
>>
>>> Before restarting it, you need to do pg_basebackup and make a base
>>> backup
>>> onto the standby again. Since you started the standby without
>>> recovery.conf,
>>> a series of WAL in the standby has gotten inconsistent with that in
>>> the master.
>>> So you need a fresh backup to restart the standby.
>>
>> You're not understanding the bug.  The problem is that the standby came up and reported that it was replicating OK,
whenclearly it wasn't.
 
>
>> 8. Got this fatal error on the standby server:
>>
>> LOG:  record with incorrect prev-link 0/70000B8 at 0/70000E0
>> LOG:  record with incorrect prev-link 0/70000B8 at 0/70000E0
>>
>> ... this error message repeated every 5s.
>
> According to your first report, ISTM you got error messages.

Only *after* it was correctly setup.

Josh's point is that if you flub the configuration, you should get an error, which is not what's happening now. Right
nowit just comes up and acts as if nothing's wrong.
 
-- 
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net

Re: Strange issues with 9.2 pg_basebackup & replication

From

Fujii Masao

Date:

16 May 2012, 20:29:23

On Thu, May 17, 2012 at 1:07 AM, Thom Brown <thom@linux.com> wrote:
> On 16 May 2012 11:36, Fujii Masao <masao.fujii@gmail.com> wrote:
>> On Wed, May 16, 2012 at 2:29 AM, Thom Brown <thom@linux.com> wrote:
>>> On 15 May 2012 13:15, Fujii Masao <masao.fujii@gmail.com> wrote:
>>>> On Wed, May 16, 2012 at 1:36 AM, Thom Brown <thom@linux.com> wrote:
>>>>> However, this isn't true when I restart the standby.  I've been
>>>>> informed that this should work fine if a WAL archive has been
>>>>> configured (which should be used anyway).
>>>>
>>>> The WAL archive should be shared by master-replica and replica-replica,
>>>> and recovery_target_timeline should be set to latest in replica-replica.
>>>> If you configure that way, replica-replica would successfully reconnect to
>>>> master-replica with no need to restart it.
>>>
>>> I had set the archive_command on the primary, then produced a base
>>> backup which would have copied the archive settings, but I also added
>>> a corresponding recovery_command setting, so everything was pointing
>>> at the same archive.
>>
>> Hmm.. when doing the same, the replica-replica successfully reconnected
>> to the master-replica after I shutdown the master-master and promoted the
>> master-replica. archive_command is the same in three servers,
>> restore_command is the same in two standby servers (i.e., master-replica
>> and replica-replica), and recovery_target_timeline is set to 'latest' in two
>> standby servers.
>
> I didn't shut down the master-master, but I didn't expect to need to.
>
> I also had recovery_target_timeline set to latest.  I also tried
> explicitly setting it to the new timeline, and got an error saying
> there was no such timeline.

What did the replica-replica do after you got such an error? Repeated
such an error? Emit PANIC error and exited? Got stuck? Successfully
reconnected to the master-replica? ....

In theory, the gap of timeline should be resolved as follows:

1. promote master-replica, which terminates cascade replication.
2. while replica-replica is repeating to reconnect to master-replica,   if it finds new timeline history file in the
archive,it adjusts 
its timeline   to new one.
3. as the result of promotion, master-replica increments its timeline,   creates the timeline history file and archives
it.
4. finally replica-replica finds new timeline history file in the archive,   adjusts its timeline to new one, and
successfullyreconnects to the   master-replica. 

Note that you might see the timeline mismatch error some times
before replication is successfully restarted because of the timing
problem.

>
>>> But in any case, shouldn't the replication connection be
>>> terminated when pg_basebackup is terminated?
>>
>> +1 To do this, we would need to define SIGINT signal handler and make it
>> send QueryCancel packet when Ctrl-C is typed.
>
> Also could we provide some feedback when using the -c spread option,
> when there isn't progress within a short period of time?  Something
> like "Waiting for checkpoint.  This can take up to
> %checkpoint_timeout%", or something similar, rather than seeing
> nothing happening and wondering if something has gone wrong.

+1, at least for the case where -P option is specified in pg_basebackup.

Regards,

--
Fujii Masao

Re: Strange issues with 9.2 pg_basebackup & replication

From

Joshua Berkus

Date:

17 May 2012, 00:02:39

> > And: if we still have to ship logs, what's the point in even having
> > cascading replication?
> 
> At least cascading replication (1) allows you to adopt more flexible
> configuration of servers,

I'm just pretty shocked.  The last time we talked about this, at the end of the 9.1 development cycle, you almost had
remasteringusing streaming-only replication working, you just ran out of time.  Now it appears that you've abandoned
workingon that completely.  What's going on?

Re: Strange issues with 9.2 pg_basebackup & replication

From

Joshua Berkus

Date:

17 May 2012, 09:42:59

Jim, Fujii,

Even more fun:

1) Set up a server as a cascading replica (e.g. max_wal_senders = 3, standby_mode = on )

2) Connect the server to *itself* as a replica.

3) This will work and report success, up until you do your first write.

4) Then ... segfault!  


----- Original Message -----
> On 5/16/12 10:53 AM, Fujii Masao wrote:
> > On Wed, May 16, 2012 at 3:43 AM, Joshua Berkus<josh@agliodbs.com>
> >  wrote:
> >>
> >>> Before restarting it, you need to do pg_basebackup and make a
> >>> base
> >>> backup
> >>> onto the standby again. Since you started the standby without
> >>> recovery.conf,
> >>> a series of WAL in the standby has gotten inconsistent with that
> >>> in
> >>> the master.
> >>> So you need a fresh backup to restart the standby.
> >>
> >> You're not understanding the bug.  The problem is that the standby
> >> came up and reported that it was replicating OK, when clearly it
> >> wasn't.
> >
> >> 8. Got this fatal error on the standby server:
> >>
> >> LOG:  record with incorrect prev-link 0/70000B8 at 0/70000E0
> >> LOG:  record with incorrect prev-link 0/70000B8 at 0/70000E0
> >>
> >> ... this error message repeated every 5s.
> >
> > According to your first report, ISTM you got error messages.
> 
> Only *after* it was correctly setup.
> 
> Josh's point is that if you flub the configuration, you should get an
> error, which is not what's happening now. Right now it just comes up
> and acts as if nothing's wrong.
> --
> Jim C. Nasby, Database Architect                   jim@nasby.net
> 512.569.9461 (cell)                         http://jim.nasby.net
>

Re: Strange issues with 9.2 pg_basebackup & replication

From

Ants Aasma

Date:

17 May 2012, 10:43:17

On Thu, May 17, 2012 at 3:42 PM, Joshua Berkus <josh@agliodbs.com> wrote:
> Even more fun:
>
> 1) Set up a server as a cascading replica (e.g. max_wal_senders = 3, standby_mode = on )
>
> 2) Connect the server to *itself* as a replica.
>
> 3) This will work and report success, up until you do your first write.
>
> 4) Then ... segfault!

I cannot reproduce this. Attached is the script that I use for cascade
replication testing. With it I can see the replica connecting to
itself but no segfault.

Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

Attachment

test_cascading.sh

Re: Strange issues with 9.2 pg_basebackup & replication

From

Fujii Masao

Date:

17 May 2012, 13:28:16

On Thu, May 17, 2012 at 12:01 PM, Joshua Berkus <josh@agliodbs.com> wrote:
>
>> > And: if we still have to ship logs, what's the point in even having
>> > cascading replication?
>>
>> At least cascading replication (1) allows you to adopt more flexible
>> configuration of servers,
>
> I'm just pretty shocked.  The last time we talked about this, at the end of the 9.1 development cycle, you almost had
remasteringusing streaming-only replication working, you just ran out of time.  Now it appears that you've abandoned
workingon that completely.  What's going on? 

You mean that "remaster" is, after promoting one of standby servers, to make
remaining standby servers reconnect to new master and resolve the timeline
gap without the shared archive? Yep, that's one of my TODO items, but I'm not
sure if I have enough time to implement that for 9.3....

Regards,

--
Fujii Masao

Re: Strange issues with 9.2 pg_basebackup & replication

From

Fujii Masao

Date:

17 May 2012, 13:39:16

On Thu, May 17, 2012 at 10:42 PM, Ants Aasma <ants@cybertec.at> wrote:
> On Thu, May 17, 2012 at 3:42 PM, Joshua Berkus <josh@agliodbs.com> wrote:
>> Even more fun:
>>
>> 1) Set up a server as a cascading replica (e.g. max_wal_senders = 3, standby_mode = on )
>>
>> 2) Connect the server to *itself* as a replica.
>>
>> 3) This will work and report success, up until you do your first write.
>>
>> 4) Then ... segfault!
>
> I cannot reproduce this.

Me, neither.

Josh, could you show me the more detail procedure to reproduce the problem?

Regards,

-- 
Fujii Masao

Re: Strange issues with 9.2 pg_basebackup & replication

From

Joshua Berkus

Date:

17 May 2012, 15:57:51

Yeah, I don't know how I produced the crash in the first place, because of course the self-replica should block all
writes,and retesting it I can't get it to accept a write.  Not sure how I did it in the first place.

So the bug is just that you can connect a server to itself as its own replica.  Since I can't think of any good reason
todo this, we should simply error out on startup if someone sets things up that way.  How can we detect that we've
connectedstreaming replication to the same server?

----- Original Message -----
> On Thu, May 17, 2012 at 10:42 PM, Ants Aasma <ants@cybertec.at>
> wrote:
> > On Thu, May 17, 2012 at 3:42 PM, Joshua Berkus <josh@agliodbs.com>
> > wrote:
> >> Even more fun:
> >>
> >> 1) Set up a server as a cascading replica (e.g. max_wal_senders =
> >> 3, standby_mode = on )
> >>
> >> 2) Connect the server to *itself* as a replica.
> >>
> >> 3) This will work and report success, up until you do your first
> >> write.
> >>
> >> 4) Then ... segfault!
> >
> > I cannot reproduce this.
> 
> Me, neither.
> 
> Josh, could you show me the more detail procedure to reproduce the
> problem?
> 
> Regards,
> 
> --
> Fujii Masao
>

Re: Strange issues with 9.2 pg_basebackup & replication

From

Fujii Masao

Date:

18 May 2012, 12:04:46

On Fri, May 18, 2012 at 3:57 AM, Joshua Berkus <josh@agliodbs.com> wrote:
> Yeah, I don't know how I produced the crash in the first place, because of course the self-replica should block all
writes,and retesting it I can't get it to accept a write.  Not sure how I did it in the first place. 
>
> So the bug is just that you can connect a server to itself as its own replica.  Since I can't think of any good
reasonto do this, we should simply error out on startup if someone sets things up that way.  How can we detect that
we'veconnected streaming replication to the same server? 

It might be easy to detect the situation where the standby has
connected to itself,
e.g., by assigning ID for each instance and checking whether IDs of two servers
are the same. But it seems not easy to detect the circularly-connected
two or more
standbys.

Regards,

--
Fujii Masao

Re: Strange issues with 9.2 pg_basebackup & replication

From

Joshua Berkus

Date:

18 May 2012, 14:42:05

> It might be easy to detect the situation where the standby has
> connected to itself,
> e.g., by assigning ID for each instance and checking whether IDs of
> two servers
> are the same. But it seems not easy to detect the
> circularly-connected
> two or more
> standbys.

Well, I think it would be fine not to worry about circles for now.

Re: Strange issues with 9.2 pg_basebackup & replication

From

Joshua Berkus

Date:

18 May 2012, 14:44:59

Fujii,
> 
> You mean that "remaster" is, after promoting one of standby servers,
> to make
> remaining standby servers reconnect to new master and resolve the
> timeline
> gap without the shared archive? Yep, that's one of my TODO items, but
> I'm not
> sure if I have enough time to implement that for 9.3....

Well, not remastering from stream is the single largest usability obstacle for streaming replication, and severely
limitsthe utility of cascading replication.  Is there any way you could get it done for 9.3?  I'm happy to spend lots
oftime testing it, if necessary.
 

--Josh Berkus