Thread: dropdb breaks replication?

dropdb breaks replication?

From

Edson Richter

Date:

31 October 2012, 17:32:36

I've two PostgreSQL 9.1.6 running on Linux CentOS 5.8 64bit.
They are replicated asynchronously.

Yesterday, I've dropped a database of 20Gb, and then replication has
broken, requiring me to manually synchronize both servers again.

It is expected that dropdb (or, perhaps, createdb) break existing
replication between servers?

Thanks,

Edson

Re: dropdb breaks replication?

From

Lonni J Friedman

Date:

31 October 2012, 17:40:02

On Wed, Oct 31, 2012 at 10:32 AM, Edson Richter
<edsonrichter@hotmail.com> wrote:
> I've two PostgreSQL 9.1.6 running on Linux CentOS 5.8 64bit.
> They are replicated asynchronously.
>
> Yesterday, I've dropped a database of 20Gb, and then replication has broken,
> requiring me to manually synchronize both servers again.
>
> It is expected that dropdb (or, perhaps, createdb) break existing
> replication between servers?

How did you determine that replication was broken, and how did you
manually synchronize the servers?  Are you certain that replication
was working prior to dropping the database?

Re: dropdb breaks replication?

From

Edson Richter

Date:

31 October 2012, 18:01:48

Em 31/10/2012 15:39, Lonni J Friedman escreveu:
> On Wed, Oct 31, 2012 at 10:32 AM, Edson Richter
> <edsonrichter@hotmail.com> wrote:
>> I've two PostgreSQL 9.1.6 running on Linux CentOS 5.8 64bit.
>> They are replicated asynchronously.
>>
>> Yesterday, I've dropped a database of 20Gb, and then replication has broken,
>> requiring me to manually synchronize both servers again.
>>
>> It is expected that dropdb (or, perhaps, createdb) break existing
>> replication between servers?
> How did you determine that replication was broken, and how did you
> manually synchronize the servers?  Are you certain that replication
> was working prior to dropping the database?
>
>
I'm sure replication was running.
I usually keep two windows open in both servers, running

In master:

watch -n 2 "ps aux | egrep sender"

In slave:

watch -n 2 "ps aux | egrep receiver"


At the point the dropdb command has been executed, both disappeared from
my "radar".
Also, in the log there is the following error:

LOG:  replicação em fluxo conectou-se com sucesso ao servidor principal
FATAL:  não pôde receber dados do fluxo do WAL: FATAL:  segmento do WAL
solicitado 0000000100000001000000BE já foi removido


May the cause not having enough segments (currently 80) for dropdb
command? Is dropdb logged in transaction log page-by-page excluded?


Thanks,

Edson

Re: dropdb breaks replication?

From

Lonni J Friedman

Date:

31 October 2012, 18:09:49

On Wed, Oct 31, 2012 at 11:01 AM, Edson Richter
<edsonrichter@hotmail.com> wrote:
> Em 31/10/2012 15:39, Lonni J Friedman escreveu:
>>
>> On Wed, Oct 31, 2012 at 10:32 AM, Edson Richter
>> <edsonrichter@hotmail.com> wrote:
>>>
>>> I've two PostgreSQL 9.1.6 running on Linux CentOS 5.8 64bit.
>>> They are replicated asynchronously.
>>>
>>> Yesterday, I've dropped a database of 20Gb, and then replication has
>>> broken,
>>> requiring me to manually synchronize both servers again.
>>>
>>> It is expected that dropdb (or, perhaps, createdb) break existing
>>> replication between servers?
>>
>> How did you determine that replication was broken, and how did you
>> manually synchronize the servers?  Are you certain that replication
>> was working prior to dropping the database?
>>
>>
> I'm sure replication was running.
> I usually keep two windows open in both servers, running
>
> In master:
>
> watch -n 2 "ps aux | egrep sender"
>
> In slave:
>
> watch -n 2 "ps aux | egrep receiver"
>
>
> At the point the dropdb command has been executed, both disappeared from my
> "radar".
> Also, in the log there is the following error:
>
> LOG:  replicação em fluxo conectou-se com sucesso ao servidor principal
> FATAL:  não pôde receber dados do fluxo do WAL: FATAL:  segmento do WAL
> solicitado 0000000100000001000000BE já foi removido
>
>
> May the cause not having enough segments (currently 80) for dropdb command?
> Is dropdb logged in transaction log page-by-page excluded?

I can't read portugese(?), but i think the gist of the error is that
the WAL segment was already removed before the slave could consume it.
 I'm guessing that you aren't keeping enough of them, and dropping the
database generated a huge volume which flushed out the old ones before
they could get consumed by your slave.

Re: dropdb breaks replication?

From

Tom Lane

Date:

31 October 2012, 18:34:24

Lonni J Friedman <netllama@gmail.com> writes:
> On Wed, Oct 31, 2012 at 11:01 AM, Edson Richter
> <edsonrichter@hotmail.com> wrote:
>> May the cause not having enough segments (currently 80) for dropdb command?
>> Is dropdb logged in transaction log page-by-page excluded?

> I can't read portugese(?), but i think the gist of the error is that
> the WAL segment was already removed before the slave could consume it.
>  I'm guessing that you aren't keeping enough of them, and dropping the
> database generated a huge volume which flushed out the old ones before
> they could get consumed by your slave.

dropdb generates one, not very large, WAL record saying "go rm -rf this
directory".  So sheer WAL volume is not the correct explanation.  It's
possible though that the slave spent long enough executing the rm -rf
to fall behind the master.

In any case, it should have been able to catch up automatically if WAL
archiving was configured properly.

            regards, tom lane

Re: dropdb breaks replication?

From

Edson Richter

Date:

31 October 2012, 18:35:09

Em 31/10/2012 16:09, Lonni J Friedman escreveu:
> On Wed, Oct 31, 2012 at 11:01 AM, Edson Richter
> <edsonrichter@hotmail.com> wrote:
>> Em 31/10/2012 15:39, Lonni J Friedman escreveu:
>>> On Wed, Oct 31, 2012 at 10:32 AM, Edson Richter
>>> <edsonrichter@hotmail.com> wrote:
>>>> I've two PostgreSQL 9.1.6 running on Linux CentOS 5.8 64bit.
>>>> They are replicated asynchronously.
>>>>
>>>> Yesterday, I've dropped a database of 20Gb, and then replication has
>>>> broken,
>>>> requiring me to manually synchronize both servers again.
>>>>
>>>> It is expected that dropdb (or, perhaps, createdb) break existing
>>>> replication between servers?
>>> How did you determine that replication was broken, and how did you
>>> manually synchronize the servers?  Are you certain that replication
>>> was working prior to dropping the database?
>>>
>>>
>> I'm sure replication was running.
>> I usually keep two windows open in both servers, running
>>
>> In master:
>>
>> watch -n 2 "ps aux | egrep sender"
>>
>> In slave:
>>
>> watch -n 2 "ps aux | egrep receiver"
>>
>>
>> At the point the dropdb command has been executed, both disappeared from my
>> "radar".
>> Also, in the log there is the following error:
>>
>> LOG:  replicação em fluxo conectou-se com sucesso ao servidor principal
>> FATAL:  não pôde receber dados do fluxo do WAL: FATAL:  segmento do WAL
>> solicitado 0000000100000001000000BE já foi removido
>>
>>
>> May the cause not having enough segments (currently 80) for dropdb command?
>> Is dropdb logged in transaction log page-by-page excluded?
> I can't read portugese(?), but i think the gist of the error is that
> the WAL segment was already removed before the slave could consume it.
>   I'm guessing that you aren't keeping enough of them, and dropping the
> database generated a huge volume which flushed out the old ones before
> they could get consumed by your slave.
>
>
Sorry for the portguese text. Yes, your assumption is correct: WAL
segment has been excluded before being able to replicate.
I keep 80 WAL segments, but I was wondering if a drop database is being
logged: it's just so fast, I thought it wasn't logged.
And what is the purpose to log (and replicate) the database drop, if you
will not be able to recover it - IMHO, dropdb should be replicated as
"database deactivation" or something more or like that...

Edson

Re: dropdb breaks replication?

From

Edson Richter

Date:

31 October 2012, 18:42:22

Em 31/10/2012 16:34, Tom Lane escreveu:
> Lonni J Friedman <netllama@gmail.com> writes:
>> On Wed, Oct 31, 2012 at 11:01 AM, Edson Richter
>> <edsonrichter@hotmail.com> wrote:
>>> May the cause not having enough segments (currently 80) for dropdb command?
>>> Is dropdb logged in transaction log page-by-page excluded?
>> I can't read portugese(?), but i think the gist of the error is that
>> the WAL segment was already removed before the slave could consume it.
>>   I'm guessing that you aren't keeping enough of them, and dropping the
>> database generated a huge volume which flushed out the old ones before
>> they could get consumed by your slave.
> dropdb generates one, not very large, WAL record saying "go rm -rf this
> directory".  So sheer WAL volume is not the correct explanation.  It's
> possible though that the slave spent long enough executing the rm -rf
> to fall behind the master.

Your assumption is right: the slave server is a slow mono processor, low
memory, cloud computer, and would have taken very long time to delete
everything.

>
> In any case, it should have been able to catch up automatically if WAL
> archiving was configured properly.

I don't use WAL archiving - both servers are miles away from each other,
and don't have anything except PostgreSQL async replication over VPN
connecting them.

Edson

>
>             regards, tom lane
>
>

Re: dropdb breaks replication?

From

John R Pierce

Date:

31 October 2012, 18:43:13

On 10/31/12 11:34 AM, Edson Richter wrote:
> Sorry for the portguese text. Yes, your assumption is correct: WAL
> segment has been excluded before being able to replicate.
> I keep 80 WAL segments, but I was wondering if a drop database is
> being logged: it's just so fast, I thought it wasn't logged.
> And what is the purpose to log (and replicate) the database drop, if
> you will not be able to recover it - IMHO, dropdb should be replicated
> as "database deactivation" or something more or like that...


WAL is not a 'redo' log like Oracle uses.



--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast

Re: dropdb breaks replication?

From

Greg Williamson

Date:

31 October 2012, 22:48:03

Edson --

>I've two PostgreSQL 9.1.6 running on Linux CentOS 5.8 64bit.
>They are replicated asynchronously.
>
>Yesterday, I've dropped a database of 20Gb, and then replication has broken, requiring me to manually synchronize both
serversagain. 
>
>It is expected that dropdb (or, perhaps, createdb) break existing replication between servers?
>


Sorry for the slow response -- as others have indicated, the drop db is probably not the problem. We have one system
thatdrops a several-gig database hourly and the replication has never failed. We see issues on the master with dead
filehandles but the replication itself is rock solid. 

Greg

Re: dropdb breaks replication?

From

Edson Richter

Date:

31 October 2012, 23:56:27

Em 31/10/2012 20:47, Greg Williamson escreveu:
> Edson --
>
>> I've two PostgreSQL 9.1.6 running on Linux CentOS 5.8 64bit.
>> They are replicated asynchronously.
>>
>> Yesterday, I've dropped a database of 20Gb, and then replication has broken, requiring me to manually synchronize
bothservers again. 
>>
>> It is expected that dropdb (or, perhaps, createdb) break existing replication between servers?
>>
>
> Sorry for the slow response -- as others have indicated, the drop db is probably not the problem. We have one system
thatdrops a several-gig database hourly and the replication has never failed. We see issues on the master with dead
filehandles but the replication itself is rock solid. 
>
> Greg
>
>

Our application should (almost) never delete databases, but just in case
I'll keep an eye open, and manually sync the replication if needed. It
is not a major issue, was more a matter of curiosity.

Also, John pointed that xlog in PostgreSQL is not the same as the
concept I had from Oracle days.

Thanks, Greg (and everyone).

Edson