Thread: WAL segment question

WAL segment question

From

"Donald Fraser"

Date:

03 August 2007, 20:19:12

Our company has two sites, a master and a disaster recovery site.

I am trying to assess whether data has been lost during a fail-over, due to the asynchronous method of transporting WAL file data between sites (we use DRBD).

Disaster recovery node start-up log:

user= pid=9907 timestamp=[2007-08-01 17:09:51 BST] tid= LOG: record with zero length at 0/7C62F52C
user= pid=9907 timestamp=[2007-08-01 17:09:51 BST] tid= LOG: redo done at 0/7C62F4E8

My question is: given the above WAL segment information, can I use it to decide whether information was lost in the master database site, at the point of failure, without starting the database up?

Regards

Donald Fraser.

Re: WAL segment question

From

"Simon Riggs"

Date:

05 August 2007, 17:57:50

On Sat, 2007-08-04 at 00:00 +0100, Donald Fraser wrote:
> Our company has two sites, a master and a disaster recovery site.
> I am trying to assess whether data has been lost during a fail-over,
> due to the asynchronous method of transporting WAL file data between
> sites (we use DRBD).
>
> Disaster recovery node start-up log:
> user= pid=9907 timestamp=[2007-08-01 17:09:51 BST] tid= LOG:  record
> with zero length at 0/7C62F52C
> user= pid=9907 timestamp=[2007-08-01 17:09:51 BST] tid= LOG:  redo
> done at 0/7C62F4E8
>
> My question is: given the above WAL segment information, can I use it
> to decide whether information was lost in the master database site, at
> the point of failure, without starting the database up?

Currently there is no built-in means of finding out the time of the last
record transferred, if the primary stays down. You need to use xlogdump
to locate the time of the last commit/abort record.

--
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com

Re: WAL segment question

From

Chander Ganesan

Date:

06 August 2007, 19:54:36

Donald Fraser wrote:

Our company has two sites, a master and a disaster recovery site.
I am trying to assess whether data has been lost during a fail-over, due to the asynchronous method of transporting WAL file data between sites (we use DRBD).

Disaster recovery node start-up log:
user= pid=9907 timestamp=[2007-08-01 17:09:51 BST] tid= LOG: record with zero length at 0/7C62F52C
user= pid=9907 timestamp=[2007-08-01 17:09:51 BST] tid= LOG: redo done at 0/7C62F4E8

My question is: given the above WAL segment information, can I use it to decide whether information was lost in the master database site, at the point of failure, without starting the database up?

Regards
Donald Fraser.

DRBD is a synchronous method of transferring data - it's not asynchronous. Assuming you are using Protocol C you should find that both are always in sync. If you are using protocol B then things will be in sync so long as the remote system didn't crash, and if you are using Protocol A then things should be in sync as long as the local system didn't crash.

-- 
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC  27560
919-463-0999/866-229-3386
http://www.otg-nc.com

Re: WAL segment question

From

"Donald Fraser"

Date:

06 August 2007, 20:59:25

----- Original Message -----
From: Chander Ganesan
> DRBD is a synchronous method of transferring data - it's not asynchronous.
Assuming you are using Protocol C you should find that both are always in
sync.  If you are using protocol B then things will be in sync so long as
the remote system didn't crash, and if you are using Protocol A then things
should be in sync as long as the local system didn't crash.

Apologies if you thought I was insinuating DRBD is categorically
asynchronous.
But you assumed wrong, we use protocol A, because we don't have a fast WAN
link < 10Mbps.
According to the DRBD documentation:
Protocol A: write IO is reported as completed, if it has reached local disk
and local tcp send buffer.

Which I interpret as: A for Asynchronous, which means I can complete writing
data and start writing the next set of data before the first set of data has
been synchronised at the remote site.

This is the cost of our system - we are willing to loose data for improved
performance of our master server. We've had one failure in 7 years that I've
been at the company... Many thanks to Linux and more importantly PostgreSQL!

One day when our local telecomms company stop the monopoly on leased data
lines, we may upgrade our DR link to a 100Mbps or even 1Gbps link, in which
case, protocol B or C will be a better solution. Unfortunately I think
there's a better chance that the cost of postage will go down first.

Thread: WAL segment question

WAL segment question

Re: WAL segment question

Attachment

Re: WAL segment question

Re: WAL segment question

Re: WAL segment question