Thread: performance hit for replication

performance hit for replication

From

"Matthew Nuzum"

Date:

12 April 2005, 13:25:25

I'd like to create a fail-over server in case of a problem. Ideally, it
would be synchronized with our main database server, but I don't see any
major problem with having a delay of up to 4 hours between syncs.

My database is a little shy of 10 Gigs, with much of that data being in an
archived log table. Every day a batch job is run which adds 100,000 records
over the course of 3 hours (the batch job does a lot of pre/post
processing).

Doing a restore of the db backup in vmware takes about 3 hours. I suspect a
powerful server with a better disk setup could do it faster, but I don't
have servers like that at my disposal, so I need to assume worst-case of 3-4
hours is typical.

So, my question is this: My server currently works great, performance wise.
I need to add fail-over capability, but I'm afraid that introducing a
stressful task such as replication will hurt my server's performance. Is
there any foundation to my fears? I don't need to replicate the archived log
data because I can easily restore that in a separate step from the nightly
backup if disaster occurs. Also, my database load is largely selects. My
application works great with PostgreSQL 7.3 and 7.4, but I'm currently using
7.3.

I'm eager to hear your thoughts and experiences,
--
Matthew Nuzum <matt@followers.net>
www.followers.net - Makers of "Elite Content Management System"
Earn a commission of $100 - $750 by recommending Elite CMS. Visit
http://www.elitecms.com/Contact_Us.partner for details.

Re: performance hit for replication

From

"Joshua D. Drake"

Date:

12 April 2005, 13:37:13

>So, my question is this: My server currently works great, performance wise.
>I need to add fail-over capability, but I'm afraid that introducing a
>stressful task such as replication will hurt my server's performance. Is
>there any foundation to my fears? I don't need to replicate the archived log
>data because I can easily restore that in a separate step from the nightly
>backup if disaster occurs. Also, my database load is largely selects. My
>application works great with PostgreSQL 7.3 and 7.4, but I'm currently using
>7.3.
>
>I'm eager to hear your thoughts and experiences,
>
>
Well with replicator you are going to take a pretty big hit initially
during the full
sync but then you could use batch replication and only replicate every
2-3 hours.

I am pretty sure Slony has similar capabilities.

Sincerely,

Joshua D. Drake

Re: performance hit for replication

From

Darcy Buskermolen

Date:

12 April 2005, 13:53:18

On Tuesday 12 April 2005 09:25, Matthew Nuzum wrote:
> I'd like to create a fail-over server in case of a problem. Ideally, it
> would be synchronized with our main database server, but I don't see any
> major problem with having a delay of up to 4 hours between syncs.
>
> My database is a little shy of 10 Gigs, with much of that data being in an
> archived log table. Every day a batch job is run which adds 100,000 records
> over the course of 3 hours (the batch job does a lot of pre/post
> processing).
>
> Doing a restore of the db backup in vmware takes about 3 hours. I suspect a
> powerful server with a better disk setup could do it faster, but I don't
> have servers like that at my disposal, so I need to assume worst-case of
> 3-4 hours is typical.
>
> So, my question is this: My server currently works great, performance wise.
> I need to add fail-over capability, but I'm afraid that introducing a
> stressful task such as replication will hurt my server's performance. Is
> there any foundation to my fears? I don't need to replicate the archived
> log data because I can easily restore that in a separate step from the
> nightly backup if disaster occurs. Also, my database load is largely
> selects. My application works great with PostgreSQL 7.3 and 7.4, but I'm
> currently using 7.3.
>
> I'm eager to hear your thoughts and experiences,

Your application sounds like a perfact candidate for Slony-I
http://www.slony.info . Using Slony-I I see about a 5-7% performance hit in
terms of the number of insert.update/delete per second i can process.

Depending on your network connection , DML volume, and the power of your
backup server, the replica could be as little as 10 seconds behind the
origin.  A failover/switchover could occur in under 60 seconds.

--
Darcy Buskermolen
Wavefire Technologies Corp.

http://www.wavefire.com
ph: 250.717.0200
fx: 250.763.1759

Re: performance hit for replication

From

"Matthew Nuzum"

Date:

12 April 2005, 13:55:48

> >I'm eager to hear your thoughts and experiences,
> >
> >
> Well with replicator you are going to take a pretty big hit initially
> during the full
> sync but then you could use batch replication and only replicate every
> 2-3 hours.
>
> Sincerely,
>
> Joshua D. Drake
>

Thanks, I'm looking at your product and will contact you off list for more
details soon.

Out of curiosity, does batch mode produce a lighter load? Live updating will
provide maximum data security, and I'm most interested in how it affects the
server.

--
Matthew Nuzum <matt@followers.net>
www.followers.net - Makers of "Elite Content Management System"
Earn a commission of $100 - $750 by recommending Elite CMS. Visit
http://www.elitecms.com/Contact_Us.partner for details.

Re: performance hit for replication

From

"Joshua D. Drake"

Date:

12 April 2005, 14:13:44

Matthew Nuzum wrote:

>>>I'm eager to hear your thoughts and experiences,
>>>
>>>
>>>
>>>
>>Well with replicator you are going to take a pretty big hit initially
>>during the full
>>sync but then you could use batch replication and only replicate every
>>2-3 hours.
>>
>>Sincerely,
>>
>>Joshua D. Drake
>>
>>
>>
>
>Thanks, I'm looking at your product and will contact you off list for more
>details soon.
>
>Out of curiosity, does batch mode produce a lighter load?
>
Well more of a burstier load. You could also do live replication but
replicator requires
some IO which VMWare just ins't that good at :)

Sincerely,

Joshua D. Drake

Re: performance hit for replication

From

Chris Browne

Date:

12 April 2005, 18:04:21

jd@commandprompt.com ("Joshua D. Drake") writes:
>>So, my question is this: My server currently works great,
>>performance wise.  I need to add fail-over capability, but I'm
>>afraid that introducing a stressful task such as replication will
>>hurt my server's performance. Is there any foundation to my fears? I
>>don't need to replicate the archived log data because I can easily
>>restore that in a separate step from the nightly backup if disaster
>>occurs. Also, my database load is largely selects. My application
>>works great with PostgreSQL 7.3 and 7.4, but I'm currently using
>>7.3.
>>
>>I'm eager to hear your thoughts and experiences,
>>
> Well with replicator you are going to take a pretty big hit
> initially during the full sync but then you could use batch
> replication and only replicate every 2-3 hours.
>
> I am pretty sure Slony has similar capabilities.

Yes, similar capabilities, similar "pretty big hit."

There's a downside to "batch replication" that some of the data
structures grow in size if you have appreciable periods between
batches.
--
(format nil "~S@~S" "cbbrowne" "acm.org")
http://www.ntlug.org/~cbbrowne/slony.html
Rules of the Evil Overlord #78.  "I will not tell my Legions of Terror
"And he must  be taken alive!" The command will be:  ``And try to take
him alive if it is reasonably practical.''"
<http://www.eviloverlord.com/>

Re: performance hit for replication

From

"Dave Page"

Date:

13 April 2005, 05:03:02


> -----Original Message-----
> From: pgsql-performance-owner@postgresql.org
> [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of
> Matthew Nuzum
> Sent: 12 April 2005 17:25
> To: pgsql-performance@postgresql.org
> Subject: [PERFORM] performance hit for replication
>
> So, my question is this: My server currently works great,
> performance wise.
> I need to add fail-over capability, but I'm afraid that introducing a
> stressful task such as replication will hurt my server's
> performance. Is
> there any foundation to my fears? I don't need to replicate
> the archived log
> data because I can easily restore that in a separate step
> from the nightly
> backup if disaster occurs. Also, my database load is largely
> selects. My
> application works great with PostgreSQL 7.3 and 7.4, but I'm
> currently using
> 7.3.

If it's possible to upgrade to 8.0 then perhaps you could make use of
PITR and continuously ship log files to your standby machine.

http://www.postgresql.org/docs/8.0/interactive/backup-online.html

I can't help further with this as I've yet to give it a go myself, but
others here may have tried it.

Regards, Dave.