Thread: Best Replication Tool

Best Replication Tool

From
Kiswono Prayogo
Date:
Hi, i'm really new to postgresql replication, was there any
replication tool for postgresql that can do real-time replication from
1 database to ~3-10 database (on the other machines) with little
overhead for load balancing purpose?

--
Regards,
Kiswono P
GB

Re: Best Replication Tool

From
Scott Marlowe
Date:
On Sun, Feb 7, 2010 at 8:22 PM, Kiswono Prayogo <kiswono@gmail.com> wrote:
> Hi, i'm really new to postgresql replication, was there any
> replication tool for postgresql that can do real-time replication from
> 1 database to ~3-10 database (on the other machines) with little
> overhead for load balancing purpose?

Well, you're not telling us how much traffic you're looking at or how
powerful your servers will be.  Where I work we have two large master
servers (lots of SAS drives on power hardware RAID controllers with
battery backed cache) which are our master and slave-master, so to
speak, then we currently have two, but may have more, slaves synced
off of the slave-master machine.  We use slony.  It works well enough,
but there are some issues with large numbers of db objects (lots of
tables / indexes / schemas) that we've run into with it that are
making me pull some hair out, and I haven't had time to troubleshoot
them.

Overall though, it's a quite capable system, and once it's setup and
running it's fast and stable.

There's also londiste, bucardo, and a few others worth looking at.

Re: Best Replication Tool

From
Rosser Schwarz
Date:
On Sun, Feb 7, 2010 at 8:22 PM, Kiswono Prayogo <kiswono@gmail.com> wrote:
> Hi, i'm really new to postgresql replication, was there any
> replication tool for postgresql that can do real-time replication from
> 1 database to ~3-10 database (on the other machines) with little
> overhead for load balancing purpose?

If, by "real time", you mean that a tuple created or changed on the
master is visible to queries on an arbitrary slave within a few
seconds, then Bucardo and Slony are both good candidates.  I've used
Bucardo in just such an environment, along with its master-master
capability.  Of the two, I definitely prefer the former, but this
isn't an advocacy list, and they both work.

The amount of overhead you'll experience is a function of how "chatty"
your data is -- the higher the volume of changes it needs to
replicate, the more work it will take to keep those changes flowing
out to the slave nodes.  If you update the same tuple three times in a
five second period, that's three tuples you'll be replicating out to
each slave.

Another important consideration is the size of your replication sets
(in Slony terms; "herds" in Bucardo).  The bigger they are -- the more
tables they include -- the more work it will take to replicate any
changes in any of their member objects.  IME, you'll typically do well
to keep your sets down to the smallest meaningful groups you can.
Particularly, you'll want to group together tables that are
interrelated by foreign keys (this is especially important if you're
using Slony, and have any plans to support switching masters), or that
you expect to be modified together -- for example, all the tables that
might be modified in the course of a new order being entered into your
OLTP application.

rls

--
:wq

Re: Best Replication Tool

From
Gerd Koenig
Date:
Hi,

probably pgpool-II is worth trying. A new version has been released two days
ago....
http://pgfoundry.org/projects/pgpool

hth ..:GERD:..


On Monday 08 February 2010 23:38:54 Rosser Schwarz wrote:
> On Sun, Feb 7, 2010 at 8:22 PM, Kiswono Prayogo <kiswono@gmail.com> wrote:
> > Hi, i'm really new to postgresql replication, was there any
> > replication tool for postgresql that can do real-time replication from
> > 1 database to ~3-10 database (on the other machines) with little
> > overhead for load balancing purpose?
>
> If, by "real time", you mean that a tuple created or changed on the
> master is visible to queries on an arbitrary slave within a few
> seconds, then Bucardo and Slony are both good candidates.  I've used
> Bucardo in just such an environment, along with its master-master
> capability.  Of the two, I definitely prefer the former, but this
> isn't an advocacy list, and they both work.
>
> The amount of overhead you'll experience is a function of how "chatty"
> your data is -- the higher the volume of changes it needs to
> replicate, the more work it will take to keep those changes flowing
> out to the slave nodes.  If you update the same tuple three times in a
> five second period, that's three tuples you'll be replicating out to
> each slave.
>
> Another important consideration is the size of your replication sets
> (in Slony terms; "herds" in Bucardo).  The bigger they are -- the more
> tables they include -- the more work it will take to replicate any
> changes in any of their member objects.  IME, you'll typically do well
> to keep your sets down to the smallest meaningful groups you can.
> Particularly, you'll want to group together tables that are
> interrelated by foreign keys (this is especially important if you're
> using Slony, and have any plans to support switching masters), or that
> you expect to be modified together -- for example, all the tables that
> might be modified in the course of a new order being entered into your
> OLTP application.
>
> rls
>

--
/====================================\
| Gerd König
| - Infrastruktur -
|
| TRANSPOREON GmbH
| Magirus-Deutz-Str. 16
| DE - 89077 Ulm
|
| Tel: +49 [0]731 16906 106
| Fax: +49 [0]731 16906 99
| koenig@transporeon.com
| www.transporeon.com
|
\====================================/


TRANSPOREON GmbH, Amtsgericht Ulm, HRB 722056
Geschäftsf.: Peter Förster, Roland Hötzl, Marc-Oliver Simon

Re: Best Replication Tool

From
Dimitri Fontaine
Date:
Kiswono Prayogo <kiswono@gmail.com> writes:

> Hi, i'm really new to postgresql replication, was there any
> replication tool for postgresql that can do real-time replication from
> 1 database to ~3-10 database (on the other machines) with little
> overhead for load balancing purpose?

My prefered asynchronous replication solution for PostgreSQL is Londiste
from Skytools, don't miss its tutorial:

  http://wiki.postgresql.org/wiki/Londiste_Tutorial

Regards,
--
dim

Disabling triggers with psql (gforge 7.4 to 8.2 migration)

From
Iñigo Martinez Lasala
Date:
Hi everybody.

We are migrating an ancient gforge postgres 7.4 database to a less ancient postgres 8.2 one. We have not upgraded to 8.3 or 8.4 due to problems with implicit conversion for data types (we should had to review all sql code and we don't have time enough to do it). It's a quite large database, with a 45GB dump.

We are facing some problems with this.

First one is tsearch. It has changed from postgres 7.4 to 8.2, but I think we have solved. We have dumped schema of 7.4, removed all tsearch2 related and launch new tsearch2 from contrib folder in 8.2. We did it with a previous migration from 7.4 to 8.1 (two years ago) and everything worked, so we think this has been solved.

Second one is database has lot of stuff and bad ascii codes inside. We can dump, but we cannot restore due to malformed UTF-8 characters. We also solved this problem by facing a long conversion process with iconv (from utf-8 to utf-8) removing bad chars. It appears postgres 7.4 didn't checked utf-8 chars but 8.1 and 8.2 do it. We have performed a data dump in plain text to use iconv for fixing this.

Third one, and we haven't been able to fix it, is we cannot restore data dump via psql due to psql does not have an option to disable triggers. So, when restoring data, it fails. So... is there any way to disable in a psql session all triggers same way like with pg_restore --disable-triggers?  Or can we convert plain text dump to be processed by pg_restore? Perhaps a table by table trigger disabling script?

Thanks in advance.

Of course, if there is an easier way to proceed with a 7.4 to 8.2 gforge database migration, It would be fantastic if anybody can explain me how to do it. :-)


Re: Disabling triggers with psql (gforge 7.4 to 8.2 migration)

From
Iñigo Martinez Lasala
Date:
Could be a valid strategy to dump in tar format, untar files, fix via iconv each .dat file and tar them together again?

... and use pg_restore with --disable-triggers in order to avoid constraints, of course.

-----Original Message-----
From: Iñigo Martinez Lasala <imartinez@vectorsf.com>
To: pgsql-admin <pgsql-admin@postgresql.org>
Subject: [ADMIN] Disabling triggers with psql (gforge 7.4 to 8.2 migration)
Date: Wed, 10 Feb 2010 10:30:18 +0100

Hi everybody.

We are migrating an ancient gforge postgres 7.4 database to a less ancient postgres 8.2 one. We have not upgraded to 8.3 or 8.4 due to problems with implicit conversion for data types (we should had to review all sql code and we don't have time enough to do it). It's a quite large database, with a 45GB dump.

We are facing some problems with this.

First one is tsearch. It has changed from postgres 7.4 to 8.2, but I think we have solved. We have dumped schema of 7.4, removed all tsearch2 related and launch new tsearch2 from contrib folder in 8.2. We did it with a previous migration from 7.4 to 8.1 (two years ago) and everything worked, so we think this has been solved.

Second one is database has lot of stuff and bad ascii codes inside. We can dump, but we cannot restore due to malformed UTF-8 characters. We also solved this problem by facing a long conversion process with iconv (from utf-8 to utf-8) removing bad chars. It appears postgres 7.4 didn't checked utf-8 chars but 8.1 and 8.2 do it. We have performed a data dump in plain text to use iconv for fixing this.

Third one, and we haven't been able to fix it, is we cannot restore data dump via psql due to psql does not have an option to disable triggers. So, when restoring data, it fails. So... is there any way to disable in a psql session all triggers same way like with pg_restore --disable-triggers?  Or can we convert plain text dump to be processed by pg_restore? Perhaps a table by table trigger disabling script?

Thanks in advance.

Of course, if there is an easier way to proceed with a 7.4 to 8.2 gforge database migration, It would be fantastic if anybody can explain me how to do it. :-)



Re: Disabling triggers with psql (gforge 7.4 to 8.2 migration)

From
Josh Kupershmidt
Date:

On Feb 10, 2010, at 4:30 AM, Iñigo Martinez Lasala wrote:

Third one, and we haven't been able to fix it, is we cannot restore data dump via psql due to psql does not have an option to disable triggers. So, when restoring data, it fails. So... is there any way to disable in a psql session all triggers same way like with pg_restore --disable-triggers?  Or can we convert plain text dump to be processed by pg_restore? Perhaps a table by table trigger disabling script?

How about:
 1. restore your schema, with triggers, using a pg_dump --schema-only ... ; pg_restore ... 
 2. Create a psql-loadable data-only dump using pg_dump --data-only --disable-triggers --format=p ...
 3. clean the data-only dump and restore

Or, you could do a full pg_dump -Fc ... of your database, then use pg_restore to restore the compressed dumpfile into a plaintext file (i.e. pg_restore [options] filename). Then run your cleanup on the plaintext file, and reload. 

Josh

Re: Disabling triggers with psql (gforge 7.4 to 8.2 migration)

From
Iñigo Martinez Lasala
Date:
Two questions.

I could, of course, create a data-only dump (in fact I've already done it). However, when restoring, I cannot use pg_restore since it's in plain format, don't you? pg_restore only works with tar or compressed formats, I think. I could restore data-only dump with psql, but then I've to disable triggers and psql does not have an option in order to disable them.

Second one. You say I could restore a compressed dumpfile into a plaintext file. Is this possible? How? And after cleaning this plaintext file, how do I restore it again into database without using psql since pg_restore only accept tar or compressed file formats?

Thank you, Josh.


-----Original Message-----
From: Josh Kupershmidt <schmiddy@gmail.com>
To: Iñigo Martinez Lasala <imartinez@vectorsf.com>
Cc: pgsql-admin <pgsql-admin@postgresql.org>
Subject: Re: [ADMIN] Disabling triggers with psql (gforge 7.4 to 8.2 migration)
Date: Wed, 10 Feb 2010 09:55:15 -0500


On Feb 10, 2010, at 4:30 AM, Iñigo Martinez Lasala wrote:

Third one, and we haven't been able to fix it, is we cannot restore data dump via psql due to psql does not have an option to disable triggers. So, when restoring data, it fails. So... is there any way to disable in a psql session all triggers same way like with pg_restore --disable-triggers?  Or can we convert plain text dump to be processed by pg_restore? Perhaps a table by table trigger disabling script?


How about:  1. restore your schema, with triggers, using a pg_dump --schema-only ... ; pg_restore ...   2. Create a psql-loadable data-only dump using pg_dump --data-only --disable-triggers --format=p ...  3. clean the data-only dump and restore

Or, you could do a full pg_dump -Fc ... of your database, then use pg_restore to restore the compressed dumpfile into a plaintext file (i.e. pg_restore [options] filename). Then run your cleanup on the plaintext file, and reload. 

Josh

Re: Disabling triggers with psql (gforge 7.4 to 8.2 migration)

From
Iñigo Martinez Lasala
Date:
Hmmmm...

What about
UPDATE pg_trigger SET tgenabled = false;

Restore data, and after done:
UPDATE pg_trigger SET tgenabled = true;

I'm going to test it.

-----Original Message-----
From: Josh Kupershmidt <schmiddy@gmail.com>
To: Iñigo Martinez Lasala <imartinez@vectorsf.com>
Cc: pgsql-admin <pgsql-admin@postgresql.org>
Subject: Re: [ADMIN] Disabling triggers with psql (gforge 7.4 to 8.2 migration)
Date: Wed, 10 Feb 2010 09:55:15 -0500


On Feb 10, 2010, at 4:30 AM, Iñigo Martinez Lasala wrote:

Third one, and we haven't been able to fix it, is we cannot restore data dump via psql due to psql does not have an option to disable triggers. So, when restoring data, it fails. So... is there any way to disable in a psql session all triggers same way like with pg_restore --disable-triggers?  Or can we convert plain text dump to be processed by pg_restore? Perhaps a table by table trigger disabling script?


How about:  1. restore your schema, with triggers, using a pg_dump --schema-only ... ; pg_restore ...   2. Create a psql-loadable data-only dump using pg_dump --data-only --disable-triggers --format=p ...  3. clean the data-only dump and restore

Or, you could do a full pg_dump -Fc ... of your database, then use pg_restore to restore the compressed dumpfile into a plaintext file (i.e. pg_restore [options] filename). Then run your cleanup on the plaintext file, and reload. 

Josh

Re: Disabling triggers with psql (gforge 7.4 to 8.2 migration)

From
Josh Kupershmidt
Date:
[Resending, forgot to CC list]

On Wed, Feb 10, 2010 at 10:47 AM, Josh Kupershmidt <schmiddy@gmail.com> wrote:

On Wed, Feb 10, 2010 at 10:09 AM, Iñigo Martinez Lasala <imartinez@vectorsf.com> wrote:
Two questions.

I could, of course, create a data-only dump (in fact I've already done it). However, when restoring, I cannot use pg_restore since it's in plain format, don't you? pg_restore only works with tar or compressed formats, I think. I could restore data-only dump with psql, but then I've to disable triggers and psql does not have an option in order to disable them.


Here's what I meant by the first routine. Let's dump database "test" and restore into database "restoration".

  pg_dump --schema-only -Ujosh --format=c --file=test.schema.pgdump test
  pg_restore -Ujosh --dbname=restoration test.schema.pgdump
  pg_dump --data-only -Ujosh --format=p --disable-triggers --file=test.data.pgdump test
  # clean up test.data.pgdump here
  psql -Ujosh restoration < test.data.pgdump

So for the restoration of the actual data, you'd use psql, but the disabling of triggers would be handled for you (you should see ALTER TABLE ... DISABLE TRIGGER ALL; and ALTER TABLE ... ENABLE TRIGGER ALL; or similar in test.data.pgdump)

Second one. You say I could restore a compressed dumpfile into a plaintext file. Is this possible? How? And after cleaning this plaintext file, how do I restore it again into database without using psql since pg_restore only accept tar or compressed file formats?

To turn a pg_dump file which was dumped with, say, --format=c into a plaintext file loadable by SQL:

  pg_dump -Ujosh --format=c --file=test.Fc.pgdump test
  pg_restore test.Fc.pgdump > test.plaintext.pgdump
  # clean up test.plaintext.pgdump here
  psql -Ujosh restoration < test.plaintext.pgdump

This was the second option I mentioned. You would then have to use psql to restore this plaintext file. You might be able to jump through some hoops and turn the plaintext dump back into a pg_restore compressed dump, but I don't see any point in this -- the plaintext dump here should have the CREATE TRIGGER statements after the table creation and population steps, which is likely exactly what you want. 
 
Josh

Re: Disabling triggers with psql (gforge 7.4 to 8.2 migration)

From
Josh Kupershmidt
Date:
[Forgot to CC list, again]

Found the doc. page on pg_trigger I was looking for. You were right, pg_trigger changed between 8.2 and 8.3:
http://www.postgresql.org/docs/8.2/static/catalog-pg-trigger.html
http://www.postgresql.org/docs/8.3/static/catalog-pg-trigger.html

Josh

On Wed, Feb 10, 2010 at 11:03 AM, Josh Kupershmidt <schmiddy@gmail.com> wrote:
On Wed, Feb 10, 2010 at 10:30 AM, Iñigo Martinez Lasala <imartinez@vectorsf.com> wrote:
Hmmmm...

What about
UPDATE pg_trigger SET tgenabled = false;

Restore data, and after done:
UPDATE pg_trigger SET tgenabled = true;

I'm going to test it.


Perhaps this was still a boolean column in 8.2, but at least in 8.3 and above tgenabled is a "char" field. The only explanation I found of its values is here:
http://archives.postgresql.org/pgsql-hackers/2007-01/msg01302.php

Anyway, I suspect your idea will work, but if you're going to the trouble of disabling+reenabling triggers for performance reasons, you might still be better off with pg_restore, since it can handle creation of constraints and indexes at the appropriate time as well, which can really help. 

Josh

Re: Disabling triggers with psql (gforge 7.4 to 8.2 migration)

From
Alvaro Herrera
Date:
Iñigo Martinez Lasala wrote:

> Third one, and we haven't been able to fix it, is we cannot restore data
> dump via psql due to psql does not have an option to disable triggers.
> So, when restoring data, it fails. So... is there any way to disable in
> a psql session all triggers same way like with pg_restore
> --disable-triggers?  Or can we convert plain text dump to be processed
> by pg_restore? Perhaps a table by table trigger disabling script?

Have you tried using 8.2's pg_dump instead of the old one?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Disabling triggers with psql (gforge 7.4 to 8.2 migration)

From
Iñigo Martinez Lasala
Date:
Hmm...
No, I've not tried it. Does pg_dump perform some kind of truncating over bad chars? If yes, my problem would be solved!

When actual test finish, I will try again with pg_dump from 8.2

Thank you, Alvaro.

-----Original Message-----
From: Alvaro Herrera <alvherre@commandprompt.com>
To: Iñigo Martinez Lasala <imartinez@vectorsf.com>
Cc: pgsql-admin <pgsql-admin@postgresql.org>
Subject: Re: [ADMIN] Disabling triggers with psql (gforge 7.4 to 8.2 migration)
Date: Wed, 10 Feb 2010 13:21:04 -0300

Iñigo Martinez Lasala wrote:

> Third one, and we haven't been able to fix it, is we cannot restore data
> dump via psql due to psql does not have an option to disable triggers.
> So, when restoring data, it fails. So... is there any way to disable in
> a psql session all triggers same way like with pg_restore
> --disable-triggers?  Or can we convert plain text dump to be processed
> by pg_restore? Perhaps a table by table trigger disabling script?

Have you tried using 8.2's pg_dump instead of the old one?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Disabling triggers with psql (gforge 7.4 to 8.2 migration)

From
Alvaro Herrera
Date:
Iñigo Martinez Lasala wrote:
> Hmm...
> No, I've not tried it. Does pg_dump perform some kind of truncating over
> bad chars? If yes, my problem would be solved!

No, I was thinking in the disabling of triggers ...


--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Disabling triggers with psql (gforge 7.4 to 8.2 migration)

From
Iñigo Martinez Lasala
Date:
Hi Josh.

I have followed your suggestion and restored bad tables into a file. This has shortened the conversion process since I've only restored to file bad chars tables, and only 8 showed this issue. So, instead of having to fix full dump (about 45G) I only had to process 10MB (8 tables). This is what I've done:

- Dump schema, fix it in order to accomodate to postgres 8.2, and import into database via psql.
- Dump data and import into database via pg_restore with triggers disabled.

pg_dump -f gforge-database.dmp -Fc -a -O --disable-triggers -x -E utf-8 -U gforge -h 192.168.7.54 gforge -W
pg_restore -Fc -a -O --disable-triggers -x -U gforge -d gforge gforge-database.dmp 2>import-error.log

- Locate in import-log what tables have problem (8 in my import9.
- Restore bad tables into plain text files:

pg_restore -t artifact ../gforge.dmp > artifact.dmp
pg_restore -t artifact_history ../gforge.dmp > artifact_history.dmp
pg_restore -t artifact_idx ../gforge.dmp > artifact_idx.dmp
pg_restore -t doc_data_idx ../gforge.dmp > doc_data_idx.dmp
pg_restore -t frs_release_idx ../gforge.dmp > frs_release_idx.dmp
pg_restore -t groups_idx ../gforge.dmp > groups_idx.dmp
pg_restore -t project_task_idx ../gforge.dmp > project_task_idx.dmp
pg_restore -t users_idx ../gforge.dmp > users_idx.dmp

- Modify each dump file, changing set client enconding from UTF-8 to ISO-8859-1 (it appears that this was my problem, ISO-8859-1 chars in database that had not been properly converted to UTF-8)
SET client_encoding = 'UTF8';

Alternatively, you could user iconv in order to eliminate bad chars (iconv -f utf-8 -t utf-8 -c -f table_exported.dmp > table_exported_fixed.dmp). But truncating bad chars could lead to errors when importing since some registers could become NULL registers and fail on import.

- Import each table DISABLING TRIGERS temporary
(add ALTER TABLE tablename DISABLE TRIGGER ALL; at beginning and ALTER TABLE tablename ENABLE TRIGGER ALL; at the end).

Et voila. Import finished.

Thank you very much to all for your suggestions. They have been very useful for me.


-----Original Message-----
From: Josh Kupershmidt <schmiddy@gmail.com>
To: Iñigo Martinez Lasala <imartinez@vectorsf.com>
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Disabling triggers with psql (gforge 7.4 to 8.2 migration)
Date: Wed, 10 Feb 2010 11:08:53 -0500

[Resending, forgot to CC list]

On Wed, Feb 10, 2010 at 10:47 AM, Josh Kupershmidt <schmiddy@gmail.com> wrote:

On Wed, Feb 10, 2010 at 10:09 AM, Iñigo Martinez Lasala <imartinez@vectorsf.com> wrote:
Two questions.

I could, of course, create a data-only dump (in fact I've already done it). However, when restoring, I cannot use pg_restore since it's in plain format, don't you? pg_restore only works with tar or compressed formats, I think. I could restore data-only dump with psql, but then I've to disable triggers and psql does not have an option in order to disable them.



Here's what I meant by the first routine. Let's dump database "test" and restore into database "restoration".

  pg_dump --schema-only -Ujosh --format=c --file=test.schema.pgdump test
  pg_restore -Ujosh --dbname=restoration test.schema.pgdump
  pg_dump --data-only -Ujosh --format=p --disable-triggers --file=test.data.pgdump test
  # clean up test.data.pgdump here
  psql -Ujosh restoration < test.data.pgdump

So for the restoration of the actual data, you'd use psql, but the disabling of triggers would be handled for you (you should see ALTER TABLE ... DISABLE TRIGGER ALL; and ALTER TABLE ... ENABLE TRIGGER ALL; or similar in test.data.pgdump)


Second one. You say I could restore a compressed dumpfile into a plaintext file. Is this possible? How? And after cleaning this plaintext file, how do I restore it again into database without using psql since pg_restore only accept tar or compressed file formats?


To turn a pg_dump file which was dumped with, say, --format=c into a plaintext file loadable by SQL:

  pg_dump -Ujosh --format=c --file=test.Fc.pgdump test
  pg_restore test.Fc.pgdump > test.plaintext.pgdump
  # clean up test.plaintext.pgdump here
  psql -Ujosh restoration < test.plaintext.pgdump

This was the second option I mentioned. You would then have to use psql to restore this plaintext file. You might be able to jump through some hoops and turn the plaintext dump back into a pg_restore compressed dump, but I don't see any point in this -- the plaintext dump here should have the CREATE TRIGGER statements after the table creation and population steps, which is likely exactly what you want. 
 
Josh