Thread: Choosing PostgreSQL as the database for our next project

Choosing PostgreSQL as the database for our next project

From
Johnny Ljunggren
Date:
Hello everyone!

I've been assigned the task of choosing the database soloution for our
next project and have two questions I need to figure out:

1. Replication - multimaster
I'll try to explain the setup to the best of my ability:
Three centers:
Main center - database with a backup database
Center 1 - database with a backup database
Center 2 - database with a backup database (same as center 1)

Application on the three centers will use the local database but the
data should be replicated to the others as well. So when the link
between the main center and center 1/2 goes down applications will work
as usual and when the link is up the data will be replicated back and
forth so they are equal. I assume that not all of the databases and
tables will be replicated though...
I know that Oracle can do this and they call it multimaster. I read in
the FAQ that presumably pgcluster can do this, but instead of digging
through tons of info I'll ask try my luck here.
Will PostgreSQL be able to do what I want? Any third party (commercial
or not) solutions?

2. Commercial support
<Excerpt from the customer specifications:>
The Database System shall be based (on penalty of rejection) on a
commercial relational database package to ensure simple future
extensions and modifications.
</excerpt>
They're asking the wrong thing her IMO, but that is nothing I can do about.
So, are there any companies that I can point to and say, 'There's your
commecial vendor'?

regards
--
Johnny Ljunggren, Vestlia 6, 3080  HOLMESTRAND, 918 50 411

Re: Choosing PostgreSQL as the database for our next project

From
William Yu
Date:
Johnny Ljunggren wrote:
> 1. Replication - multimaster
> I'll try to explain the setup to the best of my ability:
> Three centers:
> Main center - database with a backup database
> Center 1 - database with a backup database
> Center 2 - database with a backup database (same as center 1)
>
> Application on the three centers will use the local database but the
> data should be replicated to the others as well. So when the link
> between the main center and center 1/2 goes down applications will work
> as usual and when the link is up the data will be replicated back and
> forth so they are equal. I assume that not all of the databases and
> tables will be replicated though...
> I know that Oracle can do this and they call it multimaster. I read in
> the FAQ that presumably pgcluster can do this, but instead of digging
> through tons of info I'll ask try my luck here.
> Will PostgreSQL be able to do what I want? Any third party (commercial
> or not) solutions?

If you are talking multiple data centers located relatively far away
from each other in order to assure uptime, I have bad news for you.
Nobody has an "out of the box" multi-master replication solution for
your situation -- not even Oracle. Synchronous multi-master solutions
pretty much demand LAN-level latency in order to assure performance.
Async multi-master can be done across high latency connections but async
multi-master requires rolling your own replication code to handle/avoid
data conflict issues.

Re: Choosing PostgreSQL as the database for our next project

From
vishal saberwal
Date:
hi johny,

would pgreplicator help ... i am not sure if it would solve your purpose but may be you want to look at it ... it does offer multimaster replication ...

thanks,
vish

On 11/11/05, William Yu <wyu@talisys.com> wrote:
Johnny Ljunggren wrote:
> 1. Replication - multimaster
> I'll try to explain the setup to the best of my ability:
> Three centers:
> Main center - database with a backup database
> Center 1 - database with a backup database
> Center 2 - database with a backup database (same as center 1)
>
> Application on the three centers will use the local database but the
> data should be replicated to the others as well. So when the link
> between the main center and center 1/2 goes down applications will work
> as usual and when the link is up the data will be replicated back and
> forth so they are equal. I assume that not all of the databases and
> tables will be replicated though...
> I know that Oracle can do this and they call it multimaster. I read in
> the FAQ that presumably pgcluster can do this, but instead of digging
> through tons of info I'll ask try my luck here.
> Will PostgreSQL be able to do what I want? Any third party (commercial
> or not) solutions?

If you are talking multiple data centers located relatively far away
from each other in order to assure uptime, I have bad news for you.
Nobody has an "out of the box" multi-master replication solution for
your situation -- not even Oracle. Synchronous multi-master solutions
pretty much demand LAN-level latency in order to assure performance.
Async multi-master can be done across high latency connections but async
multi-master requires rolling your own replication code to handle/avoid
data conflict issues.

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Re: Choosing PostgreSQL as the database for our next project

From
Johnny Ljunggren
Date:
William Yu wrote:
> Johnny Ljunggren wrote:
>
>> 1. Replication - multimaster
>> I'll try to explain the setup to the best of my ability:
>> Three centers:
>> Main center - database with a backup database
>> Center 1 - database with a backup database
>> Center 2 - database with a backup database (same as center 1)

Hmm, okay. We're now looking at another, simpler, way to do it. The same
setup but the clients on Center 1/2 will connect directly to the Main
center (2Mb leased line). The databases on Center 1/2 will then just be
a replica of the Main databases.
The biggest issue though is what happens when the lines go down:
1. connect to local database server
2. line comes up => changes to local database sent to main database
3. connect to main database server

 From what I understand this will be possible but I may need to roll my
own replication code to handle data conflict issues? Especially since
center 1/2 may be down at the same time and then might change the same data.
How do the tools for this compare against the ones from Oracle?

--
Johnny Ljunggren, Vestlia 6, 3080  HOLMESTRAND, 918 50 411

Re: Choosing PostgreSQL as the database for our next project

From
Johnny Ljunggren
Date:
Scott Ribe wrote:
>From what I understand this will be possible but I may need to roll my
>>own replication code to handle data conflict issues? Especially since
>>center 1/2 may be down at the same time and then might change the same data.
>>How do the tools for this compare against the ones from Oracle?

<snip some useful inputs>

> There is no simple solution, and the out-of-the-box solutions require a
> whole lotta configuration work.

That is what I have been suspecting. Luckily the databases aren't that
complex so we may be able to fix them so that there will be no conflicts
(in my dreams perhaps...) that need manual intervention.

Anyhow, are there any (big) companies that will do support for a
PostgreSQL installation?

--
Johnny Ljunggren, Vestlia 6, 3080  HOLMESTRAND, 918 50 411

Re: Choosing PostgreSQL as the database for our next project

From
"Jim C. Nasby"
Date:
On Mon, Nov 14, 2005 at 06:02:03PM +0100, Johnny Ljunggren wrote:
> Scott Ribe wrote:
> >>From what I understand this will be possible but I may need to roll my
> >>own replication code to handle data conflict issues? Especially since
> >>center 1/2 may be down at the same time and then might change the same
> >>data.
> >>How do the tools for this compare against the ones from Oracle?
>
> <snip some useful inputs>
>
> >There is no simple solution, and the out-of-the-box solutions require a
> >whole lotta configuration work.
>
> That is what I have been suspecting. Luckily the databases aren't that
> complex so we may be able to fix them so that there will be no conflicts
> (in my dreams perhaps...) that need manual intervention.
>
> Anyhow, are there any (big) companies that will do support for a
> PostgreSQL installation?

Yes, there are a number of companies offering commercial support.
Pervasive (who I work for) is one; there's also Command Prompt and
others. http://www.postgresql.org/support/professional_support has a
good list of options.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Choosing PostgreSQL as the database for our next project

From
William Yu
Date:
Johnny Ljunggren wrote:
> Hmm, okay. We're now looking at another, simpler, way to do it. The same
> setup but the clients on Center 1/2 will connect directly to the Main
> center (2Mb leased line). The databases on Center 1/2 will then just be
> a replica of the Main databases.
> The biggest issue though is what happens when the lines go down:
> 1. connect to local database server
> 2. line comes up => changes to local database sent to main database
> 3. connect to main database server
>
>  From what I understand this will be possible but I may need to roll my
> own replication code to handle data conflict issues? Especially since
> center 1/2 may be down at the same time and then might change the same
> data.
> How do the tools for this compare against the ones from Oracle?
>

It sounds like now is you have a master -> multi-slave setup. If master
goes down, people use slave instead w/ master re-replicating after it
comes back up. This config should be much easier to configure out of the
box -- you shouldn't need to write any custom app code.

What I'm not sure about though is what would happen with 2 slaves if
main goes down AND the connection between center 1 & 2 is also
disconnected. Others with more experience with the various master/slave
replication solutions might want to chime in now on how it would work.

Re: Choosing PostgreSQL as the database for our next project

From
Johnny Ljunggren
Date:
William Yu wrote:

>> Hmm, okay. We're now looking at another, simpler, way to do it. The
>> same setup but the clients on Center 1/2 will connect directly to the
>> Main center (2Mb leased line). The databases on Center 1/2 will then
>> just be a replica of the Main databases.
>> The biggest issue though is what happens when the lines go down:
>> 1. connect to local database server
>> 2. line comes up => changes to local database sent to main database
>> 3. connect to main database server
>
> It sounds like now is you have a master -> multi-slave setup. If master
> goes down, people use slave instead w/ master re-replicating after it
> comes back up. This config should be much easier to configure out of the
> box -- you shouldn't need to write any custom app code.
>
> What I'm not sure about though is what would happen with 2 slaves if
> main goes down AND the connection between center 1 & 2 is also
> disconnected. Others with more experience with the various master/slave
> replication solutions might want to chime in now on how it would work.

Well, this might be a problem, because there is no connection between
center 1 and 2, so they will basically live their own lives as long as
the connection to main center is down.


--
Johnny Ljunggren, Vestlia 6, 3080  HOLMESTRAND, 918 50 411

Re: Choosing PostgreSQL as the database for our next project

From
Christopher Browne
Date:
> William Yu wrote:
>> Johnny Ljunggren wrote:
>>
>>> 1. Replication - multimaster
>>> I'll try to explain the setup to the best of my ability:
>>> Three centers:
>>> Main center - database with a backup database
>>> Center 1 - database with a backup database
>>> Center 2 - database with a backup database (same as center 1)
>
> Hmm, okay. We're now looking at another, simpler, way to do it. The
> same setup but the clients on Center 1/2 will connect directly to the
> Main center (2Mb leased line). The databases on Center 1/2 will then
> just be a replica of the Main databases.
> The biggest issue though is what happens when the lines go down:
> 1. connect to local database server
> 2. line comes up => changes to local database sent to main database
> 3. connect to main database server
>
>  From what I understand this will be possible but I may need to roll
> my own replication code to handle data conflict issues? Especially
> since center 1/2 may be down at the same time and then might change
> the same data.

If you're considering the major choices in replication systems
(Slony-I, Mammoth Replicator), there is no problem with this, as
updates go to the main database server, and there are therefore
necessarily no conflicts.

Both of those systems are asynchronous single master replication
systems.
--
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','ntlug.org').
http://linuxdatabases.info/info/slony.html
"They laughed at Columbus, they laughed at Fulton, they laughed at the
Wright brothers.  But they also laughed at Bozo the Clown."
-- Carl Sagan

Re: Choosing PostgreSQL as the database for our next project

From
William Yu
Date:
Johnny Ljunggren wrote:
> William Yu wrote:
>
>> What I'm not sure about though is what would happen with 2 slaves if
>> main goes down AND the connection between center 1 & 2 is also
>> disconnected. Others with more experience with the various
>> master/slave replication solutions might want to chime in now on how
>> it would work.
>
>
> Well, this might be a problem, because there is no connection between
> center 1 and 2, so they will basically live their own lives as long as
> the connection to main center is down.

This would mean your system goes from Master-Slave to Multi-Master if
your main DB goes down. Roll your own replication time.