Thread: High Availability / Replication with Sequoia

High Availability / Replication with Sequoia

From
"Mr.Frog.to.you@googlemail.com"
Date:
Hi Guys,

I have been testing / working with Postgres for a work project, and so
far I am really impressed with this DB system. Takes a little getting
used to, but I am really beginning to love it.

I am looking now at a scenario that does not seem to be a native
ability of Postgres, but might possibly be overcome with Sequoia. I am
hoping that there exists the possibility of using Sequoia to replicate
a DB between / among a number of machines in the office, some of which
are not always connected to the lan.

The scenario is like this..... On each of the machines I would want to
have Postgres installed and only to accepting connections from the
local machine. Also on each of these machines would be running Tomcat
or similar hosting the required application (app to connect to local
Postgres installation). Sequoia would then be used as a form of
replication from machine to machine to ensure that the database is
kept up to date.

The application does not allow writeback to the db, so for all intents
and purposes you can consider it read only.

To keep the applications database up to date with new information I
would be using ETL applications like Spoon / PDI. This will be done to
an as yet undecided 'point of origin', but it is probably safe to say
that it will be a commercial db server somewhere on our network. The
latency from our network to the 'Data Warehouse' (read as badly
managed dogs breakfast) is huge. Suffice to say the desire for local
db's is high, as is the desire to make the application portable for
our sometimes connected laptop users.

Does anyone have any experience or comments that they would like to
share about this sort of scenario? Its a fairly big jump from just
having Postgres running on my laptop for dev purposes to pushing this
to multiple machines and I would really appreciate any feedback you
guys might have.

Thanks in advance

The Frog

Re: High Availability / Replication with Sequoia

From
Richard Huxton
Date:
Mr.Frog.to.you@googlemail.com wrote:
> Hi Guys,
>
> I have been testing / working with Postgres for a work project, and so
> far I am really impressed with this DB system. Takes a little getting
> used to, but I am really beginning to love it.

Good to hear it Mr ...Umm... Frog.

> I am looking now at a scenario that does not seem to be a native
> ability of Postgres, but might possibly be overcome with Sequoia. I am
> hoping that there exists the possibility of using Sequoia to replicate
> a DB between / among a number of machines in the office, some of which
> are not always connected to the lan.

You might want to look at slony with log-shipping
   http://www.slony.info/documentation/logshipping.html
Have a master server on the lan, others grab files as and when they
need. Do read the "limitations" though.

> The application does not allow writeback to the db, so for all intents
> and purposes you can consider it read only.

Fine with slony.

> To keep the applications database up to date with new information I
> would be using ETL applications like Spoon / PDI. This will be done to
> an as yet undecided 'point of origin', but it is probably safe to say
> that it will be a commercial db server somewhere on our network. The
> latency from our network to the 'Data Warehouse' (read as badly
> managed dogs breakfast) is huge. Suffice to say the desire for local
> db's is high, as is the desire to make the application portable for
> our sometimes connected laptop users.

The syncing with your commercial DB is probably the most fiddly bit.
That's not so bad, since it's one-way.

> Does anyone have any experience or comments that they would like to
> share about this sort of scenario? Its a fairly big jump from just
> having Postgres running on my laptop for dev purposes to pushing this
> to multiple machines and I would really appreciate any feedback you
> guys might have.

Not used the log-shipping variant of slony, but I'm happy enough with
the regular connected-version.

--
   Richard Huxton
   Archonet Ltd

Re: High Availability / Replication with Sequoia

From
Shane Ambler
Date:
Richard Huxton wrote:
> Mr.Frog.to.you@googlemail.com wrote:

>> I am looking now at a scenario that does not seem to be a native
>> ability of Postgres, but might possibly be overcome with Sequoia. I am
>> hoping that there exists the possibility of using Sequoia to replicate
>> a DB between / among a number of machines in the office, some of which
>> are not always connected to the lan.
>
> You might want to look at slony with log-shipping
>   http://www.slony.info/documentation/logshipping.html
> Have a master server on the lan, others grab files as and when they
> need. Do read the "limitations" though.

I haven't used this or Sequoia before but slony log-shipping sounds
closer to what you want for the mobile users.

Sequoia appears to cater for always available servers to replicate to so
may not suite your mobile users.

>> The application does not allow writeback to the db, so for all intents
>> and purposes you can consider it read only.
>
> Fine with slony.
>
>> To keep the applications database up to date with new information I
>> would be using ETL applications like Spoon / PDI. This will be done to
>> an as yet undecided 'point of origin', but it is probably safe to say
>> that it will be a commercial db server somewhere on our network. The
>> latency from our network to the 'Data Warehouse' (read as badly
>> managed dogs breakfast) is huge. Suffice to say the desire for local
>> db's is high, as is the desire to make the application portable for
>> our sometimes connected laptop users.
>
> The syncing with your commercial DB is probably the most fiddly bit.
> That's not so bad, since it's one-way.
>

Not sure I get your whole plan here or not....

The main thing I am thinking here is whether you plan to have each user
sync from the commercial db or another postgresql db?

Sequoia appears to support different db sources but I would check
whether it supports replicating the same data between different 'brands'
of db or whether the master and slave must be the same brand if this is
where you plan to use it.


You could look at having postgresql draw data from your commercial db on
a set schedule something like odbclink at pgfoundry may help there.


How many of your users are mobile? Could most users be accommodated by
one central server with some mobile users getting the local copy to take
with them? Maybe syncing data when they request it? Or are you set on
automating the syncing?





--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

Re: High Availability / Replication with Sequoia

From
"Mr.Frog.to.you@googlemail.com"
Date:
Thanks for the help guys,

I should clear up a little what I am trying to achieve I think.

The primary users of this db and application will be located in an
office, each user with a desktop machine, all networked. They need to
work with this DB in a fairly heavy kind of way, in so far as to say
that 80% of their day will be working with the application and the db.

The primary source of data will / must be located on a database server
that is actually in a different facility. It is possible to reach this
server from the office, and is done so daily, however the speed of
connection is very slow and is frequently disconnected - in short
unrelaible. To implement an extension of this 'primary' db with the
associated hardware and licensing costs at the local site is beyond
what the business is willing to pay. It also goes directly against the
'structure' that has been laid out by the IT group in that they want
all the db servers in a single location - regardless of business
impact they want to make their budget savings.

So, what I want to do is to satisfy the IT group by keeping a 'master'
copy of the db on their off-site facility, which in fact will be
populated from a source system sitting on my desk. The ETL tools will
be used for creating a completely (or as near as possible) automated
system for populating the 'master' that is offsite.

What I wanted to do next was to have Postgres installed on each of the
local users machines, along with the application they require, and run
them as a cluster - if one db goes down or one machine dies the client
software / app can still connect to the cluster and keep functioning
from another machine. I could then have the defective machine attended
to and if necessary re-built... In short the ability to work would not
be interrupted. Or at least thats the hope.

These desktops shut down each night too, as the staff leave to go
home. There is no possibility to install a server locally
(unfortunately). So with this in mind I was hoping that the
'automatic' nature of Sequoia would allow for recovery / updating from
the master or others in the cluster and keep all the local db's up to
date without the users having to do anything.

There is also a desire to have a mobile copy of this db / app for some
of the mobile users that come in to the office. They wont be able to
update while external due to the way the network is designed, but once
back in the office they could do this. I was hoping once again to keep
this as effortless as possible for the users. I am still hoping that
this may be achieveable.

In summary, what we are looking at is an install of Postgres on each
machine, a copy of Tomcat running the application, and maybe Sequoia
or Slony or some combination of both. ETL is handled separately (by
me) and the users are supposed to just be able to get on with their
work.

Do you think this is achieveable or am I up the creek and reaching too
far here?

Cheers

The Frog (you caught me out - its not my real name!)