Thread: Disconnected editing - versioning of databases

Disconnected editing - versioning of databases

From
William Temperley
Date:
Hi All

I'm wondering if anyone can share any insights or experience with
temporary versions of databases, allowing "disconnected editing"
during Internet downtime.

The use-case is that I run a Postgres database, hosted in the UK, but
used by scientists in several other countries - Ecuador, Vietnam,
Kenya and Indonesia.
Unfortunately the connectivity between the UK and these countries is
flaky at best - perhaps usable 70% of the time. This means a lot of
wasted time for technicians and scientists working at remote
locations.

I could potentially run a database in each of these countries and
provide 100% uptime, obviously raising the issue of version conflicts
that would require hand-merging. Do people think this is feasible, or
a total non-starter? Performance is not an issue. Development time is
:)


Cheers

Will T

Re: Disconnected editing - versioning of databases

From
Greg Smith
Date:
On Tue, 14 Apr 2009, William Temperley wrote:

> I could potentially run a database in each of these countries and
> provide 100% uptime, obviously raising the issue of version conflicts
> that would require hand-merging.

It sounds like you want an asynchronous master-slave database architecture
where the slaves can also send changes back to the master, but didn't know
that's what you should be looking for.  A quick glance at
http://wiki.postgresql.org/wiki/Replication%2C_Clustering%2C_and_Connection_Pooling
suggests Bucardo and Londiste might be useful tools for you to
investigate; I don't think Slony or Mammoth can handle slaves generating
their own transactions and feeding them to the master, but given how
complicated Slony is maybe I just don't know how to do it there.

The Londiste tutorial at
http://pgsql.tapoueh.org/site/html/londiste/londiste.html even starts out
with a business situation that sounds similar to yours.

I would suggest that if you're new to the area of replication, do not
assume that just because these tools look complicated that you'd be better
off rolling your own.  The reason they're complicated is because they're
filled with solutions to hard problems most people never even think they
need to solve, until they get bit by one.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Disconnected editing - versioning of databases

From
Johan Nel
Date:
Will,

William Temperley wrote:
> Hi All
>
> I'm wondering if anyone can share any insights or experience with
> temporary versions of databases, allowing "disconnected editing"
> during Internet downtime.

What do your users use for updating data?  Software that you developed
for them?  Do they use a web-interface?

What I do in one of my applications (.NET) is to allow the users to make
data available offline with a simple XML serialisation.  So they are
then allowed to make updates to the local XML data, and when
online/connected again do a "upload" back into the central database.

Regards,

Johan Nel
Pretoria, South Africa.

Re: Disconnected editing - versioning of databases

From
Dimitri Fontaine
Date:
Le Wednesday 15 April 2009, Greg Smith a écrit :
> On Tue, 14 Apr 2009, William Temperley wrote:
> > I could potentially run a database in each of these countries and
> > provide 100% uptime, obviously raising the issue of version conflicts
> > that would require hand-merging.

Can you partition data on origin?
If that's possible, then do it and use a schema per origin to simplify the
administration thereafter, and "just" replicate some tables from orginin to UK
and central data from UK to editing countries.

> It sounds like you want an asynchronous master-slave database architecture
> where the slaves can also send changes back to the master, but didn't know
> that's what you should be looking for.  A quick glance at
> http://wiki.postgresql.org/wiki/Replication%2C_Clustering%2C_and_Connection
>_Pooling suggests Bucardo and Londiste might be useful tools for you to
> investigate; I don't think Slony or Mammoth can handle slaves generating
> their own transactions and feeding them to the master, but given how
> complicated Slony is maybe I just don't know how to do it there.
>
> The Londiste tutorial at
> http://pgsql.tapoueh.org/site/html/londiste/londiste.html even starts out
> with a business situation that sounds similar to yours.

The updated Londiste Tutorial is now to be found on the PostgreSQL wiki, and
I've updated the Replication/Clustering page to add links to it:
  http://wiki.postgresql.org/wiki/Skytools
  http://wiki.postgresql.org/wiki/Londiste_Tutorial

> I would suggest that if you're new to the area of replication, do not
> assume that just because these tools look complicated that you'd be better
> off rolling your own.  The reason they're complicated is because they're
> filled with solutions to hard problems most people never even think they
> need to solve, until they get bit by one.

If you're afraid about their complexity, try londiste and enjoy :)
--
dim

Attachment

Re: Disconnected editing - versioning of databases

From
William Temperley
Date:
On Wed, Apr 15, 2009 at 9:34 AM, Dimitri Fontaine
<dfontaine@hi-media.com> wrote:
>> On Tue, 14 Apr 2009, William Temperley wrote:
>> > I could potentially run a database in each of these countries and
>> > provide 100% uptime, obviously raising the issue of version conflicts
>> > that would require hand-merging.
>
> Can you partition data on origin?
> If that's possible, then do it and use a schema per origin to simplify the
> administration thereafter, and "just" replicate some tables from orginin to UK
> and central data from UK to editing countries.
>

Thanks Dimitri, that might be the way forward for us. I guess that's
what's referred to in the Federated database on
http://wiki.postgresql.org/wiki/Londiste_Tutorial.
Eventually all data will need to find it's way into the main schema,
however. Once the multi-origin tables are replicated back to the UK
they could then be merged into this under supervision, though I'm not
sure what would happen to them once they were merged - I guess this
gives a multi-master situation.

>> Le Wednesday 15 April 2009, Greg Smith a écrit :
>> It sounds like you want an asynchronous master-slave database architecture
>> where the slaves can also send changes back to the master, but didn't know
>> that's what you should be looking for.
>>

That is what I was looking for, thanks Greg.

>
> If you're afraid about their complexity, try londiste and enjoy :)
> --
> dim
>

Londiste seems to be a consensus. It looks good -  modular,
configurable and it's Python! Cool.

Thanks again,

Will