Thread: Is there a peer-to-peer server solution with PG?

Is there a peer-to-peer server solution with PG?

From
Mike Nolan
Date:
I have need to set up a 2nd database server for a client in their new
offices in another state this month.  We will be shutting down the old
offices later this year but we really don't want to have 2-3 days of
downtime while we physically transfer equipment 800 miles.

We should have decent data connections between the two offices starting
next week, but I was wonding if there is a good peer-to-peer option for
PostgreSQL at this time.

As I understand Slony, it is master-slave only.
--
Mike Nolan

Re: Is there a peer-to-peer server solution with PG?

From
David Fetter
Date:
On Thu, Feb 03, 2005 at 06:25:50PM -0600, Mike Nolan wrote:
> I have need to set up a 2nd database server for a client in their
> new offices in another state this month.  We will be shutting down
> the old offices later this year but we really don't want to have 2-3
> days of downtime while we physically transfer equipment 800 miles.
>
> We should have decent data connections between the two offices
> starting next week, but I was wonding if there is a good
> peer-to-peer option for PostgreSQL at this time.
>
> As I understand Slony, it is master-slave only.

Slony-1 is perfectly capable of replicating to a slave database, then
letting you decide to promote it to master, which is just what you'd
need.  Why are you asking about multi-master?

Cheers,
D
--
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

Re: Is there a peer-to-peer server solution with PG?

From
Mike Nolan
Date:
> Slony-1 is perfectly capable of replicating to a slave database, then
> letting you decide to promote it to master, which is just what you'd
> need.  Why are you asking about multi-master?

I am concerned that if I have to support the traffic to keep the slave
unit in sync PLUS support general database use from the 'slaved' office
to the master one, on the same comm line, I might start running into
congestion issues.

We will have people actively working the database in both office for
a period of several weeks to several months, depending on how the final
transfer plan unfolds.

Master/Slave is probably an acceptable solution, I was just wondering if
there was a multi-master one available yet.
--
Mike Nolan

Re: Is there a peer-to-peer server solution with PG?

From
David Fetter
Date:
On Thu, Feb 03, 2005 at 07:03:36PM -0600, Mike Nolan wrote:
> > Slony-1 is perfectly capable of replicating to a slave database,
> > then letting you decide to promote it to master, which is just
> > what you'd need.  Why are you asking about multi-master?
>
> I am concerned that if I have to support the traffic to keep the
> slave unit in sync PLUS support general database use from the
> 'slaved' office to the master one, on the same comm line, I might
> start running into congestion issues.

Slony-1 does its level best to ship transactions in a compact way.
Any write operations are done as the net result of the write
transaction, not necessarily all the steps in between.  IOW, don't
worry too much :)

> We will have people actively working the database in both office for
> a period of several weeks to several months, depending on how the
> final transfer plan unfolds.

Sounds like a fit for Slony-1.  Just make sure that nobody tries to
write to a slave, as such writes will fail.

> Master/Slave is probably an acceptable solution, I was just
> wondering if there was a multi-master one available yet.

Not really.  If you *must* have multi-master, you probably have to get
Oracle or DB2 and pay /mucho dinero/.

Cheers,
D
--
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

Re: Is there a peer-to-peer server solution with PG?

From
Christopher Browne
Date:
Martha Stewart called it a Good Thing when nolan@gw.tssi.com (Mike Nolan) wrote:
>> Slony-1 is perfectly capable of replicating to a slave database,
>> then letting you decide to promote it to master, which is just what
>> you'd need.  Why are you asking about multi-master?
>
> I am concerned that if I have to support the traffic to keep the
> slave unit in sync PLUS support general database use from the
> 'slaved' office to the master one, on the same comm line, I might
> start running into congestion issues.
>
> We will have people actively working the database in both office for
> a period of several weeks to several months, depending on how the
> final transfer plan unfolds.
>
> Master/Slave is probably an acceptable solution, I was just
> wondering if there was a multi-master one available yet.

There is an effort under way; in planning stages at this point.  Don't
expect that to be "productized" next month...

Let me wag a finger at one of your assumptions...

You should re-examine assumptions with great care if you start
imagining that you'll get more throughput out of a general purpose
"multimaster" system.  (Something designed specifically for your
application is quite another matter, particularly if your application
turns out to be, in some fashion "embarassingly parallelizable.")

Synchronization can't _conceivably_ come for free; it has _got_ to
have some cost in terms of decreasing overall performance.

If you have so much update load that one server cannot accomodate that
load, then you should wonder why you'd expect that causing every one
of these updates to be applied to (say) 3 servers would "diminish"
this burden.

Each of the 3 servers may only have to take on 1/3 of the updates from
the outside, but they surely have to accomodate the other 2/3 as well.

This not to say that there can't be some benefits from multimaster
replication; that's why such projects are proceeding.

But it's NOT a panacea; it's NOT an easy "general purpose solution."

I was in a room with The Thinkers; I got the sense that the lights
dimmed for blocks around when they put their thinking caps on :-).  To
this group of Rather Smart Folk, perceiving the array of concurrency
and locking problems required great attention on their part.  'Easy'
is definitely not the right word...
--
(reverse (concatenate 'string "gro.mca" "@" "enworbbc"))
http://linuxdatabases.info/info/slony.html
Rules of the Evil Overlord #31. "All naive, busty tavern wenches in my
realm  will be replaced  with surly,  world-weary waitresses  who will
provide no  unexpected reinforcement  and/or romantic subplot  for the
hero or his sidekick." <http://www.eviloverlord.com/>

Re: Is there a peer-to-peer server solution with PG?

From
Mike Nolan
Date:
> If you have so much update load that one server cannot accomodate that
> load, then you should wonder why you'd expect that causing every one
> of these updates to be applied to (say) 3 servers would "diminish"
> this burden.

The update/query load isn't the real issue here, it's that these two
servers will be 800 miles apart and there are some advantages in having
each office connect to its local database rather than having one of
them connect to the remote master.

The Slony-1 approach will work, assuming I've got suffient network
bandwidth to support it plus the traffic from the remote office plus
exixting outside traffic from our public website.

That's one of those things you just don't know will work until you
have it built, so I'm looking for other options now while I have time
to consider them.  Once I get on-site in two weeks it'll a lot more hectic.
--
Mike Nolan

Re: Is there a peer-to-peer server solution with PG?

From
Christopher Browne
Date:
Martha Stewart called it a Good Thing when david@fetter.org (David Fetter) wrote:
> On Thu, Feb 03, 2005 at 07:03:36PM -0600, Mike Nolan wrote:
>> > Slony-1 is perfectly capable of replicating to a slave database,
>> > then letting you decide to promote it to master, which is just
>> > what you'd need.  Why are you asking about multi-master?
>>
>> I am concerned that if I have to support the traffic to keep the
>> slave unit in sync PLUS support general database use from the
>> 'slaved' office to the master one, on the same comm line, I might
>> start running into congestion issues.
>
> Slony-1 does its level best to ship transactions in a compact way.
> Any write operations are done as the net result of the write
> transaction, not necessarily all the steps in between.  IOW, don't
> worry too much :)

Sorta.  If there were SQL queries involving in _preparing_ for the
writes, those queries do not need to be run again.

On the other hand, if you run a SQL query like:

  delete from t1 where id in (select id from t1 limit
        7000);

(Which is a query I have recently used for some testing...)

you'll discover that ultimately turns into somewhere around 7000
delete statements when it hits the replica.

-> An insert of 7000 rows becomes 7000 insert statements
-> A delete of 7000 rows becomes 7000 delete statements
-> An update to 7000 rows becomes 7000 update statements

Mass updates can therefore get fairly expensive, alas.

>> We will have people actively working the database in both office
>> for a period of several weeks to several months, depending on how
>> the final transfer plan unfolds.

> Sounds like a fit for Slony-1.  Just make sure that nobody tries to
> write to a slave, as such writes will fail.

... Which is actually a Remarkably Good Feature.  I once pointed a
report that wanted to update data to a wrong node, and would have been
Seriously Chagrined if it had silently gone along with the updates...
--
output = ("cbbrowne" "@" "acm.org")
http://linuxdatabases.info/info/slony.html
"...In my phone conversation with Microsoft's lawyer I copped to the
fact that just maybe his client might see me as having been in the
past just a bit critical of their products and business
practices. This was too bad, he said with a sigh, because they were
having a very hard time finding a reporter who both knew the industry
well enough to be called an expert and who hadn't written a negative
article about Microsoft." -- Robert X. Cringely

Re: Is there a peer-to-peer server solution with PG?

From
Marco Colombo
Date:
On Fri, 4 Feb 2005, Mike Nolan wrote:

>> If you have so much update load that one server cannot accomodate that
>> load, then you should wonder why you'd expect that causing every one
>> of these updates to be applied to (say) 3 servers would "diminish"
>> this burden.
>
> The update/query load isn't the real issue here, it's that these two
> servers will be 800 miles apart and there are some advantages in having
> each office connect to its local database rather than having one of
> them connect to the remote master.

Considering that the two masters need to be connected, I don't see the
advantage. Any write on the _local_ master will have to wait for the
_remote_ master to complete as well.

> The Slony-1 approach will work, assuming I've got suffient network
> bandwidth to support it plus the traffic from the remote office plus
> exixting outside traffic from our public website.

Local read-only access won't travel on the network, both with multi-master
and with master-slave.

Write traffic _will_ be transmitted over the wire, both with multi-master
and with master-slave. With multi-master _every write operation_ will be
remotely duplicated, _both ways_, _synchronously_. That is, master-1 has
to wait for master-2 and vice versa. If you're concerned with network
performances, multi-master will only make it worse.

With master-slave, _only_ the write operations performed on
the slave side need to travel, since clients will perform them directly
on the master. On the master side writes are only local.
Of course, the _results_ of the writes will have to be propagated to the
slave (and thus they travel on the network as well), but that's another
matter (delay is usually acceptable, and fits MVCC - the semantics not
broken).

If you're willing to break semantics, you may run two splitted masters
and find a way to synchronize them. But that requires application level
knowledge, if not human intervention, to resolve conflicts.


> That's one of those things you just don't know will work until you
> have it built, so I'm looking for other options now while I have time
> to consider them.  Once I get on-site in two weeks it'll a lot more hectic.
> --
> Mike Nolan
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>      joining column's datatypes do not match
>

.TM.
--
       ____/  ____/   /
      /      /       /            Marco Colombo
     ___/  ___  /   /              Technical Manager
    /          /   /             ESI s.r.l.
  _____/ _____/  _/               Colombo@ESI.it

Re: Is there a peer-to-peer server solution with PG?

From
Tzahi Fadida
Date:
I am just a newbie but logically:
Maybe the answer to that is much simpler.
Ask your network officer to tell you whats the bandwidth you
have on your current office and remote office.
whats the avg:
a. website bandwidth.
b. current postgress office bandwidth.

I never used replication but it seems to me you'll need
a+2*b bandwidth at your current office and 2*b at your remote office
for the period of transition.
If your db size is C then you'll need (C/b)/3600 hrs in transition time.
do the math and if it fits great. If not, well...


Regards,
    tzahi.

> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Mike Nolan
> Sent: Friday, February 04, 2005 12:57 PM
> To: Christopher Browne
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Is there a peer-to-peer server
> solution with PG?
>
>
> > If you have so much update load that one server cannot
> accomodate that
> > load, then you should wonder why you'd expect that causing
> every one
> > of these updates to be applied to (say) 3 servers would "diminish"
> > this burden.
>
> The update/query load isn't the real issue here, it's that
> these two servers will be 800 miles apart and there are some
> advantages in having each office connect to its local
> database rather than having one of them connect to the remote
> master.
>
> The Slony-1 approach will work, assuming I've got suffient
> network bandwidth to support it plus the traffic from the
> remote office plus
> exixting outside traffic from our public website.
>
> That's one of those things you just don't know will work
> until you have it built, so I'm looking for other options now
> while I have time to consider them.  Once I get on-site in
> two weeks it'll a lot more hectic.
> --
> Mike Nolan
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index
> scan if your
>       joining column's datatypes do not match
>
>



How many connections now ?

From
"NTPT"
Date:
Is there a way to determine how many connections to the database are  active
?
some form of select  ? ie how many client application are connected to
server  ?  I need this value for client application ...

Thank for help.


Re: How many connections now ?

From
Devrim GUNDUZ
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Hi,

On Sat, 5 Feb 2005, NTPT wrote:

> Is there a way to determine how many connections to the database are  active
> ?
> some form of select  ? ie how many client application are connected to server
> ?  I need this value for client application ...

Check  pg_stat_activity  table...

Regards,
- --
Devrim GUNDUZ
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.tdmsoft.com                         http://www.gunduz.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFCBKz9tl86P3SPfQ4RAiTpAKDwUUuvju8xLhAjFzCKYsF4WQp7JACgrpQC
hhw0sDz2XDHElHCfBZFP8gs=
=pYgE
-----END PGP SIGNATURE-----

Re: Is there a peer-to-peer server solution with PG?

From
Jan Wieck
Date:
On 2/4/2005 5:56 AM, Mike Nolan wrote:

>> If you have so much update load that one server cannot accomodate that
>> load, then you should wonder why you'd expect that causing every one
>> of these updates to be applied to (say) 3 servers would "diminish"
>> this burden.
>
> The update/query load isn't the real issue here, it's that these two
> servers will be 800 miles apart and there are some advantages in having
> each office connect to its local database rather than having one of
> them connect to the remote master.

You do realize that any multimaster replication system, that is designed
to avoind complex business process structure based conflict resolution
mechanisms, necessarily has to be based on 2 phase commit or similar? So
your global write transaction throughput will be limited by the latency
of your WAN, no matter what bandwidth you have. And as per RFC 1925: No
matter how hard you push and no matter what the priority, you can't
increase the speed of light.

I think what you are really looking for is an application internal
abstraction layer based multmaster replication approach.


Jan


>
> The Slony-1 approach will work, assuming I've got suffient network
> bandwidth to support it plus the traffic from the remote office plus
> exixting outside traffic from our public website.
>
> That's one of those things you just don't know will work until you
> have it built, so I'm looking for other options now while I have time
> to consider them.  Once I get on-site in two weeks it'll a lot more hectic.
> --
> Mike Nolan
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match


--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: Is there a peer-to-peer server solution with PG?

From
Christopher Browne
Date:
Quoth JanWieck@Yahoo.com (Jan Wieck):
> On 2/4/2005 5:56 AM, Mike Nolan wrote:
>
>>> If you have so much update load that one server cannot accomodate that
>>> load, then you should wonder why you'd expect that causing every one
>>> of these updates to be applied to (say) 3 servers would "diminish"
>>> this burden.

>> The update/query load isn't the real issue here, it's that these two
>> servers will be 800 miles apart and there are some advantages in having
>> each office connect to its local database rather than having one of
>> them connect to the remote master.
>
> You do realize that any multimaster replication system, that is
> designed to avoind complex business process structure based conflict
> resolution mechanisms, necessarily has to be based on 2 phase commit
> or similar? So your global write transaction throughput will be
> limited by the latency of your WAN, no matter what bandwidth you
> have. And as per RFC 1925: No matter how hard you push and no matter
> what the priority, you can't increase the speed of light.
>
> I think what you are really looking for is an application internal
> abstraction layer based multmaster replication approach.

Note also that there can be some "embarassingly parallel" systems that
can scale _perfectly well_ given some reasonable 'global sequencing
system.'  ("Global sequences" seemed to be the only 'surprise' that
popped up at the conference, which surprised me because it was one of
the few things I was entirely certain needed discussion ;-).)

Consider a general ledger transaction system for a retail operation
with 200 stores.

Each accounting transaction actually is pretty independent of the
others; with suitable application design, it's perfectly reasonable to
generate transactions locally at each site and [somehow; there lies
the grand detail] roll those together into the central G/L at the end.

Purely accounting transactions don't usually have any need to conflict
with anything at all.

Of course, as soon as you have _shared_ objects in your business, such
as account balances [that people can lay claim to], inventory [that we
can promise to customers], and 8 people that want to change prices,
then comes the trouble of conflict resolution.

You've got to think about which of these sorts of conflicts do and do
not exist before heading down any of the "multimaster" roads otherwise
trouble awaits...

It's quite likely for reality to need to be a mix.  For account
balances, you more than likely need to go to only ONE source;
"conflict resolution" would be liable to 'break the bank.'  For
inventory, it's probably not unreasonable for different sites to fight
over it :-).
--
let name="cbbrowne" and tld="acm.org" in name ^ "@" ^ tld;;
http://linuxdatabases.info/info/slony.html
What's another word for synonym?

Re: Is there a peer-to-peer server solution with PG?

From
Pailloncy Jean-Gerard
Date:
> You do realize that any multimaster replication system, that is
> designed to avoind complex business process structure based conflict
> resolution mechanisms, necessarily has to be based on 2 phase commit
> or similar? So your global write transaction throughput will be
> limited by the latency of your WAN, no matter what bandwidth you have.
> And as per RFC 1925: No matter how hard you push and no matter what
> the priority, you can't increase the speed of light.
>
> I think what you are really looking for is an application internal
> abstraction layer based multmaster replication approach.
Hi,

I found a paper about Clustra DB
http://www.nuug.no/pub/dist/20011017-clustra.pdf

Clustra is a cluster database for high-availability, any node has one
other fail-over node designed. But the interesting idea is that the
data is distributed across the node.
A transaction implies many nodes, but is managed by a two phase commit
and the log is written on ONLY few (two) nodes.
I suppose (because I never used it) that there is a speed boost.

Cordialement,
Jean-Gérard Pailloncy


Re: Is there a peer-to-peer server solution with PG?

From
Christopher Browne
Date:
jg@rilk.com (Pailloncy Jean-Gerard) writes:
> I suppose (because I never used it) that there is a speed boost.

I would suppose (because synchronization has substantial costs) that
there is more than likely _no_ material improvement in performance.

"Clustering" can only provide material performance improvements if the
problem turns out to be "embarrassingly parallelizable."  That is a
rather difficult condition to satisfy, and it is very likely to be
FALSE whenever there is need for conflict resolution between nodes.
--
"cbbrowne","@","ca.afilias.info"
<http://linuxdatabases.info/info/slony.html>
Christopher Browne
(416) 673-4124 (land)

Re: Is there a peer-to-peer server solution with PG?

From
Robert Treat
Date:
On Sat, 2005-02-05 at 15:03, Jan Wieck wrote:
> On 2/4/2005 5:56 AM, Mike Nolan wrote:
>
> >> If you have so much update load that one server cannot accomodate that
> >> load, then you should wonder why you'd expect that causing every one
> >> of these updates to be applied to (say) 3 servers would "diminish"
> >> this burden.
> >
> > The update/query load isn't the real issue here, it's that these two
> > servers will be 800 miles apart and there are some advantages in having
> > each office connect to its local database rather than having one of
> > them connect to the remote master.
>
> You do realize that any multimaster replication system, that is designed
> to avoind complex business process structure based conflict resolution
> mechanisms, necessarily has to be based on 2 phase commit or similar? So
> your global write transaction throughput will be limited by the latency
> of your WAN, no matter what bandwidth you have. And as per RFC 1925: No
> matter how hard you push and no matter what the priority, you can't
> increase the speed of light.
>

I think the advantage Mike is looking for is to not have his READ
traffic have to travel 1600 miles for the remote office. If the read's
outnumber the writes by enough, he might have something to gain.

Mike, I've yet to see a thorough review of daffodil replicator but it
may be able to help get you to a little closer to what your looking for.
If you have time please check it out and see if it can be of any help,
I'm sure many of us would be interested in hearing some feedback on it.
http://www.daffodildb.com/dbreplicator.html



Robert Treat
--
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL