Re: PostgreSQL clustering VS MySQL clustering - Mailing list pgsql-performance

From William Yu
Subject Re: PostgreSQL clustering VS MySQL clustering
Date
Msg-id 41FB47B6.3010507@talisys.com
Whole thread Raw
In response to Re: PostgreSQL clustering VS MySQL clustering  (Greg Stark <gsstark@mit.edu>)
List pgsql-performance
>>I know what I would choose. I'd get the mega server w/ a ton of RAM and skip
>>all the trickyness of partitioning a DB over multiple servers. Yes your data
>>will grow to a point where even the XXGB can't cache everything. On the
>>otherhand, memory prices drop just as fast. By that time, you can ebay your
>>original 16/32GB and get 64/128GB.
>
>
> a) What do you do when your calculations show you need 256G of ram? [Yes such
> machines exist but you're not longer in the realm of simply "add more RAM".
> Administering such machines is nigh as complex as clustering]

If you need that much memory, you've got enough customers paying you
cash to pay for anything. :) Technology always increase -- 8X Opterons
would double your memory capacity, higher capacity DIMMs, etc.

> b) What do you do when you find you need multiple machines anyways to divide
> the CPU or I/O or network load up. Now you need n big beefy servers when n
> servers 1/nth as large would really have sufficed. This is a big difference
> when you're talking about the difference between colocating 16 1U boxen with
> 4G of ram vs 16 4U opterons with 64G of RAM...
>
> All that said, yes, speaking as a user I think the path of least resistance is
> to build n complete slaves using Slony and then just divide the workload.
> That's how I'm picturing going when I get to that point.

Replication is good for uptime and high read systems. The problem is
that if your system has a high volume of writes and you need near
realtime data syncing, clusters don't get you anything. A write on one
server means a write on every server. Spreading out the damage over
multiple machines doesn't help a bit.

Plus the fact that we don't have multi-master replication yet is quite a
bugaboo. That requires writing quite extensive code if you can't afford
to have 1 server be your single point of failure. We wrote our own
multi-master replication code at the client app level and it's quite a
chore making sure the replication act logically. Every table needs to
have separate logic to parse situations like "voucher was posted on
server 1 but voided after on server 2, what's the correct action here?"
So I've got a slew of complicated if-then-else statements that not only
have to take into account type of update being made but the sequence.

And yes, I tried doing realtime locks over a VPN link over our servers
in SF and VA. Ugh...latency was absolutely horrible and made
transactions run 1000X slower.

pgsql-performance by date:

Previous
From: Greg Stark
Date:
Subject: Re: PostgreSQL clustering VS MySQL clustering
Next
From: Sebastian Böck
Date:
Subject: Re: Optimizing Outer Joins