Re: Scalability (both vertical and horizontal)? - Mailing list pgsql-general

From Ron Johnson
Subject Re: Scalability (both vertical and horizontal)?
Date
Msg-id 1063938123.11739.1673.camel@haggis
Whole thread Raw
In response to Re: Scalability (both vertical and horizontal)?  ("scott.marlowe" <scott.marlowe@ihs.com>)
List pgsql-general
On Thu, 2003-09-18 at 16:32, scott.marlowe wrote:
> On Thu, 18 Sep 2003, Dennis Gearon wrote:
>
> > scott.marlowe wrote:
> >
> > >On Thu, 18 Sep 2003, Duffey, Kevin wrote:
[snip]
> > Are there any databases that do well in horizontal scaling? What really
> > *IS* Oracle Real Application Clusters?

DECpaq ported VMSclusters to Tru64.  Oracle then licensed that
and built it into RAC.

> I've heard Vax Clusters running RDB do well.

It's seamless and extremely simple.  On each node that you want
the database open, you run the command:
$RMU/OPEN <yourdatabase>

The "root file" knows what transactions are open on each node, so
if node FOO crashes, the MONITOR picks another node to apply the
RECOVERY UNIT JOURNAL files from the transactions that were open
on FOO when it crashed.
(RDB doesn't use MVCC, so it keeps an RUJ file for each process,
that has the "before images" of all table & index tuples involved
in transactions.  If the txn commits successfully, the file is
zeroed out, but if it must be rolled back, the data is read back
from the RUJ and applied back to the appropriate tablespace pages.)

> TPF on a mainframe is highly recommended by Sabre, the Airline reservation
> folks.
>
> I've heard horror stories about RAC though.
>
> I don't think there's anysuch thing as an easily configurable high
> performance clustering solution.  The better the run the more
> infrastructure (hardware, software, support) they seem to need.

VMSclusters are very easy to support, but the speed problem is there.
It's just so much faster to from CPU0 to CPU15 in a 16x SMP or NUMA
box, than it is between nodes is a 4x cluster of 4x SMP boxes.

Also, you need so much extra *expensive* "stuff" to connect a bunch
of boxen with dual-redundancy in a high-speed cluster.

Database replication to a remote site that has a "smaller" box is
usually a cheaper solution nowadays, since the remote box can still
be used as a report engine.

A recent development is "in-box clustering".  Recent large Alpha
systems (along with recent versions of VMS) allow for h/w partition-
ing (like mainframes have done for, what, 20 years?, and Sun does
with the Starfire).
Thus, you can take a 16x machine and divvie it up into 4 nodes.
The cool part is that inter-node chatter takes place at "in-box"
speeds, instead of a wire speeds.
So, you could have nodes dedicated to batch jobs, on-line access,
etc.
On-line on-the-fly re-partitioning allows you to take CPUs from
underutilized nodes and allocate them to overtaxed nodes.
Of course, RDB doesn't need a clue about this; it just runs.

--
-----------------------------------------------------------------
Ron Johnson, Jr. ron.l.johnson@cox.net
Jefferson, LA USA

"What other evidence do you have that they are terrorists, other
than that they trained in these camps?"
17-Sep-2002 Katie Couric to an FBI agent regarding the 5 men
arrested near Buffalo NY


pgsql-general by date:

Previous
From: "Andrew L. Gould"
Date:
Subject: Re: PostgreSQL versus MySQL
Next
From: Ron Johnson
Date:
Subject: Re: Scalability (both vertical and horizontal)?