Home > mailing lists
Re: Replication - Mailing list pgsql-general

From	Jan Wieck
Subject	Re: Replication
Date	April 21, 2004 16:01:41
Msg-id	40869207.8020509@Yahoo.com Whole thread Raw
In response to	Re: Replication (Andrew Sullivan <ajs@crankycanuck.ca>)
Responses	Re: Replication
List	pgsql-general
Tree view
Andrew Sullivan wrote:

> On Tue, Apr 20, 2004 at 11:26:24AM +0200, Pailloncy Jean-G?rard wrote:
>> Hi,
>>
>> I just see that Mysql will propose at the end of the month a full
>> synchronous replication system with auto-recovery.
>
> Well, sort of.  It seems to be yet another 80/20 Solution From MySQL
> (tm).
>
> It looks like it's based on a new table type.  It stores everything
> in memory, and then writes out asynchronously.  This strikes me as
> pretty dangerous from the point of view of reliability: what if the
> box dies before the write is complete?  (And don't tell me about
> super-redundant high-availability hardware.  I _have_ all that.  All
> hardware sucks; HA stuff just sucks less often at a higher price.)
> Also, it doesn't support the other table types.  I don't want to
> contemplate the horrible mess you'd have to clean up if you had a
> transaction crossing three table types and get a hardware failure.
>
> I'm afraid I agree with the recently-posted Oracle Veep interview:
> this does not represent any serious challenge to the core ORAC
> market.

Quoting from the MySQL(tm) FAQ about MySQL(tm) Cluster(tm) avaliable at
http://www.mysql.com/products/cluster/faq.html

<quote>
Q: Does MySQL Cluster work with MyISAM and InnoDB?

A: MySQL Cluster can include the MyISAM and InnoDB storage engines. Of
these, the high-availability data must reside in the MySQL Cluster
storage engine.

The MySQL Cluster DB node stores MySQL Cluster data, the MySQL Server
parses SQL and sends requests to the DB node. The MySQL Server does not
store any data belonging to the MySQL Cluster storage engine.

InnoDB/MyISAM data is still stored in the MySQL server and can be used
in the standard way, but that data is not replicated, so that data is
not visible from any other MySQL server that is connected to the MySQL
Cluster.
</quote>

It is just another table handler made available for the SQL query
engine. Touting loudly and on all available channels that "MySQL Cluster
combines the world's most popular open source database with a
parallel-server" naturally leads to the misinterpretation that all the
wonderfull new features like foreign keys, MVCC and rollback will now
horizontally scale over multiple, high available nodes. This is not true.

The NDB table type does not have support for foreign keys, constraints,
triggers. It does support transactions, but these transactions are not
the same transactions as the ones of the InnoDB table handler, so a
COMMIT is not atomic across different table types. MySQL likes to point
out that the largest systems like SAP R/3 do not use referential
integrity on the database level. That is true so far, but having worked
for many years as an SAP base consultant I can tell you that the reason
for that is NOT performance. SAP spends that effort multiple times by
implementing their own, custom integrity control and data domain system
in the DB abstraction layer, to gain DB vendor independence. That
abstraction layer is larger than PHP and Apache together, so this
example is IMHO totally irrelevant for the typical MySQL user.

Also, the NDB table type is based on an in-memory, partitioned storage
engine (that's where the speed comes from) and to get high availablility
one needs at least two times the full database size in RAM (plus some
for the OS and other overhead), and a higher factor to really achieve
the 99.999%. So to serve let's say a 100 GB database, we're talking
about 220-240 GB of RAM. Now that's 8 boxes with 32GB each? And
according to a MySQL consultant I spoke with, the real bottleneck is the
network, so these boxes like to have "better than Gigabit Ethernet" as a
backbone. That are some decent hardware requirements, make sure you have
a forklift on your next shopping list.

So what one gets with NDB on the bottom line is another table type that
is usefull for some special cases. I can imagine for example systems
that read sensor data, which cannot be interrupted. Sensors usually
don't care much about referential integrity, so for the logging system
this is in fact irrelevant, the data has to be stored now and corrected
later. I think it is indeed a big plus for a system, to make that
logging data available inside the same SQL query engine where the more
complicated bits and pieces of the application are implemented in. But
that is all, and that can pretty easy be achieved by doing bulk-loads of
the log data into regular database tables. Unless one really needs the
ability to query and analyse up to the last second of logdata, running
some multiple 100 kilodollar hardware and network equipment just for the
fun of a memory cluster solution is a bit overkill.

As the Oracle VP of product strategy, Ken Jacobs, pointed out: "MySQL is
trying to address certain product shortcomings by acquiring a
third-party technology. This does not mean they now have a product that
is competitive with Oracle—or even other—database products, whether
clustered or not.". Absolutely right Mr. Jacobs, they have done that
before by adding InnoDB, now they added some limited multimaster
replication capabilities. But instead of developing an integrated
solution that includes the InnoDB table handler, where this
functionality would be usefull, they just added a fifth wheel to the cart.

>
>> I use PostgreSQL and I would appreciate to have the same features in
>> PostgreSQL.
>
> Sure, so would I.  Talk to Jan Wieck about what he plans to do
> about it, and maybe consider supporting that development work too ;-)

Ken Jacobs further said "No one has anything at all like Oracle's Real
Application Clusters". And that is right too. However good PostgreSQL by
now compares on SQL features and standalone DB performance. On
replication we are 2 years or more behind.

Right now we need to get the Slony-I project out the door and let that
settle a bit and maybe get enhanced over one more release. With that as
the base, we will start designing a synchronous multimaster system that
can be jump-started from a running, asynchronous replication setup. All
this "high-availability" babble is IMHO totally pointless as long as
there is no way of (re)creataing a (failed) node from scratch without
taking an outage. And that functionality is listed on the MySQL roadmap
for 5.1 ... so somewhere in 2008? Slony does that for async master-slave
right today.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #
pgsql-general by date:
From: Alexander Antonakakis
Date: 21 April 2004, 15:26:32
Subject: Re: Unicode problem ???
From: miguel angel rojas aquino
Date: 21 April 2004, 16:32:29
Subject: [OT] Problem starting pgsql on windows 2000 server
Re: Replication - Mailing list pgsql-general

Previous

Next