Re: [HACKERS] Mariposa - Mailing list pgsql-hackers

From Bob Devine
Subject Re: [HACKERS] Mariposa
Date
Msg-id 37A72837.762BF6E1@cs.utah.edu
Whole thread Raw
Responses Re: [HACKERS] Mariposa
List pgsql-hackers
Ross J. Reedstrom wrote:

> Right. As I've been able to make out so far, in Mariposa a query passes
> through the regular parser and single-site optimizer, then the selected
> plan tree is handed to a 'fragmenter' to break the work up into chunks,
> which are then handed around to a 'broker' which uses a microeconomic
> 'bid' process to parcels them out to both local and remote executors. The
> results from each site then go through a local 'coordinator' which merges
> the result sets, and hands them back to the original client.
> 
> Whew!
> 
> It's interesting to compare the theory describing the workings of Mariposa
> (such as the paper in VLDB), and the code. For the fragmenter, the paper
> describes basically a rational decomposition of the plan, while the code
> applies non-deterministic, but tuneable, methods (lots of calls to random
> and comparisions to user specified odds ratios).
> 
> It strikes me as a bit odd to optimize the plan for a single site,
> then break it all apart again. My thoughts on this are to implement
> a two new node types: one a remote table, and one which represents
> access to a remote table. Remote tables have host info in them, and
> always be added to the plan with a remote-access node directly above
> them. Remote-access nodes would be seperate from their remote-table,
> to allow the communications cost to be slid up the plan tree, and merged
> with other remote-access nodes talking to the same server. This should
> maintain the order-agnostic nature of the optimizer. The executor will
> need to build SQL statements and from the sub-plans and submit them via
> standard network db access client librarys.
> 
> First step, create a remote-table node, and teach the excutor how to get
> info from it. Later, add the seperable remote-access node.
> 
> How insane does this sound now? Am I still a mad scientist? (...always!)

Let me give a brief "what, where, and why" about Mariposa.

The impetus for Mariposa was that every distributed database
idea either died or failed to scale up beyond a handful of nodes.
There was Ingres/Star, IBM's important R* project, DEC's failed
RdbStar project, and others.  The idea of grouping together a
bunch of servers is a seductive, but very hard, one.

What exists today is a group of simple extensions (basically they
are: treating remote tables differently, use replication instead
of distributed consistency, and pushing the problem up to the
programmer's level).  Not too transparent or even seamless.

Mariposa proposed that tables can dynamically move.  And fragments
of tables.  And queries too!  This helps a lot towards meeting
the big problem of load balancing and configuring any huge system.
Compare this with web servers -- web sites can become overloaded
plus everyone knows the permanent location of the data on the
server (there is some tricky footwork behind the scenes that allow
multiple servers to share the load).

Soooo, how to optimize a query where _everything_ can move?
That is the basis for the "optimize for a single site, later
split it for distributed execution" idea.  The loss of data
mobility would have been more painful than a more complicated
optimizer (at least in theory! ;-)  So it does a bit insane but
the alternative would have been to assume all tables are remote
but this collapses to the same idea as "all local" tables that
just have higher access costs.

Fast forward to 1999, Mariposa is now being commercialized by
Cohera (www.cohera.com) as a middleware distribution layer.

--
Bob Devine  devine@cs.utah.edu

(PS: Just for background, I proposed a lot of the Mariposa ideas
way back in 1992 at Berkeley after working on DEC's RdbStar.
Now working on Impulse - www.cs.utah.edu/projects/impulse)




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] Threads
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] pg_upgrade may be mortally wounded