Re: Horizontal scalability/sharding - Mailing list pgsql-hackers

From Mason S
Subject Re: Horizontal scalability/sharding
Date
Msg-id CA+rR5x0dethD9+3iKRp92fTeCMWOxAdFTG6E4x08y3h-NjHJkw@mail.gmail.com
Whole thread Raw
In response to Re: Horizontal scalability/sharding  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Horizontal scalability/sharding  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers


On Tue, Sep 1, 2015 at 6:55 AM, Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Aug 31, 2015 at 11:23:58PM -0300, Alvaro Herrera wrote:
> Bruce Momjian wrote:
>
> > My hope is that many FDW improvements will benefit sharding and
> > non-sharding workloads, but I bet some improvements are going to be
> > sharding-specific.  I would say we are still in the exploratory stage,
> > but based on the number of people who care about this feature and want
> > to be involved, I think we are off to a very good start.  :-)
>
> Having lots of interested people doesn't help with some problems,
> though.  The Citus document says:
>
>       And the issue with these four limitations wasn't with foreign
>       data wrappers. We wrote mongo_fdw and cstore_fdw, and we're
>       quite happy with the contract FDWs provide. The problem was that
>       we were trying to retrofit an API for something that it was
>       fundamentally not designed to do.

I had a chance to review the Citus Data document just now:

        https://goo.gl/vJWF85

Particularly, it links to this document, which is clearer about the
issues they are trying to solve:

        https://www.citusdata.com/blog/114-how-to-build-your-distributed-database

The document opens a big question --- when queries can't be processed in
a traditional top/down fashion, Citus has the goal of sending groups of
results up the the coordinator, reordering them, then sending them back
to the shards for further processing, basically using the shards as
compute engines because the shards are no longer using local data to do
their computations.  The two examples they give are COUNT(DISTINCT) and
a join across two sharded tables ("CANADA").

I assumed these queries were going to be solved by sending as digested
data as possible to the coordinator, and having the coordinator complete
any remaining processing.  I think we are going to need to decide if
such "sending data back to shards" is something we are ever going to
implement.  I can see FDWs _not_ working well for that use-case.


For efficient internodes joins with row shipping, FDWs may also not be easy to do. Maybe it is possible if we optionally pass in lists of other nodes and information about how they are partitioned so data knows where to get shipped. 

A challenge for planning with arbitrary copies of different shards is that sometimes you may be able to push down joins, sometimes not. Planning and execution get ugly. Maybe this can be simplified by parent-child tables following the same partitioning scheme.

Mason

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Horizontal scalability/sharding
Next
From: Anastasia Lubennikova
Date:
Subject: Re: Should \o mean "everything?"