Re: Horizontal scalability/sharding - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Horizontal scalability/sharding
Date
Msg-id 20150902140558.GA21716@momjian.us
Whole thread Raw
In response to Re: Horizontal scalability/sharding  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On Tue, Sep  1, 2015 at 06:11:45PM -0400, Bruce Momjian wrote:
> Let me clearer about what the Citus Data paper shows.  I said originally
> that the data was sent to the coordinator, sorted, then resent to the
> shards, but the document:
> 
>     https://goo.gl/vJWF85
>     https://www.citusdata.com/blog/114-how-to-build-your-distributed-database
> 
> has the shards create the groups and the groups are sent to the other
> shards.  For example, to do COUNT(DISTINCT) if you have three shards,
> then each shard breaks its data into 3 buckets (1B in size), then the
> first bucket from each of the three shards goes to the first shard, and
> the second bucket goes to the second shared, etc.
> 
> Basically, they are doing map-reduce, and the shards are creating
> additional batches that get shipped to other shards.  I can see FDWs not
> working well in that case as you are really creating a new data layout
> just for the query.  This explains why the XC/XL people are saying they
> would use FDWs if they existed at the time they started development,
> while the Citus Data people are saying they couldn't use FDWs as they
> currently exist.  They probably both needed FDW improvements, but I
> think the Citus Data features would need a lot more.

To expand on this, using FDWs, it means each shard would create a
temporary table on the other shards and send some if its data to those
shards.  Once a shard gets all its data from the other shards, it will
process the data and send the result to the collector.

That certainly seems like something FDWs would not do well.  Frankly, I
am unclear how Citus Data was able to do this with only backend hooks.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Allow a per-tablespace effective_io_concurrency setting
Next
From: Andres Freund
Date:
Subject: Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore