Re: Merge a sharded master into a single read-only slave - Mailing list pgsql-general

From Sébastien Lorion
Subject Re: Merge a sharded master into a single read-only slave
Date
Msg-id CAGa5y0Pd3s4dB7sfLnpzd06Zg0HHL6719d5YW2ROQvcxEfUO5g@mail.gmail.com
Whole thread Raw
In response to Merge a sharded master into a single read-only slave  (Sébastien Lorion <sl@thestrangefactory.com>)
Responses Re: Merge a sharded master into a single read-only slave  (John R Pierce <pierce@hogranch.com>)
Re: Merge a sharded master into a single read-only slave  (Kevin Goess <kgoess@bepress.com>)
List pgsql-general
On Thu, May 29, 2014 at 12:58 PM, Sébastien Lorion <sl@thestrangefactory.com> wrote:
I have a master database sharded by user_id, with globally unique IDs for everything, except shared configuration data stored in global tables (resources strings, system parameters, etc).

What would be the best (ie both fast and reliable, simple to maintain as a bonus) to merge all shards into a single read-only slave that will then be replicated and used for read queries ? I took a look at Londiste and repmgr, and can see some ways to accomplish that, but would appreciate the advice of people here.

Thank you,

Sébastien

​Answering myself, please correct me if my findings are wrong.

I cannot find a way to accomplish the above without using statement level replication. That kind of defeat the point since if my DB is sharded, it's to avoid having to vertically scale to sustain the write charge, but by using statement level replication, I will now have to vertically scale the slave, bringing me back to square one.

So my conclusion is that for now, the best way to scale read-only queries for a sharded master is to implement map-reduce at the application level. Fortunately, most of the time, read queries scope can be limited to a single shard, but nonetheless, it would have been nice to avoid the additional complexity if it had been possible to merge sharded tables on a binary level (which should be much faster than statement level), given that their records will never overlap (i.e. the same record is never present in many shards).

​Sébastien​

pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Upgrading from PG 8.3.3 to 9.3.4 - FATAL: invalid value for parameter "TimeZone": "PST"
Next
From: John R Pierce
Date:
Subject: Re: Merge a sharded master into a single read-only slave