Re: Horizontal scalability/sharding - Mailing list pgsql-hackers

From Sumedh Pathak
Subject Re: Horizontal scalability/sharding
Date
Msg-id CAEowBnzBFA85ZcQ20ab-ZTCwn55tanrr4kgJuriRtTNcpwg0tQ@mail.gmail.com
Whole thread Raw
In response to Horizontal scalability/sharding  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
Hi Bruce,

Sumedh from Citus Data here.

August, 2015:  While speaking at SFPUG, Citus Data approached me about joining the FDW sharding team.  They have been invited to the September 1 meeting, as have the XC and XL people.

I'd like to add a clarification. We already tried the FDW APIs for pg_shard two years ago and failed. We figured sharing our learnings could contribute to the technical discussion and that's why we wanted to be in the call.

Ozgun summarized our technical learnings in this design document: https://goo.gl/vJWF85

In the document, we focused on one of the four learnings we had with FDW APIs. For us, we switched to the hook API based approach, and things went smoothly from there.

Best,
Sumedh

On Sat, Aug 29, 2015 at 7:17 PM, Bruce Momjian <bruce@momjian.us> wrote:
I have recently increased my public statements about the idea of adding
horizontal scaling/sharding to Postgres. I wanted to share with hackers
a timeline of how we got here, and where I think we are going in the
short term:

2012-2013:  As part of writing my scaling talk
(http://momjian.us/main/presentations/overview.html#scaling), studying
Oracle RAC, and talking to users, it became clear that an XC-like
architecture (sharding) was the only architecture that was going to allow
for write scaling.

Users and conference attendees I talked to were increasingly concerned
about the ability of Postgres to scale for high write volumes.  They didn't
necessarily need that scale now, but they needed to know they could get
it if they wanted it, and wouldn't need to switch to a new database in
the future.  This is similar to wanting a car that can get you on a highway
on-ramp fast --- even if you don't need it, you want to know it is there.

2014:  I started to shop around the idea that we could use FDWs,
parallelism, and a transaction/snapshot manager to get XC features
as built-in to Postgres.  (I don't remember where the original idea
came from.)  It was clear that having separate forks of the source code
in XC and XL was never going to achieve critical mass --- there just
aren't enough people who need high right scale right now, and the fork
maintenance overhead is a huge burden.

I realized that we would never get community acceptance to dump the XC
(or XL) code needed for sharding into community Postgres, but with FDWs,
we could add the features as _part_ of improving FDWs, which would benefit
FDWs _and_ would be useful for sharding.  (We already see some of those
FDW features in 9.5.)

October, 2014:  EDB and NTT started working together in the community
to start improving FDWs as a basis for an FDW-based sharding solution.
Many of the 9.5 FDW improvements that also benefit sharding were developed
by a combined EDB/NTT team.  The features improved FDWs independent of
sharding, so they didn't need community buy-in on sharding to get them
accepted.

June, 2015:  I attended the PGCon sharding unconference session and
there was a huge discussion about where we should go with sharding.
I think the big take-away was that most people liked the FDW approach,
but had business/customer reasons for wanting to work on XC or XL because
those would be production-ready faster.

July, 2015:  Oleg Bartunov and his new company Postgres Professional (PP)
started to think about joining the FDW approach, rather than working on
XL, as they had stated at PGCon in June.  A joint NTT/EDB/PP phone-in
meeting is scheduled for September 1.

August, 2015:  While speaking at SFPUG, Citus Data approached me about
joining the FDW sharding team.  They have been invited to the September
1 meeting, as have the XC and XL people.

October, 2015:  EDB is sponsoring a free 3-hour summit about FDW sharding
at the PG-EU conference in Vienna.   Everyone is invited, but it is hoped
most of the September 1 folks can attend.

February, 2016:  Oleg is planning a similar meeting at their February
Moscow conference.

Anyway, I wanted to explain the work that has been happening around
sharding.  As things move forward, I am increasingly convinced that write
scaling will be needed soon, that the XC approach is the only reasonable
way to do it, and that FDWs are the cleanest way to get it into community
Postgres.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



--
Sumedh Pathak
Citus Data

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: pg_upgrade + Extensions
Next
From: Bruce Momjian
Date:
Subject: Re: pg_upgrade + Extensions