Re: I'd like to discuss scaleout at PGCon - Mailing list pgsql-hackers
From | Merlin Moncure |
---|---|
Subject | Re: I'd like to discuss scaleout at PGCon |
Date | |
Msg-id | CAHyXU0xzrVfSVJTrU6g6dBhV25oM7k1m+tT75TJOK_hb4Mu2sg@mail.gmail.com Whole thread Raw |
In response to | Re: I'd like to discuss scaleout at PGCon (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: I'd like to discuss scaleout at PGCon
Re: I'd like to discuss scaleout at PGCon |
List | pgsql-hackers |
On Fri, Jun 22, 2018 at 12:34 PM Bruce Momjian <bruce@momjian.us> wrote: > > On Fri, Jun 1, 2018 at 11:29:43AM -0500, Merlin Moncure wrote: > > FWIW, Distributed analytical queries is the right market to be in. > > This is the field in which I work, and this is where the action is at. > > I am very, very, sure about this. My view is that many of the > > existing solutions to this problem (in particular hadoop class > > soltuions) have major architectural downsides that make them > > inappropriate in use cases that postgres really shines at; direct > > hookups to low latency applications for example. postgres is > > fundamentally a more capable 'node' with its multiple man-millennia of > > engineering behind it. Unlimited vertical scaling (RAC etc) is > > interesting too, but this is not the way the market is moving as > > hardware advancements have reduced or eliminated the need for that in > > many spheres. > > > > The direction of the project is sound and we are on the cusp of the > > point where multiple independent coalescing features (FDW, logical > > replication, parallel query, executor enhancements) will open new > > scaling avenues that will not require trading off the many other > > benefits of SQL that competing contemporary solutions might. The > > broader development market is starting to realize this and that is a > > major driver of the recent upswing in popularity. This is benefiting > > me tremendously personally due to having gone 'all-in' with postgres > > almost 20 years ago :-D. (Time sure flies) These are truly > > wonderful times for the community. > > While I am glad people know a lot about how other projects handle > sharding, these can be only guides to how Postgres will handle such > workloads. I think we need to get to a point where we have all of the > minimal sharding-specific code features done, at least as > proof-of-concept, and then test Postgres with various workloads like > OLTP/OLAP and read-write/read-only. This will tell us where > sharding-specific code will have the greatest impact. > > What we don't want to do is to add a bunch of sharding-specific code > without knowing which workloads it benefits, and how many of our users > will actually use sharding. Some projects have it done that, and it > didn't end well since they then had a lot of product complexity with > little user value. Key features from my perspective: *) fdw in parallel. how do i do it today? ghetto implemented parallel queries with asynchronous dblink *) column store *) automatic partition management through shards probably some more, gotta run :-) merlin
pgsql-hackers by date: