Re: I'd like to discuss scaleout at PGCon - Mailing list pgsql-hackers

From Robert Haas
Subject Re: I'd like to discuss scaleout at PGCon
Date
Msg-id CA+TgmoY9XpbamwJcy=LqhN08ZwydObNM-P-ch7Qmh7A3+Cavzg@mail.gmail.com
Whole thread Raw
In response to I'd like to discuss scaleout at PGCon  ("MauMau" <maumau307@gmail.com>)
Responses Re: I'd like to discuss scaleout at PGCon
Re: I'd like to discuss scaleout at PGCon
List pgsql-hackers
On Sun, May 27, 2018 at 1:20 AM, MauMau <maumau307@gmail.com> wrote:
> I'm going to attend PGCon in Ottawa for the first time.  I am happy if
> I can meet you.

It was nice to meet you in person.

> I'd like to have a session on scaleout design at the unconference.
> I've created a wiki page for that (this is still just a memo; I'd like
> to populate this page with you as the discussion in the community
> progresses).  I'd appreciate it if someone could stand with me and
> facilitate the discussion at the unconference.
>
> https://wiki.postgresql.org/wiki/Scaleout_Design

We didn't have time in the unconference session to discuss these
topics in detail, for you have raised many issues here each of which
deserves discussion individually and in detail.  I wrote a blog post
somewhat related to this topic recently which you can find at
http://rhaas.blogspot.com/2018/05/built-in-sharding-for-postgresql.html
-- it focuses on a somewhat different set of issues than the ones you
raise but that's good, because we need to hear all perspectives.  In
terms of high-level architecture, I think you are right to wonder
about the possibility of a cloud-native mode based on separating
storage and compute.  Amazon Aurora offers that feature, but we might
want to have it in PostgreSQL.

Another, somewhat different thing that we might want is a fully
distributed database, with a distributed buffer cache, distributed
lock manager, distributed invalidation queue, etc.  That would be like
what Oracle RAC does, but it's a tremendous amount of work, and a lot
of that work has no value by itself.  You don't get the payoff until
it's all working.  There are a few pieces that might be independently
useful, though, like a distributed deadlock detector.  The goal of
this kind of effort is, I think, to support zillions of connections --
scale further than you can with just one node.  This would be a lot
better if combined with the cloud-native storage, but of course that's
even more work.

The FDW approach, of which I have been a supporter for some years now,
is really aiming at a different target, which is to allow efficient
analytics queries across a multi-node cluster.  I think we're getting
pretty close to being able to do that -- IMHO, the last fundamental
building block that we need is asynchronous execution, which Andres is
working on.  After that, it's a matter of adding other features that
people want (like cross-node MVCC) and improving the plans for queries
that still don't perform well (like joins that involve redistributing
one of the tables involved).

We might not want to confine ourselves strictly to the FDW interface
-- for example, I've thought about the idea of building introducing a
new relkind for a "distributed table".  A distributed table may be
present on the local node, in which case it can be scanned like a
local table, or it may be not present, in which case it can be scanned
like a foreign table by connecting to a node on which it is present.
The set of nodes on which a table is present is metadata that is
shared throughout the cluster.  Multi-master logical replication
propagates changes between all nodes on which the table is present.
With a concept like this, there is a lot of opportunity to optimize
queries by, for example, deciding to perform a join on a node on which
both input tables are present, to minimize data movement.  But even if
we create something like this, I see it as fundamentally an extension
of the FDW approach that would hopefully manage to reuse a lot of
what's already been built there.  I don't think we need to (or should)
throw away the work that's been done on FDW pushdown and start over --
we should find a way to build on top of it and add ease-of-use and
management features.

In fact, even if we said that we want a fully distributed database,
we'd probably still need some kind of distributed table concept.
Unless every node has a full copy of everything in the database, you
still need to reason about which data is present on which nodes and
optimize queries accordingly.

> By the way, the popularity score of PostgreSQL finally exceeded 400
> points in the DB-Engines ranking!  The popularity difference with the
> top products has shrunk greatly.  Let's make PostgreSQL more popular.
>
> https://db-engines.com/en/ranking
>
>     [as of May 27, 2018]
>     Oracle=1290.42  MySQL=1223.34  SQL Server=1085.84
>     PostgreSQL=400.90  MongoDB=342.11
>     (Oracle / PostgreSQL ratio is 3.2)
>
>     [as of Feb 2016, according to a memo at hand]
>     Oracle=1476.14  MySQL=1321.13  SQL Server=??
>     MongoDB=??  PostgreSQL=288.66
>     (Oracle / PostgreSQL ratio is 5.1)

From the chart view, in February 2016, SQL Server was at 1150.227, and
MongoDB was at 305.599.  Generally it looks like the "big three" --
Oracle, MySQL, and SQL Server -- are all slowly declining in the
rankings while PostgreSQL and MongoDB are slowly rising.

It's good to see PostgreSQL going up, but we have a long way to go to
really be in the big leagues.  I think in the last couple of years we
have done a much better job of adding interesting features than we did
for a few years before that.  Between PostgreSQL 9.6 and 11, we will
have added (and in many cases significantly improved) partitioning,
logical replication, parallel query, and stored procedures.  Along the
way we've improved a lot of other things, not least executor
performance.  I don't know how directly those things affect the
DB-Engines ranking, but they are such major things that I have to
believe they make PostgreSQL a real possibility for many users who
wouldn't previously have found it a viable option, and I assume that's
at least part of what is driving the numbers, though there are
probably other factors too, such as Oracle's acquisition of MySQL.  I
think it's pretty clear that we need to both continue to improve some
of these major new features we've added and at the same time keep
introducing even more new things if we want to continue to gain market
share and mind share.  I hope that features like scale-out and also
zheap are going to help us continue to whittle away at the gap, and I
look forward to seeing what else anyone may have in mind.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Carter Thaxton
Date:
Subject: Re: Add --include-table-data-where option to pg_dump, to export onlya subset of table data
Next
From: Charles Cui
Date:
Subject: json results parsing