Re: The plan for FDW-based sharding - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: The plan for FDW-based sharding
Date
Msg-id 56D0B33F.3030102@postgrespro.ru
Whole thread Raw
In response to Re: The plan for FDW-based sharding  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: The plan for FDW-based sharding
Re: The plan for FDW-based sharding
List pgsql-hackers
On 02/26/2016 09:30 PM, Alvaro Herrera wrote:
> Konstantin Knizhnik wrote:
>
>> Yes, it is certainly possible to develop cluster by cloning PostgreSQL.
>> But it cause big problems both for developers, which have to permanently
>> synchronize their branch with master,
>> and, what is more important, for customers, which can not use standard
>> version of PostgreSQL.
>> It may cause problems with system certification, with running Postgres in
>> cloud,...
>> Actually the history of Postgres-XL/XC and Greenplum IMHO shows that it is
>> wrong direction.
> That's not the point, though.  I don't think a Postgres clone with a GTM
> solves any particular problem that's not already solved by the existing
> forks.  However, if you have a clone at home and you make a GTM work on
> it, then you take the GTM as a patch and post it for discussion.
> There's no need for hooks for that.  Just make sure your GTM solves the
> problem that it is supposed to solve.
>
> Excuse me if I've missed the discussion elsewhere -- why does
> PostgresPro have *two* GTMs instead of a single one?
>
There are many different clusters which require different approaches for managing distributed transactions.
Some clusters do no need distributed transactions at all: if you are executing OLAP queries on read-only database GTM
will just add extra overhead.
 

pg_dtm uses centralized arbiter. It is similar with Postgres-XL DTM. Presence of single arbiter signficantly simplify
alldistributed algorithms: failure detection, global deadlock elimination, ... But at the same time arbiter is SPOF and
mainfactor 
 
limiting cluster scalability.

pg_tsdtm  is based on another approach: it is using system time as CSN and doesn't require arbiter. In theory there is
nolimit for scalability. But differences in system time and necessity to use more rounds of communication have negative
impacton 
 
performance.

So there is no ideal solution which can work well for all cluster. This is why it is not possible to develop just one
GTM,propose it as a patch for review and then (hopefully) commit it in Postgres core. IMHO it will never happen. And I
donot think that 
 
it is actually needed. What we need is a way to be able to create own transaction managers as Postgres extension not
affectingits  core.
 

All arguments against XTM can be applied to any other extension API in Postgres, for example FDW.
Is it general enough? There are many useful operations which currently are not handled by this API. For example
performingaggregation and grouping at foreign server side.  But still it is very useful and flexible mechanism,
allowingto implement many 
 
wonderful things.
From my point of view good system should be as open and customizable as possible, if it doesn't affect  performance.
Replacing direct function calls with indirect function calls in almost all cases can not suffer performance as well as
addinghooks.
 
So without any extra price we get better flexibility. What's wrong with it?






-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: The plan for FDW-based sharding
Next
From: Roma Sokolov
Date:
Subject: Re: [PATCH] fix DROP OPERATOR to reset links to itself on commutator and negator