Re: I'd like to discuss scaleout at PGCon - Mailing list pgsql-hackers
From | MauMau |
---|---|
Subject | Re: I'd like to discuss scaleout at PGCon |
Date | |
Msg-id | CALO4oLMq9UYGa6PXs54j507F5oLj1q=8-Rdk=6Gd_+YMFYNOtg@mail.gmail.com Whole thread Raw |
In response to | Re: I'd like to discuss scaleout at PGCon (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: I'd like to discuss scaleout at PGCon
Re: I'd like to discuss scaleout at PGCon |
List | pgsql-hackers |
2018-05-31 22:44 GMT+09:00, Robert Haas <robertmhaas@gmail.com>: > On Thu, May 31, 2018 at 8:12 AM, MauMau <maumau307@gmail.com> wrote: >> Oh, I didn't know you support FDW approach mainly for analytics. I >> guessed the first target was OLTP read-write scalability. > > That seems like a harder target to me, because you will have an extra > hop involved -- SQL from the client to the first server, then via SQL > to a second server. The work of parsing and planning also has to be > done twice, once for the foreign table and again for the table. For > longer-running queries this overhead doesn't matter as much, but for > short-running queries it is significant. Yes, that extra hop and double parsing/planning were the killer for our performance goal when we tried to meet our customer's scaleout needs with XL. The application executes 82 DML statements in one transaction. Those DMLs consist of INSERT, UPDATE and SELECT that only accesses one row with a primary key. The target tables are only a few, so the application PREPAREs a few statements and EXECUTEs them repeatedly. We placed the coordinator node of XL on the same host as the application, and data nodes and GTM on other individual nodes. The response time of XL compared to PostgreSQL was 2.4 times, and the throughput (tps) was 43%. Interestingly, perf showed that base_yyparse() was the top CPU consumer on both coordinator and data node, while base_yyparse() appeared near the bottom of the ranking. The SQL processor should be one layer, not two layers. In the above benchmark, each transaction only accessed data on one data node. That's what sharding principles recommend. The FDW approach would be no problem as long as the application follows the sharding recommendation. But not all applications will/can follow the sharding recommendation. The above application, which is migrated from a mainframe, uses INSERTs to load data, inserting rows onto various nodes. Considering your concern of double parsing/planning for a local foreign table and a remote real table, wouldn't the FDW approach hit the wall? > I don't know what "node management" and "failure dectection/failover" > mean specifically. I'd like to hear proposals, though. That's nothing special or new. Things like: * Define a set of nodes that can join the cluster. * Initialize or configure a node according to its role in the cluster. * Decommission a node from the cluster. * Define a node group in which all member nodes have the same data set for redundancy. * One command to start and shutdown the entire cluster. * System tables to display the member nodes and node groups. * Each node's in-memory view of the current cluster state. * How each node monitors which other nodes. * Elect a new primary node within a node group when the current primary node fails. * Whether each node group should be configured with a master-slaves replication topology, or a multi-master topology like MySQL Group Replication Some of the above may end up with XL's things like pgxc_node/pgxc_group system tables, pgxc_ctl command, CREATE/DROP NODE/NODE GROUP commands, etc. Regards MauMau
pgsql-hackers by date: