Home > mailing lists

Re: Re: Append with naive multiplexing of FDWs - Mailing list pgsql-hackers

From	movead.li@highgo.ca
Subject	Re: Re: Append with naive multiplexing of FDWs
Date	January 14, 2020 12:12:21
Msg-id	2020011417113872105895@highgo.ca Whole thread Raw
In response to	Append with naive multiplexing of FDWs (Thomas Munro <thomas.munro@gmail.com>)
List	pgsql-hackers

Tree view

Hello

I have tested the patch with a partition table with several foreign

partitions living on seperate data nodes. The initial testing was done

with a partition table having 3 foreign partitions, test was done with

variety of scale facters. The seonnd test was with fixed data per data

node but number of data nodes were increased incrementally to see

the peformance impact as more nodes are added to the cluster. The

test three is similar to the initial test but with much huge data and

4 nodes.

The results are summary is given below and test script attached:

Test ENV

Parent node:2Core 8G

Child Nodes:2Core 4G

Test one:

1.1 The partition struct as below:

[ ptf:(a int, b int, c varchar)]

(Parent node)

| | |

[ptf1] [ptf2] [ptf3]

(Node1) (Node2) (Node3)

The table data is partitioned across nodes, the test is done using a

simple select query and a count aggregate as shown below. The result

is an average of executing each query multiple times to ensure reliable

and consistent results.

①select * from ptf where b = 100;

②select count(*) from ptf;

1.2. Test Results

For ① result:

scalepernode master patched performance

2G 7s 2s 350%

5G 173s 63s 275%

10G 462s 156s 296%

20G 968s 327s 296%

30G 1472s 494s 297%

For ② result:

scalepernode master patched performance

2G 1079s 291s 370%

5G 2688s 741s 362%

10G 4473s 1493s 299%

It takes too long time to test a aggregate so the test was done with a

smaller data size.

1.3. summary

With the table partitioned over 3 nodes, the average performance gain

across variety of scale factors is almost 300%

Test Two

2.1 The partition struct as below:

[ ptf:(a int, b int, c varchar)]

(Parent node)

| | |

[ptf1] ... [ptfN]

(Node1) (...) (NodeN)

①select * from ptf

②select * from ptf where b = 100;

This test is done with same size of data per node but table is partitioned

across N number of nodes. Each varation (master or patches) is tested

at-least 3 times to get reliable and consistent results. The purpose of the

test is to see impact on performance as number of data nodes are increased.

2.2 The results

For ① result（scalepernode=2G）:

nodenumber master patched performance

2 432s 180s 240%

3 636s 223s 285%

4 830s 283s 293%

5 1065s 361s 295%

For ② result（scalepernode=10G）:

nodenumber master patched performance

2 281s 140s 201%

3 421s 140s 300%

4 562s 141s 398%

5 702s 141s 497%

6 833s 139s 599%

7 986s 141s 699%

8 1125s 140s 803%

Test Three

This test is similar to the [test one] but with much huge data and

4 nodes.

For ① result:

scalepernode master patched performance

100G 6592s 1649s 399%

For ② result:

scalepernode master patched performance

100G 35383 12363 286%

The result show it work well in much huge data.

Summary

The patch is pretty good, it works well when there were little data back to

the parent node. The patch doesn’t provide parallel FDW scan, it ensures

that child nodes can send data to parent in parallel but the parent can only

sequennly process the data from data nodes.

Providing there is no performance degrdation for non FDW append queries,

I would recomend to consider this patch as an interim soluton while we are

waiting for parallel FDW scan.

Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Attachment

script.tar

pgsql-hackers by date:

From: Pavel Stehule
Date: 14 January 2020, 11:16:50
Subject: Re: Additional improvements to extended statistics

From: Kohei KaiGai
Date: 14 January 2020, 12:16:17
Subject: Re: TRUNCATE on foreign tables

Re: Re: Append with naive multiplexing of FDWs - Mailing list pgsql-hackers

Attachment

Previous

Next