How to make partitioning scale better for larger numbers ofpartitions - Mailing list pgsql-hackers

From Kato, Sho
Subject How to make partitioning scale better for larger numbers ofpartitions
Date
Msg-id 25C1C6B2E7BE044889E4FE8643A58BA963AA81D9@G01JPEXMBKW03
Whole thread Raw
Responses Re: How to make partitioning scale better for larger numbers ofpartitions
Re: How to make partitioning scale better for larger numbers of partitions
List pgsql-hackers

Hi,

 

I benchmarked on a RANGE partitioned table with 1.1k leaf partitions and no sub-partitioned tables.

But, statement latencies on a partitioned table is much slower than on a non-partitioned table.

 

UPDATE latency is 210 times slower than a non-partitioned table.

SELECT latency is 36 times slower than a non-partitioned table.

Surprisingly INSERT latency is almost same.

 

Of course I'm sure table partitioning work well with up to a hundred partitions as written on the postgresql document.

But, my customer will use partitioned table with 1.1k leaf partitions.

So, we need to improve performance.

 

Any ideas?

 

The results of pgbench and perf are listed below.

 

pgbench results

---------------

 

transaction type: update.sql

scaling factor: 1

query mode: simple

number of clients: 1

number of threads: 1

duration: 180 s

number of transactions actually processed: 648

latency average = 278.202 ms

tps = 3.594512 (including connections establishing)

tps = 3.594545 (excluding connections establishing)

statement latencies in milliseconds:

         0.011  \set aid random(1, 1100 * 1)

         0.004  \set delta random(-5000, 5000)

         0.038  BEGIN;

       277.005  UPDATE test.accounts SET abalance = abalance + :delta WHERE aid = :aid;

         1.140  END;

 

transaction type: select.sql

scaling factor: 1

query mode: simple

number of clients: 1

number of threads: 1

duration: 180 s

number of transactions actually processed: 19415

latency average = 9.281 ms

tps = 107.745068 (including connections establishing)

tps = 107.746067 (excluding connections establishing)

statement latencies in milliseconds:

         0.800  \set aid random(1, 1100 * 1)

         0.137  \set delta random(-5000, 5000)

         1.351  BEGIN;

         4.941  SELECT abalance FROM test.accounts WHERE aid = :aid;

         2.052  END;

 

transaction type: insert.sql

scaling factor: 1

query mode: simple

number of clients: 1

number of threads: 1

duration: 180 s

number of transactions actually processed: 31895

latency average = 5.654 ms

tps = 176.865541 (including connections establishing)

tps = 176.867086 (excluding connections establishing)

statement latencies in milliseconds:

         2.083  \set aid random(1, 1100 * 1)

         0.003  \set delta random(-5000, 5000)

         0.029  BEGIN;

         3.222  INSERT INTO test.accounts_history (aid, delta, mtime) VALUES (:aid, :delta, CURRENT_TIMESTAMP);

         0.317  END;

 

Top 10 of perf report

------------

 

UPDATE:

21.33%  postgres  postgres           [.] range_table_mutator

12.57%  postgres  postgres           [.] AllocSetAlloc

  4.97%  postgres  postgres           [.] palloc

  4.48%  postgres  postgres           [.] make_one_rel

  3.96%  postgres  postgres           [.] lappend

  2.74%  postgres  [kernel.kallsyms]  [k] get_page_from_freelist

  1.87%  postgres  postgres           [.] setup_append_rel_array

  1.68%  postgres  [kernel.kallsyms]  [k] list_del

  1.64%  postgres  [kernel.kallsyms]  [k] __alloc_pages_nodemask

  1.62%  postgres  [kernel.kallsyms]  [k] unmap_vmas

 

SELECT:

14.72%  postgres  postgres           [.] AllocSetAlloc

  5.14%  postgres  postgres           [.] hash_search_with_hash_value

  4.23%  postgres  postgres           [.] palloc

  4.06%  postgres  postgres           [.] MemoryContextAllocZeroAligned

  2.61%  postgres  postgres           [.] copyObjectImpl

  2.34%  postgres  postgres           [.] expression_tree_mutator

  2.13%  postgres  [kernel.kallsyms]  [k] _spin_lock

  1.91%  postgres  postgres           [.] lappend

  1.59%  postgres  [kernel.kallsyms]  [k] __link_path_walk

  1.50%  postgres  postgres           [.] set_rel_size

 

INSERT:

20.75%  postgres  postgres           [.] hash_search_with_hash_value

  6.03%  postgres  postgres           [.] hash_any

  4.88%  postgres  postgres           [.] AllocSetAlloc

  4.05%  postgres  postgres           [.] LWLockRelease

  4.05%  postgres  postgres           [.] LWLockAcquire

  3.27%  postgres  postgres           [.] oid_cmp

  3.06%  postgres  postgres           [.] SearchCatCache

  2.97%  postgres  postgres           [.] LockReleaseAll

  2.57%  postgres  postgres           [.] pg_qsort

  2.37%  postgres  postgres           [.] hash_seq_search

 

 

The following is information on the environment used for the benchmark.

 

Server spec

-----------

 

  Server has 16 cpu.

  Memory size is 264GB.

  Database directory is on SSD.

 

database tuning

---------------

 

 shared_buffers = 102GB

  max_locks_per_transactions = 1000000

 

postgresql version

------------------

 

  11beta2 + patch1 + patch2

 

  patch1: Allow direct lookups of AppendRelInfo by child relid

          commit 7d872c91a3f9d49b56117557cdbb0c3d4c620687

 

  patch2: 0001-Speed-up-INSERT-and-UPDATE-on-partitioned-tables.patch

          https://commitfest.postgresql.org/18/1690/

 

 

table definition

----------------

 

  create table test.accounts(aid serial, abalance int) partition by range(aid));

  create table test.accounts_history(id serial, aid int, delta int, mtime timestamp without time zone) partition by range(aid);

 

  create table test.account_part_1 partition of test.accounts for values from (1) to (2);

  create table test.account_part_2 partition of test.accounts for values from (2) to (3);

  .

  .

  create table test.account_part_1100 partition of test.accounts for values from (1100) to (1101);

 

  accounts_history is also partitioned in the same way.

 

  There is only one data in each leaf partitions for UPDATE/SELECT benchmark.

 

  regards,

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: requested timeline ... does not contain minimum recovery point...
Next
From: Amit Kapila
Date:
Subject: Re: Concurrency bug in UPDATE of partition-key