RE: Performance improvements of INSERTs to a partitioned table - Mailing list pgsql-hackers

From Kato, Sho
Subject RE: Performance improvements of INSERTs to a partitioned table
Date
Msg-id 25C1C6B2E7BE044889E4FE8643A58BA963B5796B@G01JPEXMBKW03
Whole thread Raw
In response to Performance improvements of INSERTs to a partitioned table  ("Kato, Sho" <kato-sho@jp.fujitsu.com>)
Responses Re: Performance improvements of INSERTs to a partitioned table
List pgsql-hackers

AFAIK, When CREATE INDEX on a partition and INSERT to a parent table are executed at the same time, this patch causes deadlock.

 

* partitions information

 

Partition key: RANGE (a)

Partitions: a_1 FOR VALUES FROM (1) TO (100),

            a_2 FOR VALUES FROM (100) TO (200)

 

T1: create index a_1_ix on a_1(a);

T2: insert into a values(101),(1);  locking a_2 and waiting releasing a_1’s lock

T1: create index a_2_ix on a_2(a); ERROR:  deadlock detected                 |

 

I think this situation does not mean unsafe because similar situation will occurs at no partitioned tables and DBMS could not prevent this situation.

But, I'm not sure this behavior is correct.

Does similar deadlock occur in other DBMS like Oracle?

 

From: Kato, Sho [mailto:kato-sho@jp.fujitsu.com]
Sent: Tuesday, November 6, 2018 6:36 PM
To: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Subject: Performance improvements of INSERTs to a partitioned table

 

Hi,

 

I want to discuss performance improvements of INSERTs to a partitioned table.

 

When an application inserts records into a table partitioned into thousands, find_all_inheritors() locks all of the partitions.

Updating partition key locks in the same way.

 

So, Execution time becomes longer as the number of partition increases.

 

* nparts 8

 

testdb=# explain analyze insert into test.accounts_history(aid, delta, mtime) values(8, 5000, current_timestamp);

                                               QUERY PLAN                                               

---------------------------------------------------------------------------------------------------------

Insert on accounts_history  (cost=0.00..0.02 rows=1 width=20) (actual time=0.281..0.281 rows=0 loops=1)

   ->  Result  (cost=0.00..0.02 rows=1 width=20) (actual time=0.079..0.080 rows=1 loops=1)

Planning Time: 0.080 ms

Execution Time: 0.362 ms

(4 rows)

 

* nparts 8192

 

testdb=# explain analyze insert into test.accounts_history(aid, delta, mtime) values(8192, 5000, current_timestamp);

                                               QUERY PLAN                                               

---------------------------------------------------------------------------------------------------------

Insert on accounts_history  (cost=0.00..0.02 rows=1 width=20) (actual time=0.058..0.059 rows=0 loops=1)

   ->  Result  (cost=0.00..0.02 rows=1 width=20) (actual time=0.032..0.034 rows=1 loops=1)

Planning Time: 0.032 ms

Execution Time: 12.508 ms

(4 rows)

 

Locking only the target partitions like the patch previously proposed by David[1], the performance will improve greatly.

 

* nparts 8192 (patched)

 

testdb=# explain analyze insert into test.accounts_history(aid, delta, mtime) values(8192, 5000, current_timestamp);

                                               QUERY PLAN                                               

---------------------------------------------------------------------------------------------------------

Insert on accounts_history  (cost=0.00..0.02 rows=1 width=20) (actual time=0.415..0.416 rows=0 loops=1)

   ->  Result  (cost=0.00..0.02 rows=1 width=20) (actual time=0.140..0.141 rows=1 loops=1)

Planning Time: 0.120 ms

Execution Time: 1.694 ms

(4 rows)

 

However, I am concerned that "unsafe" is included in the name of this patch.

If locking only target partitions and locking all of partitions are executed at the same time, a problem may occurs.

But, I'm not sure what kind of problem will occur.

Is it enough to lock only the target partitions?

 

If a problem occurs in above case, I think it is safer to divide the steps to acquire the lock into two.

 

In first step, locking only the parent table in share or exclusive mode.

In second step, locking only the target partitions after locking the parent table.

 

Thoughts?

 

[1]: https://www.postgresql.org/message-id/CAKJS1f_1RJyFquuCKRFHTdcXqoPX-PYqAd7nz=GVBwvGh4a6xA@mail.gmail.com

 

regards,

Sho Kato

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: csv format for psql
Next
From: Evgeniy Efimkin
Date:
Subject: Connection limit doesn't work for superuser