Thread: Delay locking partitions during INSERT and UPDATE

Delay locking partitions during INSERT and UPDATE

From

David Rowley

Date:

23 November 2018, 00:14:01

As a follow-on from [1] and also discussed in [2], I'd like to propose
that we don't obtain locks on all partitions during INSERT into a
partitioned table and UPDATE of a partitioned key and instead, only
lock the partition when we first route a tuple to it. This means that
the order that the locks are obtained is no longer well defined and is
at the mercy of the order that tuples are INSERTed or UPDATEd.  It
seems worth relaxing this a bit for gains in performance, as when a
partitioned table contains many partitions, the overhead of locking
all partitions when inserting a single row, or just a few rows is
often significantly higher than the cost of doing the actual insert.

The current behaviour was added in 54cde0c4c058073 in order to
minimise deadlock risk.  It seems that the risk there only comes from
AELs that could be taken when a partition directly receives a TRUNCATE
/ CREATE INDEX / VACUUM FULL / CLUSTER. There's obviously no conflict
with other DML operations since two RowExclusiveLocks don't conflict
with each other.  I think all other AEL obtaining DDL must be
performed on the top level partitioned table, for example, ADD COLUMN
can't be done directly on a partition, so there's no added deadlock
risk from those. For a deadlock to occur one of the above DDL commands
would have to be executed inside a transaction in an order opposite to
the order rows are being INSERTed or UPDATEd in the partitioned table.
If required, such operations could LOCK TABLE the top partitioned
table to block the DML operation. There's already a risk of similar
deadlocks from such operations done on multiple separate tables when
the order they're done is not the same as the order the tables are
written in a query, although, in that case, the window for the
deadlock is likely to be much smaller.

With this done, the performance of an INSERT into a 10k partition
partitioned table looks like:

Setup:
create table hashp (a int) partition by hash(a);
select 'create table hashp'||x::Text || ' partition of hashp for
values with (modulus 10000, remainder '||x::text||');' from
generate_Series(0,9999) x;
\gexec

hashp_insert.sql:
\set p_a random(1,1000)
insert into hashp values(:p_a);

Results:
$ psql -c "truncate hashp;" postgres && pgbench -n -f hashp_insert.sql
-M prepared -c 4 -j 4 -T 60 postgres

Patched:
tps = 27811.427620 (excluding connections establishing)
tps = 28617.417308 (excluding connections establishing)

Unpatched:
tps = 130.446706 (excluding connections establishing)
tps = 119.726641 (excluding connections establishing)

The patch is attached.
I'll park this here until the January commitfest.

[1] https://www.postgresql.org/message-id/flat/CAKJS1f_1RJyFquuCKRFHTdcXqoPX-PYqAd7nz=GVBwvGh4a6xA@mail.gmail.com
[2]
https://www.postgresql.org/message-id/flat/25C1C6B2E7BE044889E4FE8643A58BA963B5796B%40G01JPEXMBKW03#3783596a794c6789a54a95a20971b6aa

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

v1-0001-Delay-locking-of-partitions-during-INSERT-and-UPD.patch

Re: Delay locking partitions during INSERT and UPDATE

From

Tomas Vondra

Date:

03 January 2019, 12:34:24

Hi,

On 11/23/18 1:14 AM, David Rowley wrote:
> As a follow-on from [1] and also discussed in [2], I'd like to propose
> that we don't obtain locks on all partitions during INSERT into a
> partitioned table and UPDATE of a partitioned key and instead, only
> lock the partition when we first route a tuple to it. This means that
> the order that the locks are obtained is no longer well defined and is
> at the mercy of the order that tuples are INSERTed or UPDATEd.  It
> seems worth relaxing this a bit for gains in performance, as when a
> partitioned table contains many partitions, the overhead of locking
> all partitions when inserting a single row, or just a few rows is
> often significantly higher than the cost of doing the actual insert.
> 

Yep, the locking seems like a significant bottleneck. I've done quite a
bit of testing on two machines, using a slightly modified version of
your test script with variable number of partitions (0 means not
partitioned), and the results look like this:

1) xeon e5-2620v4

    partitions        0     100     1000    10000
    ---------------------------------------------
    master        16643    6956     1039      108
    patched       16398   15522    15222    13228

2) i5-2500k

    partitions     0     100    1000    10000
    -----------------------------------------
    master      3901    2892     920       76
    patched     3894    3838    3845     3522

When using UNLOGGED tables to minimize the external noise, it looks like
this:

3) xeon e5-2620v4

    partitions      0      100     1000    10000
    --------------------------------------------
    master      30806     8740     1091      107
    patched     30455    28137    27582    24985

    partitions      0      100     1000    10000
    --------------------------------------------
    master      27662     9013     1277       79
    patched     28263    26474    25794    22434


So the performance benefit is pretty clear - up to 2 orders of magnitude
with 10k partitions, and gets us fairly close to non-partitioned table.

Me gusta.

> The current behaviour was added in 54cde0c4c058073 in order to
> minimise deadlock risk.  It seems that the risk there only comes from
> AELs that could be taken when a partition directly receives a TRUNCATE
> / CREATE INDEX / VACUUM FULL / CLUSTER. There's obviously no conflict
> with other DML operations since two RowExclusiveLocks don't conflict
> with each other.  I think all other AEL obtaining DDL must be
> performed on the top level partitioned table, for example, ADD COLUMN
> can't be done directly on a partition, so there's no added deadlock
> risk from those. For a deadlock to occur one of the above DDL commands
> would have to be executed inside a transaction in an order opposite to
> the order rows are being INSERTed or UPDATEd in the partitioned table.
> If required, such operations could LOCK TABLE the top partitioned
> table to block the DML operation. There's already a risk of similar
> deadlocks from such operations done on multiple separate tables when
> the order they're done is not the same as the order the tables are
> written in a query, although, in that case, the window for the
> deadlock is likely to be much smaller.
> 

Hmmm, yeah.

Per the discussion in [1] the locking was necessary also to ensure
partitions can't disappear while we're building the descriptors in
RelationBuildPartitionDesc(). But AFAICS 3f2393edef fixed this.

The other issue - as you note - is ensuring locking order, to prevent
(or rather reduce the risk of) deadlocks. I agree with your assessment
here, i.e. that locking the parent is a sufficient protection.

Maybe there's an alternative solution with the same benefits and not
sacrificing the lock ordering, but I fail to see how it would work.

> 
> [1] https://www.postgresql.org/message-id/flat/CAKJS1f_1RJyFquuCKRFHTdcXqoPX-PYqAd7nz=GVBwvGh4a6xA@mail.gmail.com

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

sho kato

Date:

18 January 2019, 06:07:37

The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:       tested, failed
Spec compliant:           tested, failed
Documentation:            tested, failed

Hi,

Increasing the number of clients, I benchmarked with a table partitioned into 1k partition.
I confirmed that this patch improve performance by 10 times or more.

master (commit: 90525d7b4e Date:   Tue Jan 15 12:19:21 2019 -0800)

cpu:
Xeon(R) CPU E5-2667 v3 * 2

setup:
create table history(aid int, delta int, mtime timestamp without time zone) partition by range(aid);
\o /dev/null
select 'create table history_' || x || ' partition of history for values from(' || x ||') to(' || x+1 || ')' from
generate_series(1,1000) x;
 
\gexec

benchmark.sql:
\set aid random(1, 1000)
\set delta random(-5000, 5000)
INSERT INTO history VALUES (:aid, :delta, CURRENT_TIMESTAMP);

command line:
pgbench -d testdb -f benchmark.sql -c number_of_clients -T 60 -r -n

Results:

 clients | tps_patched | tps_unpatched | tps_unpatched / tps_patched 
---------+-------------+---------------+-----------------------------
       1 |        8890 |           841 |                          11
       2 |       17484 |          1470 |                          12
       4 |       29218 |          2474 |                          12
       8 |       48789 |          3876 |                          13
      16 |       68794 |          4030 |                          17
      32 |       69550 |          2888 |                          24
      64 |       71196 |          2555 |                          28
     128 |       71739 |          2295 |                          31
     256 |       66548 |          2105 |                          32

I wonder why performance does not increase much when the number of clients exceeds 16.
Even though it is caused by competition of lightweight locks I thought 16 clients are small.


Also, I did make installcheck world, but test partition_prune failed.
However, this test case failed even before applying a patch, so this patch is not a problem.
One of the results is as follows.

 explain (analyze, costs off, summary off, timing off) execute ab_q1 (2, 2, 3);
-                       QUERY PLAN                        
----------------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Append (actual rows=0 loops=1)
-   Subplans Removed: 6
    ->  Seq Scan on ab_a2_b1 (actual rows=0 loops=1)
-         Filter: ((a >= $1) AND (a <= $2) AND (b <= $3))
+         Filter: ((a >= 2) AND (a <= 2) AND (b <= 3))
    ->  Seq Scan on ab_a2_b2 (actual rows=0 loops=1)
-         Filter: ((a >= $1) AND (a <= $2) AND (b <= $3))
+         Filter: ((a >= 2) AND (a <= 2) AND (b <= 3))
    ->  Seq Scan on ab_a2_b3 (actual rows=0 loops=1)
-         Filter: ((a >= $1) AND (a <= $2) AND (b <= $3))
-(8 rows)
+         Filter: ((a >= 2) AND (a <= 2) AND (b <= 3))
+(7 rows)

regards,
sho kato

Re: Delay locking partitions during INSERT and UPDATE

From

David Rowley

Date:

18 January 2019, 10:40:54

On Fri, 18 Jan 2019 at 19:08, sho kato <kato-sho@jp.fujitsu.com> wrote:
> I confirmed that this patch improve performance by 10 times or more.

Thanks for testing this out.

> Also, I did make installcheck world, but test partition_prune failed.
> However, this test case failed even before applying a patch, so this patch is not a problem.
> One of the results is as follows.
>
>  explain (analyze, costs off, summary off, timing off) execute ab_q1 (2, 2, 3);
> -                       QUERY PLAN
> ----------------------------------------------------------
> +                      QUERY PLAN
> +------------------------------------------------------
>   Append (actual rows=0 loops=1)
> -   Subplans Removed: 6
>     ->  Seq Scan on ab_a2_b1 (actual rows=0 loops=1)
> -         Filter: ((a >= $1) AND (a <= $2) AND (b <= $3))
> +         Filter: ((a >= 2) AND (a <= 2) AND (b <= 3))
>     ->  Seq Scan on ab_a2_b2 (actual rows=0 loops=1)
> -         Filter: ((a >= $1) AND (a <= $2) AND (b <= $3))
> +         Filter: ((a >= 2) AND (a <= 2) AND (b <= 3))
>     ->  Seq Scan on ab_a2_b3 (actual rows=0 loops=1)
> -         Filter: ((a >= $1) AND (a <= $2) AND (b <= $3))
> -(8 rows)
> +         Filter: ((a >= 2) AND (a <= 2) AND (b <= 3))
> +(7 rows)

Perhaps you're running with plan_cache_mode = force_custom_plan.
You'll likely need it set to auto for these to pass.


-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

John Naylor

Date:

18 January 2019, 23:05:51

On 11/22/18, David Rowley <david.rowley@2ndquadrant.com> wrote:
> If required, such operations could LOCK TABLE the top partitioned
> table to block the DML operation. There's already a risk of similar
> deadlocks from such operations done on multiple separate tables when
> the order they're done is not the same as the order the tables are
> written in a query, although, in that case, the window for the
> deadlock is likely to be much smaller.

Is this something that would need documentation anywhere?

> With this done, the performance of an INSERT into a 10k partition
> partitioned table looks like:
>
> Setup:
> create table hashp (a int) partition by hash(a);
> select 'create table hashp'||x::Text || ' partition of hashp for
> values with (modulus 10000, remainder '||x::text||');' from
> generate_Series(0,9999) x;
> \gexec
>
> hashp_insert.sql:
> \set p_a random(1,1000)
> insert into hashp values(:p_a);
>
> Results:
> $ psql -c "truncate hashp;" postgres && pgbench -n -f hashp_insert.sql
> -M prepared -c 4 -j 4 -T 60 postgres

I used a similar test, but with unlogged tables, and "-c 2", and got:

normal table: 32000tps
10k partitions / master: 82tps
10k partitions / patch: 7000tps

So far I haven't gotten quite as good performance as you and Tomas,
although it's still a ~85x improvement.

-John Naylor

Re: Delay locking partitions during INSERT and UPDATE

From

Tomas Vondra

Date:

19 January 2019, 15:59:02

On 1/19/19 12:05 AM, John Naylor wrote:
> On 11/22/18, David Rowley <david.rowley@2ndquadrant.com> wrote:
>> If required, such operations could LOCK TABLE the top partitioned
>> table to block the DML operation. There's already a risk of similar
>> deadlocks from such operations done on multiple separate tables when
>> the order they're done is not the same as the order the tables are
>> written in a query, although, in that case, the window for the
>> deadlock is likely to be much smaller.
> 
> Is this something that would need documentation anywhere?
> 

Not sure. Initially I was going to say "no" because it's an internal
implementation detail and the risk of the deadlock is already there
anyway. But maybe this patch is making it more likely and we should at
least mention how partitions are locked.

>> With this done, the performance of an INSERT into a 10k partition
>> partitioned table looks like:
>>
>> Setup:
>> create table hashp (a int) partition by hash(a);
>> select 'create table hashp'||x::Text || ' partition of hashp for
>> values with (modulus 10000, remainder '||x::text||');' from
>> generate_Series(0,9999) x;
>> \gexec
>>
>> hashp_insert.sql:
>> \set p_a random(1,1000)
>> insert into hashp values(:p_a);
>>
>> Results:
>> $ psql -c "truncate hashp;" postgres && pgbench -n -f hashp_insert.sql
>> -M prepared -c 4 -j 4 -T 60 postgres
> 
> I used a similar test, but with unlogged tables, and "-c 2", and got:
> 
> normal table: 32000tps
> 10k partitions / master: 82tps
> 10k partitions / patch: 7000tps
> 
> So far I haven't gotten quite as good performance as you and Tomas,
> although it's still a ~85x improvement.
> 

What hardware are you running the tests on? I wouldn't be surprised if
you were hitting some CPU or I/O bottleneck, which we're not hitting.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

John Naylor

Date:

20 January 2019, 04:45:25

On Sat, Jan 19, 2019 at 10:59 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>
> On 1/19/19 12:05 AM, John Naylor wrote:
> > I used a similar test, but with unlogged tables, and "-c 2", and got:
> >
> > normal table: 32000tps
> > 10k partitions / master: 82tps
> > 10k partitions / patch: 7000tps
> >
> > So far I haven't gotten quite as good performance as you and Tomas,
> > although it's still a ~85x improvement.
>
> What hardware are you running the tests on? I wouldn't be surprised if
> you were hitting some CPU or I/O bottleneck, which we're not hitting.

9 year-old laptop, Core i3. Side note, I miswrote my test parameters
above -- should be "-c4 -j2".

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

Tomas Vondra

Date:

20 January 2019, 14:42:35

On 1/20/19 5:45 AM, John Naylor wrote:
> On Sat, Jan 19, 2019 at 10:59 AM Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>>
>> On 1/19/19 12:05 AM, John Naylor wrote:
>>> I used a similar test, but with unlogged tables, and "-c 2", and got:
>>>
>>> normal table: 32000tps
>>> 10k partitions / master: 82tps
>>> 10k partitions / patch: 7000tps
>>>
>>> So far I haven't gotten quite as good performance as you and Tomas,
>>> although it's still a ~85x improvement.
>>
>> What hardware are you running the tests on? I wouldn't be surprised if
>> you were hitting some CPU or I/O bottleneck, which we're not hitting.
> 
> 9 year-old laptop, Core i3. Side note, I miswrote my test parameters
> above -- should be "-c4 -j2".
> 

Hmmm, I wouldn't be surprised if it was getting throttled for some
reasons (e.g. temperature).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

David Rowley

Date:

20 January 2019, 21:04:22

On Sat, 19 Jan 2019 at 12:05, John Naylor <jcnaylor@gmail.com> wrote:
>
> On 11/22/18, David Rowley <david.rowley@2ndquadrant.com> wrote:
> > If required, such operations could LOCK TABLE the top partitioned
> > table to block the DML operation. There's already a risk of similar
> > deadlocks from such operations done on multiple separate tables when
> > the order they're done is not the same as the order the tables are
> > written in a query, although, in that case, the window for the
> > deadlock is likely to be much smaller.
>
> Is this something that would need documentation anywhere?

Probably at least the release notes. I'm unsure where else to mention
it.  I don't feel the workaround of using LOCK TABLE is special to
this case. The patch does, however, make it a possible requirement for
performing DDL on individual partitions where it was not previously.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

RE: Delay locking partitions during INSERT and UPDATE

From

"Kato, Sho"

Date:

21 January 2019, 00:09:32

on Fri, 18 2019 at 19:41, David Rowley <david.rowley@2ndquadrant.com> wrote:
>Perhaps you're running with plan_cache_mode = force_custom_plan.
>You'll likely need it set to auto for these to pass.

Thank you.
I was running with plan_cache_mode = force_custom_plan.
When it set to auto, all tests are passed.

regards,

sho kato
> -----Original Message-----
> From: David Rowley [mailto:david.rowley@2ndquadrant.com]
> Sent: Friday, January 18, 2019 7:41 PM
> To: Kato, Sho/加藤 翔 <kato-sho@jp.fujitsu.com>
> Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>; David
> Rowley <dgrowley@gmail.com>
> Subject: Re: Delay locking partitions during INSERT and UPDATE
> 
> On Fri, 18 Jan 2019 at 19:08, sho kato <kato-sho@jp.fujitsu.com> wrote:
> > I confirmed that this patch improve performance by 10 times or more.
> 
> Thanks for testing this out.
> 
> > Also, I did make installcheck world, but test partition_prune failed.
> > However, this test case failed even before applying a patch, so this
> patch is not a problem.
> > One of the results is as follows.
> >
> >  explain (analyze, costs off, summary off, timing off) execute ab_q1
> (2, 2, 3);
> > -                       QUERY PLAN
> > ----------------------------------------------------------
> > +                      QUERY PLAN
> > +------------------------------------------------------
> >   Append (actual rows=0 loops=1)
> > -   Subplans Removed: 6
> >     ->  Seq Scan on ab_a2_b1 (actual rows=0 loops=1)
> > -         Filter: ((a >= $1) AND (a <= $2) AND (b <= $3))
> > +         Filter: ((a >= 2) AND (a <= 2) AND (b <= 3))
> >     ->  Seq Scan on ab_a2_b2 (actual rows=0 loops=1)
> > -         Filter: ((a >= $1) AND (a <= $2) AND (b <= $3))
> > +         Filter: ((a >= 2) AND (a <= 2) AND (b <= 3))
> >     ->  Seq Scan on ab_a2_b3 (actual rows=0 loops=1)
> > -         Filter: ((a >= $1) AND (a <= $2) AND (b <= $3))
> > -(8 rows)
> > +         Filter: ((a >= 2) AND (a <= 2) AND (b <= 3))
> > +(7 rows)
> 
> Perhaps you're running with plan_cache_mode = force_custom_plan.
> You'll likely need it set to auto for these to pass.
> 
> 
> --
>  David Rowley                   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>

Re: Delay locking partitions during INSERT and UPDATE

From

John Naylor

Date:

22 January 2019, 15:35:15

The cfbot reports this patch no longer applies.

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

David Rowley

Date:

22 January 2019, 21:31:15

On Wed, 23 Jan 2019 at 04:35, John Naylor <john.naylor@2ndquadrant.com> wrote:
> The cfbot reports this patch no longer applies.

Thanks. I've attached a rebased version.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

v2-0001-Delay-locking-of-partitions-during-INSERT-and-UPD.patch

Re: Delay locking partitions during INSERT and UPDATE

From

John Naylor

Date:

24 January 2019, 00:37:56

On Tue, Jan 22, 2019 at 4:31 PM David Rowley
<david.rowley@2ndquadrant.com> wrote:
> Thanks. I've attached a rebased version.

Hmm, now instead of an 85x speedup over master in the 10k partition
case, I only get 20x. Anyone else see this?


--
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

David Rowley

Date:

24 January 2019, 00:56:36

On Thu, 24 Jan 2019 at 13:38, John Naylor <john.naylor@2ndquadrant.com> wrote:
>
> On Tue, Jan 22, 2019 at 4:31 PM David Rowley
> <david.rowley@2ndquadrant.com> wrote:
> > Thanks. I've attached a rebased version.
>
> Hmm, now instead of an 85x speedup over master in the 10k partition
> case, I only get 20x. Anyone else see this?

What's it like with fsync off?

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

John Naylor

Date:

24 January 2019, 20:50:17

On Wed, Jan 23, 2019 at 7:56 PM David Rowley
<david.rowley@2ndquadrant.com> wrote:
> On Thu, 24 Jan 2019 at 13:38, John Naylor <john.naylor@2ndquadrant.com> wrote:
> > Hmm, now instead of an 85x speedup over master in the 10k partition
> > case, I only get 20x. Anyone else see this?
>
> What's it like with fsync off?

No change. Just in case, I tried git bisect and also recreated the
cluster, but never got the same performance as my first test, so not
sure what happened.
--
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

Tomas Vondra

Date:

24 January 2019, 21:17:12

On 1/24/19 9:50 PM, John Naylor wrote:
> On Wed, Jan 23, 2019 at 7:56 PM David Rowley
> <david.rowley@2ndquadrant.com> wrote:
>> On Thu, 24 Jan 2019 at 13:38, John Naylor <john.naylor@2ndquadrant.com> wrote:
>>> Hmm, now instead of an 85x speedup over master in the 10k partition
>>> case, I only get 20x. Anyone else see this?
>>
>> What's it like with fsync off?
> 
> No change. Just in case, I tried git bisect and also recreated the
> cluster, but never got the same performance as my first test, so not
> sure what happened.

I can still see about the same performance as before (on both clusters).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

John Naylor

Date:

24 January 2019, 21:34:04

On Thu, Jan 24, 2019 at 4:17 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> I can still see about the same performance as before (on both clusters).

Thanks for checking! I think the thing to do now is have a committer
look at it. It's a small patch with obvious performance effects --
there's just the question of whether the locking behavior with
concurrent DDL is as safe as it is now.

Anyone have anything else?

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

Tomas Vondra

Date:

24 January 2019, 21:42:55

On 1/24/19 10:34 PM, John Naylor wrote:
> On Thu, Jan 24, 2019 at 4:17 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> I can still see about the same performance as before (on both clusters).
> 
> Thanks for checking! I think the thing to do now is have a committer
> look at it. It's a small patch with obvious performance effects --
> there's just the question of whether the locking behavior with
> concurrent DDL is as safe as it is now.
>
> Anyone have anything else?
>

Yes, I don't see why would the patch change that and I've been looking
for such cases. I think David was looking at that this week too, and I
assume he'll send an update if he finds anything. If not, I plan to get
it committed soon-ish (possibly next week after the FOSDEM dev meeting,
unless there are some objections).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

Robert Haas

Date:

31 January 2019, 18:46:33

On Thu, Jan 24, 2019 at 4:43 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> Yes, I don't see why would the patch change that and I've been looking
> for such cases. I think David was looking at that this week too, and I
> assume he'll send an update if he finds anything. If not, I plan to get
> it committed soon-ish (possibly next week after the FOSDEM dev meeting,
> unless there are some objections).

I have reviewed this patch and I am in favor of it.  I think it likely
needs minor rebasing because of the heap_open -> table_open renaming.
I also agree that it's worth taking some deadlock risk for the rather
massive performance gain, although I suspect it's likely that a few
users are going to complain about deadlocks and I wonder whether we'll
have to put some energy into that problem at some point.  However, I
think what we probably want to do there is reduce the probability of
deadlocks through some trickery or maybe invent some new locking
mechanisms that work around the problem.  The alternative of trying to
block this patch seems unpalatable.

In short, +1 from me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Delay locking partitions during INSERT and UPDATE

From

David Rowley

Date:

31 January 2019, 21:47:42

On Fri, 1 Feb 2019 at 07:46, Robert Haas <robertmhaas@gmail.com> wrote:
> I have reviewed this patch and I am in favor of it.  I think it likely
> needs minor rebasing because of the heap_open -> table_open renaming.

Many thanks for looking at it.   The v2 patch was based on top of the
heap_open -> table_open change.


-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

Robert Haas

Date:

01 February 2019, 14:07:37

On Thu, Jan 31, 2019 at 4:48 PM David Rowley
<david.rowley@2ndquadrant.com> wrote:
> On Fri, 1 Feb 2019 at 07:46, Robert Haas <robertmhaas@gmail.com> wrote:
> > I have reviewed this patch and I am in favor of it.  I think it likely
> > needs minor rebasing because of the heap_open -> table_open renaming.
>
> Many thanks for looking at it.   The v2 patch was based on top of the
> heap_open -> table_open change.

Oops.  I guess I opened the wrong version.

I'm now wondering whether the same issues discussed in
https://www.postgresql.org/message-id/CA%2BTgmoZN-80143F8OhN8Cn5-uDae5miLYVwMapAuc%2B7%2BZ7pyNg%40mail.gmail.com
also need discussion with respect to this patch.  But I haven't
thought about it very hard, so I'm not sure whether they do or don't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Delay locking partitions during INSERT and UPDATE

From

David Rowley

Date:

01 February 2019, 14:16:32

On Sat, 2 Feb 2019 at 03:07, Robert Haas <robertmhaas@gmail.com> wrote:
> I'm now wondering whether the same issues discussed in
> https://www.postgresql.org/message-id/CA%2BTgmoZN-80143F8OhN8Cn5-uDae5miLYVwMapAuc%2B7%2BZ7pyNg%40mail.gmail.com
> also need discussion with respect to this patch.  But I haven't
> thought about it very hard, so I'm not sure whether they do or don't.

I really don't think it does, or if it does then the code is already
broken as it is now.

As the code is today, we obtain the locks well after the plan is
locked in.  The only difference with this patch is that we do the
locking on routing tuples to the partition for the first time rather
than do them all at once when setting up the routing data structure.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

Robert Haas

Date:

01 February 2019, 14:29:41

On Fri, Feb 1, 2019 at 9:16 AM David Rowley
<david.rowley@2ndquadrant.com> wrote:
> On Sat, 2 Feb 2019 at 03:07, Robert Haas <robertmhaas@gmail.com> wrote:
> > I'm now wondering whether the same issues discussed in
> > https://www.postgresql.org/message-id/CA%2BTgmoZN-80143F8OhN8Cn5-uDae5miLYVwMapAuc%2B7%2BZ7pyNg%40mail.gmail.com
> > also need discussion with respect to this patch.  But I haven't
> > thought about it very hard, so I'm not sure whether they do or don't.
>
> I really don't think it does, or if it does then the code is already
> broken as it is now.
>
> As the code is today, we obtain the locks well after the plan is
> locked in.  The only difference with this patch is that we do the
> locking on routing tuples to the partition for the first time rather
> than do them all at once when setting up the routing data structure.

OK, that is what I thought yesterday, and then I was just doubting
myself.  Thanks for thinking about it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Delay locking partitions during INSERT and UPDATE

From

John Naylor

Date:

15 February 2019, 06:46:49

On 1/25/19, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> Yes, I don't see why would the patch change that and I've been looking
> for such cases. I think David was looking at that this week too, and I
> assume he'll send an update if he finds anything. If not, I plan to get
> it committed soon-ish (possibly next week after the FOSDEM dev meeting,
> unless there are some objections).

I forgot to mark it ready for committer, so I've done that now.

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

Andres Freund

Date:

16 February 2019, 02:22:54

Hi,

On 2019-01-31 13:46:33 -0500, Robert Haas wrote:
> I have reviewed this patch and I am in favor of it.  I think it likely
> needs minor rebasing because of the heap_open -> table_open renaming.
> I also agree that it's worth taking some deadlock risk for the rather
> massive performance gain, although I suspect it's likely that a few
> users are going to complain about deadlocks and I wonder whether we'll
> have to put some energy into that problem at some point.  However, I
> think what we probably want to do there is reduce the probability of
> deadlocks through some trickery or maybe invent some new locking
> mechanisms that work around the problem.  The alternative of trying to
> block this patch seems unpalatable.

Are you saying that such workarounds would have to be merged at the same
time as this patch? Or that we'd address such complaints that way at a
later time?

Greetings,

Andres Freund

Re: Delay locking partitions during INSERT and UPDATE

From

Tom Lane

Date:

18 February 2019, 22:10:59

What I'd like to understand about this patch is how it relates
to Amit L.'s work on making the planner faster for partitioned
UPDATE/DELETE cases (https://commitfest.postgresql.org/22/1778/).
I think that that might render this moot?  And if it doesn't,
how much does it really matter?  You can't really postpone taking
a lock on a relation that the planner is going to do anything
nontrivial with.

            regards, tom lane

Re: Delay locking partitions during INSERT and UPDATE

From

David Rowley

Date:

18 February 2019, 22:38:57

On Tue, 19 Feb 2019 at 11:11, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> What I'd like to understand about this patch is how it relates
> to Amit L.'s work on making the planner faster for partitioned
> UPDATE/DELETE cases (https://commitfest.postgresql.org/22/1778/).

It's close to being unrelated and completely unrelated for the INSERT case.

It's true that currently we obtain locks on all partitions during the
planning of an UPDATE to a partitioned table. During execution, we
attempt to obtain all those locks again in
ExecSetupPartitionTupleRouting(), looking up the local lock table
isn't free, so it's beneficial to not wastefully do this again.  In
the case of PREPAREd UPDATE statements, delaying obtaining any
partition lock until we first route a tuple to the partition can save
big time.   Patches further on in Amit's patch series will reduce the
locking taken in the planner to just non-pruned partitions, these
partitions might not be the same ones that tuples get routed to.
Remember that tuples can go to any partition, not just the ones listed
in the ModifyTable node's subnodes.  So in that case, this patch just
complements Amit's patch series.

> I think that that might render this moot?  And if it doesn't,
> how much does it really matter?

Quite a bit.  Here's an INSERT benchmark from the first email in this
thread. The test was done with 10k partitions.

I wrote:
> Patched:
> tps = 27811.427620 (excluding connections establishing)
> tps = 28617.417308 (excluding connections establishing)
>
> Unpatched:
> tps = 130.446706 (excluding connections establishing)
> tps = 119.726641 (excluding connections establishing)

Remember that planner will obtain no locks on any partitions during
INSERT.  In the above case, the executor in the unpatched version
obtains 10k locks when only 1 is needed. In the patched version the
executor just takes a lock on the single partition that the tuple is
routed to.

> You can't really postpone taking
> a lock on a relation that the planner is going to do anything
> nontrivial with.

Not sure what you mean here.  This is an executor change. What does
the planner care about what the executor does?

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

Tom Lane

Date:

18 February 2019, 22:47:07

David Rowley <david.rowley@2ndquadrant.com> writes:
> On Tue, 19 Feb 2019 at 11:11, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> What I'd like to understand about this patch is how it relates
>> to Amit L.'s work on making the planner faster for partitioned
>> UPDATE/DELETE cases (https://commitfest.postgresql.org/22/1778/).
>> I think that that might render this moot?  And if it doesn't,
>> how much does it really matter?

> Quite a bit.  Here's an INSERT benchmark from the first email in this
> thread. The test was done with 10k partitions.

Um, this doesn't really address my question, since you're comparing
HEAD to the patch.  I was wondering about the delta if we assume
Amit's patch is in and then apply this on top (assuming it even
applies...)

>> You can't really postpone taking
>> a lock on a relation that the planner is going to do anything
>> nontrivial with.

> Not sure what you mean here.  This is an executor change. What does
> the planner care about what the executor does?

Are we talking about the same patch?  The one I'm looking at seems
to be mostly planner and plancache changes:

 src/backend/catalog/dependency.c         |  1 +
 src/backend/commands/createas.c          |  1 +
 src/backend/executor/execUtils.c         | 20 +++++++++++---------
 src/backend/nodes/copyfuncs.c            |  1 +
 src/backend/nodes/equalfuncs.c           |  1 +
 src/backend/nodes/outfuncs.c             |  1 +
 src/backend/nodes/readfuncs.c            |  1 +
 src/backend/optimizer/plan/planner.c     |  2 ++
 src/backend/optimizer/util/inherit.c     | 11 +++++++++++
 src/backend/parser/parse_relation.c      |  2 ++
 src/backend/replication/logical/worker.c |  1 +
 src/backend/rewrite/rewriteHandler.c     |  1 +
 src/backend/utils/adt/ri_triggers.c      |  2 ++
 src/backend/utils/adt/ruleutils.c        |  3 +++
 src/backend/utils/cache/plancache.c      | 15 +++++++++++----
 src/include/nodes/parsenodes.h           |  2 ++

BTW, I'm doubtful that putting planner outputs into RangeTblEntry
is a particularly good idea.

            regards, tom lane

Re: Delay locking partitions during INSERT and UPDATE

From

Andres Freund

Date:

18 February 2019, 22:53:27

Hi,

On 2019-02-18 17:47:07 -0500, Tom Lane wrote:
> David Rowley <david.rowley@2ndquadrant.com> writes:
> > Not sure what you mean here.  This is an executor change. What does
> > the planner care about what the executor does?
> 
> Are we talking about the same patch?  The one I'm looking at seems
> to be mostly planner and plancache changes:

I think so - the patch in this thread is much simpler:

diffstat < /tmp/v2-0001-Delay-locking-of-partitions-during-INSERT-and-UPD.patch
 execPartition.c |   10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

that's the patch upthread from here, at
https://www.postgresql.org/message-id/CAKJS1f_hWSiC9j9HkZCfNVQs-bbHY-P2RqbH4bYcEHmWtpyLRw%40mail.gmail.com

Greetings,

Andres Freund

Re: Delay locking partitions during INSERT and UPDATE

From

David Rowley

Date:

18 February 2019, 22:56:22

On Tue, 19 Feb 2019 at 11:47, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Are we talking about the same patch?  The one I'm looking at seems
> to be mostly planner and plancache changes:
>
>  src/backend/catalog/dependency.c         |  1 +
>  src/backend/commands/createas.c          |  1 +
>  src/backend/executor/execUtils.c         | 20 +++++++++++---------
>  src/backend/nodes/copyfuncs.c            |  1 +
>  src/backend/nodes/equalfuncs.c           |  1 +
>  src/backend/nodes/outfuncs.c             |  1 +
>  src/backend/nodes/readfuncs.c            |  1 +
>  src/backend/optimizer/plan/planner.c     |  2 ++
>  src/backend/optimizer/util/inherit.c     | 11 +++++++++++
>  src/backend/parser/parse_relation.c      |  2 ++
>  src/backend/replication/logical/worker.c |  1 +
>  src/backend/rewrite/rewriteHandler.c     |  1 +
>  src/backend/utils/adt/ri_triggers.c      |  2 ++
>  src/backend/utils/adt/ruleutils.c        |  3 +++
>  src/backend/utils/cache/plancache.c      | 15 +++++++++++----
>  src/include/nodes/parsenodes.h           |  2 ++

Looks like you're looking at the patch from the "Delay locking
partitions during query execution" thread [1].  Different thing
altogether, although the names are confusingly similar.

[1] https://commitfest.postgresql.org/22/1897/

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

Tom Lane

Date:

18 February 2019, 23:15:00

David Rowley <david.rowley@2ndquadrant.com> writes:
> On Tue, 19 Feb 2019 at 11:47, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Are we talking about the same patch?  The one I'm looking at seems
>> to be mostly planner and plancache changes:

> Looks like you're looking at the patch from the "Delay locking
> partitions during query execution" thread [1].  Different thing
> altogether, although the names are confusingly similar.

My apologies --- I searched my inbox for "Delay locking partitions",
and failed to notice that I was extracting a patch from the wrong
thread with that in its title.   (The questions I posed should
be posted to that thread instead, as they still apply there.)

Now that I've looked at *this* thread's patch, I agree that having
the executor acquire locks based on find_all_inheritors() is a
pretty horrid idea.  However, it still seems like this is being
done in a vacuum without attention to locks already acquired
upstream.  How much does the knowledge that the planner or plancache
would've already locked everything mentioned in the rangetable
affect the issues here?

I'm inclined to think that if we already have lock on the parent
partitioned table (thereby, IIUC, guaranteeing that its partitioning
info can't change) that the order in which we acquire the same lock
level on its partition(s) isn't very important.

            regards, tom lane

Re: Delay locking partitions during INSERT and UPDATE

From

David Rowley

Date:

18 February 2019, 23:28:32

On Tue, 19 Feb 2019 at 12:15, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Now that I've looked at *this* thread's patch, I agree that having
> the executor acquire locks based on find_all_inheritors() is a
> pretty horrid idea.  However, it still seems like this is being
> done in a vacuum without attention to locks already acquired
> upstream.  How much does the knowledge that the planner or plancache
> would've already locked everything mentioned in the rangetable
> affect the issues here?

For the INSERT with VALUES case, there should be no existing locks on
the partition.  Only the partitioned table that's the target of the
INSERT will have been locked already.

For the INSERT with SELECT case, well, it's possible that one of the
partitions exist in the SELECT and it was already locked by the
planner or in AcquireExecutorLocks(), but how is that different from
having multiple RangeTblEntry objects for the same relation?  We
already try to lock those twice.  If we had code to resolve those
duplicates and take just the strongest of the locks for the relation
then I might feel we'd better handle the locking better in this patch,
but we don't, so I see no reason to add additional smarts for a corner
case if we have none for a more common case.

For the UPDATE case, we reuse result relations from the ModifyTable's
subnodes.  See ExecFindPartition() around:

/*
* We have not yet set up a ResultRelInfo for this partition,
* but if we have a subplan hash table, we might have one
* there.  If not, we'll have to create one.
*/
if (proute->subplan_resultrel_htab)

> I'm inclined to think that if we already have lock on the parent
> partitioned table (thereby, IIUC, guaranteeing that its partitioning
> info can't change) that the order in which we acquire the same lock
> level on its partition(s) isn't very important.

Yeah. I think most people seem to agree with that too. At least,
nobody seemed to raise any concerns about increased deadlock risk,
although it was talked about.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

David Rowley

Date:

18 February 2019, 23:30:17

On Tue, 19 Feb 2019 at 12:15, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <david.rowley@2ndquadrant.com> writes:
> > Looks like you're looking at the patch from the "Delay locking
> > partitions during query execution" thread [1].  Different thing
> > altogether, although the names are confusingly similar.
>
> My apologies --- I searched my inbox for "Delay locking partitions",
> and failed to notice that I was extracting a patch from the wrong
> thread with that in its title.   (The questions I posed should
> be posted to that thread instead, as they still apply there.)

Easily done.  Happy to answer the question there if you ask it there.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

Robert Haas

Date:

19 February 2019, 17:09:46

On Fri, Feb 15, 2019 at 9:22 PM Andres Freund <andres@anarazel.de> wrote:
> On 2019-01-31 13:46:33 -0500, Robert Haas wrote:
> > I have reviewed this patch and I am in favor of it.  I think it likely
> > needs minor rebasing because of the heap_open -> table_open renaming.
> > I also agree that it's worth taking some deadlock risk for the rather
> > massive performance gain, although I suspect it's likely that a few
> > users are going to complain about deadlocks and I wonder whether we'll
> > have to put some energy into that problem at some point.  However, I
> > think what we probably want to do there is reduce the probability of
> > deadlocks through some trickery or maybe invent some new locking
> > mechanisms that work around the problem.  The alternative of trying to
> > block this patch seems unpalatable.
>
> Are you saying that such workarounds would have to be merged at the same
> time as this patch? Or that we'd address such complaints that way at a
> later time?

Later.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Delay locking partitions during INSERT and UPDATE

From

Robert Haas

Date:

19 February 2019, 17:36:26

On Mon, Feb 18, 2019 at 6:15 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'm inclined to think that if we already have lock on the parent
> partitioned table (thereby, IIUC, guaranteeing that its partitioning
> info can't change) that the order in which we acquire the same lock
> level on its partition(s) isn't very important.

Well, as it turns out, there's also a pending patch (series) to remove
that guarantee which I would like to get committed.  The thread for
that is "ATTACH/DETACH PARTITION CONCURRENTLY".  I believe I've more
or less got the issues with that patch sorted out, but I'm concerned
about how it interacts with all of the other things that are currently
in flight.  Delaying or skipping lock acquisition in some cases and
weakening the locks we take in others is not a process that can
continue indefinitely without at some point hitting a wall.

But that being said, I don't think I quite see how the two things you
mention here are connected to each other.  If operation A locks
parents before children, and operation B also locks parents before
children, they generally won't deadlock against each other, even if
each locks the children in a different order.  An exception would be
if the locks involved are of different levels such that the locks on
the parents don't conflict but the locks on the children do, e.g.
RowExclusiveLock on partitioned tables and AccessExclusiveLock on
partitions.  But that's a pretty weird setup and I don't know of any
current or proposed code that does that.  So my question is - what do
you mean by the parenthetical note about the partitioning info not
changing?  Regardless of whether it does or does not, I think the same
property holds.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Delay locking partitions during INSERT and UPDATE

From

David Rowley

Date:

19 February 2019, 21:07:10

On Wed, 20 Feb 2019 at 06:36, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Mon, Feb 18, 2019 at 6:15 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I'm inclined to think that if we already have lock on the parent
> > partitioned table (thereby, IIUC, guaranteeing that its partitioning
> > info can't change) that the order in which we acquire the same lock
> > level on its partition(s) isn't very important.
>
> Well, as it turns out, there's also a pending patch (series) to remove
> that guarantee which I would like to get committed.  The thread for
> that is "ATTACH/DETACH PARTITION CONCURRENTLY".  I believe I've more
> or less got the issues with that patch sorted out, but I'm concerned
> about how it interacts with all of the other things that are currently
> in flight.  Delaying or skipping lock acquisition in some cases and
> weakeni ng the locks we take in others is not a process that can
> continue indefinitely without at some point hitting a wall.

I'd say that here we should only discuss what this patch is doing, not
anything else that's in flight that you're concerned will conflict
with the ATTACH/DETACH PARTITION CONCURRENTLY patch.

During INSERT and UPDATE, not all partitions will always be locked
before executor startup. This patch removing the find_all_inheritors()
call from ExecSetupPartitionTupleRouting() is not going to break
ATTACH/DETACH PARTITION CONCURRENTLY if it wasn't already broken in
the first place.  With this patch, we're still obtaining locks after
execution has begun, it just may delay the locking until a bit later.
It was previously already happening after executor startup had begun,
so the window for the problems that you describe were already
non-zero.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

Robert Haas

Date:

20 February 2019, 03:17:10

On Tue, Feb 19, 2019 at 4:07 PM David Rowley
<david.rowley@2ndquadrant.com> wrote:
> I'd say that here we should only discuss what this patch is doing, not
> anything else that's in flight that you're concerned will conflict
> with the ATTACH/DETACH PARTITION CONCURRENTLY patch.
>
> During INSERT and UPDATE, not all partitions will always be locked
> before executor startup. This patch removing the find_all_inheritors()
> call from ExecSetupPartitionTupleRouting() is not going to break
> ATTACH/DETACH PARTITION CONCURRENTLY if it wasn't already broken in
> the first place.  With this patch, we're still obtaining locks after
> execution has begun, it just may delay the locking until a bit later.
> It was previously already happening after executor startup had begun,
> so the window for the problems that you describe were already
> non-zero.

The issue of whether it's OK to lock child partitions in variable
order has come up on many different threads, and as far as I know this
is the only thread where Tom has expressed any view on it.  Since Tom
is a smart guy,[1] I was hoping to get him to expand on those views a
little bit.  Even if that leads this thread to detour slightly from
the matter immediately at hand, I think it will be worth it if it gets
us to some kind of consensus on what has been a thorny question for
some time now.

On the other hand, Tom may not reply, in which case the rest of us
will just have to keep faking it as best we can.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] citation needed, except not really

Re: Delay locking partitions during INSERT and UPDATE

From

Robert Haas

Date:

20 February 2019, 15:42:56

On Tue, Feb 19, 2019 at 4:07 PM David Rowley
<david.rowley@2ndquadrant.com> wrote:
> I'd say that here we should only discuss what this patch is doing, ...

On that note, I spent some more time looking at what the patch is doing today.

     /*
      * We locked all the partitions in ExecSetupPartitionTupleRouting
      * including the leaf partitions.
      */
-    partrel = table_open(dispatch->partdesc->oids[partidx], NoLock);
+    partrel = table_open(dispatch->partdesc->oids[partidx], RowExclusiveLock);

It seems to me that the reason for this change is precisely that the
comment is now false, and therefore the comment needs to be updated.

Does that sound right?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Delay locking partitions during INSERT and UPDATE

From

Tom Lane

Date:

20 February 2019, 16:03:32

Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Feb 18, 2019 at 6:15 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm inclined to think that if we already have lock on the parent
>> partitioned table (thereby, IIUC, guaranteeing that its partitioning
>> info can't change) that the order in which we acquire the same lock
>> level on its partition(s) isn't very important.

> But that being said, I don't think I quite see how the two things you
> mention here are connected to each other.  If operation A locks
> parents before children, and operation B also locks parents before
> children, they generally won't deadlock against each other, even if
> each locks the children in a different order.

Right, that's the same thing I was trying to say.

> ...  So my question is - what do
> you mean by the parenthetical note about the partitioning info not
> changing?  Regardless of whether it does or does not, I think the same
> property holds.

What I was wondering about was the possibility of the set of
tables-that-need-to-be-locked-at-all changing.  Maybe that won't
create an issue either, but I'm not quite sure.  I do have to say
that I find the idea of somebody changing the partitioning data
concurrently with queries on that partitioned table to be stone
cold petrifying.  I don't think I believe you can make that work.

            regards, tom lane

Re: Delay locking partitions during INSERT and UPDATE

From

Robert Haas

Date:

20 February 2019, 16:24:48

On Wed, Feb 20, 2019 at 11:03 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Right, that's the same thing I was trying to say.

OK, thanks.

> > ...  So my question is - what do
> > you mean by the parenthetical note about the partitioning info not
> > changing?  Regardless of whether it does or does not, I think the same
> > property holds.
>
> What I was wondering about was the possibility of the set of
> tables-that-need-to-be-locked-at-all changing.  Maybe that won't
> create an issue either, but I'm not quite sure.

That's pretty much what I was thinking, too.  I think it might be fair
to say, however, that if it does give rise to deadlock situations,
they will be corner cases.  For instance, suppose you lock are busy
locking top-down and, meanwhile, somebody detaches a partition you
haven't gotten around to locking yet and tries to attach it someplace
higher up in the partition hierarchy.  I think that there's a
more-or-less unavoidable deadlock there.  And there may be other cases
where it is practically avoidable but we will fail to avoid it.  But I
don't think it's such a common scenario that we have two concurrent
DDL commands on the same partitioning hierarchy that we should stress
about it too much.  If the common cases work OK, a theoretical
deadlock risk in some more obscure case seems acceptable to me, if it
means we get a significant performance boost.

> I do have to say
> that I find the idea of somebody changing the partitioning data
> concurrently with queries on that partitioned table to be stone
> cold petrifying.  I don't think I believe you can make that work.

I guess you won't be surprised to hear that I believe otherwise, but I
agree it's a pretty thorny issue.  I would welcome your comments on
the patches and issues discussed on the ATTACH/DETACH CONCURRENTLY
thread.  To get things to (apparently) work, I have had to separately
handle: (1) the case where things change as we are building the
PartitionDesc, (2) the case where the same PartitionDesc changes
during a planning cycle, (3) the case where the PartitionDesc changes
between planning and execution, and (4) the case where the
PartitionDesc changes during execution.  And even then I'm only able
so far to handle cases where a partition is added, just by ignoring
it, not cases where a partition is removed.  But it does seem to work.
There isn't that much code yet that depends on the PartitionDesc not
changing, so it seems to be an easier problem than, say, allowing a
column to be added concurrently.  As David says, we probably shouldn't
hijack this thread to talk about that in detail, but I would
definitely like it if you would comment on what I've done over there.
I think it's good work but it is very easy to be wrong about such
things -- and even if the basic approach is sound, there may be more
cases of which I have not thought.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Delay locking partitions during INSERT and UPDATE

From

Tom Lane

Date:

20 February 2019, 20:57:11

Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Feb 20, 2019 at 11:03 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> What I was wondering about was the possibility of the set of
>> tables-that-need-to-be-locked-at-all changing.  Maybe that won't
>> create an issue either, but I'm not quite sure.

> That's pretty much what I was thinking, too.  I think it might be fair
> to say, however, that if it does give rise to deadlock situations,
> they will be corner cases.  For instance, suppose you lock are busy
> locking top-down and, meanwhile, somebody detaches a partition you
> haven't gotten around to locking yet and tries to attach it someplace
> higher up in the partition hierarchy.  I think that there's a
> more-or-less unavoidable deadlock there.  And there may be other cases
> where it is practically avoidable but we will fail to avoid it.  But I
> don't think it's such a common scenario that we have two concurrent
> DDL commands on the same partitioning hierarchy that we should stress
> about it too much.  If the common cases work OK, a theoretical
> deadlock risk in some more obscure case seems acceptable to me, if it
> means we get a significant performance boost.

I agree that any deadlock would have to involve somebody doing something
quite odd --- not just one partition-oriented operation, but something
taking multiple strong locks without regard to the partition structure.
So I don't see a problem with taking that risk; people doing that sort
of thing are probably at risk of deadlocks no matter what we do here.

Looking at the patch itself, I agree that a bit more attention to comments
is needed, and I wonder whether David has found all the places where
it's now necessary to s/NoLock/RowExclusiveLock/.  I don't have any
other objections.

            regards, tom lane

Re: Delay locking partitions during INSERT and UPDATE

From

Robert Haas

Date:

20 February 2019, 21:32:15

On Wed, Feb 20, 2019 at 3:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I agree that any deadlock would have to involve somebody doing something
> quite odd --- not just one partition-oriented operation, but something
> taking multiple strong locks without regard to the partition structure.
> So I don't see a problem with taking that risk; people doing that sort
> of thing are probably at risk of deadlocks no matter what we do here.

OK.

> Looking at the patch itself, I agree that a bit more attention to comments
> is needed, and I wonder whether David has found all the places where
> it's now necessary to s/NoLock/RowExclusiveLock/.  I don't have any
> other objections.

I spent some time thinking about that exact issue this morning and
studying the code to try to figure that out.  I wasn't able to find
any other places that seemed to need updating, but it could be that I
missed something that David also missed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Delay locking partitions during INSERT and UPDATE

From

David Rowley

Date:

20 February 2019, 21:56:17

On Thu, 21 Feb 2019 at 10:32, Robert Haas <robertmhaas@gmail.com> wrote:
> I spent some time thinking about that exact issue this morning and
> studying the code to try to figure that out.  I wasn't able to find
> any other places that seemed to need updating, but it could be that I
> missed something that David also missed.

It looks like the comment that claimed the table was already locked
crept back in during a (seemingly) sloppy rebase after the
relation_open() -> table_open() change.

I've made a pass over this again and updated the header comments in
functions that now obtain a lock to mention that fact.

Also slightly updated commit msg in the patch.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

v3-0001-On-demand-locking-of-partitions-during-INSERT-and.patch

Re: Delay locking partitions during INSERT and UPDATE

From

Tom Lane

Date:

21 February 2019, 01:18:05

Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Feb 20, 2019 at 3:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Looking at the patch itself, I agree that a bit more attention to comments
>> is needed, and I wonder whether David has found all the places where
>> it's now necessary to s/NoLock/RowExclusiveLock/.  I don't have any
>> other objections.

> I spent some time thinking about that exact issue this morning and
> studying the code to try to figure that out.  I wasn't able to find
> any other places that seemed to need updating, but it could be that I
> missed something that David also missed.

Actually, in the wake of b04aeb0a0, we probably need not be too stressed
about the possibility that we missed something.  Those assertions wouldn't
detect doing an unwanted lock upgrade, but I think that's unlikely to be
an issue here (or if it is, we have the problem anyway).

            regards, tom lane

Re: Delay locking partitions during INSERT and UPDATE

From

Robert Haas

Date:

21 February 2019, 16:33:16

On Wed, Feb 20, 2019 at 4:56 PM David Rowley
<david.rowley@2ndquadrant.com> wrote:
> I've made a pass over this again and updated the header comments in
> functions that now obtain a lock to mention that fact.

Thanks.  I have committed this version.  I know Tomas Vondra was
planning to do that, but it's been close to a month since he mentioned
those plans so I hope he will not mind me jumping in.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Delay locking partitions during INSERT and UPDATE

From

Tomas Vondra

Date:

21 February 2019, 16:41:48

On 2/21/19 5:33 PM, Robert Haas wrote:
> On Wed, Feb 20, 2019 at 4:56 PM David Rowley
> <david.rowley@2ndquadrant.com> wrote:
>> I've made a pass over this again and updated the header comments in
>> functions that now obtain a lock to mention that fact.
> 
> Thanks.  I have committed this version.  I know Tomas Vondra was
> planning to do that, but it's been close to a month since he mentioned
> those plans so I hope he will not mind me jumping in.
> 

Fine with me. It was mostly a sign that I'd like to get that committed
eventually, trying to nudge others into looking at the patch a bit more.
Relaxing locks is tricky, and I wasn't quite sure I haven't missed
something obvious.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Delay locking partitions during INSERT and UPDATE

From

David Rowley

Date:

21 February 2019, 19:49:26

On Fri, 22 Feb 2019 at 05:33, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Feb 20, 2019 at 4:56 PM David Rowley
> <david.rowley@2ndquadrant.com> wrote:
> > I've made a pass over this again and updated the header comments in
> > functions that now obtain a lock to mention that fact.
>
> Thanks.  I have committed this version.  I know Tomas Vondra was
> planning to do that, but it's been close to a month since he mentioned
> those plans so I hope he will not mind me jumping in.

Great. Thanks for doing that, and thanks to everyone who reviewed it too.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services