Re: Delay locking partitions during query execution - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Delay locking partitions during query execution
Date
Msg-id 6c4b97f7-d906-b3b3-9830-7d8bb498c0d4@2ndquadrant.com
Whole thread Raw
In response to Re: Delay locking partitions during query execution  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: Delay locking partitions during query execution
List pgsql-hackers

On 3/5/19 6:55 AM, David Rowley wrote:
> On Sat, 2 Feb 2019 at 02:52, Robert Haas <robertmhaas@gmail.com> wrote:
>> I think the key question here is whether or not you can cope with
>> someone having done arbitrary AEL-requiring modifications to the
>> delaylocked partitions.  If you can, the fact that the plan might
>> sometimes be out-of-date is an inevitable consequence of doing this at
>> all, but I think we can argue that it's an acceptable consequence as
>> long as the resulting behavior is not too bletcherous.  If you can't,
>> then this patch is dead.
> 
> I spent some time looking at this patch again and thinking about the
> locking. I ended up looking through each alter table subcommand by way
> of AlterTableGetLockLevel() and going through scenarios with each of
> them in my head to try to understand:
> 
> a) If the subcommand can even be applied to a leaf partition; and
> b) Can the subcommand cause a cached plan to become invalid?
> 
> I did all the easy ones first then started on the harder ones. I came
> to AT_DropConstraint and imagined the following scenario:
> 
> -- ** session 1
> create table listp (a int) partition by list(a);
> create table listp1 partition of listp for values in(1);
> create table listp2 partition of listp for values in(2);
> 
> create index listp1_a_idx on listp1 (a);
> create index listp2_a_idx on listp2 (a);
> 
> set enable_seqscan = off;
> set plan_cache_mode = force_generic_plan;
> prepare q1 (int) as select * from listp where a = $1;
> execute q1 (1);
> begin;
> execute q1 (1);
> 
> 
> -- ** session 2
> drop index listp2_a_idx;
> 
> -- ** session 1
> execute q1 (2);
> ERROR:  could not open relation with OID 16401
> 
> The only way I can think to fix this is to just never lock partitions
> at all, and if a lock is to be obtained on a partition, it must be
> instead obtained on the top-level partitioned table.  That's a rather
> large change that could have large consequences, and I'm not even sure
> it's possible since we'd need to find the top-level parent before
> obtaining the lock, by then the hierarchy might have changed and we'd
> need to recheck, which seems like quite a lot of effort just to obtain
> a lock... Apart from that, it's not this patch, so looks like I'll
> need to withdraw this one :-(
> 

So you're saying we could

1) lookup the parent and lock it
2) repeat the lookup to verify it did not change

I think that could still be a win, assuming that most hierarchies will
be rather shallow (I'd say 2-3 levels will cover like 95% of cases, and
4 levels would be 100% in practice). And the second lookup should be
fairly cheap thanks to syscache and the fact that the hierarchies do not
change very often.

I can't judge how invasive this patch would be, but I agree it's more
complex than the originally proposed patch.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-hackers by date:

Previous
From: Andy Fan
Date:
Subject: any plan to support shared servers like Oracle in PG?
Next
From: Jesper Pedersen
Date:
Subject: Re: speeding up planning with partitions