Re: Declarative partitioning - another take - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Declarative partitioning - another take
Date
Msg-id CA+TgmoYKeVRmCEUXBVepHK30A4txVeLgL0v3G1mKY-cmWUxwjQ@mail.gmail.com
Whole thread Raw
In response to Re: Declarative partitioning - another take  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Responses Re: Declarative partitioning - another take  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
List pgsql-hackers
On Wed, Aug 31, 2016 at 12:37 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
>>> If we need an AccessExclusiveLock on parent to add/remove a partition
>>> (IOW, changing that child table's partitioning information), then do we
>>> need to lock the individual partitions when reading partition's
>>> information?  I mean to ask why the simple syscache look-ups to get each
>>> partition's bound wouldn't do.
>>
>> Well, if X can't be changed without having an AccessExclusiveLock on
>> the parent, then an AccessShareLock on the parent is sufficient to
>> read X, right?  Because those lock modes conflict.
>
> Yes.  And hence we can proceed with performing partition elimination
> before locking any of children.  Lock on parent (AccessShareLock) will
> prevent any of existing partitions to be removed and any new partitions to
> be added because those operations require AccessExclusiveLock on the
> parent.

Agreed.

> What I was trying to understand is why this would not be possible
> with a design where partition bound is stored in the catalog as a property
> of individual partitions instead of a design where we store collection of
> partition bounds as a property of the parent.

From the point of view of feasibility, I don't think it matters very
much where the property is stored; it's the locking that is the key
thing.  In other words, I think this *would* be possible if the
partition bound is stored as a property of individual partitions, as
long as it can't change without a lock on the parent.

However, it seems a lot better to make it a property of the parent
from a performance point of view.  Suppose there are 1000 partitions.
Reading one toasted value for pg_class and running stringToNode() on
it is probably a lot faster than scanning pg_inherits to find all of
the child partitions and then doing an index scan to find the pg_class
tuple for each and then decoding all of those tuples and assembling
them into some data structure.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: comment fix for CUSTOMPATH_* flags
Next
From: Michael Paquier
Date:
Subject: Re: Missing checks when malloc returns NULL...