Thread: Add pg_partition_root to get top-most parent of a partition tree

Add pg_partition_root to get top-most parent of a partition tree

From
Michael Paquier
Date:
Hi all,

Álvaro has given faced a use case where it would be useful to have a
function which is able to return the top-most parent of a partition
tree:
https://postgr.es/m/20181204184159.eue3wlchqrkh4vsc@alvherre.pgsql

This has been mentioned as well on the thread where was discussed
pg_partition_tree, but it got shaved from the committed patch as many
things happened when discussing the thing.

Attached is a patch to do the work, which includes documentation and
tests.  An argument could be made to include the top-most parent as part
of pg_partition_tree, but it feels more natural to me to have a separate
function.  This makes sure to handle invalid relations by returning
NULL, and it generates an error for incorrect relkind.

I have included as well a fix for the recent crash on pg_partition_tree
I have reported, but let's discuss the crash on its thread here:
https://www.postgresql.org/message-id/20181207010406.GO2407@paquier.xyz
The bug fix would most likely get committed first, and I'll rebase this
patch as need be.

I am adding this patch to the CF of January.  I think that Amit should
also be marked as a co-author of this patch, as that's inspired from
what has been submitted previously, still I have no reused the code.

Thanks,
--
Michael

Attachment

Re: Add pg_partition_root to get top-most parent of a partition tree

From
Alvaro Herrera
Date:
I think adding a pg_partition_root() call to as many pg_partition_tree
tests as you modified is overkill ... OTOH I'd have one test that
invokes pg_partition_tree(pg_partition_root(some-partition)) to verify
that starting from any point in the tree you get the whole tree.

I haven't actually tried to write a query that obtains a tree of
constraints using this, mind ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Add pg_partition_root to get top-most parent of a partition tree

From
Michael Paquier
Date:
On Thu, Dec 06, 2018 at 10:48:59PM -0300, Alvaro Herrera wrote:
> I think adding a pg_partition_root() call to as many pg_partition_tree
> tests as you modified is overkill ... OTOH I'd have one test that
> invokes pg_partition_tree(pg_partition_root(some-partition)) to verify
> that starting from any point in the tree you get the whole tree.

Good idea, thanks for the input.

> I haven't actually tried to write a query that obtains a tree of
> constraints using this, mind ...

Sure.  It would be good to agree on an interface.  I have not tried
either, but you should be able to get away with a join on relid returned
by pg_partition_tree() with pg_constraint.conrelid with
pg_get_constraintdef() instead of a WITH RECURSIVE, no?
--
Michael

Attachment

Re: Add pg_partition_root to get top-most parent of a partition tree

From
Michael Paquier
Date:
On Fri, Dec 07, 2018 at 11:46:05AM +0900, Michael Paquier wrote:
> On Thu, Dec 06, 2018 at 10:48:59PM -0300, Alvaro Herrera wrote:
>> I think adding a pg_partition_root() call to as many pg_partition_tree
>> tests as you modified is overkill ... OTOH I'd have one test that
>> invokes pg_partition_tree(pg_partition_root(some-partition)) to verify
>> that starting from any point in the tree you get the whole tree.
>
> Good idea, thanks for the input.

The recent commit cc53123 has fixed a couple of issues with
pg_partition_tree, so attached is a rebased patch which similarly makes
pg_partition_root return NULL for unsupported relkinds and undefined
relations.  I have also simplified the tests based on Alvaro's
suggestion to use pg_partition_tree(pg_partition_root(partfoo)).

Thanks,
--
Michael

Attachment

Re: Add pg_partition_root to get top-most parent of a partition tree

From
Amit Langote
Date:
Hi,

On 2018/12/12 10:48, Michael Paquier wrote:
> On Fri, Dec 07, 2018 at 11:46:05AM +0900, Michael Paquier wrote:
>> On Thu, Dec 06, 2018 at 10:48:59PM -0300, Alvaro Herrera wrote:
>>> I think adding a pg_partition_root() call to as many pg_partition_tree
>>> tests as you modified is overkill ... OTOH I'd have one test that
>>> invokes pg_partition_tree(pg_partition_root(some-partition)) to verify
>>> that starting from any point in the tree you get the whole tree.
>>
>> Good idea, thanks for the input.
> 
> The recent commit cc53123 has fixed a couple of issues with
> pg_partition_tree, so attached is a rebased patch which similarly makes
> pg_partition_root return NULL for unsupported relkinds and undefined
> relations.  I have also simplified the tests based on Alvaro's
> suggestion to use pg_partition_tree(pg_partition_root(partfoo)).

Thanks for working on this.  I have looked at this patch and here are some
comments.

+        Return the top-most parent of a partition tree for the given
+        partitioned table or partitioned index.

Given that pg_partition_root will return a valid result for any relation
that can be part of a partition tree, it seems strange that the above
sentence says "for the given partitioned table or partitioned index".  It
should perhaps say:

Return the top-most parent of the partition tree to which the given
relation belongs

+/*
+ * Perform several checks on a relation on which is extracted some
+ * information related to its partition tree.  Returns false if the
+ * relation cannot be processed, in which case it is up to the caller
+ * to decide what to do, by either raising an error or doing something
+ * else.
+ */
+static bool
+check_rel_for_partition_info(Oid relid)
+{
+    char        relkind;
+
+    /* Check if relation exists */
+    if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(relid)))
+        return false;

This should be checked in the caller imho.

+
+    relkind = get_rel_relkind(relid);
+
+    /* Only allow relation types that can appear in partition trees. */
+    if (relkind != RELKIND_RELATION &&
+        relkind != RELKIND_FOREIGN_TABLE &&
+        relkind != RELKIND_INDEX &&
+        relkind != RELKIND_PARTITIONED_TABLE &&
+        relkind != RELKIND_PARTITIONED_INDEX)
+        return false;
+
+    return true;
+}

I can't imagine this function growing more code to perform additional
checks beside just checking the relkind, so the name of this function may
be a bit too ambitious.  How about calling it
check_rel_can_be_partition()?  The comment above the function could be a
much simpler sentence too.  I know I may be just bikeshedding here though.

+    /*
+     * If the relation is not a partition, return itself as a result.
+     */
+    if (!get_rel_relispartition(relid))
+        PG_RETURN_OID(relid);

Maybe the comment here could say "The relation itself may be the root parent".

+    /*
+     * If the relation is actually a partition, 'rootrelid' has been set to
+     * the OID of the root table in the partition tree.  It should be a valid
+     * valid per the previous check for partition leaf above.
+     */
+    Assert(OidIsValid(rootrelid));

"valid" is duplicated in the last sentence in the comment.  Anyway, what's
being Asserted can be described simply as:

/*
 * 'rootrelid' must contain a valid OID, given that the input relation is
 * a valid partition tree member as checked above.
 */

Thanks,
Amit



Re: Add pg_partition_root to get top-most parent of a partition tree

From
Michael Paquier
Date:
On Fri, Dec 14, 2018 at 02:20:27PM +0900, Amit Langote wrote:
> Given that pg_partition_root will return a valid result for any relation
> that can be part of a partition tree, it seems strange that the above
> sentence says "for the given partitioned table or partitioned index".  It
> should perhaps say:
>
> Return the top-most parent of the partition tree to which the given
> relation belongs

Check.

> +static bool
> +check_rel_for_partition_info(Oid relid)
> +{
> +    char        relkind;
> +
> +    /* Check if relation exists */
> +    if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(relid)))
> +        return false;
>
> This should be checked in the caller imho.

On this one I disagree, both pg_partition_root and pg_partition_tree
share the same semantics on the matter.  If the set of functions gets
expanded again later on, I got the feeling that we could forget about it
again, and at least placing the check here has the merit to make out
future selves not forget about that pattern..

> I can't imagine this function growing more code to perform additional
> checks beside just checking the relkind, so the name of this function may
> be a bit too ambitious.  How about calling it
> check_rel_can_be_partition()?  The comment above the function could be a
> much simpler sentence too.  I know I may be just bikeshedding here
> though.

The review is also here for that.  The routine name you are suggesting
looks good to me.

> +    /*
> +     * If the relation is not a partition, return itself as a result.
> +     */
> +    if (!get_rel_relispartition(relid))
> +        PG_RETURN_OID(relid);
>
> Maybe the comment here could say "The relation itself may be the root
> parent".

Check.  I tweaked the comment in this sense.

> "valid" is duplicated in the last sentence in the comment.  Anyway, what's
> being Asserted can be described simply as:
>
> /*
>  * 'rootrelid' must contain a valid OID, given that the input relation is
>  * a valid partition tree member as checked above.
>  */

Changed in this sense.  Please find attached an updated patch.
--
Michael

Attachment

Re: Add pg_partition_root to get top-most parent of a partition tree

From
Michael Paquier
Date:
On Sat, Dec 15, 2018 at 10:16:08AM +0900, Michael Paquier wrote:
> Changed in this sense.  Please find attached an updated patch.

Rebased as per the attached, and moved to next CF.
--
Michael

Attachment

Re: Add pg_partition_root to get top-most parent of a partition tree

From
Amit Langote
Date:
Hi Michael,

Sorry about the long delay in replying.  Looking at the latest patch (v4)
but replying to an earlier email of yours.

On 2018/12/15 10:16, Michael Paquier wrote:
> On Fri, Dec 14, 2018 at 02:20:27PM +0900, Amit Langote wrote:
>> +static bool
>> +check_rel_for_partition_info(Oid relid)
>> +{
>> +    char        relkind;
>> +
>> +    /* Check if relation exists */
>> +    if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(relid)))
>> +        return false;
>>
>> This should be checked in the caller imho.
> 
> On this one I disagree, both pg_partition_root and pg_partition_tree
> share the same semantics on the matter.  If the set of functions gets
> expanded again later on, I got the feeling that we could forget about it
> again, and at least placing the check here has the merit to make out
> future selves not forget about that pattern..

OK, no problem.

Some minor comments on v4:

+/*
+ * Perform several checks on a relation on which is extracted some
+ * information related to its partition tree.

This is a bit unclear to me.  How about:

Checks if a given relation can be part of a partition tree

+/*
+ * pg_partition_root
+ *
+ * For the given relation part of a partition tree, return its top-most
+ * root parent.
+ */

How about writing:

Returns the top-most parent of the partition tree to which a given
relation belongs, or NULL if it's not (or cannot be) part of any partition
tree


Given that a couple (?) of other patches depend on this, maybe it'd be a
good idea to proceed with this.  Sorry that I kept this hanging too long
by not sending these comments sooner.

Thanks,
Amit



Re: Add pg_partition_root to get top-most parent of a partition tree

From
Michael Paquier
Date:
On Wed, Feb 06, 2019 at 05:26:48PM +0900, Amit Langote wrote:
> Some minor comments on v4:

Thanks for the review.

> +/*
> + * Perform several checks on a relation on which is extracted some
> + * information related to its partition tree.
>
> This is a bit unclear to me.  How about:
>
> Checks if a given relation can be part of a partition tree

Done as suggested.

> Returns the top-most parent of the partition tree to which a given
> relation belongs, or NULL if it's not (or cannot be) part of any partition
> tree

Fine for me as well.

> Given that a couple (?) of other patches depend on this, maybe it'd be a
> good idea to proceed with this.

If you are happy with the version attached, I am fine to commit it.  I
think that we have the right semantics and the right test coverage for
this patch.

> Sorry that I kept this hanging too long by not sending these
> comments sooner.

No problem, don't worry.  There are many patches hanging around.
--
Michael

Attachment

Re: Add pg_partition_root to get top-most parent of a partition tree

From
Amit Langote
Date:
Hi,

On 2019/02/06 19:14, Michael Paquier wrote:
>> Given that a couple (?) of other patches depend on this, maybe it'd be a
>> good idea to proceed with this.
> 
> If you are happy with the version attached, I am fine to commit it.  I
> think that we have the right semantics and the right test coverage for
> this patch.

Yeah, I agree.  Should also check with Alvaro maybe?

Thanks,
Amit



Re: Add pg_partition_root to get top-most parent of a partition tree

From
Michael Paquier
Date:
On Thu, Feb 07, 2019 at 01:34:15PM +0900, Amit Langote wrote:
> Yeah, I agree.  Should also check with Alvaro maybe?

Good idea.  Let's wait for his input.
--
Michael

Attachment

Re: Add pg_partition_root to get top-most parent of a partition tree

From
Alvaro Herrera
Date:
On 2019-Feb-07, Michael Paquier wrote:

> On Thu, Feb 07, 2019 at 01:34:15PM +0900, Amit Langote wrote:
> > Yeah, I agree.  Should also check with Alvaro maybe?
> 
> Good idea.  Let's wait for his input.

I looked at it briefly a few weeks ago and it seemed sound, though I
haven't yet tried to write the constraint display query for psql using
it, yet -- but that should be straightforward anyway.  Let's get it
committed so we have one less thing to worry about.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Add pg_partition_root to get top-most parent of a partition tree

From
Michael Paquier
Date:
On Thu, Feb 07, 2019 at 12:11:49PM -0300, Alvaro Herrera wrote:
> I looked at it briefly a few weeks ago and it seemed sound, though I
> haven't yet tried to write the constraint display query for psql using
> it, yet -- but that should be straightforward anyway.  Let's get it
> committed so we have one less thing to worry about.

item_to_worry_about--;

Thanks for the successive reviews.
--
Michael

Attachment