Thread: unsupportable composite type partition keys

unsupportable composite type partition keys

From
Amit Langote
Date:
Hi,

It seems to me that we currently allow expressions that are anonymous
and self-referencing composite type records as partition key, but
shouldn't.  Allowing them leads to this:

create table foo (a int) partition by list ((row(a, b)));
create table foo1 partition of foo for values in ('(1)'::foo);
create table foo2 partition of foo for values in ('(2)'::foo);
explain select * from foo where row(a) = '(1)'::foo;
ERROR:  stack depth limit exceeded

Stack trace is this:

#0  errfinish (dummy=0) at elog.c:442
#1  0x0000000000911a51 in check_stack_depth () at postgres.c:3288
#2  0x00000000007970e6 in expression_tree_mutator (node=0x31890a0,
mutator=0x82095f <eval_const_expressions_mutator>,
context=0x7fff0578ef60) at nodeFuncs.c:2526
#3  0x000000000082340b in eval_const_expressions_mutator
(node=0x31890a0, context=0x7fff0578ef60) at clauses.c:3605
#4  0x000000000079875c in expression_tree_mutator (node=0x31890f8,
mutator=0x82095f <eval_const_expressions_mutator>,
context=0x7fff0578ef60) at nodeFuncs.c:2996
#5  0x000000000082340b in eval_const_expressions_mutator
(node=0x31890f8, context=0x7fff0578ef60) at clauses.c:3605
#6  0x000000000079810c in expression_tree_mutator (node=0x3188cc8,
mutator=0x82095f <eval_const_expressions_mutator>,
context=0x7fff0578ef60) at nodeFuncs.c:2863
#7  0x000000000082225d in eval_const_expressions_mutator
(node=0x3188cc8, context=0x7fff0578ef60) at clauses.c:3154
#8  0x000000000079875c in expression_tree_mutator (node=0x3189240,
mutator=0x82095f <eval_const_expressions_mutator>,
context=0x7fff0578ef60) at nodeFuncs.c:2996
#9  0x000000000082340b in eval_const_expressions_mutator
(node=0x3189240, context=0x7fff0578ef60) at clauses.c:3605
#10 0x000000000082090c in eval_const_expressions (root=0x0,
node=0x3189240) at clauses.c:2265
#11 0x0000000000a75169 in RelationBuildPartitionKey
(relation=0x7f5ca3e479a8) at partcache.c:139
#12 0x0000000000a7aa5e in RelationBuildDesc (targetRelId=17178,
insertIt=true) at relcache.c:1171
#13 0x0000000000a7c975 in RelationIdGetRelation (relationId=17178) at
relcache.c:2035
#14 0x000000000048e0c0 in relation_open (relationId=17178, lockmode=1)
at relation.c:59
#15 0x0000000000a8a4f7 in load_typcache_tupdesc (typentry=0x1c16bc0)
at typcache.c:793
#16 0x0000000000a8a3bb in lookup_type_cache (type_id=17180, flags=256)
at typcache.c:748
#17 0x0000000000a8bba4 in lookup_rowtype_tupdesc_internal
(type_id=17180, typmod=-1, noError=false) at typcache.c:1570
#18 0x0000000000a8be43 in lookup_rowtype_tupdesc (type_id=17180,
typmod=-1) at typcache.c:1656
#19 0x0000000000a0713f in record_cmp (fcinfo=0x7fff0578f4d0) at rowtypes.c:815
#20 0x0000000000a083e2 in btrecordcmp (fcinfo=0x7fff0578f4d0) at rowtypes.c:1276
#21 0x0000000000a97bd9 in FunctionCall2Coll (flinfo=0x2bb4a98,
collation=0, arg1=51939144, arg2=51940000) at fmgr.c:1162
#22 0x00000000008443f6 in qsort_partition_list_value_cmp (a=0x3188c50,
b=0x3188c58, arg=0x2bb46c0) at partbounds.c:1769
#23 0x0000000000af9dc6 in qsort_arg (a=0x3188c50, n=2, es=8,
cmp=0x84439a <qsort_partition_list_value_cmp>, arg=0x2bb46c0) at
qsort_arg.c:132
#24 0x000000000084186a in create_list_bounds (boundspecs=0x3188650,
nparts=2, key=0x2bb46c0, mapping=0x7fff0578f7d8) at partbounds.c:396
#25 0x00000000008410ec in partition_bounds_create
(boundspecs=0x3188650, nparts=2, key=0x2bb46c0,
mapping=0x7fff0578f7d8) at partbounds.c:206
#26 0x0000000000847622 in RelationBuildPartitionDesc
(rel=0x7f5ca3e47560) at partdesc.c:205
#27 0x0000000000a7aa6a in RelationBuildDesc (targetRelId=17178,
insertIt=true) at relcache.c:1172

Also:

create table foo (a int) partition by list ((row(a)));
create table foo1 partition of foo for values in (row(1));
create table foo2 partition of foo for values in (row(2));

explain select * from foo where row(a) = '(1)'::foo;
                        QUERY PLAN
----------------------------------------------------------
 Seq Scan on foo1 foo  (cost=0.00..41.88 rows=13 width=4)
   Filter: (ROW(a) = '(1)'::foo)
(2 rows)

explain select * from foo where row(a) = '(2)'::foo;
                        QUERY PLAN
----------------------------------------------------------
 Seq Scan on foo2 foo  (cost=0.00..41.88 rows=13 width=4)
   Filter: (ROW(a) = '(2)'::foo)
(2 rows)

-- another session
explain select * from foo where row(a) = '(1)'::foo;
ERROR:  record type has not been registered
LINE 1: explain select * from foo where row(a) = '(1)'::foo;

Attached a patch to fix that.

Thanks,
Amit

Attachment

Re: unsupportable composite type partition keys

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
> It seems to me that we currently allow expressions that are anonymous
> and self-referencing composite type records as partition key, but
> shouldn't.  Allowing them leads to this:

Hm.  Seems like the restrictions here ought to be just about the same
as on index columns, no?  That is, it should be roughly a test like
"no pseudo-types".  The check you're proposing seems awfully specific,
and I doubt that the equivalent check in CREATE INDEX looks the same.
(But I didn't go look ... I might be wrong.)

            regards, tom lane



Re: unsupportable composite type partition keys

From
Amit Langote
Date:
On Wed, Dec 18, 2019 at 2:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Langote <amitlangote09@gmail.com> writes:
> > It seems to me that we currently allow expressions that are anonymous
> > and self-referencing composite type records as partition key, but
> > shouldn't.  Allowing them leads to this:
>
> Hm.  Seems like the restrictions here ought to be just about the same
> as on index columns, no?  That is, it should be roughly a test like
> "no pseudo-types".  The check you're proposing seems awfully specific,
> and I doubt that the equivalent check in CREATE INDEX looks the same.
> (But I didn't go look ... I might be wrong.)

We also need to disallow self-referencing composite type in the case
of partitioning, because otherwise it leads to infinite recursion
shown in my first email.

The timing of building PartitionDesc is what causes it, because the
construction of PartitionBoundInfo in turn requires opening the parent
relation if the partition partition key is of self-referencing
composite type, because we need the TupleDesc when sorting the
partition bounds.  Maybe we'll need to rearrange that someday so that
PartitionDesc is built outside RelationBuildDesc path, so this
infinite recursion doesn't occur, but maybe allowing this case isn't
that useful to begin with?

Thanks,
Amit



Re: unsupportable composite type partition keys

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
> On Wed, Dec 18, 2019 at 2:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Hm.  Seems like the restrictions here ought to be just about the same
>> as on index columns, no?

> We also need to disallow self-referencing composite type in the case
> of partitioning, because otherwise it leads to infinite recursion
> shown in my first email.

My point is basically that CheckAttributeType already covers that
issue, as well as a lot of others.  So why isn't the partitioning
code using it?

            regards, tom lane



Re: unsupportable composite type partition keys

From
Amit Langote
Date:
On Wed, Dec 18, 2019 at 10:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Langote <amitlangote09@gmail.com> writes:
> > On Wed, Dec 18, 2019 at 2:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Hm.  Seems like the restrictions here ought to be just about the same
> >> as on index columns, no?
>
> > We also need to disallow self-referencing composite type in the case
> > of partitioning, because otherwise it leads to infinite recursion
> > shown in my first email.
>
> My point is basically that CheckAttributeType already covers that
> issue, as well as a lot of others.  So why isn't the partitioning
> code using it?

My reason to not use it was that the error message that are produced
are not quite helpful in this case; compare what my patch produces vs.
what one gets with CheckAttributeType("expr", ...):

    a int,
    b int
 ) PARTITION BY RANGE (((a, b)));
-ERROR:  partition key cannot be of anonymous or self-referencing composite type
-LINE 4: ) PARTITION BY RANGE (((a, b)));
-                              ^
+ERROR:  column "expr" has pseudo-type record

 CREATE TABLE partitioned (
    a int,
    b int
 ) PARTITION BY RANGE ((row(a, b)));
-ERROR:  partition key cannot be of anonymous or self-referencing composite type
-LINE 4: ) PARTITION BY RANGE ((row(a, b)));
-                              ^
+ERROR:  column "expr" has pseudo-type record

 CREATE TABLE partitioned (
    a int,
    b int
 ) PARTITION BY RANGE ((row(a, b)::partitioned));
-ERROR:  partition key cannot be of anonymous or self-referencing composite type
-LINE 4: ) PARTITION BY RANGE ((row(a, b)::partitioned));
-                              ^
+ERROR:  composite type partitioned cannot be made a member of itself

Thanks,
Amit



Re: unsupportable composite type partition keys

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
> On Wed, Dec 18, 2019 at 10:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> My point is basically that CheckAttributeType already covers that
>> issue, as well as a lot of others.  So why isn't the partitioning
>> code using it?

> My reason to not use it was that the error message that are produced
> are not quite helpful in this case;

I can't get terribly excited about that; but in any case, if we think
the errors aren't nice enough, the answer is to improve them, not
re-implement the function badly.

After further thought, it seems to me that we are dealing with two
nearly independent issues:

1. We must not accept partition bounds values that are of underdetermined
types, else (a) we are likely to get failures like "record type has not
been registered" while loading them back from disk, and (b) polymorphic
btree support functions are likely to complain that they can't identify
the type they're supposed to work on.  This is exactly the same issue that
expression indexes face, so we should be applying the same checks, that
is CheckAttributeType().  I do not believe that checking for RECORD is
adequate to close this hole.  At the very least, RECORD[] is equally
dangerous, and in general I think any pseudotype would be risky.

2. If the partitioning expression contains a reference to the partitioned
table's rowtype, we get infinite recursion while trying to load the
relcache entry.  The patch proposes to prevent that by checking whether
the expression's final result type is that type, but that's not nearly
adequate because a reference anywhere inside the expression is just as
bad.  In general, considering possibly-inlined SQL functions, I'm doubtful
that any precheck is going to be able to prevent this scenario.

Now as far as point 1 goes, I think it's not really that awful to use
CheckAttributeType() with a dummy attribute name.  The attached
incomplete patch uses "partition key" which causes it to emit errors
like

regression=# create table fool (a int, b int) partition by list ((row(a, b)));
ERROR:  column "partition key" has pseudo-type record

I don't think that that's unacceptable.  But if we wanted to improve it,
we could imagine adding another flag, say CHKATYPE_IS_PARTITION_KEY,
that doesn't affect CheckAttributeType's semantics, just the wording of
the error messages it throws.

As far as point 2 goes, I think this is another outgrowth of the
fundamental design error that we load a partitioned rel's partitioning
info immediately when the relcache entry is created, rather than later
on-demand.  If we weren't doing that then it wouldn't be problematic
to inspect the rel's rowtype while constructing the partitioning info.
I've bitched about this before, if memory serves, but couldn't light
a fire under anyone about fixing it.  Now I think we have no choice.
It was never a great idea that minimal construction of a relcache
entry could result in running arbitrary user-defined code.

Note that the end result of this would be to allow, not prohibit,
cases like your example.  I wonder whether we couldn't also lift
the restriction against whole-row Vars in partition expressions.
Doesn't seem like there is much difference between such a Var and
a row(...)::table_rowtype expression.

            regards, tom lane

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index e8e004e..29efc9c 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -15102,6 +15102,16 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu
             attcollation = exprCollation(expr);

             /*
+             * The expression must be of a storable type (e.g., not RECORD).
+             * The test is the same as for whether a table column is of a safe
+             * type (which is why we needn't check for the non-expression
+             * case).
+             */
+            CheckAttributeType("partition key",
+                               atttype, attcollation,
+                               NIL, 0);
+
+            /*
              * Strip any top-level COLLATE clause.  This ensures that we treat
              * "x COLLATE y" and "(x COLLATE y)" alike.
              */

Re: unsupportable composite type partition keys

From
Tom Lane
Date:
I wrote:
> As far as point 2 goes, I think this is another outgrowth of the
> fundamental design error that we load a partitioned rel's partitioning
> info immediately when the relcache entry is created, rather than later
> on-demand.  If we weren't doing that then it wouldn't be problematic
> to inspect the rel's rowtype while constructing the partitioning info.
> I've bitched about this before, if memory serves, but couldn't light
> a fire under anyone about fixing it.  Now I think we have no choice.
> It was never a great idea that minimal construction of a relcache
> entry could result in running arbitrary user-defined code.

Here's a draft patch for that.  There are a couple of secondary issues
I didn't do anything about yet:

* When rebuilding an open relcache entry for a partitioned table, this
coding now always quasi-leaks the old rd_pdcxt, where before that happened
only if the partdesc actually changed.  (Even if I'd kept the
equalPartitionDescs call, it would always fail.)  I complained about the
quasi-leak behavior before, but this probably pushes it up to the level of
"must fix".  What I'm inclined to do is to hack
RelationDecrementReferenceCount so that, when the refcount goes to zero,
we delete any child contexts of rd_pdcxt.  That's pretty annoying but in
the big scheme of things it's unlikely to matter.

* It'd be better to declare RelationGetPartitionKey and
RelationGetPartitionDesc in relcache.h and get their callers out of the
business of including rel.h, where possible.

* equalPartitionDescs is now dead code, should we remove it?

> Note that the end result of this would be to allow, not prohibit,
> cases like your example.  I wonder whether we couldn't also lift
> the restriction against whole-row Vars in partition expressions.
> Doesn't seem like there is much difference between such a Var and
> a row(...)::table_rowtype expression.

I didn't look into that either.  I wouldn't propose back-patching that,
but it'd be interesting to try to fix it in HEAD.

            regards, tom lane

diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 8f242ae..a143998 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -812,7 +812,7 @@ DefineIndex(Oid relationId,
      */
     if (partitioned && (stmt->unique || stmt->primary))
     {
-        PartitionKey key = rel->rd_partkey;
+        PartitionKey key = RelationGetPartitionKey(rel);
         int            i;

         /*
diff --git a/src/backend/partitioning/partbounds.c b/src/backend/partitioning/partbounds.c
index d160d6f..a010b3e 100644
--- a/src/backend/partitioning/partbounds.c
+++ b/src/backend/partitioning/partbounds.c
@@ -775,6 +775,9 @@ partition_bounds_equal(int partnatts, int16 *parttyplen, bool *parttypbyval,
 /*
  * Return a copy of given PartitionBoundInfo structure. The data types of bounds
  * are described by given partition key specification.
+ *
+ * Note: it's critical that this function and its callees not do any
+ * catalog access, because we use it to copy data into the relcache.
  */
 PartitionBoundInfo
 partition_bounds_copy(PartitionBoundInfo src,
diff --git a/src/backend/partitioning/partdesc.c b/src/backend/partitioning/partdesc.c
index 6ede084..d0e253d 100644
--- a/src/backend/partitioning/partdesc.c
+++ b/src/backend/partitioning/partdesc.c
@@ -65,10 +65,12 @@ RelationBuildPartitionDesc(Relation rel)
     List       *inhoids;
     PartitionBoundSpec **boundspecs = NULL;
     Oid           *oids = NULL;
+    bool       *is_leaf = NULL;
     ListCell   *cell;
     int            i,
                 nparts;
     PartitionKey key = RelationGetPartitionKey(rel);
+    MemoryContext new_pdcxt;
     MemoryContext oldcxt;
     int           *mapping;

@@ -81,10 +83,11 @@ RelationBuildPartitionDesc(Relation rel)
     inhoids = find_inheritance_children(RelationGetRelid(rel), NoLock);
     nparts = list_length(inhoids);

-    /* Allocate arrays for OIDs and boundspecs. */
+    /* Allocate arrays for OIDs, leaf flags, and boundspecs. */
     if (nparts > 0)
     {
-        oids = palloc(nparts * sizeof(Oid));
+        oids = (Oid *) palloc(nparts * sizeof(Oid));
+        is_leaf = (bool *) palloc(nparts * sizeof(bool));
         boundspecs = palloc(nparts * sizeof(PartitionBoundSpec *));
     }

@@ -172,63 +175,75 @@ RelationBuildPartitionDesc(Relation rel)

         /* Save results. */
         oids[i] = inhrelid;
+        is_leaf[i] = (get_rel_relkind(inhrelid) != RELKIND_PARTITIONED_TABLE);
         boundspecs[i] = boundspec;
         ++i;
     }

-    /* Assert we aren't about to leak any old data structure */
-    Assert(rel->rd_pdcxt == NULL);
-    Assert(rel->rd_partdesc == NULL);
+    /*
+     * Create PartitionBoundInfo, working in the caller's context.  This could
+     * fail, but we haven't done any damage if so.
+     */
+    if (nparts > 0)
+        boundinfo = partition_bounds_create(boundspecs, nparts, key, &mapping);

     /*
      * Now build the actual relcache partition descriptor.  Note that the
      * order of operations here is fairly critical.  If we fail partway
-     * through this code, we won't have leaked memory because the rd_pdcxt is
-     * attached to the relcache entry immediately, so it'll be freed whenever
-     * the entry is rebuilt or destroyed.  However, we don't assign to
-     * rd_partdesc until the cached data structure is fully complete and
-     * valid, so that no other code might try to use it.
+     * through the data-copying code, we won't have leaked memory because the
+     * rd_pdcxt is attached to the relcache entry immediately, so it'll be
+     * freed whenever the entry is rebuilt or destroyed.  However, we don't
+     * assign to rd_partdesc until the cached data structure is fully complete
+     * and valid, so that no other code might try to use it.
      */
-    rel->rd_pdcxt = AllocSetContextCreate(CacheMemoryContext,
-                                          "partition descriptor",
-                                          ALLOCSET_SMALL_SIZES);
-    MemoryContextCopyAndSetIdentifier(rel->rd_pdcxt,
+    new_pdcxt = AllocSetContextCreate(CacheMemoryContext,
+                                      "partition descriptor",
+                                      ALLOCSET_SMALL_SIZES);
+    MemoryContextCopyAndSetIdentifier(new_pdcxt,
                                       RelationGetRelationName(rel));

+    /*
+     * This is a hack and a half: if there's an old rd_pdcxt, it might contain
+     * an old partition descriptor that is still being referenced somewhere.
+     * Preserve it, while not leaking it, by reattaching it as a child context
+     * of the new rd_pdcxt.  Eventually somebody will clean it up.  See
+     * related hack in RelationClearRelation.
+     */
+    rel->rd_partdesc = NULL;    /* let's just make sure of this */
+    if (rel->rd_pdcxt != NULL)
+        MemoryContextSetParent(rel->rd_pdcxt, new_pdcxt);
+    rel->rd_pdcxt = new_pdcxt;
+
+    /*
+     * Now copy everything into new_pdcxt.  Note it's critical that none of
+     * this code do any catalog access, else we could end up recursing here.
+     */
     partdesc = (PartitionDescData *)
-        MemoryContextAllocZero(rel->rd_pdcxt, sizeof(PartitionDescData));
+        MemoryContextAllocZero(new_pdcxt, sizeof(PartitionDescData));
     partdesc->nparts = nparts;
     /* If there are no partitions, the rest of the partdesc can stay zero */
     if (nparts > 0)
     {
-        /* Create PartitionBoundInfo, using the caller's context. */
-        boundinfo = partition_bounds_create(boundspecs, nparts, key, &mapping);
-
-        /* Now copy all info into relcache's partdesc. */
-        oldcxt = MemoryContextSwitchTo(rel->rd_pdcxt);
+        oldcxt = MemoryContextSwitchTo(new_pdcxt);
         partdesc->boundinfo = partition_bounds_copy(boundinfo, key);
         partdesc->oids = (Oid *) palloc(nparts * sizeof(Oid));
         partdesc->is_leaf = (bool *) palloc(nparts * sizeof(bool));
-        MemoryContextSwitchTo(oldcxt);

         /*
          * Assign OIDs from the original array into mapped indexes of the
          * result array.  The order of OIDs in the former is defined by the
          * catalog scan that retrieved them, whereas that in the latter is
          * defined by canonicalized representation of the partition bounds.
-         *
-         * Also record leaf-ness of each partition.  For this we use
-         * get_rel_relkind() which may leak memory, so be sure to run it in
-         * the caller's context.
+         * Also save leaf-ness of each partition.
          */
         for (i = 0; i < nparts; i++)
         {
             int            index = mapping[i];

             partdesc->oids[index] = oids[i];
-            partdesc->is_leaf[index] =
-                (get_rel_relkind(oids[i]) != RELKIND_PARTITIONED_TABLE);
+            partdesc->is_leaf[index] = is_leaf[i];
         }
+        MemoryContextSwitchTo(oldcxt);
     }

     rel->rd_partdesc = partdesc;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 50f8912..0a693fef 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1165,20 +1165,11 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
     relation->rd_fkeylist = NIL;
     relation->rd_fkeyvalid = false;

-    /* if a partitioned table, initialize key and partition descriptor info */
-    if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
-    {
-        RelationBuildPartitionKey(relation);
-        RelationBuildPartitionDesc(relation);
-    }
-    else
-    {
-        relation->rd_partkey = NULL;
-        relation->rd_partkeycxt = NULL;
-        relation->rd_partdesc = NULL;
-        relation->rd_pdcxt = NULL;
-    }
-    /* ... but partcheck is not loaded till asked for */
+    /* partitioning data is not loaded till asked for */
+    relation->rd_partkey = NULL;
+    relation->rd_partkeycxt = NULL;
+    relation->rd_partdesc = NULL;
+    relation->rd_pdcxt = NULL;
     relation->rd_partcheck = NIL;
     relation->rd_partcheckvalid = false;
     relation->rd_partcheckcxt = NULL;
@@ -2527,7 +2518,6 @@ RelationClearRelation(Relation relation, bool rebuild)
         bool        keep_rules;
         bool        keep_policies;
         bool        keep_partkey;
-        bool        keep_partdesc;

         /* Build temporary entry, but don't link it into hashtable */
         newrel = RelationBuildDesc(save_relid, false);
@@ -2560,9 +2550,6 @@ RelationClearRelation(Relation relation, bool rebuild)
         keep_policies = equalRSDesc(relation->rd_rsdesc, newrel->rd_rsdesc);
         /* partkey is immutable once set up, so we can always keep it */
         keep_partkey = (relation->rd_partkey != NULL);
-        keep_partdesc = equalPartitionDescs(relation->rd_partkey,
-                                            relation->rd_partdesc,
-                                            newrel->rd_partdesc);

         /*
          * Perform swapping of the relcache entry contents.  Within this
@@ -2617,34 +2604,46 @@ RelationClearRelation(Relation relation, bool rebuild)
         SWAPFIELD(Oid, rd_toastoid);
         /* pgstat_info must be preserved */
         SWAPFIELD(struct PgStat_TableStatus *, pgstat_info);
-        /* preserve old partitioning info if no logical change */
+        /* preserve old partition key if we have one */
         if (keep_partkey)
         {
             SWAPFIELD(PartitionKey, rd_partkey);
             SWAPFIELD(MemoryContext, rd_partkeycxt);
         }
-        if (keep_partdesc)
-        {
-            SWAPFIELD(PartitionDesc, rd_partdesc);
-            SWAPFIELD(MemoryContext, rd_pdcxt);
-        }
-        else if (rebuild && newrel->rd_pdcxt != NULL)
+        if (rebuild && newrel->rd_pdcxt != NULL)
         {
             /*
              * We are rebuilding a partitioned relation with a non-zero
              * reference count, so keep the old partition descriptor around,
              * in case there's a PartitionDirectory with a pointer to it.
-             * Attach it to the new rd_pdcxt so that it gets cleaned up
-             * eventually.  In the case where the reference count is 0, this
-             * code is not reached, which should be OK because in that case
-             * there should be no PartitionDirectory with a pointer to the old
-             * entry.
+             * (This is necessary because RelationGetPartitionDesc hands out
+             * direct pointers to the relcache's data structure, unlike our
+             * usual practice which is to hand out copies.  We'd have the same
+             * problem with rd_partkey, except that we always preserve that
+             * once created.)
+             *
+             * To ensure that it's not leaked completely, re-attach it to the
+             * new reldesc, or make it a child of the new reldesc's rd_pdcxt
+             * in the unlikely event that there is one already.  (See hack in
+             * RelationBuildPartitionDesc.)  Currently, it'll stick around
+             * until such time as we destroy the reldesc, or rebuild it while
+             * it has zero refcount.  That's pretty undesirable and should get
+             * fixed.
+             *
+             * In the case where the reference count is 0, this code is not
+             * reached, which should be OK because in that case there should
+             * be no PartitionDirectory with a pointer to the old entry.
              *
              * Note that newrel and relation have already been swapped, so the
              * "old" partition descriptor is actually the one hanging off of
              * newrel.
              */
-            MemoryContextSetParent(newrel->rd_pdcxt, relation->rd_pdcxt);
+            if (relation->rd_pdcxt != NULL) /* probably never happens */
+                MemoryContextSetParent(newrel->rd_pdcxt, relation->rd_pdcxt);
+            else
+                relation->rd_pdcxt = newrel->rd_pdcxt;
+            relation->rd_partdesc = NULL;    /* just to be sure */
+            /* drop newrel's pointers so we don't destroy it below */
             newrel->rd_partdesc = NULL;
             newrel->rd_pdcxt = NULL;
         }
@@ -3907,27 +3906,7 @@ RelationCacheInitializePhase3(void)
             restart = true;
         }

-        /*
-         * Reload the partition key and descriptor for a partitioned table.
-         */
-        if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
-            relation->rd_partkey == NULL)
-        {
-            RelationBuildPartitionKey(relation);
-            Assert(relation->rd_partkey != NULL);
-
-            restart = true;
-        }
-
-        if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
-            relation->rd_partdesc == NULL)
-        {
-            RelationBuildPartitionDesc(relation);
-            Assert(relation->rd_partdesc != NULL);
-
-            restart = true;
-        }
-
+        /* Reload tableam data if needed */
         if (relation->rd_tableam == NULL &&
             (relation->rd_rel->relkind == RELKIND_RELATION ||
              relation->rd_rel->relkind == RELKIND_SEQUENCE ||
@@ -5133,6 +5112,36 @@ RelationGetExclusionInfo(Relation indexRelation,
 }

 /*
+ * RelationGetPartitionKey -- get partition key, if relation is partitioned
+ */
+struct PartitionKeyData *
+RelationGetPartitionKey(Relation rel)
+{
+    if (rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
+        return NULL;
+
+    if (rel->rd_partkey == NULL)
+        RelationBuildPartitionKey(rel);
+
+    return rel->rd_partkey;
+}
+
+/*
+ * RelationGetPartitionDesc -- get partition descriptor, if relation is partitioned
+ */
+struct PartitionDescData *
+RelationGetPartitionDesc(Relation rel)
+{
+    if (rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
+        return NULL;
+
+    if (rel->rd_partdesc == NULL)
+        RelationBuildPartitionDesc(rel);
+
+    return rel->rd_partdesc;
+}
+
+/*
  * Get publication actions for the given relation.
  */
 struct PublicationActions *
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 31d8a1a..aa8c625 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -596,20 +596,10 @@ typedef struct ViewOptions
      RelationNeedsWAL(relation) && \
      !IsCatalogRelation(relation))

-/*
- * RelationGetPartitionKey
- *        Returns the PartitionKey of a relation
- */
-#define RelationGetPartitionKey(relation) ((relation)->rd_partkey)
-
-/*
- * RelationGetPartitionDesc
- *        Returns partition descriptor for a relation.
- */
-#define RelationGetPartitionDesc(relation) ((relation)->rd_partdesc)
-
 /* routines in utils/cache/relcache.c */
 extern void RelationIncrementReferenceCount(Relation rel);
 extern void RelationDecrementReferenceCount(Relation rel);
+extern struct PartitionKeyData *RelationGetPartitionKey(Relation rel);
+extern struct PartitionDescData *RelationGetPartitionDesc(Relation rel);

 #endif                            /* REL_H */
diff --git a/src/test/regress/expected/create_table.out b/src/test/regress/expected/create_table.out
index f630168..9d6d1ba 100644
--- a/src/test/regress/expected/create_table.out
+++ b/src/test/regress/expected/create_table.out
@@ -501,6 +501,22 @@ Partition of: partitioned2 FOR VALUES FROM ('-1', 'aaaaa') TO (100, 'ccccc')
 Partition constraint: (((a + 1) IS NOT NULL) AND (substr(b, 1, 5) IS NOT NULL) AND (((a + 1) > '-1'::integer) OR (((a
+1) = '-1'::integer) AND (substr(b, 1, 5) >= 'aaaaa'::text))) AND (((a + 1) < 100) OR (((a + 1) = 100) AND (substr(b,
1,5) < 'ccccc'::text)))) 

 DROP TABLE partitioned, partitioned2;
+-- check reference to partitioned table's rowtype in partition descriptor
+create table partitioned (a int, b int)
+  partition by list ((row(a, b)::partitioned));
+create table partitioned1
+  partition of partitioned for values in ('(1,2)'::partitioned);
+create table partitioned2
+  partition of partitioned for values in ('(2,4)'::partitioned);
+explain (costs off)
+select * from partitioned where row(a,b)::partitioned = '(1,2)'::partitioned;
+                        QUERY PLAN
+-----------------------------------------------------------
+ Seq Scan on partitioned1 partitioned
+   Filter: (ROW(a, b)::partitioned = '(1,2)'::partitioned)
+(2 rows)
+
+drop table partitioned;
 -- check that dependencies of partition columns are handled correctly
 create domain intdom1 as int;
 create table partitioned (
diff --git a/src/test/regress/sql/create_table.sql b/src/test/regress/sql/create_table.sql
index e835b65..7ef646a 100644
--- a/src/test/regress/sql/create_table.sql
+++ b/src/test/regress/sql/create_table.sql
@@ -449,6 +449,17 @@ CREATE TABLE part2_1 PARTITION OF partitioned2 FOR VALUES FROM (-1, 'aaaaa') TO

 DROP TABLE partitioned, partitioned2;

+-- check reference to partitioned table's rowtype in partition descriptor
+create table partitioned (a int, b int)
+  partition by list ((row(a, b)::partitioned));
+create table partitioned1
+  partition of partitioned for values in ('(1,2)'::partitioned);
+create table partitioned2
+  partition of partitioned for values in ('(2,4)'::partitioned);
+explain (costs off)
+select * from partitioned where row(a,b)::partitioned = '(1,2)'::partitioned;
+drop table partitioned;
+
 -- check that dependencies of partition columns are handled correctly
 create domain intdom1 as int;


Re: unsupportable composite type partition keys

From
Tom Lane
Date:
I wrote:
> Now as far as point 1 goes, I think it's not really that awful to use
> CheckAttributeType() with a dummy attribute name.  The attached
> incomplete patch uses "partition key" which causes it to emit errors
> like
> regression=# create table fool (a int, b int) partition by list ((row(a, b)));
> ERROR:  column "partition key" has pseudo-type record
> I don't think that that's unacceptable.  But if we wanted to improve it,
> we could imagine adding another flag, say CHKATYPE_IS_PARTITION_KEY,
> that doesn't affect CheckAttributeType's semantics, just the wording of
> the error messages it throws.

Here's a fleshed-out patch that does it like that.

While poking at this, I also started to wonder why CheckAttributeType
wasn't recursing into ranges, since those are our other kind of
container type.  And the answer is that it must, because we allow
creation of ranges over composite types:

regression=# create table foo (f1 int, f2 int);
CREATE TABLE
regression=# create type foorange as range (subtype = foo);
CREATE TYPE
regression=# alter table foo add column r foorange;
ALTER TABLE

Simple things still work on table foo, but surely this is exactly
what CheckAttributeType is supposed to be preventing.  With the
second attached patch you get

regression=# alter table foo add column r foorange;
ERROR:  composite type foo cannot be made a member of itself

The second patch needs to go back all the way, the first one
only as far as we have partitions.

            regards, tom lane

diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 8404904..82398da 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -572,6 +572,10 @@ CheckAttributeNamesTypes(TupleDesc tupdesc, char relkind,
  * are reliably identifiable only within a session, since the identity info
  * may use a typmod that is only locally assigned.  The caller is expected
  * to know whether these cases are safe.)
+ *
+ * flags can also control the phrasing of the error messages.  If
+ * CHKATYPE_IS_PARTKEY is specified, "attname" should be a partition key
+ * column number as text, not a real column name.
  * --------------------------------
  */
 void
@@ -598,10 +602,19 @@ CheckAttributeType(const char *attname,
         if (!((atttypid == ANYARRAYOID && (flags & CHKATYPE_ANYARRAY)) ||
               (atttypid == RECORDOID && (flags & CHKATYPE_ANYRECORD)) ||
               (atttypid == RECORDARRAYOID && (flags & CHKATYPE_ANYRECORD))))
-            ereport(ERROR,
-                    (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
-                     errmsg("column \"%s\" has pseudo-type %s",
-                            attname, format_type_be(atttypid))));
+        {
+            if (flags & CHKATYPE_IS_PARTKEY)
+                ereport(ERROR,
+                        (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+                /* translator: first %s is an integer not a name */
+                         errmsg("partition key column %s has pseudo-type %s",
+                                attname, format_type_be(atttypid))));
+            else
+                ereport(ERROR,
+                        (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+                         errmsg("column \"%s\" has pseudo-type %s",
+                                attname, format_type_be(atttypid))));
+        }
     }
     else if (att_typtype == TYPTYPE_DOMAIN)
     {
@@ -648,7 +661,7 @@ CheckAttributeType(const char *attname,
             CheckAttributeType(NameStr(attr->attname),
                                attr->atttypid, attr->attcollation,
                                containing_rowtypes,
-                               flags);
+                               flags & ~CHKATYPE_IS_PARTKEY);
         }

         relation_close(relation, AccessShareLock);
@@ -670,11 +683,21 @@ CheckAttributeType(const char *attname,
      * useless, and it cannot be dumped, so we must disallow it.
      */
     if (!OidIsValid(attcollation) && type_is_collatable(atttypid))
-        ereport(ERROR,
-                (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
-                 errmsg("no collation was derived for column \"%s\" with collatable type %s",
-                        attname, format_type_be(atttypid)),
-                 errhint("Use the COLLATE clause to set the collation explicitly.")));
+    {
+        if (flags & CHKATYPE_IS_PARTKEY)
+            ereport(ERROR,
+                    (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+            /* translator: first %s is an integer not a name */
+                     errmsg("no collation was derived for partition key column %s with collatable type %s",
+                            attname, format_type_be(atttypid)),
+                     errhint("Use the COLLATE clause to set the collation explicitly.")));
+        else
+            ereport(ERROR,
+                    (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+                     errmsg("no collation was derived for column \"%s\" with collatable type %s",
+                            attname, format_type_be(atttypid)),
+                     errhint("Use the COLLATE clause to set the collation explicitly.")));
+    }
 }

 /*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index e8e004e..845f010 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -15096,12 +15096,24 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu
         {
             /* Expression */
             Node       *expr = pelem->expr;
+            char        partattname[16];

             Assert(expr != NULL);
             atttype = exprType(expr);
             attcollation = exprCollation(expr);

             /*
+             * The expression must be of a storable type (e.g., not RECORD).
+             * The test is the same as for whether a table column is of a safe
+             * type (which is why we needn't check for the non-expression
+             * case).
+             */
+            snprintf(partattname, sizeof(partattname), "%d", attn + 1);
+            CheckAttributeType(partattname,
+                               atttype, attcollation,
+                               NIL, CHKATYPE_IS_PARTKEY);
+
+            /*
              * Strip any top-level COLLATE clause.  This ensures that we treat
              * "x COLLATE y" and "(x COLLATE y)" alike.
              */
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index eec71c2..2e63137 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -22,6 +22,7 @@
 /* flag bits for CheckAttributeType/CheckAttributeNamesTypes */
 #define CHKATYPE_ANYARRAY        0x01    /* allow ANYARRAY */
 #define CHKATYPE_ANYRECORD        0x02    /* allow RECORD and RECORD[] */
+#define CHKATYPE_IS_PARTKEY        0x04    /* attname is part key # not column */

 typedef struct RawColumnDefault
 {
diff --git a/src/test/regress/expected/create_table.out b/src/test/regress/expected/create_table.out
index f630168..50de4b3 100644
--- a/src/test/regress/expected/create_table.out
+++ b/src/test/regress/expected/create_table.out
@@ -375,7 +375,7 @@ CREATE TABLE partitioned (
 ERROR:  cannot use subquery in partition key expression
 CREATE TABLE partitioned (
     a int
-) PARTITION BY RANGE (('a'));
+) PARTITION BY RANGE ((42));
 ERROR:  cannot use constant expression as partition key
 CREATE FUNCTION const_func () RETURNS int AS $$ SELECT 1; $$ LANGUAGE SQL IMMUTABLE;
 CREATE TABLE partitioned (
@@ -402,6 +402,17 @@ CREATE TABLE partitioned (
 ERROR:  cannot use system column "xmin" in partition key
 LINE 3: ) PARTITION BY RANGE (xmin);
                               ^
+-- cannot use pseudotypes
+CREATE TABLE partitioned (
+    a int,
+    b int
+) PARTITION BY RANGE (((a, b)));
+ERROR:  partition key column 1 has pseudo-type record
+CREATE TABLE partitioned (
+    a int,
+    b int
+) PARTITION BY RANGE (a, ('unknown'));
+ERROR:  partition key column 2 has pseudo-type unknown
 -- functions in key must be immutable
 CREATE FUNCTION immut_func (a int) RETURNS int AS $$ SELECT a + random()::int; $$ LANGUAGE SQL;
 CREATE TABLE partitioned (
diff --git a/src/test/regress/sql/create_table.sql b/src/test/regress/sql/create_table.sql
index e835b65..9a40d7b 100644
--- a/src/test/regress/sql/create_table.sql
+++ b/src/test/regress/sql/create_table.sql
@@ -361,7 +361,7 @@ CREATE TABLE partitioned (

 CREATE TABLE partitioned (
     a int
-) PARTITION BY RANGE (('a'));
+) PARTITION BY RANGE ((42));

 CREATE FUNCTION const_func () RETURNS int AS $$ SELECT 1; $$ LANGUAGE SQL IMMUTABLE;
 CREATE TABLE partitioned (
@@ -384,6 +384,16 @@ CREATE TABLE partitioned (
     a int
 ) PARTITION BY RANGE (xmin);

+-- cannot use pseudotypes
+CREATE TABLE partitioned (
+    a int,
+    b int
+) PARTITION BY RANGE (((a, b)));
+CREATE TABLE partitioned (
+    a int,
+    b int
+) PARTITION BY RANGE (a, ('unknown'));
+
 -- functions in key must be immutable
 CREATE FUNCTION immut_func (a int) RETURNS int AS $$ SELECT a + random()::int; $$ LANGUAGE SQL;
 CREATE TABLE partitioned (
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 82398da..452a7f3 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -668,6 +668,15 @@ CheckAttributeType(const char *attname,
 
         containing_rowtypes = list_delete_last(containing_rowtypes);
     }
+    else if (att_typtype == TYPTYPE_RANGE)
+    {
+        /*
+         * If it's a range, recurse to check its subtype.
+         */
+        CheckAttributeType(attname, get_range_subtype(atttypid), attcollation,
+                           containing_rowtypes,
+                           flags);
+    }
     else if (OidIsValid((att_typelem = get_element_type(atttypid))))
     {
         /*

Re: unsupportable composite type partition keys

From
Amit Langote
Date:
On Sun, Dec 22, 2019 at 6:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I wrote:
> > As far as point 2 goes, I think this is another outgrowth of the
> > fundamental design error that we load a partitioned rel's partitioning
> > info immediately when the relcache entry is created, rather than later
> > on-demand.  If we weren't doing that then it wouldn't be problematic
> > to inspect the rel's rowtype while constructing the partitioning info.
> > I've bitched about this before, if memory serves, but couldn't light
> > a fire under anyone about fixing it.  Now I think we have no choice.
> > It was never a great idea that minimal construction of a relcache
> > entry could result in running arbitrary user-defined code.
>
> Here's a draft patch for that.

Thanks for writing the patch.  This also came up recently on another thread [1].

>  There are a couple of secondary issues
> I didn't do anything about yet:
>
> * When rebuilding an open relcache entry for a partitioned table, this
> coding now always quasi-leaks the old rd_pdcxt, where before that happened
> only if the partdesc actually changed.  (Even if I'd kept the
> equalPartitionDescs call, it would always fail.)  I complained about the
> quasi-leak behavior before, but this probably pushes it up to the level of
> "must fix".  What I'm inclined to do is to hack
> RelationDecrementReferenceCount so that, when the refcount goes to zero,
> we delete any child contexts of rd_pdcxt.  That's pretty annoying but in
> the big scheme of things it's unlikely to matter.

Hacking RelationDecrementReferenceCount() like that sounds OK.

-        else if (rebuild && newrel->rd_pdcxt != NULL)
+        if (rebuild && newrel->rd_pdcxt != NULL)

Checking rebuild seems unnecessary in this block, although that's true
even without the patch.

+             * To ensure that it's not leaked completely, re-attach it to the
+             * new reldesc, or make it a child of the new reldesc's rd_pdcxt
+             * in the unlikely event that there is one already.  (See hack in
+             * RelationBuildPartitionDesc.)
...
+            if (relation->rd_pdcxt != NULL) /* probably never happens */
+                MemoryContextSetParent(newrel->rd_pdcxt, relation->rd_pdcxt);
+            else
+                relation->rd_pdcxt = newrel->rd_pdcxt;

While I can imagine that RelationBuildPartitionDesc() might encounter
an old partition descriptor making the re-parenting hack necessary
there, I don't see why it's needed here, because a freshly built
relation descriptor would not contain the partition descriptor after
this patch.

> * It'd be better to declare RelationGetPartitionKey and
> RelationGetPartitionDesc in relcache.h and get their callers out of the
> business of including rel.h, where possible.

Although I agree to declare them in relcache.h, that doesn't reduce
the need to include rel.h in their callers much, because anyplace that
uses RelationGetPartitionDesc() is also very likely to use
RelationGetRelid() which is in rel.h.

> * equalPartitionDescs is now dead code, should we remove it?

Don't see any problem with doing so.

> > Note that the end result of this would be to allow, not prohibit,
> > cases like your example.  I wonder whether we couldn't also lift
> > the restriction against whole-row Vars in partition expressions.
> > Doesn't seem like there is much difference between such a Var and
> > a row(...)::table_rowtype expression.
>
> I didn't look into that either.  I wouldn't propose back-patching that,
> but it'd be interesting to try to fix it in HEAD.

Agreed.

Thanks,
Amit

[1] https://www.postgresql.org/message-id/CA%2BHiwqFucUh7hYkfZ6x1MVcs_R24eUfNVuRwdE_FwuwK8XpSZg%40mail.gmail.com



Re: unsupportable composite type partition keys

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
> On Sun, Dec 22, 2019 at 6:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> +             * To ensure that it's not leaked completely, re-attach it to the
> +             * new reldesc, or make it a child of the new reldesc's rd_pdcxt
> +             * in the unlikely event that there is one already.  (See hack in
> +             * RelationBuildPartitionDesc.)

> While I can imagine that RelationBuildPartitionDesc() might encounter
> an old partition descriptor making the re-parenting hack necessary
> there, I don't see why it's needed here, because a freshly built
> relation descriptor would not contain the partition descriptor after
> this patch.

Well, as the comment says, that's probably unreachable today.  But
I could see it happening in the future, particularly if we ever allow
partitioned system catalogs.  There are a lot of paths through this
code that are not obvious to the naked eye, and some of them can cause
relcache entries to get populated behind-your-back.  Most of relcache.c
is careful about this; I do not see an excuse for the partition-data
code to be less so, even if we think it can't happen today.

(I notice that RelationBuildPartitionKey is making a similar assumption
that the partkey couldn't magically appear while it's working, and I
don't like it much there either.)

>> * It'd be better to declare RelationGetPartitionKey and
>> RelationGetPartitionDesc in relcache.h and get their callers out of the
>> business of including rel.h, where possible.

> Although I agree to declare them in relcache.h, that doesn't reduce
> the need to include rel.h in their callers much, because anyplace that
> uses RelationGetPartitionDesc() is also very likely to use
> RelationGetRelid() which is in rel.h.

Hm.  Well, we can try anyway.

> [1] https://www.postgresql.org/message-id/CA%2BHiwqFucUh7hYkfZ6x1MVcs_R24eUfNVuRwdE_FwuwK8XpSZg%40mail.gmail.com

Oh, interesting --- I hadn't been paying much attention to that thread.
I'll compare your PoC there to what I did.

            regards, tom lane



Re: unsupportable composite type partition keys

From
Tom Lane
Date:
BTW, I forgot to mention: while I think the patch to forbid pseudotypes
by using CheckAttributeType() can be back-patched, I'm leaning towards
not back-patching the other patch.  The situation where we get into
infinite recursion seems not very likely in practice, and it's not
going to cause any crash or data loss, so I think we can just say
"sorry that's not supported before v13".  The patch as I'm proposing
it seems rather invasive for a back-branch fix.  Also, changing
RelationGetPartitionKey/Desc from macros to functions is at least a
weak ABI break.  If there are extensions calling either, they might
still work without a recompile --- but if they have code paths that
are the first thing to touch either field since a relcache flush,
they'd crash.

            regards, tom lane



Re: unsupportable composite type partition keys

From
Amit Langote
Date:
On Mon, Dec 23, 2019 at 23:49 Tom Lane <tgl@sss.pgh.pa.us> wrote:
Amit Langote <amitlangote09@gmail.com> writes:
> [1] https://www.postgresql.org/message-id/CA%2BHiwqFucUh7hYkfZ6x1MVcs_R24eUfNVuRwdE_FwuwK8XpSZg%40mail.gmail.com

Oh, interesting --- I hadn't been paying much attention to that thread.
I'll compare your PoC there to what I did.

Actually, I should’ve said that your patch is much better attempt at getting this in order, so there’s not much to see in my patch really. :)

Thanks,
Amit

Re: unsupportable composite type partition keys

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
> On Mon, Dec 23, 2019 at 23:49 Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Oh, interesting --- I hadn't been paying much attention to that thread.
>> I'll compare your PoC there to what I did.

> Actually, I should’ve said that your patch is much better attempt at
> getting this in order, so there’s not much to see in my patch really. :)

One thing I see is that you chose to relocate RelationGetPartitionDesc's
declaration to partdesc.h, whereupon RelationBuildPartitionDesc doesn't
have to be exported at all anymore.  Perhaps that's a better factorization
than what I did.  It supposes that any caller of RelationGetPartitionDesc
is going to need partdesc.h, but that seems reasonable.  We could likewise
move RelationGetPartitionKey to partcache.h.

            regards, tom lane



Re: unsupportable composite type partition keys

From
Tom Lane
Date:
I wrote:
> One thing I see is that you chose to relocate RelationGetPartitionDesc's
> declaration to partdesc.h, whereupon RelationBuildPartitionDesc doesn't
> have to be exported at all anymore.  Perhaps that's a better factorization
> than what I did.  It supposes that any caller of RelationGetPartitionDesc
> is going to need partdesc.h, but that seems reasonable.  We could likewise
> move RelationGetPartitionKey to partcache.h.

I concluded that that is indeed a better solution; it does allow removing
some rel.h inclusions (though possibly those were just duplicative?), and
it also means that relcache.c itself doesn't need any partitioning
inclusions at all.

Here's a cleaned-up patch that does it like that and also fixes the
memory leakage issue.

I noticed along the way that with partkeys only being loaded on demand,
we no longer need the incredibly-unsafe hack in RelationBuildPartitionKey
whereby it just silently ignores failure to read the pg_partitioned_table
entry.  I also rearranged RelationBuildPartitionDesc so that it uses the
same context-reparenting trick as RelationBuildPartitionKey.  That doesn't
save us anything, but it makes the code considerably more robust, I think;
we don't need to assume as much about what partition_bounds_copy does.

One other thing worth noting is that I used unlikely() to try to
discourage the compiler from inlining RelationBuildPartitionDesc
into RelationGetPartitionDesc (and likewise for the Key functions).
Not sure how effective that is, but it can't hurt.

I haven't removed equalPartitionDescs here; that seems like material
for a separate patch (to make it easier to put it back if we need it).

            regards, tom lane

diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 452a7f3..8b68fb7 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -83,7 +83,6 @@
 #include "utils/inval.h"
 #include "utils/lsyscache.h"
 #include "utils/partcache.h"
-#include "utils/rel.h"
 #include "utils/ruleutils.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 8f242ae..a143998 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -812,7 +812,7 @@ DefineIndex(Oid relationId,
      */
     if (partitioned && (stmt->unique || stmt->primary))
     {
-        PartitionKey key = rel->rd_partkey;
+        PartitionKey key = RelationGetPartitionKey(rel);
         int            i;

         /*
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index d8cbd6a..dc3d792 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -30,7 +30,6 @@
 #include "rewrite/rewriteManip.h"
 #include "utils/lsyscache.h"
 #include "utils/partcache.h"
-#include "utils/rel.h"
 #include "utils/rls.h"
 #include "utils/ruleutils.h"

diff --git a/src/backend/partitioning/partbounds.c b/src/backend/partitioning/partbounds.c
index d160d6f..4c6ca89 100644
--- a/src/backend/partitioning/partbounds.c
+++ b/src/backend/partitioning/partbounds.c
@@ -35,7 +35,6 @@
 #include "utils/hashutils.h"
 #include "utils/lsyscache.h"
 #include "utils/partcache.h"
-#include "utils/rel.h"
 #include "utils/ruleutils.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
@@ -775,6 +774,11 @@ partition_bounds_equal(int partnatts, int16 *parttyplen, bool *parttypbyval,
 /*
  * Return a copy of given PartitionBoundInfo structure. The data types of bounds
  * are described by given partition key specification.
+ *
+ * Note: it's important that this function and its callees not do any catalog
+ * access, nor anything else that would result in allocating memory other than
+ * the returned data structure.  Since this is called in a long-lived context,
+ * that would result in unwanted memory leaks.
  */
 PartitionBoundInfo
 partition_bounds_copy(PartitionBoundInfo src,
diff --git a/src/backend/partitioning/partdesc.c b/src/backend/partitioning/partdesc.c
index 6ede084..e26e45e 100644
--- a/src/backend/partitioning/partdesc.c
+++ b/src/backend/partitioning/partdesc.c
@@ -47,17 +47,48 @@ typedef struct PartitionDirectoryEntry
     PartitionDesc pd;
 } PartitionDirectoryEntry;

+static void RelationBuildPartitionDesc(Relation rel);
+
+
+/*
+ * RelationGetPartitionDesc -- get partition descriptor, if relation is partitioned
+ *
+ * Note: we arrange for partition descriptors to not get freed until the
+ * relcache entry's refcount goes to zero (see hacks in RelationClose,
+ * RelationClearRelation, and RelationBuildPartitionDesc).  Therefore, even
+ * though we hand back a direct pointer into the relcache entry, it's safe
+ * for callers to continue to use that pointer as long as (a) they hold the
+ * relation open, and (b) they hold a relation lock strong enough to ensure
+ * that the data doesn't become stale.
+ */
+PartitionDesc
+RelationGetPartitionDesc(Relation rel)
+{
+    if (rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
+        return NULL;
+
+    if (unlikely(rel->rd_partdesc == NULL))
+        RelationBuildPartitionDesc(rel);
+
+    return rel->rd_partdesc;
+}
+
 /*
  * RelationBuildPartitionDesc
  *        Form rel's partition descriptor, and store in relcache entry
  *
- * Note: the descriptor won't be flushed from the cache by
- * RelationClearRelation() unless it's changed because of
- * addition or removal of a partition.  Hence, code holding a lock
- * that's sufficient to prevent that can assume that rd_partdesc
- * won't change underneath it.
+ * Partition descriptor is a complex structure; to avoid complicated logic to
+ * free individual elements whenever the relcache entry is flushed, we give it
+ * its own memory context, a child of CacheMemoryContext, which can easily be
+ * deleted on its own.  To avoid leaking memory in that context in case of an
+ * error partway through this function, the context is initially created as a
+ * child of CurTransactionContext and only re-parented to CacheMemoryContext
+ * at the end, when no further errors are possible.  Also, we don't make this
+ * context the current context except in very brief code sections, out of fear
+ * that some of our callees allocate memory on their own which would be leaked
+ * permanently.
  */
-void
+static void
 RelationBuildPartitionDesc(Relation rel)
 {
     PartitionDesc partdesc;
@@ -65,10 +96,12 @@ RelationBuildPartitionDesc(Relation rel)
     List       *inhoids;
     PartitionBoundSpec **boundspecs = NULL;
     Oid           *oids = NULL;
+    bool       *is_leaf = NULL;
     ListCell   *cell;
     int            i,
                 nparts;
     PartitionKey key = RelationGetPartitionKey(rel);
+    MemoryContext new_pdcxt;
     MemoryContext oldcxt;
     int           *mapping;

@@ -81,10 +114,11 @@ RelationBuildPartitionDesc(Relation rel)
     inhoids = find_inheritance_children(RelationGetRelid(rel), NoLock);
     nparts = list_length(inhoids);

-    /* Allocate arrays for OIDs and boundspecs. */
+    /* Allocate working arrays for OIDs, leaf flags, and boundspecs. */
     if (nparts > 0)
     {
-        oids = palloc(nparts * sizeof(Oid));
+        oids = (Oid *) palloc(nparts * sizeof(Oid));
+        is_leaf = (bool *) palloc(nparts * sizeof(bool));
         boundspecs = palloc(nparts * sizeof(PartitionBoundSpec *));
     }

@@ -172,65 +206,73 @@ RelationBuildPartitionDesc(Relation rel)

         /* Save results. */
         oids[i] = inhrelid;
+        is_leaf[i] = (get_rel_relkind(inhrelid) != RELKIND_PARTITIONED_TABLE);
         boundspecs[i] = boundspec;
         ++i;
     }

-    /* Assert we aren't about to leak any old data structure */
-    Assert(rel->rd_pdcxt == NULL);
-    Assert(rel->rd_partdesc == NULL);
+    /*
+     * Create PartitionBoundInfo and mapping, working in the caller's context.
+     * This could fail, but we haven't done any damage if so.
+     */
+    if (nparts > 0)
+        boundinfo = partition_bounds_create(boundspecs, nparts, key, &mapping);

     /*
-     * Now build the actual relcache partition descriptor.  Note that the
-     * order of operations here is fairly critical.  If we fail partway
-     * through this code, we won't have leaked memory because the rd_pdcxt is
-     * attached to the relcache entry immediately, so it'll be freed whenever
-     * the entry is rebuilt or destroyed.  However, we don't assign to
-     * rd_partdesc until the cached data structure is fully complete and
-     * valid, so that no other code might try to use it.
+     * Now build the actual relcache partition descriptor, copying all the
+     * data into a new, small context.  As per above comment, we don't make
+     * this a long-lived context until it's finished.
      */
-    rel->rd_pdcxt = AllocSetContextCreate(CacheMemoryContext,
-                                          "partition descriptor",
-                                          ALLOCSET_SMALL_SIZES);
-    MemoryContextCopyAndSetIdentifier(rel->rd_pdcxt,
+    new_pdcxt = AllocSetContextCreate(CurTransactionContext,
+                                      "partition descriptor",
+                                      ALLOCSET_SMALL_SIZES);
+    MemoryContextCopyAndSetIdentifier(new_pdcxt,
                                       RelationGetRelationName(rel));

     partdesc = (PartitionDescData *)
-        MemoryContextAllocZero(rel->rd_pdcxt, sizeof(PartitionDescData));
+        MemoryContextAllocZero(new_pdcxt, sizeof(PartitionDescData));
     partdesc->nparts = nparts;
     /* If there are no partitions, the rest of the partdesc can stay zero */
     if (nparts > 0)
     {
-        /* Create PartitionBoundInfo, using the caller's context. */
-        boundinfo = partition_bounds_create(boundspecs, nparts, key, &mapping);
-
-        /* Now copy all info into relcache's partdesc. */
-        oldcxt = MemoryContextSwitchTo(rel->rd_pdcxt);
+        oldcxt = MemoryContextSwitchTo(new_pdcxt);
         partdesc->boundinfo = partition_bounds_copy(boundinfo, key);
         partdesc->oids = (Oid *) palloc(nparts * sizeof(Oid));
         partdesc->is_leaf = (bool *) palloc(nparts * sizeof(bool));
-        MemoryContextSwitchTo(oldcxt);

         /*
          * Assign OIDs from the original array into mapped indexes of the
          * result array.  The order of OIDs in the former is defined by the
          * catalog scan that retrieved them, whereas that in the latter is
          * defined by canonicalized representation of the partition bounds.
-         *
-         * Also record leaf-ness of each partition.  For this we use
-         * get_rel_relkind() which may leak memory, so be sure to run it in
-         * the caller's context.
+         * Also save leaf-ness of each partition.
          */
         for (i = 0; i < nparts; i++)
         {
             int            index = mapping[i];

             partdesc->oids[index] = oids[i];
-            partdesc->is_leaf[index] =
-                (get_rel_relkind(oids[i]) != RELKIND_PARTITIONED_TABLE);
+            partdesc->is_leaf[index] = is_leaf[i];
         }
+        MemoryContextSwitchTo(oldcxt);
     }

+    /*
+     * We have a fully valid partdesc ready to store into the relcache.
+     * Reparent it so it has the right lifespan.
+     */
+    MemoryContextSetParent(new_pdcxt, CacheMemoryContext);
+
+    /*
+     * But first, a kluge: if there's an old rd_pdcxt, it contains an old
+     * partition descriptor that may still be referenced somewhere.  Preserve
+     * it, while not leaking it, by reattaching it as a child context of the
+     * new rd_pdcxt.  Eventually it will get dropped by either RelationClose
+     * or RelationClearRelation.
+     */
+    if (rel->rd_pdcxt != NULL)
+        MemoryContextSetParent(rel->rd_pdcxt, new_pdcxt);
+    rel->rd_pdcxt = new_pdcxt;
     rel->rd_partdesc = partdesc;
 }

diff --git a/src/backend/utils/cache/partcache.c b/src/backend/utils/cache/partcache.c
index 342ab4d..e2144c8 100644
--- a/src/backend/utils/cache/partcache.c
+++ b/src/backend/utils/cache/partcache.c
@@ -37,9 +37,32 @@
 #include "utils/syscache.h"


+static void RelationBuildPartitionKey(Relation relation);
 static List *generate_partition_qual(Relation rel);

 /*
+ * RelationGetPartitionKey -- get partition key, if relation is partitioned
+ *
+ * Note: partition keys are not allowed to change after the partitioned rel
+ * is created.  RelationClearRelation knows this and preserves rd_partkey
+ * across relcache rebuilds, as long as the relation is open.  Therefore,
+ * even though we hand back a direct pointer into the relcache entry, it's
+ * safe for callers to continue to use that pointer as long as they hold
+ * the relation open.
+ */
+PartitionKey
+RelationGetPartitionKey(Relation rel)
+{
+    if (rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
+        return NULL;
+
+    if (unlikely(rel->rd_partkey == NULL))
+        RelationBuildPartitionKey(rel);
+
+    return rel->rd_partkey;
+}
+
+/*
  * RelationBuildPartitionKey
  *        Build partition key data of relation, and attach to relcache
  *
@@ -54,7 +77,7 @@ static List *generate_partition_qual(Relation rel);
  * that some of our callees allocate memory on their own which would be leaked
  * permanently.
  */
-void
+static void
 RelationBuildPartitionKey(Relation relation)
 {
     Form_pg_partitioned_table form;
@@ -74,12 +97,9 @@ RelationBuildPartitionKey(Relation relation)
     tuple = SearchSysCache1(PARTRELID,
                             ObjectIdGetDatum(RelationGetRelid(relation)));

-    /*
-     * The following happens when we have created our pg_class entry but not
-     * the pg_partitioned_table entry yet.
-     */
     if (!HeapTupleIsValid(tuple))
-        return;
+        elog(ERROR, "cache lookup failed for partition key of relation %u",
+             RelationGetRelid(relation));

     partkeycxt = AllocSetContextCreate(CurTransactionContext,
                                        "partition key",
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 50f8912..969cb5a 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -43,7 +43,6 @@
 #include "catalog/catalog.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
-#include "catalog/partition.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_amproc.h"
 #include "catalog/pg_attrdef.h"
@@ -53,7 +52,6 @@
 #include "catalog/pg_database.h"
 #include "catalog/pg_namespace.h"
 #include "catalog/pg_opclass.h"
-#include "catalog/pg_partitioned_table.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_rewrite.h"
@@ -71,8 +69,6 @@
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/optimizer.h"
-#include "partitioning/partbounds.h"
-#include "partitioning/partdesc.h"
 #include "rewrite/rewriteDefine.h"
 #include "rewrite/rowsecurity.h"
 #include "storage/lmgr.h"
@@ -84,7 +80,6 @@
 #include "utils/inval.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
-#include "utils/partcache.h"
 #include "utils/relmapper.h"
 #include "utils/resowner_private.h"
 #include "utils/snapmgr.h"
@@ -1165,20 +1160,11 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
     relation->rd_fkeylist = NIL;
     relation->rd_fkeyvalid = false;

-    /* if a partitioned table, initialize key and partition descriptor info */
-    if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
-    {
-        RelationBuildPartitionKey(relation);
-        RelationBuildPartitionDesc(relation);
-    }
-    else
-    {
-        relation->rd_partkey = NULL;
-        relation->rd_partkeycxt = NULL;
-        relation->rd_partdesc = NULL;
-        relation->rd_pdcxt = NULL;
-    }
-    /* ... but partcheck is not loaded till asked for */
+    /* partitioning data is not loaded till asked for */
+    relation->rd_partkey = NULL;
+    relation->rd_partkeycxt = NULL;
+    relation->rd_partdesc = NULL;
+    relation->rd_pdcxt = NULL;
     relation->rd_partcheck = NIL;
     relation->rd_partcheckvalid = false;
     relation->rd_partcheckcxt = NULL;
@@ -2090,6 +2076,16 @@ RelationClose(Relation relation)
     /* Note: no locking manipulations needed */
     RelationDecrementReferenceCount(relation);

+    /*
+     * If the relation is no longer open in this session, we can clean up any
+     * stale partition descriptors it has.  This is unlikely, so check to see
+     * if there are child contexts before expending a call to mcxt.c.
+     */
+    if (RelationHasReferenceCountZero(relation) &&
+        relation->rd_pdcxt != NULL &&
+        relation->rd_pdcxt->firstchild != NULL)
+        MemoryContextDeleteChildren(relation->rd_pdcxt);
+
 #ifdef RELCACHE_FORCE_RELEASE
     if (RelationHasReferenceCountZero(relation) &&
         relation->rd_createSubid == InvalidSubTransactionId &&
@@ -2527,7 +2523,6 @@ RelationClearRelation(Relation relation, bool rebuild)
         bool        keep_rules;
         bool        keep_policies;
         bool        keep_partkey;
-        bool        keep_partdesc;

         /* Build temporary entry, but don't link it into hashtable */
         newrel = RelationBuildDesc(save_relid, false);
@@ -2560,9 +2555,6 @@ RelationClearRelation(Relation relation, bool rebuild)
         keep_policies = equalRSDesc(relation->rd_rsdesc, newrel->rd_rsdesc);
         /* partkey is immutable once set up, so we can always keep it */
         keep_partkey = (relation->rd_partkey != NULL);
-        keep_partdesc = equalPartitionDescs(relation->rd_partkey,
-                                            relation->rd_partdesc,
-                                            newrel->rd_partdesc);

         /*
          * Perform swapping of the relcache entry contents.  Within this
@@ -2617,34 +2609,45 @@ RelationClearRelation(Relation relation, bool rebuild)
         SWAPFIELD(Oid, rd_toastoid);
         /* pgstat_info must be preserved */
         SWAPFIELD(struct PgStat_TableStatus *, pgstat_info);
-        /* preserve old partitioning info if no logical change */
+        /* preserve old partition key if we have one */
         if (keep_partkey)
         {
             SWAPFIELD(PartitionKey, rd_partkey);
             SWAPFIELD(MemoryContext, rd_partkeycxt);
         }
-        if (keep_partdesc)
-        {
-            SWAPFIELD(PartitionDesc, rd_partdesc);
-            SWAPFIELD(MemoryContext, rd_pdcxt);
-        }
-        else if (rebuild && newrel->rd_pdcxt != NULL)
+        if (newrel->rd_pdcxt != NULL)
         {
             /*
              * We are rebuilding a partitioned relation with a non-zero
-             * reference count, so keep the old partition descriptor around,
-             * in case there's a PartitionDirectory with a pointer to it.
-             * Attach it to the new rd_pdcxt so that it gets cleaned up
-             * eventually.  In the case where the reference count is 0, this
-             * code is not reached, which should be OK because in that case
-             * there should be no PartitionDirectory with a pointer to the old
-             * entry.
+             * reference count, so we must keep the old partition descriptor
+             * around, in case there's a PartitionDirectory with a pointer to
+             * it.  This means we can't free the old rd_pdcxt yet.  (This is
+             * necessary because RelationGetPartitionDesc hands out direct
+             * pointers to the relcache's data structure, unlike our usual
+             * practice which is to hand out copies.  We'd have the same
+             * problem with rd_partkey, except that we always preserve that
+             * once created.)
+             *
+             * To ensure that it's not leaked completely, re-attach it to the
+             * new reldesc, or make it a child of the new reldesc's rd_pdcxt
+             * in the unlikely event that there is one already.  (Compare hack
+             * in RelationBuildPartitionDesc.)  RelationClose will clean up
+             * any such contexts once the reference count reaches zero.
+             *
+             * In the case where the reference count is zero, this code is not
+             * reached, which should be OK because in that case there should
+             * be no PartitionDirectory with a pointer to the old entry.
              *
              * Note that newrel and relation have already been swapped, so the
              * "old" partition descriptor is actually the one hanging off of
              * newrel.
              */
-            MemoryContextSetParent(newrel->rd_pdcxt, relation->rd_pdcxt);
+            relation->rd_partdesc = NULL;    /* ensure rd_partdesc is invalid */
+            if (relation->rd_pdcxt != NULL) /* probably never happens */
+                MemoryContextSetParent(newrel->rd_pdcxt, relation->rd_pdcxt);
+            else
+                relation->rd_pdcxt = newrel->rd_pdcxt;
+            /* drop newrel's pointers so we don't destroy it below */
             newrel->rd_partdesc = NULL;
             newrel->rd_pdcxt = NULL;
         }
@@ -3907,27 +3910,7 @@ RelationCacheInitializePhase3(void)
             restart = true;
         }

-        /*
-         * Reload the partition key and descriptor for a partitioned table.
-         */
-        if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
-            relation->rd_partkey == NULL)
-        {
-            RelationBuildPartitionKey(relation);
-            Assert(relation->rd_partkey != NULL);
-
-            restart = true;
-        }
-
-        if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
-            relation->rd_partdesc == NULL)
-        {
-            RelationBuildPartitionDesc(relation);
-            Assert(relation->rd_partdesc != NULL);
-
-            restart = true;
-        }
-
+        /* Reload tableam data if needed */
         if (relation->rd_tableam == NULL &&
             (relation->rd_rel->relkind == RELKIND_RELATION ||
              relation->rd_rel->relkind == RELKIND_SEQUENCE ||
diff --git a/src/include/partitioning/partdesc.h b/src/include/partitioning/partdesc.h
index 38712c1..825bdd7 100644
--- a/src/include/partitioning/partdesc.h
+++ b/src/include/partitioning/partdesc.h
@@ -29,7 +29,8 @@ typedef struct PartitionDescData
     PartitionBoundInfo boundinfo;    /* collection of partition bounds */
 } PartitionDescData;

-extern void RelationBuildPartitionDesc(Relation rel);
+
+extern PartitionDesc RelationGetPartitionDesc(Relation rel);

 extern PartitionDirectory CreatePartitionDirectory(MemoryContext mcxt);
 extern PartitionDesc PartitionDirectoryLookup(PartitionDirectory, Relation);
diff --git a/src/include/utils/partcache.h b/src/include/utils/partcache.h
index 823ad2e..e446423 100644
--- a/src/include/utils/partcache.h
+++ b/src/include/utils/partcache.h
@@ -46,7 +46,8 @@ typedef struct PartitionKeyData
     Oid           *parttypcoll;
 }            PartitionKeyData;

-extern void RelationBuildPartitionKey(Relation relation);
+
+extern PartitionKey RelationGetPartitionKey(Relation rel);
 extern List *RelationGetPartitionQual(Relation rel);
 extern Expr *get_partition_qual_relid(Oid relid);

diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 31d8a1a..bb79bbd 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -20,6 +20,7 @@
 #include "catalog/pg_index.h"
 #include "catalog/pg_publication.h"
 #include "nodes/bitmapset.h"
+#include "partitioning/partdefs.h"
 #include "rewrite/prs2lock.h"
 #include "storage/block.h"
 #include "storage/relfilenode.h"
@@ -94,9 +95,9 @@ typedef struct RelationData
     List       *rd_fkeylist;    /* list of ForeignKeyCacheInfo (see below) */
     bool        rd_fkeyvalid;    /* true if list has been computed */

-    struct PartitionKeyData *rd_partkey;    /* partition key, or NULL */
+    PartitionKey rd_partkey;    /* partition key, or NULL */
     MemoryContext rd_partkeycxt;    /* private context for rd_partkey, if any */
-    struct PartitionDescData *rd_partdesc;    /* partitions, or NULL */
+    PartitionDesc rd_partdesc;    /* partition descriptor, or NULL */
     MemoryContext rd_pdcxt;        /* private context for rd_partdesc, if any */
     List       *rd_partcheck;    /* partition CHECK quals */
     bool        rd_partcheckvalid;    /* true if list has been computed */
@@ -596,18 +597,6 @@ typedef struct ViewOptions
      RelationNeedsWAL(relation) && \
      !IsCatalogRelation(relation))

-/*
- * RelationGetPartitionKey
- *        Returns the PartitionKey of a relation
- */
-#define RelationGetPartitionKey(relation) ((relation)->rd_partkey)
-
-/*
- * RelationGetPartitionDesc
- *        Returns partition descriptor for a relation.
- */
-#define RelationGetPartitionDesc(relation) ((relation)->rd_partdesc)
-
 /* routines in utils/cache/relcache.c */
 extern void RelationIncrementReferenceCount(Relation rel);
 extern void RelationDecrementReferenceCount(Relation rel);
diff --git a/src/test/regress/expected/create_table.out b/src/test/regress/expected/create_table.out
index 50de4b3..5236038 100644
--- a/src/test/regress/expected/create_table.out
+++ b/src/test/regress/expected/create_table.out
@@ -512,6 +512,22 @@ Partition of: partitioned2 FOR VALUES FROM ('-1', 'aaaaa') TO (100, 'ccccc')
 Partition constraint: (((a + 1) IS NOT NULL) AND (substr(b, 1, 5) IS NOT NULL) AND (((a + 1) > '-1'::integer) OR (((a
+1) = '-1'::integer) AND (substr(b, 1, 5) >= 'aaaaa'::text))) AND (((a + 1) < 100) OR (((a + 1) = 100) AND (substr(b,
1,5) < 'ccccc'::text)))) 

 DROP TABLE partitioned, partitioned2;
+-- check reference to partitioned table's rowtype in partition descriptor
+create table partitioned (a int, b int)
+  partition by list ((row(a, b)::partitioned));
+create table partitioned1
+  partition of partitioned for values in ('(1,2)'::partitioned);
+create table partitioned2
+  partition of partitioned for values in ('(2,4)'::partitioned);
+explain (costs off)
+select * from partitioned where row(a,b)::partitioned = '(1,2)'::partitioned;
+                        QUERY PLAN
+-----------------------------------------------------------
+ Seq Scan on partitioned1 partitioned
+   Filter: (ROW(a, b)::partitioned = '(1,2)'::partitioned)
+(2 rows)
+
+drop table partitioned;
 -- check that dependencies of partition columns are handled correctly
 create domain intdom1 as int;
 create table partitioned (
diff --git a/src/test/regress/sql/create_table.sql b/src/test/regress/sql/create_table.sql
index 9a40d7b..ab424dc 100644
--- a/src/test/regress/sql/create_table.sql
+++ b/src/test/regress/sql/create_table.sql
@@ -459,6 +459,17 @@ CREATE TABLE part2_1 PARTITION OF partitioned2 FOR VALUES FROM (-1, 'aaaaa') TO

 DROP TABLE partitioned, partitioned2;

+-- check reference to partitioned table's rowtype in partition descriptor
+create table partitioned (a int, b int)
+  partition by list ((row(a, b)::partitioned));
+create table partitioned1
+  partition of partitioned for values in ('(1,2)'::partitioned);
+create table partitioned2
+  partition of partitioned for values in ('(2,4)'::partitioned);
+explain (costs off)
+select * from partitioned where row(a,b)::partitioned = '(1,2)'::partitioned;
+drop table partitioned;
+
 -- check that dependencies of partition columns are handled correctly
 create domain intdom1 as int;


Re: unsupportable composite type partition keys

From
Amit Langote
Date:
On Tue, Dec 24, 2019 at 12:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> BTW, I forgot to mention: while I think the patch to forbid pseudotypes
> by using CheckAttributeType() can be back-patched, I'm leaning towards
> not back-patching the other patch.  The situation where we get into
> infinite recursion seems not very likely in practice, and it's not
> going to cause any crash or data loss, so I think we can just say
> "sorry that's not supported before v13".  The patch as I'm proposing
> it seems rather invasive for a back-branch fix.

It is indeed.

Just to be sure, by going with "unsupported before v13", which one do you mean:

* documenting it as so
* giving an error in such cases, like the patch in the first email on
this thread did
* doing nothing really

Thanks,
Amit



Re: unsupportable composite type partition keys

From
Amit Langote
Date:
On Tue, Dec 24, 2019 at 6:33 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
> > One thing I see is that you chose to relocate RelationGetPartitionDesc's
> > declaration to partdesc.h, whereupon RelationBuildPartitionDesc doesn't
> > have to be exported at all anymore.  Perhaps that's a better factorization
> > than what I did.  It supposes that any caller of RelationGetPartitionDesc
> > is going to need partdesc.h, but that seems reasonable.  We could likewise
> > move RelationGetPartitionKey to partcache.h.
>
> I concluded that that is indeed a better solution; it does allow removing
> some rel.h inclusions (though possibly those were just duplicative?), and
> it also means that relcache.c itself doesn't need any partitioning
> inclusions at all.
>
> Here's a cleaned-up patch that does it like that and also fixes the
> memory leakage issue.

Thanks for the updated patch.  I didn't find anything to complain about.

> I haven't removed equalPartitionDescs here; that seems like material
> for a separate patch (to make it easier to put it back if we need it).

Seems like a good idea.

Btw, does the memory leakage fix in this patch address any of the
pending concerns that were discussed on the "hyrax vs.
RelationBuildPartitionDesc" thread earlier this year[1]?

Thanks,
Amit

[1] https://www.postgresql.org/message-id/flat/3800.1560366716%40sss.pgh.pa.us#092b6b4f6bf75d2f3f90ef6a3b3eab5b



Re: unsupportable composite type partition keys

From
Amit Langote
Date:
On Tue, Dec 24, 2019 at 10:59 AM Amit Langote <amitlangote09@gmail.com> wrote:
> Btw, does the memory leakage fix in this patch address any of the
> pending concerns that were discussed on the "hyrax vs.
> RelationBuildPartitionDesc" thread earlier this year[1]?
>
> [1] https://www.postgresql.org/message-id/flat/3800.1560366716%40sss.pgh.pa.us#092b6b4f6bf75d2f3f90ef6a3b3eab5b

I thought about this a little and I think it *does* address the main
complaint in the above thread.

The main complaint as I understand is that receiving repeated
invalidations due to partitions being concurrently added while a
PartitionDirectory is holding a pointer to PartitionDesc causes many
copies of PartitionDesc to pile up due to the parent table being
rebuilt upon processing of each invalidation.

Now because we don't build the PartitionDesc in the
RelationClearRelation path, that can't happen.  Although it still
seems possible for the piling up to occur if it's
RelationBuildPartitionDesc that is run repeatedly via
RelationGetParttionDesc while partitions are being concurrently added,
but I couldn't find anything in the partitioning code that does that.

Thanks,
Amit



Re: unsupportable composite type partition keys

From
Amit Langote
Date:
On Mon, Dec 23, 2019 at 6:42 PM Amit Langote <amitlangote09@gmail.com> wrote:
> On Sun, Dec 22, 2019 at 6:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > I wonder whether we couldn't also lift
> > > the restriction against whole-row Vars in partition expressions.
> > > Doesn't seem like there is much difference between such a Var and
> > > a row(...)::table_rowtype expression.
> >
> > I didn't look into that either.  I wouldn't propose back-patching that,
> > but it'd be interesting to try to fix it in HEAD.
>
> Agreed.

I gave that a try and ended up with attached that applies on top of
your delay-loading-relcache-partition-data-2.patch.

Thanks,
Amit

Attachment

Re: unsupportable composite type partition keys

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
> On Tue, Dec 24, 2019 at 12:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> BTW, I forgot to mention: while I think the patch to forbid pseudotypes
>> by using CheckAttributeType() can be back-patched, I'm leaning towards
>> not back-patching the other patch.  The situation where we get into
>> infinite recursion seems not very likely in practice, and it's not
>> going to cause any crash or data loss, so I think we can just say
>> "sorry that's not supported before v13".  The patch as I'm proposing
>> it seems rather invasive for a back-branch fix.

> It is indeed.

> Just to be sure, by going with "unsupported before v13", which one do you mean:

> * documenting it as so
> * giving an error in such cases, like the patch in the first email on
> this thread did
> * doing nothing really

I was thinking "do nothing in the back branches".  I don't believe we
can detect such cases reliably (at least not without complicated logic,
which'd defeat the point), so I don't think giving an error is actually
feasible, and I doubt that documenting it would be useful.  If we get
some field complaints about this, it'd be time enough to reconsider.

            regards, tom lane



Re: unsupportable composite type partition keys

From
Amit Langote
Date:
On Wed, Dec 25, 2019 at 2:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Langote <amitlangote09@gmail.com> writes:
> > On Tue, Dec 24, 2019 at 12:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> BTW, I forgot to mention: while I think the patch to forbid pseudotypes
> >> by using CheckAttributeType() can be back-patched, I'm leaning towards
> >> not back-patching the other patch.  The situation where we get into
> >> infinite recursion seems not very likely in practice, and it's not
> >> going to cause any crash or data loss, so I think we can just say
> >> "sorry that's not supported before v13".  The patch as I'm proposing
> >> it seems rather invasive for a back-branch fix.
>
> > It is indeed.
>
> > Just to be sure, by going with "unsupported before v13", which one do you mean:
>
> > * documenting it as so
> > * giving an error in such cases, like the patch in the first email on
> > this thread did
> > * doing nothing really
>
> I was thinking "do nothing in the back branches".  I don't believe we
> can detect such cases reliably (at least not without complicated logic,
> which'd defeat the point), so I don't think giving an error is actually
> feasible, and I doubt that documenting it would be useful.  If we get
> some field complaints about this, it'd be time enough to reconsider.

Sure, thanks for the reply.

Regards,
Amit



Re: unsupportable composite type partition keys

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
> On Tue, Dec 24, 2019 at 10:59 AM Amit Langote <amitlangote09@gmail.com> wrote:
>> Btw, does the memory leakage fix in this patch address any of the
>> pending concerns that were discussed on the "hyrax vs.
>> RelationBuildPartitionDesc" thread earlier this year[1]?
>> [1] https://www.postgresql.org/message-id/flat/3800.1560366716%40sss.pgh.pa.us#092b6b4f6bf75d2f3f90ef6a3b3eab5b

> I thought about this a little and I think it *does* address the main
> complaint in the above thread.

I experimented with the test shown in [1].  This patch does prevent that
case from accumulating copies of the partition descriptor.

(The performance of that test case is still awful, more or less O(N^2)
in the number of repetitions.  But I think what's going on is that it
repeatedly creates and deletes the same catalog entries, and we're not
smart enough to recognize that the dead row versions are fully dead,
so lots of time is wasted by unique-index checks.  It's not clear
that that's of any interest for real-world cases.)

I remain of the opinion that this is a pretty crummy, ad-hoc way to manage
the partition descriptor caching.  It's less bad than before, but I'm
still concerned that holding a relcache entry open for any long period
could result in bloat if the cache entry is rebuilt many times meanwhile
--- and there's no strong reason to think that can't happen.  Still,
maybe we can wait to solve that until there's some evidence that it
does happen in useful cases.

I also poked at the test case mentioned in the other thread about foreign
keys across lots of partitions [2].  Again, this patch gets rid of the
memory bloat, though the performance is still pretty awful with lots of
partitions for other reasons.

            regards, tom lane

[1] https://www.postgresql.org/message-id/10797.1552679128%40sss.pgh.pa.us
[2]
https://www.postgresql.org/message-id/OSAPR01MB374809E8DE169C8BF2B82CBD9F6B0%40OSAPR01MB3748.jpnprd01.prod.outlook.com



Re: unsupportable composite type partition keys

From
Tom Lane
Date:
I wrote:
> Amit Langote <amitlangote09@gmail.com> writes:
>> On Tue, Dec 24, 2019 at 10:59 AM Amit Langote <amitlangote09@gmail.com> wrote:
>>> Btw, does the memory leakage fix in this patch address any of the
>>> pending concerns that were discussed on the "hyrax vs.
>>> RelationBuildPartitionDesc" thread earlier this year[1]?
>>> [1] https://www.postgresql.org/message-id/flat/3800.1560366716%40sss.pgh.pa.us#092b6b4f6bf75d2f3f90ef6a3b3eab5b

>> I thought about this a little and I think it *does* address the main
>> complaint in the above thread.

It occurred to me to also recheck the original complaint in that thread,
which was poor behavior in CLOBBER_CACHE_ALWAYS builds.  I didn't have
the patience to run a full CCA test, but I did run update.sql, which
we previously established was sufficient to show the problem.  There's
no apparent memory bloat, either with HEAD or with the patch.  I also
see the runtime (for update.sql on its own) dropping from about
474 sec in HEAD to 457 sec with the patch.  So that indicates that we're
actually saving a noticeable amount of work, not just postponing it,
at least under CCA scenarios where relcache entries get flushed a lot.

I also tried to measure update.sql's runtime in a regular debug build
(not CCA).  I get pretty repeatable results of 279ms on HEAD vs 273ms
with patch, or about a 2% overall savings.  That's at the very limit of
what I'd consider a reproducible difference, but still it seems to be
real.  So that seems like evidence that forcing the partition data to be
loaded immediately rather than on-demand is a loser from a performance
standpoint as well as the recursion concerns that prompted this patch.

Which naturally leads one to wonder whether forcing other relcache
substructures (triggers, rules, etc) to be loaded immediately isn't
a loser as well.  I'm still feeling like we're overdue to redesign how
all of this works and come up with a more uniform, less fragile/ad-hoc
approach.  But I don't have the time or interest to do that right now.

Anyway, I've run out of reasons not to commit this patch, so I'll
go do that.

            regards, tom lane



Re: unsupportable composite type partition keys

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
>> On Sun, Dec 22, 2019 at 6:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I wonder whether we couldn't also lift
>>> the restriction against whole-row Vars in partition expressions.
>>> Doesn't seem like there is much difference between such a Var and
>>> a row(...)::table_rowtype expression.

> I gave that a try and ended up with attached that applies on top of
> your delay-loading-relcache-partition-data-2.patch.

Pushed with minor fixes.

            regards, tom lane



Re: unsupportable composite type partition keys

From
Amit Langote
Date:
On Thu, Dec 26, 2019 at 5:45 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Langote <amitlangote09@gmail.com> writes:
> >> On Sun, Dec 22, 2019 at 6:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> I wonder whether we couldn't also lift
> >>> the restriction against whole-row Vars in partition expressions.
> >>> Doesn't seem like there is much difference between such a Var and
> >>> a row(...)::table_rowtype expression.
>
> > I gave that a try and ended up with attached that applies on top of
> > your delay-loading-relcache-partition-data-2.patch.
>
> Pushed with minor fixes.

Thank you.

Regards,
Amit



Re: unsupportable composite type partition keys

From
Amit Langote
Date:
On Thu, Dec 26, 2019 at 3:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
> > Amit Langote <amitlangote09@gmail.com> writes:
> >> On Tue, Dec 24, 2019 at 10:59 AM Amit Langote <amitlangote09@gmail.com> wrote:
> >>> Btw, does the memory leakage fix in this patch address any of the
> >>> pending concerns that were discussed on the "hyrax vs.
> >>> RelationBuildPartitionDesc" thread earlier this year[1]?
> >>> [1] https://www.postgresql.org/message-id/flat/3800.1560366716%40sss.pgh.pa.us#092b6b4f6bf75d2f3f90ef6a3b3eab5b
>
> >> I thought about this a little and I think it *does* address the main
> >> complaint in the above thread.
>
> It occurred to me to also recheck the original complaint in that thread,
> which was poor behavior in CLOBBER_CACHE_ALWAYS builds.

Thanks for taking the time to do that.

>  I didn't have
> the patience to run a full CCA test, but I did run update.sql, which
> we previously established was sufficient to show the problem.  There's
> no apparent memory bloat, either with HEAD or with the patch.  I also
> see the runtime (for update.sql on its own) dropping from about
> 474 sec in HEAD to 457 sec with the patch.  So that indicates that we're
> actually saving a noticeable amount of work, not just postponing it,
> at least under CCA scenarios where relcache entries get flushed a lot.

Yeah, as long as nothing in between those flushes needs to look at the
partition descriptor.

> I also tried to measure update.sql's runtime in a regular debug build
> (not CCA).  I get pretty repeatable results of 279ms on HEAD vs 273ms
> with patch, or about a 2% overall savings.  That's at the very limit of
> what I'd consider a reproducible difference, but still it seems to be
> real.  So that seems like evidence that forcing the partition data to be
> loaded immediately rather than on-demand is a loser from a performance
> standpoint as well as the recursion concerns that prompted this patch.

Agreed.

> Which naturally leads one to wonder whether forcing other relcache
> substructures (triggers, rules, etc) to be loaded immediately isn't
> a loser as well.  I'm still feeling like we're overdue to redesign how
> all of this works and come up with a more uniform, less fragile/ad-hoc
> approach.  But I don't have the time or interest to do that right now.

I suppose if on-demand loading of partition descriptors can result in
up to 2% savings, we can perhaps expect slightly more by doing the
same for other substructures.  Also, the more different substructures
are accessed similarly the better.

> Anyway, I've run out of reasons not to commit this patch, so I'll
> go do that.

Thank you.  I noticed that there are comments suggesting that certain
RelationData members are to be accessed using their RelationGet*
functions, but partitioning members do not have such comments.  How
about the attached?

Regards,
Amit

Attachment

Re: unsupportable composite type partition keys

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
> Thank you.  I noticed that there are comments suggesting that certain
> RelationData members are to be accessed using their RelationGet*
> functions, but partitioning members do not have such comments.  How
> about the attached?

Good idea, done.

            regards, tom lane



Re: unsupportable composite type partition keys

From
Julien Rouhaud
Date:
On Sun, Dec 22, 2019 at 10:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I wrote:
> > Now as far as point 1 goes, I think it's not really that awful to use
> > CheckAttributeType() with a dummy attribute name.  The attached
> > incomplete patch uses "partition key" which causes it to emit errors
> > like
> > regression=# create table fool (a int, b int) partition by list ((row(a, b)));
> > ERROR:  column "partition key" has pseudo-type record
> > I don't think that that's unacceptable.  But if we wanted to improve it,
> > we could imagine adding another flag, say CHKATYPE_IS_PARTITION_KEY,
> > that doesn't affect CheckAttributeType's semantics, just the wording of
> > the error messages it throws.
>
> Here's a fleshed-out patch that does it like that.
>
> While poking at this, I also started to wonder why CheckAttributeType
> wasn't recursing into ranges, since those are our other kind of
> container type.  And the answer is that it must, because we allow
> creation of ranges over composite types:
>
> regression=# create table foo (f1 int, f2 int);
> CREATE TABLE
> regression=# create type foorange as range (subtype = foo);
> CREATE TYPE
> regression=# alter table foo add column r foorange;
> ALTER TABLE
>
> Simple things still work on table foo, but surely this is exactly
> what CheckAttributeType is supposed to be preventing.  With the
> second attached patch you get
>
> regression=# alter table foo add column r foorange;
> ERROR:  composite type foo cannot be made a member of itself
>
> The second patch needs to go back all the way, the first one
> only as far as we have partitions.

While working on regression tests for index collation versioning [1],
I noticed that the 2nd patch apparently broke the ability to create a
table using a range over collatable datatype attribute, which we
apparently don't test anywhere.  Simple example to reproduce:

CREATE TYPE myrange_text AS range (subtype = text);
CREATE TABLE test_text(
    meh myrange_text
);
ERROR:  42P16: no collation was derived for column "meh" with
collatable type text
HINT:  Use the COLLATE clause to set the collation explicitly.

AFAICT, this is only a thinko in CheckAttributeType(), where the range
collation should be provided rather than the original tuple desc one,
as per attached.  I also added a create/drop table in an existing
regression test that was already creating range over collatable type.

[1] https://www.postgresql.org/message-id/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com

Attachment

Re: unsupportable composite type partition keys

From
Tom Lane
Date:
Julien Rouhaud <rjuju123@gmail.com> writes:
> On Sun, Dec 22, 2019 at 10:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> While poking at this, I also started to wonder why CheckAttributeType
>> wasn't recursing into ranges, since those are our other kind of
>> container type.  And the answer is that it must, because we allow
>> creation of ranges over composite types:

> While working on regression tests for index collation versioning [1],
> I noticed that the 2nd patch apparently broke the ability to create a
> table using a range over collatable datatype attribute, which we
> apparently don't test anywhere.

Ugh.

> AFAICT, this is only a thinko in CheckAttributeType(), where the range
> collation should be provided rather than the original tuple desc one,
> as per attached.  I also added a create/drop table in an existing
> regression test that was already creating range over collatable type.

Looks good, although I think maybe we'd better test the case a little
harder than this.  Will tweak that and push -- thanks!

            regards, tom lane



Re: unsupportable composite type partition keys

From
Julien Rouhaud
Date:
On Fri, Jan 31, 2020 at 04:20:36PM -0500, Tom Lane wrote:
> Julien Rouhaud <rjuju123@gmail.com> writes:
> > On Sun, Dec 22, 2019 at 10:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> While poking at this, I also started to wonder why CheckAttributeType
> >> wasn't recursing into ranges, since those are our other kind of
> >> container type.  And the answer is that it must, because we allow
> >> creation of ranges over composite types:
> 
> > While working on regression tests for index collation versioning [1],
> > I noticed that the 2nd patch apparently broke the ability to create a
> > table using a range over collatable datatype attribute, which we
> > apparently don't test anywhere.
> 
> Ugh.
> 
> > AFAICT, this is only a thinko in CheckAttributeType(), where the range
> > collation should be provided rather than the original tuple desc one,
> > as per attached.  I also added a create/drop table in an existing
> > regression test that was already creating range over collatable type.
> 
> Looks good, although I think maybe we'd better test the case a little
> harder than this.  Will tweak that and push -- thanks!

Ah, I wasn't sure that additional tests on a table would be worthwhile enough.
Thanks for tweaking and pushing!



Re: unsupportable composite type partition keys

From
Jobin Augustine
Date:
Is there a way out if someone accidentally executes the same test case against PG12?

testdb=# create table partitioned (a int, b int)
testdb-#   partition by list ((row(a, b)::partitioned));
CREATE TABLE
testdb=# DROP TABLE partitioned;
ERROR:  cache lookup failed for type 18269
 

Ah, I wasn't sure that additional tests on a table would be worthwhile enough.
Thanks for tweaking and pushing!


Re: unsupportable composite type partition keys

From
Julien Rouhaud
Date:
On Wed, Sep 9, 2020 at 4:17 PM Jobin Augustine <jobinau@gmail.com> wrote:
>
> Is there a way out if someone accidentally executes the same test case against PG12?
>
> testdb=# create table partitioned (a int, b int)
> testdb-#   partition by list ((row(a, b)::partitioned));
> CREATE TABLE
> testdb=# DROP TABLE partitioned;
> ERROR:  cache lookup failed for type 18269

AFAICT this is only a side effect of that particular use case if you
try to drop it without having a relcache entry.  Do any access before
dropping it and it should be fine, for instance:

rjuju=# create table partitioned (a int, b int)
rjuju-# partition by list ((row(a, b)::partitioned));
CREATE TABLE
rjuju=# DROP TABLE partitioned;
ERROR:  cache lookup failed for type 144845
rjuju=# \d partitioned
      Partitioned table "public.partitioned"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 a      | integer |           |          |
 b      | integer |           |          |
Partition key: LIST ((ROW(a, b)::partitioned))
Number of partitions: 0

rjuju=# DROP TABLE partitioned;
DROP TABLE