Thread: ALTER TABLE SET ACCESS METHOD on partitioned tables

ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Soumyadeep Chakraborty
Date:
Hello,

This is a fresh thread to continue the discussion on ALTER TABLE SET
ACCESS METHOD when applied to partition roots, as requested.

Current behavior (HEAD):

CREATE TABLE am_partitioned(x INT, y INT)
   PARTITION BY hash (x);
ALTER TABLE am_partitioned SET ACCESS METHOD heap2;
ERROR:  cannot change access method of a partitioned table

Potential behavior options:

(A) Don't recurse to existing children and ensure that the new am gets
inherited by any new children. (ALTER TABLE SET TABLESPACE behavior)

(B) Recurse to existing children and modify their am. Also, ensure that
any new children inherit the new am.

A patch [1] was introduced earlier by Justin to implement
(A). v1-0001-Allow-ATSETAM-on-partition-roots.patch contains a rebase
of that patch against latest HEAD, with minor updates on comments and
some additional test coverage.

I think that (B) is necessary for partition hierarchies with a high
number of partitions. One typical use case in Greenplum, for
instance, is to convert heap tables containing cold data to append-only
storage at the root or subroot level of partition hierarchies consisting
of thousands of partitions. Asking users to ALTER individual partitions
is cumbersome and error-prone.

Furthermore, I believe that (B) should be the default and (A) can be
chosen by using the ONLY clause. This would give us the best of both
worlds and would make the use of ONLY consistent. The patch
v1-0002-Make-ATSETAM-recurse-by-default.patch achieves that.

Thoughts?

Regards,
Soumyadeep (VMware)


[1] https://www.postgresql.org/message-id/20210308010707.GA29832%40telsasoft.com
Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Zhihong Yu
Date:
Hi,

+       tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+       accessMethodId = ((Form_pg_class) GETSTRUCT(tup))->relam;

-   /* look up the access method, verify it is for a table */
-   if (accessMethod != NULL)
-       accessMethodId = get_table_am_oid(accessMethod, false);
+       if (!HeapTupleIsValid(tup))
+           elog(ERROR, "cache lookup failed for relation %u", relid);

Shouldn't the validity of tup be checked before relam field is accessed ?

Cheers

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Justin Pryzby
Date:
Thanks for copying me.

I didn't look closely yet, but this comment is wrong:

+ * Since these have no storage the tablespace can be updated with a simple
                                                                                                          
 
+ * metadata only operation to update the tablespace.
                                                                                                          
 

As I see it, AMs are a strong parallel to tablespaces.  The default tablespace
is convenient: 1) explicitly specified tablespace; 2) tablespace of parent,
partitioned table; 3) DB tablespace; 4) default_tablespace:
https://www.postgresql.org/message-id/20190423222633.GA8364%40alvherre.pgsql

It'd be convenient if AMs worked the same way (and a bit odd that they don't).
Note that in v15, pg_dump/restore now allow --no-table-am, an exact parallel to
--no-tablespace.

-- 
Justin



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Soumyadeep Chakraborty
Date:

On Wed, May 18, 2022 at 4:14 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

> I didn't look closely yet, but this comment is wrong:
>
> + * Since these have no storage the tablespace can be updated with a simple                                                                                                                                                      
> + * metadata only operation to update the tablespace.                                                                                                                                                                            

Good catch. Fixed.

> It'd be convenient if AMs worked the same way (and a bit odd that they don't).
> Note that in v15, pg_dump/restore now allow --no-table-am, an exact parallel to
> --no-tablespace.

I agree that ATSET AM should behave in a similar fashion to ATSET tablespaces.
However, the way that ATSET tablespace currently behaves is not consistent with
the ONLY clause.

On a given partition root:
ALTER TABLE ONLY am_partitioned SET TABLESPACE ts;
has the same effect as:
ALTER TABLE am_partitioned SET TABLESPACE ts;

We are missing out on the feature to set the AM/tablespace throughout the
partition hierarchy, with one command.

Regards,
Soumyadeep (VMware)


Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Zhihong Yu
Date:


On Wed, May 18, 2022 at 5:49 PM Soumyadeep Chakraborty <soumyadeep2007@gmail.com> wrote:

On Wed, May 18, 2022 at 4:14 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

> I didn't look closely yet, but this comment is wrong:
>
> + * Since these have no storage the tablespace can be updated with a simple                                                                                                                                                      
> + * metadata only operation to update the tablespace.                                                                                                                                                                            

Good catch. Fixed.

> It'd be convenient if AMs worked the same way (and a bit odd that they don't).
> Note that in v15, pg_dump/restore now allow --no-table-am, an exact parallel to
> --no-tablespace.

I agree that ATSET AM should behave in a similar fashion to ATSET tablespaces.
However, the way that ATSET tablespace currently behaves is not consistent with
the ONLY clause.

On a given partition root:
ALTER TABLE ONLY am_partitioned SET TABLESPACE ts;
has the same effect as:
ALTER TABLE am_partitioned SET TABLESPACE ts;

We are missing out on the feature to set the AM/tablespace throughout the
partition hierarchy, with one command.

Regards,
Soumyadeep (VMware)

Hi,

+       accessMethodId = ((Form_pg_class) GETSTRUCT(tup))->relam;

-   /* look up the access method, verify it is for a table */
-   if (accessMethod != NULL)
-       accessMethodId = get_table_am_oid(accessMethod, false);
+       if (!HeapTupleIsValid(tup))
+           elog(ERROR, "cache lookup failed for relation %u", relid);

The validity check of tup should be done before fetching the value of relam field.

Cheers

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Soumyadeep Chakraborty
Date:
On Wed, May 18, 2022 at 6:26 PM Zhihong Yu <zyu@yugabyte.com> wrote:

> +       accessMethodId = ((Form_pg_class) GETSTRUCT(tup))->relam;
>
> -   /* look up the access method, verify it is for a table */
> -   if (accessMethod != NULL)
> -       accessMethodId = get_table_am_oid(accessMethod, false);
> +       if (!HeapTupleIsValid(tup))
> +           elog(ERROR, "cache lookup failed for relation %u", relid);
>
> The validity check of tup should be done before fetching the value of relam field.

Thanks. Fixed and rebased.

Regards,
Soumyadeep (VMware)
Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Thu, Jun 09, 2022 at 11:21:58AM -0700, Soumyadeep Chakraborty wrote:
> Thanks. Fixed and rebased.

I think that I am OK with the concept of this patch to use a
partitioned table's relam as a reference when creating a partition
rather than relying on the default GUC, in a way similar to
tablespaces.

One worry I see is that forcing a recursion on the leaves on ALTER
TABLE could silently break partitions where multiple table AMs are
used across separate partitions if ALTER TABLE SET ACCESS METHOD is
used on one of the parents, though it seems like this is not something
I would much worry about as now the command is an error.

A second worry is that we would just break existing creation flows
that rely on the GUC defining the default AM.  This is worth a close
lookup at pg_dump to make sure that we do things correctly with this
patch in place..  Did you check dump and restore flows with partition
trees and --no-table-access-method?  Perhaps there should be
some regression tests with partitioned tables?

+    /*
+     * For partitioned tables, when no access method is specified, we
+     * default to the parent table's AM.
+     */
+    Assert(list_length(inheritOids) == 1);
+    /* XXX: should implement get_rel_relam? */
+    relid = linitial_oid(inheritOids);
+    tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+    if (!HeapTupleIsValid(tup))
+            elog(ERROR, "cache lookup failed for relation %u", relid);

Having a wrapper for that could be useful, yes.  We don't have any
code paths that would directly need that now, from what I can see,
though.  This patch gives one reason to have one.
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Mon, Mar 20, 2023 at 09:30:50AM +0900, Michael Paquier wrote:
> Did you check dump and restore flows with partition
> trees and --no-table-access-method?  Perhaps there should be
> some regression tests with partitioned tables?

I was looking at the patch, and as I suspected the dumps generated
are forgetting to apply the AM to the partitioned tables.  For
example, assuming that the default AM is heap, the patch creates
child_0_10 with heap2 as AM, which is what we want:
CREATE ACCESS METHOD heap2 TYPE TABLE HANDLER heap_tableam_handler;
CREATE TABLE parent_tab (id int) PARTITION BY RANGE (id) USING heap2;
CREATE TABLE child_0_10 PARTITION OF parent_tab
  FOR VALUES FROM (0) TO (10);

However a dump produces that (content cut except for its most relevant
parts):
CREATE ACCESS METHOD heap2 TYPE TABLE HANDLER heap_tableam_handler;
SET default_tablespace = '';
CREATE TABLE public.parent_tab (
    id integer
)
PARTITION BY RANGE (id);
SET default_table_access_method = heap2;
CREATE TABLE public.child_0_10 (
    id integer
);

This would restore the previous contents incorrectly, where parent_tab
would use heap and child_0_10 would use heap2, causing any partitions
created after the restore to use silently heap.  This is going to
require a logic similar to tablespaces, where generate SET commands
on default_table_access_method so as --no-table-access-method in
pg_dump and pg_restore are able to work correctly.  Having tests to
cover all that is a must-have.
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Justin Pryzby
Date:
On Tue, Mar 28, 2023 at 09:13:10AM +0900, Michael Paquier wrote:
> On Mon, Mar 20, 2023 at 09:30:50AM +0900, Michael Paquier wrote:
> > Did you check dump and restore flows with partition
> > trees and --no-table-access-method?  Perhaps there should be
> > some regression tests with partitioned tables?
> 
> I was looking at the patch, and as I suspected the dumps generated
> are forgetting to apply the AM to the partitioned tables.

The patch said:

+       else if (RELKIND_HAS_TABLE_AM(relkind) || relkind == RELKIND_PARTITIONED_TABLE)

pg_dump was missing a similar change that's conditional on
RELKIND_HAS_TABLE_AM().  It looks like those are the only two places
that need be be specially handled.

I dug up my latest patch from 2021 and incorporated the changes from the
0001 patch here, and added a test case.

I realized that one difference with tablespaces is that, as written,
partitioned tables will *always* have an AM specified,  and partitions
will never use default_table_access_method.  Is that what's intended ?

Or do we need logic similar tablespaces, such that the relam of a
partitioned table is set only if it differs from default_table_am ?

-- 
Justin

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Mon, Mar 27, 2023 at 11:34:35PM -0500, Justin Pryzby wrote:
> I realized that one difference with tablespaces is that, as written,
> partitioned tables will *always* have an AM specified,  and partitions
> will never use default_table_access_method.  Is that what's intended ?
>
> Or do we need logic similar tablespaces, such that the relam of a
> partitioned table is set only if it differs from default_table_am ?

Hmm.  This is a good point.  It is true that the patch feels
incomplete on this side.  I don't see why we could not be flexible,
and allow a value of 0 in a partitioned table's relam to mean that we
would pick up the database default in this case when a partition is
is created on it.  This would have the advantage to be consistent with
older versions where we fallback on the default.  We cannot be
completely consistent with the reltablespace of the leaf partitions
unfortunately, as relam should always be set if a relation has
storage.  And allowing a value of 0 means that there are likely other
tricky cases with dumps?

Another thing: would it make sense to allow an empty string in
default_table_access_method so as we'd always fallback to a database
default?
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Tue, Mar 28, 2023 at 02:56:28PM +0900, Michael Paquier wrote:
> Hmm.  This is a good point.  It is true that the patch feels
> incomplete on this side.  I don't see why we could not be flexible,
> and allow a value of 0 in a partitioned table's relam to mean that we
> would pick up the database default in this case when a partition is
> is created on it.  This would have the advantage to be consistent with
> older versions where we fallback on the default.  We cannot be
> completely consistent with the reltablespace of the leaf partitions
> unfortunately, as relam should always be set if a relation has
> storage.  And allowing a value of 0 means that there are likely other
> tricky cases with dumps?
>
> Another thing: would it make sense to allow an empty string in
> default_table_access_method so as we'd always fallback to a database
> default?

FYI, I am not sure that I will be able to look more at this patch by
the end of the commit fest, and there are quite more points to
consider.  Perhaps at this stage we'd better mark it as returned with
feedback?  I understand that I've arrived late at this party :/
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Justin Pryzby
Date:
On Mon, Mar 27, 2023 at 11:34:36PM -0500, Justin Pryzby wrote:
> On Tue, Mar 28, 2023 at 09:13:10AM +0900, Michael Paquier wrote:
> > On Mon, Mar 20, 2023 at 09:30:50AM +0900, Michael Paquier wrote:
> > > Did you check dump and restore flows with partition
> > > trees and --no-table-access-method?  Perhaps there should be
> > > some regression tests with partitioned tables?
> > 
> > I was looking at the patch, and as I suspected the dumps generated
> > are forgetting to apply the AM to the partitioned tables.
> 
> The patch said:
> 
> +       else if (RELKIND_HAS_TABLE_AM(relkind) || relkind == RELKIND_PARTITIONED_TABLE)
> 
> pg_dump was missing a similar change that's conditional on
> RELKIND_HAS_TABLE_AM().  It looks like those are the only two places
> that need be be specially handled.
> 
> I dug up my latest patch from 2021 and incorporated the changes from the
> 0001 patch here, and added a test case.
> 
> I realized that one difference with tablespaces is that, as written,
> partitioned tables will *always* have an AM specified,  and partitions
> will never use default_table_access_method.  Is that what's intended ?
> 
> Or do we need logic similar tablespaces, such that the relam of a
> partitioned table is set only if it differs from default_table_am ?

Actually .. I think it'd be a mistake if the relam needed to be
interpretted differently for partitioned tables than other relkinds.

I updated the patch to allow intermediate partitioned tables to inherit
relam from their parent.

Michael wrote:
> .. and there are quite more points to consider.

What more points ?  There's nothing that's been raised here.  In fact,
your message last week is the first comment since last June ..

-- 
Justin

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Justin Pryzby
Date:
On Thu, Mar 30, 2023 at 12:07:58AM -0500, Justin Pryzby wrote:
> On Mon, Mar 27, 2023 at 11:34:36PM -0500, Justin Pryzby wrote:
> > On Tue, Mar 28, 2023 at 09:13:10AM +0900, Michael Paquier wrote:
> > > On Mon, Mar 20, 2023 at 09:30:50AM +0900, Michael Paquier wrote:
> > > > Did you check dump and restore flows with partition
> > > > trees and --no-table-access-method?  Perhaps there should be
> > > > some regression tests with partitioned tables?
> > > 
> > > I was looking at the patch, and as I suspected the dumps generated
> > > are forgetting to apply the AM to the partitioned tables.
> > 
> > The patch said:
> > 
> > +       else if (RELKIND_HAS_TABLE_AM(relkind) || relkind == RELKIND_PARTITIONED_TABLE)
> > 
> > pg_dump was missing a similar change that's conditional on
> > RELKIND_HAS_TABLE_AM().  It looks like those are the only two places
> > that need be be specially handled.
> > 
> > I dug up my latest patch from 2021 and incorporated the changes from the
> > 0001 patch here, and added a test case.
> > 
> > I realized that one difference with tablespaces is that, as written,
> > partitioned tables will *always* have an AM specified,  and partitions
> > will never use default_table_access_method.  Is that what's intended ?
> > 
> > Or do we need logic similar tablespaces, such that the relam of a
> > partitioned table is set only if it differs from default_table_am ?
> 
> Actually .. I think it'd be a mistake if the relam needed to be
> interpretted differently for partitioned tables than other relkinds.
> 
> I updated the patch to allow intermediate partitioned tables to inherit
> relam from their parent.
> 
> Michael wrote:
> > .. and there are quite more points to consider.
> 
> What more points ?  There's nothing that's been raised here.  In fact,
> your message last week is the first comment since last June ..

Michael ?



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Thu, Mar 30, 2023 at 12:07:58AM -0500, Justin Pryzby wrote:
> What more points ?  There's nothing that's been raised here.  In fact,
> your message last week is the first comment since last June ..

When I wrote this message, I felt like this may still be missing
something in the area of dump/restore.  Perhaps my feeling on the
matter is wrong, so consider this as a self-reminder not to be taken
seriously until I can have a closer look at what's proposed here for
v17.  :p
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Mon, Apr 24, 2023 at 07:18:54PM -0500, Justin Pryzby wrote:
> On Thu, Mar 30, 2023 at 12:07:58AM -0500, Justin Pryzby wrote:
>> Actually .. I think it'd be a mistake if the relam needed to be
>> interpretted differently for partitioned tables than other relkinds.
>>
>> I updated the patch to allow intermediate partitioned tables to inherit
>> relam from their parent.
>
> Michael ?

Sorry for dropping the subject for so long.  I have spent some time
looking at the patch.  Here are a few comments.

pg_class.h includes the following:

/*
 * Relation kinds that support tablespaces: All relation kinds with storage
 * support tablespaces, except that we don't support moving sequences around
 * into different tablespaces.  Partitioned tables and indexes don't have
 * physical storage, but they have a tablespace settings so that their
 * children can inherit it.
 */
#define RELKIND_HAS_TABLESPACE(relkind) \
    ((RELKIND_HAS_STORAGE(relkind) || RELKIND_HAS_PARTITIONS(relkind)) \
     && (relkind) != RELKIND_SEQUENCE)
[...]
/*
 * Relation kinds with a table access method (rd_tableam).  Although sequences
 * use the heap table AM, they are enough of a special case in most uses that
 * they are not included here.
 */
#define RELKIND_HAS_TABLE_AM(relkind) \
    ((relkind) == RELKIND_RELATION || \
     (relkind) == RELKIND_TOASTVALUE || \
     (relkind) == RELKIND_MATVIEW)

It would look much more consistent with the tablespace case if we
included partitioned tables in this case, but this refers to
rd_tableam for the relcache which we surely don't want to fill for
partitioned tables.  I guess that at minimum a comment is in order?
RELKIND_HAS_TABLE_AM() is much more spread than
RELKIND_HAS_TABLESPACE().

         * No need to add an explicit dependency for the toast table, as the
         * main table depends on it.
         */
-       if (RELKIND_HAS_TABLE_AM(relkind) && relkind != RELKIND_TOASTVALUE)
+       if ((RELKIND_HAS_TABLE_AM(relkind) && relkind != RELKIND_TOASTVALUE) ||
+               relkind == RELKIND_PARTITIONED_TABLE)

The comment at the top of this code block needs an update.

    if (stmt->accessMethod != NULL)
+       accessMethodId = get_table_am_oid(stmt->accessMethod, false);
    else if (stmt->partbound &&
+           (RELKIND_HAS_TABLE_AM(relkind) || relkind == RELKIND_PARTITIONED_TABLE))
    {
+       /*
+        * For partitions, if no access method is specified, default to the AM
+        * of the parent table.
+        */
+       Assert(list_length(inheritOids) == 1);
+       accessMethodId = get_rel_relam(linitial_oid(inheritOids));
+       if (!OidIsValid(accessMethodId))
+           accessMethodId = get_table_am_oid(default_table_access_method, false);
    }
+   else if (RELKIND_HAS_TABLE_AM(relkind) || relkind == RELKIND_PARTITIONED_TABLE)
+       accessMethodId = get_table_am_oid(default_table_access_method, false);

This structure seems a bit weird.  Could it be cleaner to group the
second and third blocks together?  I would imagine:
if (accessMethod != NULL)
{
    //Extract the AM defined in the statement
}
else
{
    //This is a relkind that can use a default table AM.
    if (RELKIND_HAS_TABLE_AM(relkind) || relkind == RELKIND_PARTITIONED_TABLE)
    {
        if (stmt->partbound)
    {
        //This is a partition, so look at what its partitioned
        //table holds.
    }
    else
    {
        //No partition, grab the default.
    }
    }
}

+           /*
+            * Only do this for partitioned tables, for which this is just a
+            * catalog change.  Tables with storage are handled by Phase 3.
+            */
+           if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+               ATExecSetAccessMethodNoStorage(rel, tab->newAccessMethod);

Okay, there is a parallel with tablespaces in this logic.

Speaking of which, ATExecSetAccessMethodNoStorage() does a catalog
update even if ALTER TABLE is defined to use the same table AM as what
is currently set.  There is no need to update the relation's pg_class
entry in this case.  Be careful that InvokeObjectPostAlterHook() still
needs to be checked in this case.  Perhaps some tests should be added
in test_oat_hooks.sql?  It would be tempted to add a new SQL file for
that.

+   else if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+   {
+       /* Do nothing: it's a catalog settings for partitions to inherit */
+   }
Actually, we could add an assertion telling that rd_rel->relam will
always be valid.

-       if (RELKIND_HAS_TABLE_AM(tbinfo->relkind))
+       if (RELKIND_HAS_TABLE_AM(tbinfo->relkind) ||
+           tbinfo->relkind == RELKIND_PARTITIONED_TABLE)
            tableam = tbinfo->amname;
I have spent some time pondering on this particular change, concluding
that it should be OK.  It passes my tests, as well.

+-- partition hierarchies
+-- upon ALTER, new children will inherit the new am, whereas the existing
+-- children will remain untouched
 CREATE TABLE am_partitioned(x INT, y INT)
   PARTITION BY hash (x);
+CREATE TABLE am_partitioned_1 PARTITION OF am_partitioned FOR VALUES WITH (MODULUS 3,REMAINDER 0);
+CREATE TABLE am_partitioned_2 PARTITION OF am_partitioned FOR VALUES WITH (MODULUS 3,REMAINDER 1);
+ALTER TABLE am_partitioned_1 SET ACCESS METHOD heap2;
 ALTER TABLE am_partitioned SET ACCESS METHOD heap2;

Hmm.  I think that we should rewrite a bit this test rather than just
adding contents on top of it.  There is also an extra test I would be
interesting in here: a partition tree with 2 levels of depth, an ALTER
TABLE SET ACCESS METHOD run at level 1 on a partitioned table, and
some new partitions attached to it to check that the new partitions
inherit from the level 1 partitioned table, not the top-parent.

alter_table.sgml should be updated to explain what happens when SET
ACCESS METHOD is applied on a partitioned table.  See for example SET
TABLESPACE that explains what happens to partitions created
afterwards, telling that there is no rewrite in this case.

The regression test added to check pg_dump with a partition tree and
the two table AMs was mixed with an existing one, but it seems to me
that it should be independent of the rest?  I have tweaked that as in
the attached, on the way, using one partition that relies on the
default defined by the parent, and a second that has a USING clause on
heap.  I did not touch the rest of the code.
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Justin Pryzby
Date:
On Thu, May 25, 2023 at 03:49:12PM +0900, Michael Paquier wrote:
> looking at the patch.  Here are a few comments.

...
>          * No need to add an explicit dependency for the toast table, as the
>          * main table depends on it.
>          */
> -       if (RELKIND_HAS_TABLE_AM(relkind) && relkind != RELKIND_TOASTVALUE)
> +       if ((RELKIND_HAS_TABLE_AM(relkind) && relkind != RELKIND_TOASTVALUE) ||
> +               relkind == RELKIND_PARTITIONED_TABLE)
> 
> The comment at the top of this code block needs an update.

What do you think the comment ought to say ?  It already says:

src/backend/catalog/heap.c-              * Make a dependency link to force the relation to be deleted if its
src/backend/catalog/heap.c-              * access method is.

> Speaking of which, ATExecSetAccessMethodNoStorage() does a catalog
> update even if ALTER TABLE is defined to use the same table AM as what
> is currently set.  There is no need to update the relation's pg_class
> entry in this case.  Be careful that InvokeObjectPostAlterHook() still
> needs to be checked in this case.  Perhaps some tests should be added
> in test_oat_hooks.sql?  It would be tempted to add a new SQL file for
> that.

Are you suggesting to put this in a conditional: if oldrelam!=newAccessMethod ?

+       ((Form_pg_class) GETSTRUCT(tuple))->relam = newAccessMethod;
+       CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
+
+       /* Update dependency on new AM */
+       changeDependencyFor(RelationRelationId, relid, AccessMethodRelationId,
+                                               oldrelam, newAccessMethod);

Why is that desirable ?  I'd prefer to see it written with fewer
conditionals, not more; then, it hits the same path every time.

This ought to address your other comments.

-- 
Justin

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Wed, May 31, 2023 at 06:35:34PM -0500, Justin Pryzby wrote:
> What do you think the comment ought to say ?  It already says:
>
> src/backend/catalog/heap.c-              * Make a dependency link to force the relation to be deleted if its
> src/backend/catalog/heap.c-              * access method is.

This is the third location where we rely on the fact that
RELKIND_HAS_TABLE_AM() does not include RELKIND_PARTITIONED_TABLE, so
it seems worth documenting what we are relying on as a comment?  Say:
 * Make a dependency link to force the relation to be deleted if its
 * access method is.
 *
 * No need to add an explicit dependency for the toast table, as the
 * main table depends on it.  Partitioned tables have a table access
 * method defined, and RELKIND_HAS_TABLE_AM ignores them.

>> Speaking of which, ATExecSetAccessMethodNoStorage() does a catalog
>> update even if ALTER TABLE is defined to use the same table AM as what
>> is currently set.  There is no need to update the relation's pg_class
>> entry in this case.  Be careful that InvokeObjectPostAlterHook() still
>> needs to be checked in this case.  Perhaps some tests should be added
>> in test_oat_hooks.sql?  It would be tempted to add a new SQL file for
>> that.
>
> Are you suggesting to put this in a conditional: if oldrelam!=newAccessMethod ?

Yes, that's what I would add with a few lines close to the beginning
of ATExecSetTableSpaceNoStorage() to save from catalog updates if
these are not needed.
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Daniel Gustafsson
Date:
Have you had a chance to address the comments raised by Michael in his last
review such that a new patch revision can be submitted?

--
Daniel Gustafsson




Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Justin Pryzby
Date:
On Thu, Jun 01, 2023 at 08:50:50AM -0400, Michael Paquier wrote:
> >> Speaking of which, ATExecSetAccessMethodNoStorage() does a catalog
> >> update even if ALTER TABLE is defined to use the same table AM as what
> >> is currently set.  There is no need to update the relation's pg_class
> >> entry in this case.  Be careful that InvokeObjectPostAlterHook() still
> >> needs to be checked in this case.  Perhaps some tests should be added
> >> in test_oat_hooks.sql?  It would be tempted to add a new SQL file for
> >> that.
> >
> > Are you suggesting to put this in a conditional: if oldrelam!=newAccessMethod ?
> 
> Yes, that's what I would add with a few lines close to the beginning
> of ATExecSetTableSpaceNoStorage() to save from catalog updates if
> these are not needed.

I understand that it's possible for it to be conditional, but I don't
undertand why it's desirable/important ?

It still seems preferable to be unconditional.

On Wed, May 31, 2023 at 06:35:34PM -0500, Justin Pryzby wrote:
>> Why is that desirable ?  I'd prefer to see it written with fewer
>> conditionals, not more; then, it hits the same path every time.

It's not conditional in the tablespace code that this follows, nor
others like ATExecSetStatistics(), ATExecForceNoForceRowSecurity(),
ATExecSetCompression(), etc, some of which are recently-added.  If it's
important to do here, isn't it also important to do everywhere else ?

-- 
Justin



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Sun, Jul 16, 2023 at 09:37:28PM -0500, Justin Pryzby wrote:
> I understand that it's possible for it to be conditional, but I don't
> undertand why it's desirable/important ?

Because it's cheaper on repeated commands, like no CCI necessary.

> It's not conditional in the tablespace code that this follows, nor
> others like ATExecSetStatistics(), ATExecForceNoForceRowSecurity(),
> ATExecSetCompression(), etc, some of which are recently-added.  If it's
> important to do here, isn't it also important to do everywhere else ?

Good point here.  I am OK to discard this point.
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Justin Pryzby
Date:
On Thu, Jun 01, 2023 at 08:50:50AM -0400, Michael Paquier wrote:
> On Wed, May 31, 2023 at 06:35:34PM -0500, Justin Pryzby wrote:
> > What do you think the comment ought to say ?  It already says:
> >
> > src/backend/catalog/heap.c-              * Make a dependency link to force the relation to be deleted if its
> > src/backend/catalog/heap.c-              * access method is.
> 
> This is the third location where we rely on the fact that
> RELKIND_HAS_TABLE_AM() does not include RELKIND_PARTITIONED_TABLE, so
> it seems worth documenting what we are relying on as a comment?  Say:
>  * Make a dependency link to force the relation to be deleted if its
>  * access method is.
>  *
>  * No need to add an explicit dependency for the toast table, as the
>  * main table depends on it.  Partitioned tables have a table access
>  * method defined, and RELKIND_HAS_TABLE_AM ignores them.

You said that this location "relies on" the macro not including
partitioned tables, but I would say the opposite: the places that use
RELKIND_HAS_TABLE_AM() and do *not* say "or relkind==PARTITIONED_TABLE"
are the ones that "rely on" that...

Anyway, this updates various comments.  No other changes.

-- 
Justin

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Peter Smith
Date:
2024-01 Commitfest.

Hi, this patch was marked in CF as "Needs Review" [1], but there has
been no activity on this thread for 6+ months.

Is anything else planned, or can you post something to elicit more
interest in reviews for the latest patch? Otherwise, if nothing
happens then the CF entry will be closed ("Returned with feedback") at
the end of this CF.

======
[1] https://commitfest.postgresql.org/46/3727/

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Alvaro Herrera
Date:
> @@ -947,20 +947,22 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
>       * a type of relation that needs one, use the default.
>       */
>      if (stmt->accessMethod != NULL)
> +        accessMethodId = get_table_am_oid(stmt->accessMethod, false);
> +    else if (RELKIND_HAS_TABLE_AM(relkind) || relkind == RELKIND_PARTITIONED_TABLE)
>      {
> -        accessMethod = stmt->accessMethod;
> -
> -        if (partitioned)
> -            ereport(ERROR,
> -                    (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> -                     errmsg("specifying a table access method is not supported on a partitioned table")));
> +        if (stmt->partbound)
> +        {
> +            /*
> +             * For partitions, if no access method is specified, use the AM of
> +             * the parent table.
> +             */
> +            Assert(list_length(inheritOids) == 1);
> +            accessMethodId = get_rel_relam(linitial_oid(inheritOids));
> +            Assert(OidIsValid(accessMethodId));
> +        }
> +        else
> +            accessMethodId = get_table_am_oid(default_table_access_method, false);
>      }

I think this works similarly but not identically to tablespace defaults,
and the difference could be confusing.  You seem to have made it so that
the partitioned table _always_ have a table AM, so the partitions can
always inherit from it.  I think it would be more sensible to _allow_
partitioned tables to have one, but not mandatory; if they don't have
it, then a partition created from it would use default_table_access_method.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
Bob [Floyd] used to say that he was planning to get a Ph.D. by the "green
stamp method," namely by saving envelopes addressed to him as 'Dr. Floyd'.
After collecting 500 such letters, he mused, a university somewhere in
Arizona would probably grant him a degree.              (Don Knuth)



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Thu, Feb 01, 2024 at 04:50:49PM +0100, Alvaro Herrera wrote:
> I think this works similarly but not identically to tablespace defaults,
> and the difference could be confusing.  You seem to have made it so that
> the partitioned table _always_ have a table AM, so the partitions can
> always inherit from it.  I think it would be more sensible to _allow_
> partitioned tables to have one, but not mandatory; if they don't have
> it, then a partition created from it would use default_table_access_method.

You mean to allow a value of 0 in pg_class.relam on a partitioned
table to allow any partitions created on it to use the default AM in
the GUC when the partition is created?  Yes, this inconsistency was
bothering me as well in the patch.
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Robert Haas
Date:
On Thu, Feb 1, 2024 at 10:51 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> I think this works similarly but not identically to tablespace defaults,
> and the difference could be confusing.  You seem to have made it so that
> the partitioned table _always_ have a table AM, so the partitions can
> always inherit from it.  I think it would be more sensible to _allow_
> partitioned tables to have one, but not mandatory; if they don't have
> it, then a partition created from it would use default_table_access_method.

I agree that we don't want this feature to invent any new behavior. If
it's clearly and fully parallel to what we do for tablespaces, then I
think it's probably OK, but anything less than that would be a cause
for concern for me.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Peter Eisentraut
Date:
On 19.07.23 20:13, Justin Pryzby wrote:
> On Thu, Jun 01, 2023 at 08:50:50AM -0400, Michael Paquier wrote:
>> On Wed, May 31, 2023 at 06:35:34PM -0500, Justin Pryzby wrote:
>>> What do you think the comment ought to say ?  It already says:
>>>
>>> src/backend/catalog/heap.c-              * Make a dependency link to force the relation to be deleted if its
>>> src/backend/catalog/heap.c-              * access method is.
>>
>> This is the third location where we rely on the fact that
>> RELKIND_HAS_TABLE_AM() does not include RELKIND_PARTITIONED_TABLE, so
>> it seems worth documenting what we are relying on as a comment?  Say:
>>   * Make a dependency link to force the relation to be deleted if its
>>   * access method is.
>>   *
>>   * No need to add an explicit dependency for the toast table, as the
>>   * main table depends on it.  Partitioned tables have a table access
>>   * method defined, and RELKIND_HAS_TABLE_AM ignores them.
> 
> You said that this location "relies on" the macro not including
> partitioned tables, but I would say the opposite: the places that use
> RELKIND_HAS_TABLE_AM() and do *not* say "or relkind==PARTITIONED_TABLE"
> are the ones that "rely on" that...
> 
> Anyway, this updates various comments.  No other changes.

It would be helpful if this patch could more extensively document in its 
commit message what semantic changes it makes.  Various options of 
possible behaviors were discussed in this thread, but it's not clear 
which behaviors were chosen in this particular patch version.

The general idea is that you can set an access method on a partitioned 
table.  That much seems very agreeable.  But then what happens with this 
setting, how can you override it, how can you change it, what happens 
when you change it, what happens with existing partitions and new 
partitions, etc. -- and which of these behaviors are new and old.  Many 
things to specify.




Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Tue, Feb 20, 2024 at 03:47:46PM +0100, Peter Eisentraut wrote:
> It would be helpful if this patch could more extensively document in its
> commit message what semantic changes it makes.  Various options of possible
> behaviors were discussed in this thread, but it's not clear which behaviors
> were chosen in this particular patch version.
>
> The general idea is that you can set an access method on a partitioned
> table.  That much seems very agreeable.  But then what happens with this
> setting, how can you override it, how can you change it, what happens when
> you change it, what happens with existing partitions and new partitions,
> etc. -- and which of these behaviors are new and old.  Many things to
> specify.

The main point in this patch is the following code block in
DefineRelation(), that defines the semantics about the AM set for a
partitioned table:
+    else if (RELKIND_HAS_TABLE_AM(relkind) || relkind == RELKIND_PARTITIONED_TABLE)
     {
+        if (stmt->partbound)
+        {
+            /*
+             * For partitions, if no access method is specified, use the AM of
+             * the parent table.
+             */
+            Assert(list_length(inheritOids) == 1);
+            accessMethodId = get_rel_relam(linitial_oid(inheritOids));
+            Assert(OidIsValid(accessMethodId));
+        }
+        else
+            accessMethodId = get_table_am_oid(default_table_access_method, false);
     }

This means that all partitioned tables would have pg_class.relam set,
and that relam would never be 0:
- The USING clause takes priority over default_table_access_method.
- If no USING clause, default_table_access_method is the AM used

Any partitions created from this partitioned table would inherit the
AM set, ignoring default_table_access_method.

Alvaro has made a very good point a couple of days ago at [1] where we
should try to make the behavior stick closer to tablespaces, where it
could be possible to set relam to 0 for a partitioned table, where a
partition would inherit the AM set in the GUC when a USING clause is
not defined (if USING specifies the AM, we'd just use it).

Existing partitions should not be changed if the AM of their
partitioned table changes, so you can think of the AM as a hint for
the creation of new partitions.

[1]: https://www.postgresql.org/message-id/202402011550.sfszd46247zi@alvherre.pgsql
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Peter Eisentraut
Date:
On 21.02.24 07:40, Michael Paquier wrote:
> This means that all partitioned tables would have pg_class.relam set,
> and that relam would never be 0:
> - The USING clause takes priority over default_table_access_method.
> - If no USING clause, default_table_access_method is the AM used
> 
> Any partitions created from this partitioned table would inherit the
> AM set, ignoring default_table_access_method.
> 
> Alvaro has made a very good point a couple of days ago at [1] where we
> should try to make the behavior stick closer to tablespaces, where it
> could be possible to set relam to 0 for a partitioned table, where a
> partition would inherit the AM set in the GUC when a USING clause is
> not defined (if USING specifies the AM, we'd just use it).

Yes, I think most people agreed that that would be the preferred behavior.




Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Wed, Feb 21, 2024 at 08:46:48AM +0100, Peter Eisentraut wrote:
> Yes, I think most people agreed that that would be the preferred behavior.

Challenge accepted.  As of the patch attached.

Tablespaces rely MyDatabaseTableSpace to fallback to the database's
default if not specified, but we cannot do that for table AMs as there
is no equivalent to dattablespace.

I have implemented that so as we keep the default, historical
behavior: if pg_class.relam is 0 for a partitioned table, use the AM
defined by default_table_access_method.  The patch only adds a path to
switch to a different AM than the GUC when creating a new partition if
and only if a partitioned table has been manipulated with ALTER TABLE
SET ACCESS METHOD to update its AM to something else than the GUC.
Similarly to tablespaces, CREATE TABLE USING is *not* supported for
partitioned tables, same behavior as previously.

There is a bit more regarding the handling of the entries in
pg_depend, but nothing really complicated, knowing that there can be
three possible patterns:
- Add a new dependency if changing the AM to be something different
than the GUC.
- Remove the dependency if changing the AM to the value of the GUC,
when something existing previously.
- Update the dependency if switching between AMs that don't refer to
the GUC at all.

If the AM of a partitioned table is not changed, there is no need to
update the catalogs at all.  The prep phase of the sub-command is
already aware of that, setting the new AM OID to InvalidOid in this
case.

The attached includes regression tests that check all the dependency
entries, the contents of pg_class for partitioned tables, as well as
the creation of partitions when pg_class.relam is not 0.  I'd welcome
more eyes regarding these changes.  pg_dump needs to be tweaked to
save the AM information of a partitioned table, like the previous
versions.  There are tests for these dump patterns, that needed a
slight tweak to work.  Docs have been refreshed.

Thoughts, comments?
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Justin Pryzby
Date:
On Wed, Feb 28, 2024 at 05:08:49PM +0900, Michael Paquier wrote:
> On Wed, Feb 21, 2024 at 08:46:48AM +0100, Peter Eisentraut wrote:
> > Yes, I think most people agreed that that would be the preferred behavior.
> 
> Challenge accepted.  As of the patch attached.

Thanks for picking it up.  I find it pretty hard to switch back to
put the needed effort into a patch after a long period.

> I have implemented that so as we keep the default, historical
> behavior: if pg_class.relam is 0 for a partitioned table, use the AM
> defined by default_table_access_method.  The patch only adds a path to
> switch to a different AM than the GUC when creating a new partition if
> and only if a partitioned table has been manipulated with ALTER TABLE
> SET ACCESS METHOD to update its AM to something else than the GUC.
> Similarly to tablespaces, CREATE TABLE USING is *not* supported for
> partitioned tables, same behavior as previously.

This patch allows resetting relam=0 by running ALTER TABLE SET AM to the
same value as the GUC.  Maybe it'd be better to have an explicit SET
DEFAULT (as in b9424d01 and 4f622503).

-- 
Justin



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Thu, Feb 29, 2024 at 08:51:31AM -0600, Justin Pryzby wrote:
> On Wed, Feb 28, 2024 at 05:08:49PM +0900, Michael Paquier wrote:
>> I have implemented that so as we keep the default, historical
>> behavior: if pg_class.relam is 0 for a partitioned table, use the AM
>> defined by default_table_access_method.  The patch only adds a path to
>> switch to a different AM than the GUC when creating a new partition if
>> and only if a partitioned table has been manipulated with ALTER TABLE
>> SET ACCESS METHOD to update its AM to something else than the GUC.
>> Similarly to tablespaces, CREATE TABLE USING is *not* supported for
>> partitioned tables, same behavior as previously.
>
> This patch allows resetting relam=0 by running ALTER TABLE SET AM to the
> same value as the GUC.  Maybe it'd be better to have an explicit SET
> DEFAULT (as in b9424d01 and 4f622503).

Outside the scope of this patch's thread, this looks like a good idea
even for tables/matviews.  And the semantics are pretty easy: if DEFAULT
is specified, just set the access method to NULL in the parser and let
tablecmds.c go the AM OID lookup in the prep phase if set to NULL.
See 0001 attached.  This one looks pretty good taken as an independent
piece.

When it comes to partitioned tables, there is a still a tricky case:
what should we do when a user specifies a non-default value in the SET
ACCESS METHOD clause and it matches default_table_access_method?
Should the relam be 0 or should we force relam to be the OID of the
given value given by the query?  Implementation-wise, forcing the
value to 0 is simpler, but I can get why it could be confusing as
well, because the state of the catalogs does not reflect what was
provided in the query.  At the same time, the user has explicitly set
the access method to be the same as the default, so perhaps 0 makes
sense anyway in this case.

0002 does that, as that's simpler.  I'm not sure if there is a case
for forcing a value in relam if the query has the same value as the
default.  Thoughts?
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Jelte Fennema-Nio
Date:
On Fri, 1 Mar 2024 at 02:57, Michael Paquier <michael@paquier.xyz> wrote:
> When it comes to partitioned tables, there is a still a tricky case:
> what should we do when a user specifies a non-default value in the SET
> ACCESS METHOD clause and it matches default_table_access_method?
> Should the relam be 0 or should we force relam to be the OID of the
> given value given by the query?  Implementation-wise, forcing the
> value to 0 is simpler, but I can get why it could be confusing as
> well, because the state of the catalogs does not reflect what was
> provided in the query.  At the same time, the user has explicitly set
> the access method to be the same as the default, so perhaps 0 makes
> sense anyway in this case.

I think we should set the AM OID explicitly. Because an important
thing to consider is: What behaviour makes sense when later
default_table_access_method is changed?
I think if someone sets it explicitly on the partitioned table, they
would want the AM of the partitioned table to stay the same when
default_table_access_method is changed. Which requires storing the AM
OID afaict.



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Fri, Mar 01, 2024 at 05:43:25AM +0100, Jelte Fennema-Nio wrote:
> I think we should set the AM OID explicitly. Because an important
> thing to consider is: What behaviour makes sense when later
> default_table_access_method is changed?

Per the latest discussion of the thread, we've kind of reached a
consensus that we should keep the current historical bevahior on
default, where relam remains at 0, causing new partitions to grab the
GUC as AM.  If we create a partitioned table attached to a partitioned
table, it should be 0 as well.  If the partitioned table has a non-0
relam, a new partitioned table created on it will inherit the same
non-0 value.

> I think if someone sets it explicitly on the partitioned table, they
> would want the AM of the partitioned table to stay the same when
> default_table_access_method is changed. Which requires storing the AM
> OID afaict.

If we allow relam to be non-0 for a partitioned table, it is equally
important to give users a way to reset it at will.  My point was a bit
more subtle than that.  For example, this sequence is clear to me:
SET default_table_access_method = 'foo';
ALTER TABLE part SET ACCESS METHOD DEFAULT;

The user wants to rely on the GUC, so relam should be 0, new
partitions created on it will use the GUC.

Now, what should this sequence mean?  See:
SET default_table_access_method = 'foo';
ALTER TABLE part SET ACCESS METHOD foo;

Should the relam be 0 because the user requested a match with the GUC,
or use the OID of the AM?  There has to be some difference with
tablespaces, because relations with physical storage (tables,
matviews) can use a reltablespace of 0, but AMs have to be set for
tables and matviews.

Fun topic, especially once coupled with the internals of tablecmds.c
that uses InvalidOid for the new access AM as a special value to work
as a no-op.
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Fri, Mar 01, 2024 at 05:43:25AM +0100, Jelte Fennema-Nio wrote:
> I think we should set the AM OID explicitly. Because an important
> thing to consider is: What behaviour makes sense when later
> default_table_access_method is changed?
> I think if someone sets it explicitly on the partitioned table, they
> would want the AM of the partitioned table to stay the same when
> default_table_access_method is changed. Which requires storing the AM
> OID afaict.

Oops, I think I misread that.  You just mean to always set relam when
using an AM in the SET ACCESS METHOD clause.  Apologies for the noise.
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Jelte Fennema-Nio
Date:
On Fri, 1 Mar 2024 at 06:15, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Mar 01, 2024 at 05:43:25AM +0100, Jelte Fennema-Nio wrote:
> > I think we should set the AM OID explicitly. Because an important
> > thing to consider is: What behaviour makes sense when later
> > default_table_access_method is changed?
> > I think if someone sets it explicitly on the partitioned table, they
> > would want the AM of the partitioned table to stay the same when
> > default_table_access_method is changed. Which requires storing the AM
> > OID afaict.
>
> Oops, I think I misread that.  You just mean to always set relam when
> using an AM in the SET ACCESS METHOD clause.  Apologies for the noise.

Correct, I intended to say that "SET ACCESS METHOD heap" on a
partitioned table should store heap its OID. Because while storing 0
might be simpler, it will result in (imho) wrong behaviour when later
the default_table_access_method is changed. behavior won't result in
the (imho) intended. i.e. it's not simply a small detail in what the
catolog looks like, but there's an actual behavioural change.



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Justin Pryzby
Date:
On Fri, Mar 01, 2024 at 10:56:50AM +0900, Michael Paquier wrote:
> On Thu, Feb 29, 2024 at 08:51:31AM -0600, Justin Pryzby wrote:
> > On Wed, Feb 28, 2024 at 05:08:49PM +0900, Michael Paquier wrote:
> >> I have implemented that so as we keep the default, historical
> >> behavior: if pg_class.relam is 0 for a partitioned table, use the AM
> >> defined by default_table_access_method.  The patch only adds a path to
> >> switch to a different AM than the GUC when creating a new partition if
> >> and only if a partitioned table has been manipulated with ALTER TABLE
> >> SET ACCESS METHOD to update its AM to something else than the GUC.
> >> Similarly to tablespaces, CREATE TABLE USING is *not* supported for
> >> partitioned tables, same behavior as previously.
> > 
> > This patch allows resetting relam=0 by running ALTER TABLE SET AM to the
> > same value as the GUC.  Maybe it'd be better to have an explicit SET
> > DEFAULT (as in b9424d01 and 4f622503).
> 
> Outside the scope of this patch's thread, this looks like a good idea
> even for tables/matviews.  And the semantics are pretty easy: if DEFAULT
> is specified, just set the access method to NULL in the parser and let
> tablecmds.c go the AM OID lookup in the prep phase if set to NULL.
> See 0001 attached.  This one looks pretty good taken as an independent
> piece.
> 
> When it comes to partitioned tables, there is a still a tricky case:
> what should we do when a user specifies a non-default value in the SET
> ACCESS METHOD clause and it matches default_table_access_method?

I don't think it's tricky - it seems more like a weird hack in the
previous patch version to make AMs behave like tablespaces, despite not
being completely parallel, due to the absence of a pg_default AM.

With the new 001, the hack can go away, and so it should.

> Should the relam be 0 or should we force relam to be the OID of the
> given value given by the query?

You said "force" it to be the user-specified value, but I think that's
not "forcing", it's respecting (but to take the user's desired value,
and conditionally store 0 instead, that could be described as
"forcing.")

> Implementation-wise, forcing the value to 0 is simpler, but I can get
> why it could be confusing as well, because the state of the catalogs
> does not reflect what was provided in the query.

> At the same time, the user has explicitly set the access method to be
> the same as the default, so perhaps 0 makes sense anyway in this case.

I think if the user sets something "explicitly", the catalog should
reflect what they set.  Tablespaces have dattablespace, but AMs don't --
it's a simpler case.

For 001: we don't *need* to support "ALTER SET AM default" for leaf
tables.  It doesn't do anything that's not already possible.  But, if
AMs for partitioned tables are optional rather than required, then seems
to be needed to allow (re)settinng relam=0.

But for partitioned tables, I think it should set relam=0 directly.
Currently it 1) falls through to default_table_am; and 2) detects that
it's the default, so then sets relam to 0.  Since InvalidOid 

On Fri, Mar 01, 2024 at 02:03:48PM +0900, Michael Paquier wrote:
> Fun topic, especially once coupled with the internals of tablecmds.c
> that uses InvalidOid for the new access AM as a special value to work
> as a no-op.

Since InvalidOid is already taken, I guess you might need to introduce a
boolean flag, like set_relam, indicating that the statement has an
ACCESS METHOD clause.

> + * method defined so as their children can inherit it; however, this is handled

so that

> +         * Do nothing: access methods is a setting that partitions can

method (singular), or s/is/are/

On Wed, Feb 28, 2024 at 05:08:49PM +0900, Michael Paquier wrote:
> Similarly to tablespaces, CREATE TABLE USING is *not* supported for
> partitioned tables, same behavior as previously.

Maybe I misunderstood what you're trying to say, but CREATE..TABLESPACE
*is* supported for partitioned tables.  I'm not sure why it wouldn't be
supported to set the AM, too.

In any case, it'd be a bit confusing for the error message to still say:

postgres=# CREATE TABLE a(i int) PARTITION BY RANGE(a) USING heap2;
ERROR:  specifying a table access method is not supported on a partitioned table

-- 
Justin



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Fri, Mar 01, 2024 at 03:03:14PM -0600, Justin Pryzby wrote:
> I think if the user sets something "explicitly", the catalog should
> reflect what they set.  Tablespaces have dattablespace, but AMs don't --
> it's a simpler case.

Okay.

> For 001: we don't *need* to support "ALTER SET AM default" for leaf
> tables.  It doesn't do anything that's not already possible.  But, if
> AMs for partitioned tables are optional rather than required, then seems
> to be needed to allow (re)settinng relam=0.

Indeed, for non-partitioned tables DEFAULT is a sugar flavor.  Not
mandatory, still it's nice to have to not have to type an AM.

> But for partitioned tables, I think it should set relam=0 directly.
> Currently it 1) falls through to default_table_am; and 2) detects that
> it's the default, so then sets relam to 0.
>
> Since InvalidOid is already taken, I guess you might need to introduce a
> boolean flag, like set_relam, indicating that the statement has an
> ACCESS METHOD clause.

Yes, I don't see an alternative.  The default needs a different field
to be tracked down to the execution.

>> + * method defined so as their children can inherit it; however, this is handled
>
> so that
>
>> +         * Do nothing: access methods is a setting that partitions can
>
> method (singular), or s/is/are/

Indeed.  Fixed both.

> In any case, it'd be a bit confusing for the error message to still say:
>
> postgres=# CREATE TABLE a(i int) PARTITION BY RANGE(a) USING heap2;
> ERROR:  specifying a table access method is not supported on a partitioned table

I was looking at this one as well and I don't see why we could not
remove it, so you are right (missed the tablespace part last week).  A
partitioned table created as a partition of a partitioned table would
inherit the relam of its parent (0 if default is set, or non-0 is
something is set).  I have added some regression tests for that.

And I'm finishing with the attached.  To summarize SET ACCESS METHOD
on a partitioned table, the semantics are:
- DEFAULT sets the relam to 0, any partitions with storage would use
the GUC at creation time.  Partitioned tables use a relam of 0.
- If a value is set for the am, relam becomes non-0.  Any partitions
created on it inherit it (partitioned as well as non-partitioned
tables).
- No USING clause means to set its relam to 0.

0001 seems OK here, 0002 needs more eyes.  The bulk of the changes is
in the regression tests to cover all the cases I could think of.
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Justin Pryzby
Date:
On Mon, Mar 04, 2024 at 05:46:56PM +0900, Michael Paquier wrote:
> > Since InvalidOid is already taken, I guess you might need to introduce a
> > boolean flag, like set_relam, indicating that the statement has an
> > ACCESS METHOD clause.
> 
> Yes, I don't see an alternative.  The default needs a different field
> to be tracked down to the execution.

The data structure you used (defaultAccessMethod) allows this, which
is intended to be prohibited:

postgres=# ALTER TABLE a SET access method default, SET access method default;
ALTER TABLE

As you wrote it, you pass the "defaultAccessMethod" bool to
ATExecSetAccessMethodNoStorage(), which seems odd.  Why not just pass
the target amoid as newAccessMethod ?

When I fooled with this on my side, I called it "chgAccessMethod" to
follow "chgPersistence".  I think "is default" isn't the right data
structure.

Attached a relative patch with my version.

Also: I just realized that instead of adding a bool, we could test
(tab->rewrite & AT_REWRITE_ACCESS_METHOD) != 0

+-- Default and AM set in in clause are the same, relam should be set.

in in?

-- 
Justin

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Thu, Mar 07, 2024 at 08:02:00PM -0600, Justin Pryzby wrote:
> As you wrote it, you pass the "defaultAccessMethod" bool to
> ATExecSetAccessMethodNoStorage(), which seems odd.  Why not just pass
> the target amoid as newAccessMethod ?

+    /*
+     * Check that the table access method exists.
+     * Use the access method, if specified, otherwise (when not specified) 0
+     * for partitioned tables or the configured default AM.
+     */
+    if (amname != NULL)
+        amoid = get_table_am_oid(amname, false);
+    else if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+        amoid = 0;
+    else
+        amoid = get_table_am_oid(default_table_access_method, false);

..  While using this flow to choose the AM oid, that's neater than the
versions I could come up with, pretty cool.

> When I fooled with this on my side, I called it "chgAccessMethod" to
> follow "chgPersistence".  I think "is default" isn't the right data
> structure.
>
> Attached a relative patch with my version.

Thanks.  I've applied the patch to add the DEFAULT clause for now, to
ease the work on this patch.

> Also: I just realized that instead of adding a bool, we could test
> (tab->rewrite & AT_REWRITE_ACCESS_METHOD) != 0

Hmm.  I've considered that, but the boolean style is able to do the
work, while being more consistent, so I'm OK with what you are doing
in your 0002.

> +-- Default and AM set in in clause are the same, relam should be set.
>
> in in?

Oops, fixed.

I have spent more time reviewing the whole and the tests (I didn't see
much value in testing the DEFAULT clause twice for the partitioned
table case and there is a test in d61a6cad6418), tweaked a few
comments and the documentation, did an indentation and a commit
message draft.

How does that look to you?  The test coverage and the semantics do
what we want them to do, so that looks rather reasonable here.  A
second or even third pair of eyes would not hurt.
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Alvaro Herrera
Date:
On 2024-Mar-08, Michael Paquier wrote:

> I have spent more time reviewing the whole and the tests (I didn't see
> much value in testing the DEFAULT clause twice for the partitioned
> table case and there is a test in d61a6cad6418), tweaked a few
> comments and the documentation, did an indentation and a commit
> message draft.
> 
> How does that look to you?  The test coverage and the semantics do
> what we want them to do, so that looks rather reasonable here.  A
> second or even third pair of eyes would not hurt.

I gave this a look.  I found some of the comments a bit confusing or
overly long, so I propose to reword them.  I also propose a small doc
change (during writing which I noticed that the docs for tablespace had
been neglected and one comment too many; patch to be committed
separately soon).  I ended up also moving code in tablecmds.c so that
all the AT*SetAccessMethod* routines appear together rather than mixed
with the ones for tablespaces, and removing one CCI that seems
unnecessary, at the bottom of ATExecSetAccessMethodNoStorage.

0001 is Michaël's patch, 0002 are my proposed changes.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Hay que recordar que la existencia en el cosmos, y particularmente la
elaboración de civilizaciones dentro de él no son, por desgracia,
nada idílicas" (Ijon Tichy)

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Alvaro Herrera
Date:
On 2024-Mar-19, Alvaro Herrera wrote:

> 0001 is Michaël's patch, 0002 are my proposed changes.

Doh, I sent the wrong set of attachments.  But I see no reason to post
again: what I attached as 0001 is what I wrote was going to be 0002,
Michaël's patch is already in archives, and the CI tests with both
applied on current master are running here:
https://cirrus-ci.com/build/6404370015715328

Michaël, I'll leave this for you to push ...

Thanks!

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Alvaro Herrera
Date:
Given that Michaël is temporarily gone, I propose to push the attached
tomorrow.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
<inflex> really, I see PHP as like a strange amalgamation of C, Perl, Shell
<crab> inflex: you know that "amalgam" means "mixture with mercury",
       more or less, right?
<crab> i.e., "deadly poison"

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael P
Date:
On Mar 21, 2024, at 13:07, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> Given that Michaël is temporarily gone, I propose to push the attached
> tomorrow.

Thanks for doing so. I’m wondering whether I should be an author of this patch at this stage, tbh. I wrote all the
testsand rewrote most of the internals to adjust with the consensus reached ;) 
--
Michael



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Alexander Lakhin
Date:
Hello Alvaro,

21.03.2024 15:07, Alvaro Herrera wrote:
> Given that Michaël is temporarily gone, I propose to push the attached
> tomorrow.

Please look at a new anomaly introduced with 374c7a229.
Starting from that commit, the following erroneous query:
CREATE FOREIGN TABLE fp PARTITION OF pg_am DEFAULT SERVER x;

triggers an assertion failure:
TRAP: failed Assert("relation->rd_rel->relam == InvalidOid"), File: "relcache.c", Line: 1219, PID: 3706301

with the stack trace:
...
#4  0x00007fe53ced67f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x000055f28555951e in ExceptionalCondition (conditionName=conditionName@entry=0x55f285744788 
"relation->rd_rel->relam == InvalidOid", fileName=fileName@entry=0x55f285743f1c "relcache.c", 
lineNumber=lineNumber@entry=1219)
     at assert.c:66
#6  0x000055f285550450 in RelationBuildDesc (targetRelId=targetRelId@entry=16385, insertIt=insertIt@entry=false) at 
relcache.c:1219
#7  0x000055f285550769 in RelationClearRelation (relation=relation@entry=0x7fe5310dd178, rebuild=rebuild@entry=true) at

relcache.c:2667
#8  0x000055f285550c41 in RelationFlushRelation (relation=0x7fe5310dd178) at relcache.c:2850
#9  0x000055f285550ca0 in RelationCacheInvalidateEntry (relationId=<optimized out>) at relcache.c:2921
#10 0x000055f285542551 in LocalExecuteInvalidationMessage (msg=0x55f2861b3160) at inval.c:738
#11 0x000055f28554159b in ProcessInvalidationMessages (group=0x55f2861b2e6c, func=func@entry=0x55f2855424a8 
<LocalExecuteInvalidationMessage>) at inval.c:518
#12 0x000055f285542740 in CommandEndInvalidationMessages () at inval.c:1180
#13 0x000055f28509cbbd in AtCCI_LocalCache () at xact.c:1550
#14 0x000055f28509e88e in CommandCounterIncrement () at xact.c:1116
#15 0x000055f2851d0c8b in DefineRelation (stmt=stmt@entry=0x55f2861803b0, relkind=relkind@entry=102 'f', ownerId=10, 
ownerId@entry=0, typaddress=typaddress@entry=0x0,
     queryString=queryString@entry=0x55f28617f870 "CREATE FOREIGN TABLE fp PARTITION OF pg_am DEFAULT SERVER x;") at 
tablecmds.c:1008
#16 0x000055f28540945d in ProcessUtilitySlow (pstate=pstate@entry=0x55f2861a9dc0, pstmt=pstmt@entry=0x55f286180510, 
queryString=queryString@entry=0x55f28617f870 "CREATE FOREIGN TABLE fp PARTITION OF pg_am DEFAULT SERVER x;",
     context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=params@entry=0x0, queryEnv=queryEnv@entry=0x0, 
dest=0x55f2861807d0, qc=0x7fff15b5d7c0) at utility.c:1203
#17 0x000055f28540911f in standard_ProcessUtility (pstmt=0x55f286180510, queryString=0x55f28617f870 "CREATE FOREIGN 
TABLE fp PARTITION OF pg_am DEFAULT SERVER x;", readOnlyTree=<optimized out>, context=PROCESS_UTILITY_TOPLEVEL,
     params=0x0, queryEnv=0x0, dest=0x55f2861807d0, qc=0x7fff15b5d7c0) at utility.c:1067
...

On 374c7a229~1 it fails with
ERROR:  "pg_am" is not partitioned

Best regards,
Alexander



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Alvaro Herrera
Date:
On 2024-Mar-26, Alexander Lakhin wrote:

> Hello Alvaro,
> 
> 21.03.2024 15:07, Alvaro Herrera wrote:
> > Given that Michaël is temporarily gone, I propose to push the attached
> > tomorrow.
> 
> Please look at a new anomaly introduced with 374c7a229.
> Starting from that commit, the following erroneous query:
> CREATE FOREIGN TABLE fp PARTITION OF pg_am DEFAULT SERVER x;
> 
> triggers an assertion failure:
> TRAP: failed Assert("relation->rd_rel->relam == InvalidOid"), File: "relcache.c", Line: 1219, PID: 3706301

Hmm, yeah, we're setting relam for relations that shouldn't have it.
I propose the attached.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Justin Pryzby
Date:
On Thu, Mar 21, 2024 at 01:07:01PM +0100, Alvaro Herrera wrote:
> Given that Michaël is temporarily gone, I propose to push the attached
> tomorrow.

Thanks.

On Tue, Mar 26, 2024 at 12:05:47PM +0100, Alvaro Herrera wrote:
> On 2024-Mar-26, Alexander Lakhin wrote:
> 
> > Hello Alvaro,
> > 
> > 21.03.2024 15:07, Alvaro Herrera wrote:
> > > Given that Michaël is temporarily gone, I propose to push the attached
> > > tomorrow.
> > 
> > Please look at a new anomaly introduced with 374c7a229.
> > Starting from that commit, the following erroneous query:
> > CREATE FOREIGN TABLE fp PARTITION OF pg_am DEFAULT SERVER x;
> > 
> > triggers an assertion failure:
> > TRAP: failed Assert("relation->rd_rel->relam == InvalidOid"), File: "relcache.c", Line: 1219, PID: 3706301
> 
> Hmm, yeah, we're setting relam for relations that shouldn't have it.
> I propose the attached.

Looks right.  That's how I originally wrote it, except for the
"stmt->accessMethod != NULL" case.

I prefered my way - the grammar should refuse to set stmt->accessMethod
for inappropriate relkinds.  And you could assert that.

I also prefered to set "accessMethodId = InvalidOid" once, rather than
twice.

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 8a02c5b05b6..050be89728f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -962,18 +962,21 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
      * case of a partitioned table) the parent's, if it has one.
      */
     if (stmt->accessMethod != NULL)
-        accessMethodId = get_table_am_oid(stmt->accessMethod, false);
-    else if (stmt->partbound)
     {
-        Assert(list_length(inheritOids) == 1);
-        accessMethodId = get_rel_relam(linitial_oid(inheritOids));
+        Assert(RELKIND_HAS_TABLE_AM(relkind) || relkind == RELKIND_PARTITIONED_TABLE);
+        accessMethodId = get_table_am_oid(stmt->accessMethod, false);
     }
-    else
-        accessMethodId = InvalidOid;
+    else if (RELKIND_HAS_TABLE_AM(relkind) || relkind == RELKIND_PARTITIONED_TABLE)
+    {
+        if (stmt->partbound)
+        {
+            Assert(list_length(inheritOids) == 1);
+            accessMethodId = get_rel_relam(linitial_oid(inheritOids));
+        }
 
-    /* still nothing? use the default */
-    if (RELKIND_HAS_TABLE_AM(relkind) && !OidIsValid(accessMethodId))
-        accessMethodId = get_table_am_oid(default_table_access_method, false);
+        if (RELKIND_HAS_TABLE_AM(relkind) && !OidIsValid(accessMethodId))
+            accessMethodId = get_table_am_oid(default_table_access_method, false);
+    }
 
     /*
      * Create the relation.  Inherited defaults and constraints are passed in



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Alvaro Herrera
Date:
On 2024-Mar-26, Justin Pryzby wrote:

> Looks right.  That's how I originally wrote it, except for the
> "stmt->accessMethod != NULL" case.
> 
> I prefered my way - the grammar should refuse to set stmt->accessMethod
> for inappropriate relkinds.  And you could assert that.

Hmm, I didn't like this at first sight, because it looked convoluted and
baroque, but I compared both versions for a while and I ended up liking
yours more than mine, so I adopted it.

> I also prefered to set "accessMethodId = InvalidOid" once, rather than
> twice.

Grumble.  I don't like initialization at declare time, so far from the
code that depends on the value.  But the alternative would have been to
assign right where this blocks starts, an additional line.  I pushed it
like you had it.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"This is what I like so much about PostgreSQL.  Most of the surprises
are of the "oh wow!  That's cool" Not the "oh shit!" kind.  :)"
Scott Marlowe, http://archives.postgresql.org/pgsql-admin/2008-10/msg00152.php



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Alexander Lakhin
Date:
Hello Alvaro,

28.03.2024 18:58, Alvaro Herrera wrote:
> Grumble.  I don't like initialization at declare time, so far from the
> code that depends on the value.  But the alternative would have been to
> assign right where this blocks starts, an additional line.  I pushed it
> like you had it.

I've stumbled upon a test failure caused by the test query added in that
commit:
--- .../src/test/regress/expected/create_am.out      2024-03-28 12:14:11.700764888 -0400
+++ .../src/test/recovery/tmp_check/results/create_am.out 2024-03-31 03:10:28.172244122 -0400
@@ -549,7 +549,10 @@
  ERROR:  access method "btree" is not of type TABLE
  -- Other weird invalid cases that cause problems
  CREATE FOREIGN TABLE fp PARTITION OF pg_am DEFAULT SERVER x;
-ERROR:  "pg_am" is not partitioned
+ERROR:  deadlock detected
+DETAIL:  Process 3076180 waits for AccessShareLock on relation 1259 of database 16386; blocked by process 3076181.
+Process 3076181 waits for AccessShareLock on relation 2601 of database 16386; blocked by process 3076180.
+HINT:  See server log for query details.
  -- Drop table access method, which fails as objects depends on it
  DROP ACCESS METHOD heap2;
  ERROR:  cannot drop access method heap2 because other objects depend on it

027_stream_regress_primary.log contains:
2024-03-31 03:10:26.728 EDT [3076181] pg_regress/vacuum LOG: statement: VACUUM FULL pg_class;
...
2024-03-31 03:10:26.797 EDT [3076180] pg_regress/create_am LOG: statement: CREATE FOREIGN TABLE fp PARTITION OF pg_am 
DEFAULT SERVER x;
...
2024-03-31 03:10:28.183 EDT [3076181] pg_regress/vacuum LOG: statement: VACUUM FULL pg_database;

This simple demo confirms the issue:
for ((i=1;i<=20;i++)); do
echo "iteration $i"
echo "VACUUM FULL pg_class;" | psql >psql-1.log &
echo "CREATE FOREIGN TABLE fp PARTITION OF pg_am DEFAULT SERVER x;" | psql >psql-2.log &
wait
done

...
iteration 15
ERROR:  "pg_am" is not partitioned
iteration 16
ERROR:  deadlock detected
DETAIL:  Process 2556377 waits for AccessShareLock on relation 1259 of database 16384; blocked by process 2556378.
Process 2556378 waits for AccessShareLock on relation 2601 of database 16384; blocked by process 2556377.
HINT:  See server log for query details.
...

Best regards,
Alexander



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Justin Pryzby
Date:
On Sun, Mar 31, 2024 at 12:00:00PM +0300, Alexander Lakhin wrote:
> Hello Alvaro,
> 
> 28.03.2024 18:58, Alvaro Herrera wrote:
> > Grumble.  I don't like initialization at declare time, so far from the
> > code that depends on the value.  But the alternative would have been to
> > assign right where this blocks starts, an additional line.  I pushed it
> > like you had it.
> 
> I've stumbled upon a test failure caused by the test query added in that
> commit:
> +ERROR:  deadlock detected
> +DETAIL:  Process 3076180 waits for AccessShareLock on relation 1259 of database 16386; blocked by process 3076181.
> +Process 3076181 waits for AccessShareLock on relation 2601 of database 16386; blocked by process 3076180.

I think means that, although it was cute to use pg_am in the reproducer
given in the problem report, it's not a good choice to use here in the
sql regression tests.

-- 
Justin



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Tom Lane
Date:
Justin Pryzby <pryzby@telsasoft.com> writes:
> On Sun, Mar 31, 2024 at 12:00:00PM +0300, Alexander Lakhin wrote:
>> I've stumbled upon a test failure caused by the test query added in that
>> commit:
>> +ERROR:  deadlock detected
>> +DETAIL:  Process 3076180 waits for AccessShareLock on relation 1259 of database 16386; blocked by process 3076181.
>> +Process 3076181 waits for AccessShareLock on relation 2601 of database 16386; blocked by process 3076180.

> I think means that, although it was cute to use pg_am in the reproducer
> given in the problem report, it's not a good choice to use here in the
> sql regression tests.

Another case here:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sevengill&dt=2024-04-02%2001%3A32%3A17

AFAICS, e2395cdbe posits that taking exclusive lock on pg_am in the
middle of a bunch of concurrent regression scripts couldn't possibly
cause any problems.  Really?

            regards, tom lane



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Tue, Apr 02, 2024 at 01:06:06AM -0400, Tom Lane wrote:
> AFAICS, e2395cdbe posits that taking exclusive lock on pg_am in the
> middle of a bunch of concurrent regression scripts couldn't possibly
> cause any problems.  Really?

There is no need for a catalog here to trigger the failure, and it
would have happened as long as a foreign table is used.  The problem
introduced in 374c7a229042 fixed by e2395cdbe83a comes from a thinko
on my side, my apologies for that and the delay in replying.  Thanks
for the extra fix done in 13b3b62746ec, Alvaro.
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Mon, Apr 15, 2024 at 10:46:00AM +0900, Michael Paquier wrote:
> There is no need for a catalog here to trigger the failure, and it
> would have happened as long as a foreign table is used.  The problem
> introduced in 374c7a229042 fixed by e2395cdbe83a comes from a thinko
> on my side, my apologies for that and the delay in replying.  Thanks
> for the extra fix done in 13b3b62746ec, Alvaro.

While doing more tests with this feature, among other things, I've
spotted an incorrect behavior with dump/restore with the handling of
the GUC default_table_access_method when it comes to partitions.
Imagine the following in database "a":
CREATE TABLE parent_tab (id int) PARTITION BY RANGE (id);
CREATE TABLE parent_tab_2 (id int) PARTITION BY RANGE (id) USING heap;
CREATE TABLE parent_tab_3 (id int) PARTITION BY RANGE (id);

This leads to the following in pg_class:
=# SELECT relname, relam FROM pg_class WHERE oid > 16000;
   relname    | relam
--------------+-------
 parent_tab   |     0
 parent_tab_2 |     2
 parent_tab_3 |     0
(3 rows)

Now, let's do the following:
$ createdb b
$ pg_dump | psql b
$ psql b
=# SELECT relname, relam FROM pg_class WHERE oid > 16000;
   relname    | relam
--------------+-------
 parent_tab   |     0
 parent_tab_2 |     0
 parent_tab_3 |     0
(3 rows)

And parent_tab_2 would now rely on the default GUC when creating new
partitions rather than enforce heap.

It seems to me that we are going to extend the GUC
default_table_access_method with a "default" mode to be able to force
relam to 0 and make a difference with the non-0 case, in the same way
as ALTER TABLE SET ACCESS METHOD DEFAULT.  The thing is that, like
tablespaces, we have to rely on a GUC and not a USING clause to be
able to handle --no-table-access-method.

An interesting point comes to what we should do for
default_table_access_method set to "default" when dealing with
something else than a partitioned table, where an error may be
adapted.  Still, I'm wondering if there are more flavors I lack
imagination for.  This requires more careful design work.

Perhaps somebody has a good idea?
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Tue, Apr 16, 2024 at 02:14:21PM +0900, Michael Paquier wrote:
> It seems to me that we are going to extend the GUC
> default_table_access_method with a "default" mode to be able to force
> relam to 0 and make a difference with the non-0 case, in the same way
> as ALTER TABLE SET ACCESS METHOD DEFAULT.  The thing is that, like
> tablespaces, we have to rely on a GUC and not a USING clause to be
> able to handle --no-table-access-method.
>
> An interesting point comes to what we should do for
> default_table_access_method set to "default" when dealing with
> something else than a partitioned table, where an error may be
> adapted.  Still, I'm wondering if there are more flavors I lack
> imagination for.  This requires more careful design work.
>
> Perhaps somebody has a good idea?

Actually, I've come up with an idea just after hitting the send
button: let's use an extra ALTER TABLE SET ACCESS METHOD rather than
rely on the GUC to set the AM of the partitioned table correctly.
This extra command should be optional, depending on
--no-table-access-method.  If a partitioned table has 0 as relam,
let's not add this extra ALTER TABLE at all.
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Tue, Apr 16, 2024 at 02:19:56PM +0900, Michael Paquier wrote:
> Actually, I've come up with an idea just after hitting the send
> button: let's use an extra ALTER TABLE SET ACCESS METHOD rather than
> rely on the GUC to set the AM of the partitioned table correctly.
> This extra command should be optional, depending on
> --no-table-access-method.  If a partitioned table has 0 as relam,
> let's not add this extra ALTER TABLE at all.

I have explored this idea, and while this is tempting this faces a
couple of challenges:
1) Binary upgrades would fail because the table rewrite created by
ALTER TABLE SET ACCESS METHOD for relkinds with physical storage
expects heap_create_with_catalog to have a fixed OID, but the rewrite
would require extra steps to be able to handle that, and I am not
convinced that more binary_upgrade_set_next_heap_relfilenode() is a
good idea.
2) We could limit these extra ALTER TABLE commands to be generated for
partitioned tables.  This is kind of confusing as resulting dumps
would mix SET commands for default_table_access_method that would
affect tables with physical storage, while partitioned tables would
have their own extra ALTER TABLE commands.  Another issue that needs
more consideration is that TocEntrys don't hold any relkind
information so pg_backup_archiver.c cannot make a difference with
tables and partitioned tables to select if SET or ALTER TABLE should
be generated.

Several designs are possible, like:
- Mix SET and ALTER TABLE commands in the dumps to set the AM, SET for
tables and matviews, ALTER TABLE for relations without storage.  This
would bypass the binary upgrade problem with the fixed relid.
- Use only SET, requiring a new "default" value for
default_table_access_method that would force a partitioned table's
relam to be 0.  Be stricter with the "current" table AM tracked in
pg_dump's backup archiver.
- Use only ALTER TABLE commands, with extra binary upgrade tweaks to
force relation OIDs for the second heap_create_with_catalog() done
with the rewrite to update a relation's AM.

With all that in mind, it may be better to revert 374c7a229042 and
e2395cdbe83a from HEAD and reconsider how to tackle the dump issues in
v18 or newer versions as all of the approaches I can think of lead to
more complications of their own.

Please see attached a non-polished POC that switches dumps to use
ALTER TABLE, that I've used to detect the upgrade problems.

Thoughts or comments are welcome.
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Alvaro Herrera
Date:
On 2024-Apr-17, Michael Paquier wrote:

> 2) We could limit these extra ALTER TABLE commands to be generated for
> partitioned tables.  This is kind of confusing as resulting dumps
> would mix SET commands for default_table_access_method that would
> affect tables with physical storage, while partitioned tables would
> have their own extra ALTER TABLE commands.

Hmm, cannot we simply add a USING clause to the CREATE TABLE command for
partitioned tables?  That would override the
default_table_access_method, so it should give the correct result, no?

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Alvaro Herrera
Date:
On 2024-Apr-17, Alvaro Herrera wrote:

> Hmm, cannot we simply add a USING clause to the CREATE TABLE command for
> partitioned tables?  That would override the
> default_table_access_method, so it should give the correct result, no?

Ah, upthread you noted that pg_restore-time --no-table-access-method
needs to be able to elide it, so a dump-time USING clause doesn't work.

I think it's easy enough to add a "bool ispartitioned" to TableInfo and
use an ALTER TABLE or rely on the GUC depending on that -- seems easy
enough.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Wed, Apr 17, 2024 at 09:50:02AM +0200, Alvaro Herrera wrote:
> I think it's easy enough to add a "bool ispartitioned" to TableInfo and
> use an ALTER TABLE or rely on the GUC depending on that -- seems easy
> enough.

Yeah, that would be easy enough to track but I was wondering about
adding the relkind instead.  Still, one thing that I found confusing
is the dump generated in this case, as it would mix the SET and the
ALTER TABLE commands so one reading the dumps may wonder why the SET
has no effect for a CREATE TABLE PARTITION OF without USING.  Perhaps
that's fine and I just worry too much ;)

The extra ALTER commands need to be generated after the object
definitions, so we'd need a new subroutine similar to
_selectTableAccessMethod() like a _selectTableAccessMethodNoStorage().
Or grouping both together is just simpler?
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Alvaro Herrera
Date:
On 2024-Apr-17, Michael Paquier wrote:

> Yeah, that would be easy enough to track but I was wondering about
> adding the relkind instead.  Still, one thing that I found confusing
> is the dump generated in this case, as it would mix the SET and the
> ALTER TABLE commands so one reading the dumps may wonder why the SET
> has no effect for a CREATE TABLE PARTITION OF without USING.  Perhaps
> that's fine and I just worry too much ;)

Hmm, maybe we should do a RESET of default_table_access_method before
printing the CREATE TABLE to avoid the confusion.

> The extra ALTER commands need to be generated after the object
> definitions, so we'd need a new subroutine similar to
> _selectTableAccessMethod() like a _selectTableAccessMethodNoStorage().
> Or grouping both together is just simpler?

I think there should be two routines, since _select* routines just print
a SET command; maybe the new one would be _printAlterTableAM() or
something like that.  Having _select() print an ALTER TABLE command
depending on relkind (or the boolean flag) would be confusing, I think.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Alvaro Herrera
Date:
BTW if nothing else, this thread led me to discover a 18-month-old typo
in the Spanish translation of pg_dump:

-msgstr "  --no-tablespaces             no volcar métodos de acceso de tablas\n"
+msgstr "  --no-table-access-method     no volcar métodos de acceso de tablas\n"

Oops.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Los dioses no protegen a los insensatos.  Éstos reciben protección de
otros insensatos mejor dotados" (Luis Wu, Mundo Anillo)



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Wed, Apr 17, 2024 at 10:31:52AM +0200, Alvaro Herrera wrote:
> On 2024-Apr-17, Michael Paquier wrote:
>> Yeah, that would be easy enough to track but I was wondering about
>> adding the relkind instead.  Still, one thing that I found confusing
>> is the dump generated in this case, as it would mix the SET and the
>> ALTER TABLE commands so one reading the dumps may wonder why the SET
>> has no effect for a CREATE TABLE PARTITION OF without USING.  Perhaps
>> that's fine and I just worry too much ;)
>
> Hmm, maybe we should do a RESET of default_table_access_method before
> printing the CREATE TABLE to avoid the confusion.

A hard reset would make the business around currTableAM that decides
when to generate the SET default_table_access_method queries slightly
more complicated, while increasing the number of queries run on the
server.

>> The extra ALTER commands need to be generated after the object
>> definitions, so we'd need a new subroutine similar to
>> _selectTableAccessMethod() like a _selectTableAccessMethodNoStorage().
>> Or grouping both together is just simpler?
>
> I think there should be two routines, since _select* routines just print
> a SET command; maybe the new one would be _printAlterTableAM() or
> something like that.  Having _select() print an ALTER TABLE command
> depending on relkind (or the boolean flag) would be confusing, I think.

Fine by me to use two routines to generate the two different commands.
I am finishing with the attached for now, making dumps, restores and
upgrades work happily as far as I've tested.

I was also worrying about a need to dump the protocol version to be
able to track the relkind in the toc entries, but a45c78e3284b has
already done one.  The difference in AM handling between relations
without storage and relations with storage pushes the relkind logic
more within the internals of pg_backup_archiver.c.

What do you think?
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Alvaro Herrera
Date:
On 2024-Apr-18, Michael Paquier wrote:

> On Wed, Apr 17, 2024 at 10:31:52AM +0200, Alvaro Herrera wrote:

> > Hmm, maybe we should do a RESET of default_table_access_method before
> > printing the CREATE TABLE to avoid the confusion.
> 
> A hard reset would make the business around currTableAM that decides
> when to generate the SET default_table_access_method queries slightly
> more complicated, while increasing the number of queries run on the
> server.

Hmm, okay.  (I don't think we really care too much about the number of
queries, do we?)

> Fine by me to use two routines to generate the two different commands.
> I am finishing with the attached for now, making dumps, restores and
> upgrades work happily as far as I've tested.

Great.

> I was also worrying about a need to dump the protocol version to be
> able to track the relkind in the toc entries, but a45c78e3284b has
> already done one.  The difference in AM handling between relations
> without storage and relations with storage pushes the relkind logic
> more within the internals of pg_backup_archiver.c.

Hmm, does this mean that every dump taking since a45c78e3284b (April
1st) and before this commit will be unrestorable?  This doesn't worry me
too much, because we aren't even in beta yet ... and I think we don't
have a strict policy about it.

> --- a/src/bin/pg_dump/t/002_pg_dump.pl
> +++ b/src/bin/pg_dump/t/002_pg_dump.pl
> @@ -4591,11 +4591,9 @@ my %tests = (
>              CREATE TABLE dump_test.regress_pg_dump_table_am_child_2
>                PARTITION OF dump_test.regress_pg_dump_table_am_parent FOR VALUES IN (2);',
>          regexp => qr/^
> -            \QSET default_table_access_method = regress_table_am;\E
> -            (\n(?!SET[^;]+;)[^\n]*)*
> -            \n\QCREATE TABLE dump_test.regress_pg_dump_table_am_parent (\E
> -            (.*\n)*
>              \QSET default_table_access_method = heap;\E
> +            (.*\n)*
> +            \QALTER TABLE dump_test.regress_pg_dump_table_am_parent SET ACCESS METHOD regress_table_am;\E
>              (\n(?!SET[^;]+;)[^\n]*)*
>              \n\QCREATE TABLE dump_test.regress_pg_dump_table_am_child_1 (\E
>              (.*\n)*

This looks strange -- why did you remove matching for the CREATE TABLE
of the parent table?  That line should appear shortly before the ALTER
TABLE SET ACCESS METHOD for the same table, shouldn't it?  Maybe your
intention was to remove only the SET default_table_access_method
= regress_table_am line ... but it's not clear to me why we have the
"SET default_table_access_method = heap" line before the ALTER TABLE SET
ACCESS METHOD.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"If it is not right, do not do it.
If it is not true, do not say it." (Marcus Aurelius, Meditations)



Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Thu, Apr 18, 2024 at 06:17:56PM +0200, Alvaro Herrera wrote:
> On 2024-Apr-18, Michael Paquier wrote:
>> I was also worrying about a need to dump the protocol version to be
>> able to track the relkind in the toc entries, but a45c78e3284b has
>> already done one.  The difference in AM handling between relations
>> without storage and relations with storage pushes the relkind logic
>> more within the internals of pg_backup_archiver.c.
>
> Hmm, does this mean that every dump taking since a45c78e3284b (April
> 1st) and before this commit will be unrestorable?  This doesn't worry me
> too much, because we aren't even in beta yet ... and I think we don't
> have a strict policy about it.

I've been scanning the history of K_VERS_1_* in the recent years, and
it does not seem that we have a case where we would have needed to
bump the version twice in the same release cycle.  Anyway, yes, any
dump taken since 1_16 has been bumped would fail to restore with this
patch in place.  For an unreleased not-yet-in-beta branch, why should
we care?  Things are not set in stone, like extensions.  If others
have comments about this point, feel free of course.

>> --- a/src/bin/pg_dump/t/002_pg_dump.pl
>> +++ b/src/bin/pg_dump/t/002_pg_dump.pl
>> @@ -4591,11 +4591,9 @@ my %tests = (
>>              CREATE TABLE dump_test.regress_pg_dump_table_am_child_2
>>                PARTITION OF dump_test.regress_pg_dump_table_am_parent FOR VALUES IN (2);',
>>          regexp => qr/^
>> -            \QSET default_table_access_method = regress_table_am;\E
>> -            (\n(?!SET[^;]+;)[^\n]*)*
>> -            \n\QCREATE TABLE dump_test.regress_pg_dump_table_am_parent (\E
>> -            (.*\n)*
>>              \QSET default_table_access_method = heap;\E
>> +            (.*\n)*
>> +            \QALTER TABLE dump_test.regress_pg_dump_table_am_parent SET ACCESS METHOD regress_table_am;\E
>>              (\n(?!SET[^;]+;)[^\n]*)*
>>              \n\QCREATE TABLE dump_test.regress_pg_dump_table_am_child_1 (\E
>>              (.*\n)*
>
> This looks strange -- why did you remove matching for the CREATE TABLE
> of the parent table?  That line should appear shortly before the ALTER
> TABLE SET ACCESS METHOD for the same table, shouldn't it?

Yeah, with the ALTER in place that did not seem that mandatory but I
don't mind keeping it, as well.

> Maybe your
> intention was to remove only the SET default_table_access_method
> = regress_table_am line ... but it's not clear to me why we have the
> "SET default_table_access_method = heap" line before the ALTER TABLE SET
> ACCESS METHOD.

This comes from the contents of the dump for
regress_pg_dump_table_am_2, that uses heap as table AM.  A SET is
issued for it before dumping regress_pg_dump_table_am_parent and its
partitions.  One trick that I can think of to make the output parsing
of the test more palatable is to switch the AMs used by the two
partitions, so as we finish with two SET queries before each partition
rather than one before the partitioned table.  See the attached for
the idea.
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Fri, Apr 19, 2024 at 10:41:30AM +0900, Michael Paquier wrote:
> This comes from the contents of the dump for
> regress_pg_dump_table_am_2, that uses heap as table AM.  A SET is
> issued for it before dumping regress_pg_dump_table_am_parent and its
> partitions.  One trick that I can think of to make the output parsing
> of the test more palatable is to switch the AMs used by the two
> partitions, so as we finish with two SET queries before each partition
> rather than one before the partitioned table.  See the attached for
> the idea.

FYI, as this is an open item, I am planning to wrap that at the
beginning of next week after a second lookup.  If there are any
comments, feel free.
--
Michael

Attachment

Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From
Michael Paquier
Date:
On Sat, Apr 20, 2024 at 11:55:46AM +0900, Michael Paquier wrote:
> FYI, as this is an open item, I am planning to wrap that at the
> beginning of next week after a second lookup.  If there are any
> comments, feel free.

This one is now fixed with f46bee346c3b, and the open item is marked
as fixed.
--
Michael

Attachment