Thread: Re: alter table set TABLE ACCESS METHOD

Re: alter table set TABLE ACCESS METHOD

From
Jeff Davis
Date:
On Mon, 2021-03-08 at 16:30 +0900, Michael Paquier wrote:
> This toast issue is a kind of interesting one, and it seems rather
> right to rely on toast_build_flattened_tuple() to decompress things
> if
> both table AMs support toast with the internals of toast knowing what
> kind of compression has been applied to the stored tuple, rather than
> have the table AM try to extra a toast tuple by itself.  I wonder
> whether we should have a table AM API to know what kind of
> compression
> is supported for a given table AMs at hand, because there is no need
> to flatten things if both the origin and the target match their
> compression algos, which would be on HEAD to make sure that both the
> origin and table AMs have toast (relation_toast_am).  Your patch,
> here, would flatten each tuples as long as the table AMs don't 
> match.  That can be made cheaper in some cases.

I am confused by this. The toast-related table AM API functions are:
relation_needs_toast_table(), relation_toast_am(), and
relation_fetch_toast_slice().

What cases are we trying to solve here?

1. target of conversion is tableam1 main table, heap toast table
2. target of conversion is tableam1 main table, no toast table
3. target of conversion is tableam1 main table, tableam1 toast table
4. target of conversion is tableam1 main table, tableam2 toast table

Or does the problem apply to all of these cases?

And if tableam1 can't handle some case, why can't it just detoast the
data itself? Shouldn't that be able to decompress anything?

For example, in columnar[1], we just always detoast/decompress because
we want to compress it ourselves (along with other values from the same
column), and we never have a separate toast table. Is that code
incorrect, or will it break in v14?

Regards,
    Jeff Davis



https://github.com/citusdata/citus/blob/6b1904d37a18e2975b46f0955076f84c8a387cc6/src/backend/columnar/columnar_tableam.c#L1433




Re: alter table set TABLE ACCESS METHOD

From
Justin Pryzby
Date:
On Thu, May 06, 2021 at 01:10:53PM -0700, Jeff Davis wrote:
> On Mon, 2021-03-08 at 16:30 +0900, Michael Paquier wrote:
> > This toast issue is a kind of interesting one, and it seems rather
> > right to rely on toast_build_flattened_tuple() to decompress things
> > if
> > both table AMs support toast with the internals of toast knowing what
> > kind of compression has been applied to the stored tuple, rather than
> > have the table AM try to extra a toast tuple by itself.  I wonder
> > whether we should have a table AM API to know what kind of
> > compression
> > is supported for a given table AMs at hand, because there is no need
> > to flatten things if both the origin and the target match their
> > compression algos, which would be on HEAD to make sure that both the
> > origin and table AMs have toast (relation_toast_am).  Your patch,
> > here, would flatten each tuples as long as the table AMs don't 
> > match.  That can be made cheaper in some cases.
> 
> I am confused by this. The toast-related table AM API functions are:
> relation_needs_toast_table(), relation_toast_am(), and
> relation_fetch_toast_slice().

I wrote this shortly after looking at one of Dilip's LZ4 patches.

At one point in February/March, pg_attribute.attcompression defined the
compression used by *all* tuples in the table, rather than the compression used
for new tuples, and ALTER SET COMPRESSION would rewrite the table with the new
compression (rather than being only a catalog update).


> What cases are we trying to solve here?
> 
> 1. target of conversion is tableam1 main table, heap toast table
> 2. target of conversion is tableam1 main table, no toast table
> 3. target of conversion is tableam1 main table, tableam1 toast table
> 4. target of conversion is tableam1 main table, tableam2 toast table

I think ALTER TABLE SET ACCESS METHOD should move all data off the old AM,
including its toast table.  The optimization Michael suggests is when the new
AM and old AM use the same toast AM, then the data doesn't need to be
de/re/toasted.

Thanks for looking.

-- 
Justin



Re: alter table set TABLE ACCESS METHOD

From
Jeff Davis
Date:
On Thu, 2021-05-06 at 15:23 -0500, Justin Pryzby wrote:
> I think ALTER TABLE SET ACCESS METHOD should move all data off the
> old AM,
> including its toast table.

Can you explain what you mean, and why? I'm still confused.

Let's say there are 4 table AMs: A, AT, B, and BT. A's
relation_toast_am() returns AT, and B's relation_toast_am() returns BT.
AT or BT are invalid if A or B have relation_needs_toast_table() return
false.

Here are the cases that I see:

If A = B, then AT = BT, and it's all a no-op.

If A != B and BT is invalid (e.g. converting heap to columnar), then A
should detoast (and perhaps decompress, as in the case of columnar)
whatever it gets as input and do whatever it wants. That's what
columnar does and I don't see why ATRewriteTable needs to handle it.

If A != B and AT != BT, then B needs to detoast whatever it gets (but
should not decompress, as that would just be wasted effort), and then
re-toast using BT. Again, I don't see a need for ATRewriteTable to do
anything, B can handle it.

The only case I can really see where ATRewriteTable might be helpful is
if A != B but AT = BT. In that case, in theory, you don't need to do
anything to the toast table, just leave it where it is. But then the
responsibilities get a little confusing to me -- how is B supposed to
know that it doesn't need to toast anything? Is this the problem you
are trying to solve?

Regards,
    Jeff Davis





Re: alter table set TABLE ACCESS METHOD

From
Justin Pryzby
Date:
On Thu, May 06, 2021 at 02:11:31PM -0700, Jeff Davis wrote:
> On Thu, 2021-05-06 at 15:23 -0500, Justin Pryzby wrote:
> > I think ALTER TABLE SET ACCESS METHOD should move all data off the
> > old AM,
> > including its toast table.
> 
> Can you explain what you mean, and why? I'm still confused.
> 
> Let's say there are 4 table AMs: A, AT, B, and BT. A's
> relation_toast_am() returns AT, and B's relation_toast_am() returns BT.
> AT or BT are invalid if A or B have relation_needs_toast_table() return
> false.
> 
> Here are the cases that I see:
> 
> If A = B, then AT = BT, and it's all a no-op.
> 
> If A != B and BT is invalid (e.g. converting heap to columnar), then A
> should detoast (and perhaps decompress, as in the case of columnar)
> whatever it gets as input and do whatever it wants. That's what
> columnar does and I don't see why ATRewriteTable needs to handle it.
> 
> If A != B and AT != BT, then B needs to detoast whatever it gets (but
> should not decompress, as that would just be wasted effort), and then
> re-toast using BT. Again, I don't see a need for ATRewriteTable to do
> anything, B can handle it.
> 
> The only case I can really see where ATRewriteTable might be helpful is
> if A != B but AT = BT. In that case, in theory, you don't need to do
> anything to the toast table, just leave it where it is. But then the
> responsibilities get a little confusing to me -- how is B supposed to
> know that it doesn't need to toast anything? Is this the problem you
> are trying to solve?

That's the optimization Michael is suggesting.

I was approaching this after having just looked at Dilip's patch (which was
originally written using pg_am to support "pluggable" compression "AM"s, but
not otherwise related to table AM).

Once a table is converted to a new AM, its tuples had better not reference the
old AM - it could be dropped.

The new table AM (B) shouldn't need to know anything about the old one (A).  It
should just process incoming tuples.  It makes more to me that ATRewriteTable
would handle this, rather than every acccess method having the same logic (at
best) or different logic (more likely).  If ATRewriteTable didn't handle this,
data would become inaccessible if an AM failed to de-toast tuples.

-- 
Justin



Re: alter table set TABLE ACCESS METHOD

From
Jeff Davis
Date:
On Thu, 2021-05-06 at 17:19 -0500, Justin Pryzby wrote:
> If ATRewriteTable didn't
> handle this,
> data would become inaccessible if an AM failed to de-toast tuples.

If the AM fails to detoast tuples, it's got bigger problems than ALTER
TABLE. What about INSERT INTO ... SELECT?

It's the table AM's responsibility to detoast out-of-line datums and
toast any values that are too large (see
heapam.c:heap_prepare_insert()).

Regards,
    Jeff Davis





Re: alter table set TABLE ACCESS METHOD

From
Jeff Davis
Date:
On Thu, 2021-05-06 at 17:24 -0700, Jeff Davis wrote:
> It's the table AM's responsibility to detoast out-of-line datums and
> toast any values that are too large (see
> heapam.c:heap_prepare_insert()).

Do we have general agreement on this point? Did I miss another purpose
of detoasting in tablecmds.c, or can we just remove that part of the
patch?

Regards,
    Jeff Davis





Re: alter table set TABLE ACCESS METHOD

From
Michael Paquier
Date:
On Thu, Jun 03, 2021 at 02:36:15PM -0700, Jeff Davis wrote:
> Do we have general agreement on this point? Did I miss another purpose
> of detoasting in tablecmds.c, or can we just remove that part of the
> patch?

Catching up with this thread..  So, what you are suggesting here is
that we have no need to let ATRewriteTable() do anything about the
detoasting, and just push down the responsability of detoasting the
tuple, if necessary, down to the AM layer where the tuple insertion is
handled, right?

In short, a table AMs would receive on a rewrite with ALTER TABLE
tuples which may be toasted, still table_insert_tuple() should be able
to handle both:
- the case where this tuple was already toasted.
- the case where this tuple has been already detoasted.

You are right that this would be more consistent with what heap does
with heap_prepare_insert().
--
Michael

Attachment

Re: alter table set TABLE ACCESS METHOD

From
Jeff Davis
Date:
On Fri, 2021-06-04 at 14:58 +0900, Michael Paquier wrote:
> In short, a table AMs would receive on a rewrite with ALTER TABLE
> tuples which may be toasted, still table_insert_tuple() should be
> able
> to handle both:
> - the case where this tuple was already toasted.
> - the case where this tuple has been already detoasted.

Yes. That's a current requirement, and any AM that doesn't do that is
already broken (e.g. for INSERT INTO ... SELECT).

Regards,
    Jeff Davis





Re: alter table set TABLE ACCESS METHOD

From
Michael Paquier
Date:
On Fri, Jun 04, 2021 at 11:26:28AM -0700, Jeff Davis wrote:
> Yes. That's a current requirement, and any AM that doesn't do that is
> already broken (e.g. for INSERT INTO ... SELECT).

Makes sense.  I was just looking at the patch, and this was the only
part of it that made my spidey sense react.

One thing I am wondering is if we should have a dummy_table_am in
src/test/modules/ to be able to stress more this feature.  That does
not seem like a hard requirement, but relying only on heap limits a
bit the coverage of this feature even if one changes
default_table_access_method.
--
Michael

Attachment

Re: alter table set TABLE ACCESS METHOD

From
Jeff Davis
Date:
On Sat, 2021-06-05 at 08:45 +0900, Michael Paquier wrote:
> One thing I am wondering is if we should have a dummy_table_am in
> src/test/modules/ to be able to stress more this feature.  That does
> not seem like a hard requirement, but relying only on heap limits a
> bit the coverage of this feature even if one changes
> default_table_access_method.

I agree that a dummy AM would be good, but implementing even a dummy AM
is a fair amount of work. Also, there are many potential variations, so
we'd probably need several.

The table AM API is a work in progress, and I think it will be a few
releases (and require a few more table AMs in the wild) to really nail
down the API.

Regards,
    Jeff Davis





Re: alter table set TABLE ACCESS METHOD

From
Michael Paquier
Date:
On Fri, Jun 04, 2021 at 05:34:36PM -0700, Jeff Davis wrote:
> I agree that a dummy AM would be good, but implementing even a dummy AM
> is a fair amount of work.

Not much, honestly, the largest part being to document that properly
so as it could be used as a template:
https://www.postgresql.org/message-id/YEXm2nh/5j5P2jEl@paquier.xyz

> Also, there are many potential variations, so
> we'd probably need several.

Not so sure here.  GUCs or reloptions could be used to control some of
the behaviors.  Now this really depends on the use-cases we are
looking to support here and the low-level facilities that could
benefit from this module (dummy_index_am tests reloptions for
example).  I agree that this thread is perhaps not enough to justify
adding this module for now.

> The table AM API is a work in progress, and I think it will be a few
> releases (and require a few more table AMs in the wild) to really nail
> down the API.

Hard to say, we'll see.  I'd like to believe that it could be a good
to not set something in stone for that forever.
--
Michael

Attachment

Re: alter table set TABLE ACCESS METHOD

From
Jeff Davis
Date:
On Thu, 2021-06-03 at 14:36 -0700, Jeff Davis wrote:
> Do we have general agreement on this point? Did I miss another
> purpose
> of detoasting in tablecmds.c, or can we just remove that part of the
> patch?

Per discussion with Justin, I'll drive this patch. I merged the CF
entries into https://commitfest.postgresql.org/33/3110/

New version attached, with the detoasting code removed. Could use
another round of validation/cleanup in case I missed something during
the merge.

Regards,
    Jeff Davis


Attachment

Re: alter table set TABLE ACCESS METHOD

From
Michael Paquier
Date:
On Tue, Jun 08, 2021 at 05:33:31PM -0700, Jeff Davis wrote:
> New version attached, with the detoasting code removed. Could use
> another round of validation/cleanup in case I missed something during
> the merge.

This looks rather sane to me, thanks.

    /* Create the transient table that will receive the re-ordered data */
    OIDNewHeap = make_new_heap(tableOid, tableSpace,
+                              accessMethod
It strikes me that we could extend CLUSTER/VACUUM FULL to support this
option, in the same vein as TABLESPACE.  Perhaps that's not something to
implement or have, just wanted to mention it.

+ALTER TABLE heaptable SET ACCESS METHOD heap2;
+explain (analyze, costs off, summary off, timing off) SELECT * FROM heaptable;
+SELECT COUNT(a), COUNT(1) FILTER(WHERE a=1) FROM heaptable;
+DROP TABLE heaptable;
There is a mix of upper and lower-case characters here.  It could be
more consistent.  It seems to me that this test should actually check
that pg_class.relam reflects the new value.

+   /* Save info for Phase 3 to do the real work */
+   if (OidIsValid(tab->newAccessMethod))
+       ereport(ERROR,
+               (errcode(ERRCODE_SYNTAX_ERROR),
+                errmsg("cannot have multiple SET ACCESS METHOD subcommands")));
Worth adding a test?

- * with the specified persistence, which might differ from the original's.
+ * NewTableSpace/accessMethod/persistence, which might differ from those
Nit: the name of the variable looks inconsistent with this comment.
The original is weird as well.

I am wondering if it would be a good idea to set the new tablespace
and new access method fields to InvalidOid within ATGetQueueEntry.  We
do that for the persistence.  Not critical at all, still..

+           pass = AT_PASS_MISC;
Maybe add a comment here?
--
Michael

Attachment

Re: alter table set TABLE ACCESS METHOD

From
Jeff Davis
Date:
On Wed, 2021-06-09 at 13:47 +0900, Michael Paquier wrote:
> There is a mix of upper and lower-case characters here.  It could be
> more consistent.  It seems to me that this test should actually check
> that pg_class.relam reflects the new value.

Done. I also added a (negative) test for changing the AM of a
partitioned table, which wasn't caught before.

> +                errmsg("cannot have multiple SET ACCESS METHOD
> subcommands")));
> Worth adding a test?

Done.

> Nit: the name of the variable looks inconsistent with this comment.
> The original is weird as well.

Tried to improve wording.

> I am wondering if it would be a good idea to set the new tablespace
> and new access method fields to InvalidOid within
> ATGetQueueEntry.  We
> do that for the persistence.  Not critical at all, still..

Done.

> +           pass = AT_PASS_MISC;
> Maybe add a comment here?

Done. In that case, it doesn't matter because there's no work to be
done in Phase 2.

Regards,
    Jeff Davis


Attachment

Re: alter table set TABLE ACCESS METHOD

From
Zhihong Yu
Date:


On Wed, Jun 9, 2021 at 12:31 PM Jeff Davis <pgsql@j-davis.com> wrote:
On Wed, 2021-06-09 at 13:47 +0900, Michael Paquier wrote:
> There is a mix of upper and lower-case characters here.  It could be
> more consistent.  It seems to me that this test should actually check
> that pg_class.relam reflects the new value.

Done. I also added a (negative) test for changing the AM of a
partitioned table, which wasn't caught before.

> +                errmsg("cannot have multiple SET ACCESS METHOD
> subcommands")));
> Worth adding a test?

Done.

> Nit: the name of the variable looks inconsistent with this comment.
> The original is weird as well.

Tried to improve wording.

> I am wondering if it would be a good idea to set the new tablespace
> and new access method fields to InvalidOid within
> ATGetQueueEntry.  We
> do that for the persistence.  Not critical at all, still..

Done.

> +           pass = AT_PASS_MISC;
> Maybe add a comment here?

Done. In that case, it doesn't matter because there's no work to be
done in Phase 2.

Regards,
        Jeff Davis

Hi,

+           /* check if another access method change was already requested */
+           if (tab->newAccessMethod)
+               ereport(ERROR,
+                       (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                        errmsg("cannot change access method setting twice")));

I think the error message can be refined - changing  access method twice is supported, as long as the two changes don't overlap.

Cheers

Re: alter table set TABLE ACCESS METHOD

From
Michael Paquier
Date:
On Wed, Jun 09, 2021 at 01:45:52PM -0700, Zhihong Yu wrote:
> +           /* check if another access method change was already requested
> */
> +           if (tab->newAccessMethod)
> +               ereport(ERROR,
> +                       (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> +                        errmsg("cannot change access method setting
> twice")));
>
> I think the error message can be refined - changing  access method twice is
> supported, as long as the two changes don't overlap.

Hmm.  Do we actually need this one?  ATPrepSetAccessMethod() checks
already that this command cannot be specified multiple times, with an
error message consistent with what SET TABLESPACE does.
--
Michael

Attachment

Re: alter table set TABLE ACCESS METHOD

From
Justin Pryzby
Date:
On Wed, Jun 09, 2021 at 01:47:18PM +0900, Michael Paquier wrote:
> On Tue, Jun 08, 2021 at 05:33:31PM -0700, Jeff Davis wrote:
> > New version attached, with the detoasting code removed. Could use
> > another round of validation/cleanup in case I missed something during
> > the merge.
> 
> This looks rather sane to me, thanks.
> 
>     /* Create the transient table that will receive the re-ordered data */
>     OIDNewHeap = make_new_heap(tableOid, tableSpace,
> +                              accessMethod
> It strikes me that we could extend CLUSTER/VACUUM FULL to support this
> option, in the same vein as TABLESPACE.  Perhaps that's not something to
> implement or have, just wanted to mention it.

It's a good thought.  But remember that that c5b28604 handled REINDEX
(TABLESPACE) but not CLUSTER/VACUUM FULL (TABLESPACE).  You wrote:
https://www.postgresql.org/message-id/YBuWbzoW6W7AaMv0%40paquier.xyz
> Regarding the VACUUM and CLUSTER cases, I am not completely sure if
> going through these for a tablespace case is the best move we can do,
> as ALTER TABLE is able to mix multiple operations and all of them
> require already an AEL to work.  REINDEX was different thanks to the
> case of CONCURRENTLY.  Anyway, as a lot of work has been done here
> already, I would recommend to create new threads about those two
> topics.  I am also closing this patch in the CF app.

In any case, I think we really want to have an ALTER .. SET ACCESS METHOD.
Supporting it also in CLUSTER/VACUUM is an optional, additional feature.

-- 
Justin



Re: alter table set TABLE ACCESS METHOD

From
Justin Pryzby
Date:
On Wed, Jun 09, 2021 at 01:45:52PM -0700, Zhihong Yu wrote:
> +           /* check if another access method change was already requested
> */
> +           if (tab->newAccessMethod)
> +               ereport(ERROR,
> +                       (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> +                        errmsg("cannot change access method setting twice")));
> 
> I think the error message can be refined - changing  access method twice is
> supported, as long as the two changes don't overlap.

That language is consistent wtih existing errors.

src/backend/commands/tablecmds.c:                                                errmsg("cannot change persistence
settingtwice")));
 
src/backend/commands/tablecmds.c:                                                errmsg("cannot change persistence
settingtwice")));
 
                                 errmsg("cannot alter type of column \"%s\" twice",

However, I think that SET TABLESPACE is a better template to follow:
                                 errmsg("cannot have multiple SET TABLESPACE subcommands")));

Michael pointed out that there's two, redundant checks:

+       if (rel->rd_rel->relam == amoid)
+               return;
+  
+       /* Save info for Phase 3 to do the real work */
+       if (OidIsValid(tab->newAccessMethod))
+               ereport(ERROR,
+                               (errcode(ERRCODE_SYNTAX_ERROR),
+                                errmsg("cannot have multiple SET ACCESS METHOD subcommands")));

I think that the "multiple subcommands" test should be before the "no-op" test.

@Jeff: In my original patch, I also proposed patches 2,3:

Subject: [PATCH v2 2/3] Allow specifying acccess method of partitioned tables..
Subject: [PATCH v2 3/3] Implement lsyscache get_rel_relam()
                        
 



Re: alter table set TABLE ACCESS METHOD

From
vignesh C
Date:
On Thu, Jun 10, 2021 at 1:01 AM Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Wed, 2021-06-09 at 13:47 +0900, Michael Paquier wrote:
> > There is a mix of upper and lower-case characters here.  It could be
> > more consistent.  It seems to me that this test should actually check
> > that pg_class.relam reflects the new value.
>
> Done. I also added a (negative) test for changing the AM of a
> partitioned table, which wasn't caught before.
>
> > +                errmsg("cannot have multiple SET ACCESS METHOD
> > subcommands")));
> > Worth adding a test?
>
> Done.
>
> > Nit: the name of the variable looks inconsistent with this comment.
> > The original is weird as well.
>
> Tried to improve wording.
>
> > I am wondering if it would be a good idea to set the new tablespace
> > and new access method fields to InvalidOid within
> > ATGetQueueEntry.  We
> > do that for the persistence.  Not critical at all, still..
>
> Done.
>
> > +           pass = AT_PASS_MISC;
> > Maybe add a comment here?
>
> Done. In that case, it doesn't matter because there's no work to be
> done in Phase 2.
>

There are few compilation issues:
tablecmds.c:4629:52: error: too few arguments to function call,
expected 3, have 2
ATSimplePermissions(rel, ATT_TABLE | ATT_MATVIEW);
~~~~~~~~~~~~~~~~~~~ ^
tablecmds.c:402:1: note: 'ATSimplePermissions' declared here
static void ATSimplePermissions(AlterTableType cmdtype, Relation rel,
int allowed_targets);
^
tablecmds.c:5983:10: warning: enumeration value 'AT_SetAccessMethod'
not handled in switch [-Wswitch]
switch (cmdtype)
^
1 warning and 1 error generated.

Also few comments need to be addressed, based on that I'm changing the
status to "Waiting for Author".

Regards,
Vignesh



Re: alter table set TABLE ACCESS METHOD

From
vignesh C
Date:
On Fri, Jul 16, 2021 at 9:19 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> rebased.
>
> Also, there were two redundant checks for multiple SET ACCESS METHOD commands.
> But one of them wasn't hit if the ALTER was setting the current AM due to the
> no-op test.
>
> I think it's better to fail in every case, and not just sometimes (especially
> if we were to use ERRCODE_SYNTAX_ERROR).
>
> I included my 2ndary patch allowing to set the AM of partitioned table, same as
> for a tablespace.

One of the tests is failing, please post an updated patch for this:
create_am.out       2021-07-22 10:34:56.234654166 +0530
@@ -177,6 +177,7 @@
 (1 row)

 -- CREATE TABLE ..  PARTITION BY supports USING
+-- new partitions will inherit from the current default, rather the
partition root
 CREATE TABLE tableam_parted_heap2 (a text, b int) PARTITION BY list
(a) USING heap2;
 SET default_table_access_method = 'heap';
 CREATE TABLE tableam_parted_a_heap2 PARTITION OF tableam_parted_heap2
FOR VALUES IN ('a');

Regards,
Vignesh