Re: [PATCH] Logical decoding of TRUNCATE - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [PATCH] Logical decoding of TRUNCATE
Date
Msg-id 20180401200107.gjpehuqy63efechn@alap3.anarazel.de
Whole thread Raw
In response to Re: [PATCH] Logical decoding of TRUNCATE  (Marco Nenciarini <marco.nenciarini@2ndquadrant.it>)
Responses Re: [PATCH] Logical decoding of TRUNCATE  (Simon Riggs <simon@2ndquadrant.com>)
Re: [PATCH] Logical decoding of TRUNCATE  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Re: [PATCH] Logical decoding of TRUNCATE  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
List pgsql-hackers
On 2018-01-25 14:21:15 +0100, Marco Nenciarini wrote:
> +     if (SessionReplicationRole != SESSION_REPLICATION_ROLE_REPLICA)
> +     {
> + 
> +         /*
> +          * Check foreign key references.  In CASCADE mode, this should be
> +          * unnecessary since we just pulled in all the references; but as a
> +          * cross-check, do it anyway if in an Assert-enabled build.
> +          */
>   #ifdef USE_ASSERT_CHECKING
>           heap_truncate_check_FKs(rels, false);
> + #else
> +         if (stmt->behavior == DROP_RESTRICT)
> +             heap_truncate_check_FKs(rels, false);
>   #endif
> +     }

That *can't* be right.

> +         case REORDER_BUFFER_CHANGE_TRUNCATE:
> +             appendStringInfoString(ctx->out, " TRUNCATE:");
> + 
> +             if (change->data.truncate_msg.restart_seqs
> +                 || change->data.truncate_msg.cascade)
> +             {
> +                 if (change->data.truncate_msg.restart_seqs)
> +                     appendStringInfo(ctx->out, " restart_seqs");
> +                 if (change->data.truncate_msg.cascade)
> +                     appendStringInfo(ctx->out, " cascade");
> +             }
> +             else
> +                 appendStringInfoString(ctx->out, " (no-flags)");
> +             break;
>           default:
>               Assert(false);
>       }

I know this has been discussed in the thread already, but it really
strikes me as wrong to basically do some mini DDL replication feature
via per-command WAL records.
> ***************
> *** 111,116 **** CREATE PUBLICATION <replaceable class="parameter">name</replaceable>
> --- 111,121 ----
>             and so the default value for this option is
>             <literal>'insert, update, delete'</literal>.
>            </para>
> +          <para>
> +            <command>TRUNCATE</command> is treated as a form of
> +            <command>DELETE</command> for the purpose of deciding whether
> +            to publish, or not.
> +          </para>
>           </listitem>
>          </varlistentry>
>         </variablelist>

Why is this a good idea?


Hm, it seems logicaldecoding.sgml hasn't been updated?

> + void
> + ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
> +                             DropBehavior behavior, bool restart_seqs)
> + {
> +     List       *rels = list_copy(explicit_rels);

Why is this copied?


> +      * Write a WAL record to allow this set of actions to be logically decoded.
> +      * We could optimize this away when !RelationIsLogicallyLogged(rel)
> +      * but that doesn't save much space or time.

What you're saying isn't that you're not logging anything, but that
you're allocating the header regardless? Because this certainly sounds
like you unconditionally log a WAL record.

> +      * Assemble an array of relids, then an array of seqrelids so we can write
> +      * a single WAL record for the whole action.
> +      */
> +     logrelids = palloc(maxrelids * sizeof(Oid));
> +     foreach (cell, relids_logged)
> +     {
> +         nrelids++;
> +         if (nrelids > maxrelids)
> +         {
> +             maxrelids *= 2;
> +             logrelids = repalloc(logrelids, maxrelids * sizeof(Oid));
> +         }
> +         logrelids[nrelids - 1] = lfirst_oid(cell);
> +     }
> + 
> +     foreach (cell, seq_relids_logged)
> +     {
> +         nseqrelids++;
> +         if ((nrelids + nseqrelids) > maxrelids)
> +         {
> +             maxrelids *= 2;
> +             logrelids = repalloc(logrelids, maxrelids * sizeof(Oid));
> +         }
> +         logrelids[nrelids + nseqrelids - 1] = lfirst_oid(cell);
> +     }

I'm confused. Why do we need the resizing here, when we know the max
upfront?

> + /*
> +  * For truncate we list all truncated relids in an array, followed by all
> +  * sequence relids that need to be restarted, if any.
> +  * All rels are always within the same database, so we just list dbid once.
> +  */
> + typedef struct xl_heap_truncate
> + {
> +     Oid            dbId;
> +     uint32        nrelids;
> +     uint32        nseqrelids;
> +     uint8        flags;
> +     Oid relids[FLEXIBLE_ARRAY_MEMBER];
> + } xl_heap_truncate;

Given that the space is used anyway due to padding, I'd just make flags
32bit.

Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Dmitry Ivanov
Date:
Subject: Re: new function for tsquery creartion
Next
From: Andres Freund
Date:
Subject: Re: Optimizing nested ConvertRowtypeExpr execution