Re: Single transaction in the tablesync worker? - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Single transaction in the tablesync worker?
Date
Msg-id CAHut+Ps2nVDzMuk-qyyOUJ1L5At_wBM2CMvO1O0Rse77gm_Uag@mail.gmail.com
Whole thread Raw
In response to Re: Single transaction in the tablesync worker?  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Sat, Jan 9, 2021 at 5:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jan 8, 2021 at 2:55 PM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > On Fri, Jan 8, 2021 at 1:02 PM Hou, Zhijie <houzj.fnst@cn.fujitsu.com> wrote:
> > >
> >
> > > 3.
> > > +       /*
> > > +        * To build a slot name for the sync work, we are limited to NAMEDATALEN -
> > > +        * 1 characters.
> > > +        *
> > > +        * The name is calculated as pg_%u_sync_%u (3 + 10 + 6 + 10 + '\0'). (It's
> > > +        * actually the NAMEDATALEN on the remote that matters, but this scheme
> > > +        * will also work reasonably if that is different.)
> > > +        */
> > > +       StaticAssertStmt(NAMEDATALEN >= 32, "NAMEDATALEN too small");   /* for sanity */
> > > +
> > > +       syncslotname = psprintf("pg_%u_sync_%u", suboid, relid);
> > >
> > > The comments says syncslotname is limit to NAMEDATALEN - 1 characters.
> > > But the actual size of it is (3 + 10 + 6 + 10 + '\0') = 30,which seems not NAMEDATALEN - 1.
> > > Should we change the comment here?
> > >
> >
> > The comment wording is a remnant from older code which had a
> > differently format slot name.
> > I think the comment is still valid, albeit maybe unnecessary since in
> > the current code the tablesync slot
> > name length is fixed. But I left the older comment here as a safety reminder
> > in case some future change would want to modify the slot name. What do
> > you think?
> >
>
> I find it quite confusing. The comments should reflect the latest
> code. You can probably say in some form that the length of slotname
> shouldn't exceed NAMEDATALEN because of remote node constraints on
> slot name length. Also, probably the StaticAssert on NAMEDATALEN is
> not required.

Modified comment in latest patch [v14]

>
> 1.
> +   <para>
> +    Additional table synchronization slots are normally transient, created
> +    internally and dropped automatically when they are no longer needed.
> +    These table synchronization slots have generated names:
> +   <quote><literal>pg_%u_sync_%u</literal></quote> (parameters:
> Subscription <parameter>oid</parameter>, Table
> <parameter>relid</parameter>)
> +   </para>
>
> The last line seems too long. I think we are not strict for 80 char
> limit in docs but it is good to be close to that, however, this
> appears quite long.

Fixed in latest patch [v14]

>
> 2.
> + /*
> + * Cleanup any remaining tablesync resources.
> + */
> + {
> + char originname[NAMEDATALEN];
> + RepOriginId originid;
> + char state;
> + XLogRecPtr statelsn;
>
> I have already mentioned previously that let's not use this new style
> of code (start using { to localize the scope of variables). I don't
> know about others but I find it difficult to read such a code. You
> might want to consider moving this whole block to a separate function.
>

Removed extra code block in latest patch [v14]

> 3.
> /*
> + * XXX - Should optimize this to avoid multiple
> + * connect/disconnect.
> + */
> + wrconn = walrcv_connect(sub->conninfo, true, sub->name, &err);
>
> I think it is better to avoid multiple connect/disconnect here. In
> this same function, we have connected to the publisher, we should be
> able to use the same connection.
>

Fixed in latest patch [v14]

> 4.
> process_syncing_tables_for_sync()
> {
> ..
> + /*
> + * Cleanup the tablesync slot.
> + */
> + syncslotname = ReplicationSlotNameForTablesync(
> +    MySubscription->oid,
> +    MyLogicalRepWorker->relid);
> + PG_TRY();
> + {
> + elog(DEBUG1, "process_syncing_tables_for_sync: dropping the
> tablesync slot \"%s\".", syncslotname);
> + ReplicationSlotDropAtPubNode(wrconn, syncslotname);
> + }
> + PG_FINALLY();
> + {
> + pfree(syncslotname);
> + }
> + PG_END_TRY();
> ..
> }
>
> Both here and in DropSubscription(), it seems we are using
> PG_TRY..PG_FINALLY just to free the memory even though
> ReplicationSlotDropAtPubNode already has try..finally. Can we arrange
> code to move allocation of syncslotname inside
> ReplicationSlotDropAtPubNode to avoid additional try..finaly? BTW, if
> the usage of try..finally here is only to free the memory, I am not
> sure if it is required because I think we will anyway Reset the memory
> context where this memory is allocated as part of error handling.
>

Eliminated need for TRY/FINALLY to free syncslotname in latest patch [v14]

> 5.
>  #define SUBREL_STATE_DATASYNC 'd' /* data is being synchronized (sublsn
>   * NULL) */
> +#define SUBREL_STATE_TCOPYDONE 't' /* tablesync copy phase is completed
> + * (sublsn NULL) */
>  #define SUBREL_STATE_SYNCDONE 's' /* synchronization finished in front of
>   * apply (sublsn set) */
>
> I am not very happy with the new state name SUBREL_STATE_TCOPYDONE as
> it is quite different from other adjoining state names and somehow not
> going well with the code. How about SUBREL_STATE_ENDCOPY 'e' or
> SUBREL_STATE_FINISHEDCOPY 'f'?
>

Using SUBREL_STATE_FINISHEDCOPY in latest patch [v14]

---
[v14] = https://www.postgresql.org/message-id/CAHut%2BPsPO2vOp%2BP7U2szROMy15PKKGanhUrCYQ0ffpy9zG1V1A%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: Single transaction in the tablesync worker?
Next
From: torikoshia
Date:
Subject: Re: Get memory contexts of an arbitrary backend process