RE: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers

From houzj.fnst@fujitsu.com
Subject RE: Perform streaming logical transactions by background workers and parallel apply
Date
Msg-id OS0PR01MB5716C663C85687E76672327094FD9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Perform streaming logical transactions by background workers and parallel apply  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Perform streaming logical transactions by background workers and parallel apply  (Peter Smith <smithpb2250@gmail.com>)
Re: Perform streaming logical transactions by background workers and parallel apply  (Masahiko Sawada <sawada.mshk@gmail.com>)
Re: Perform streaming logical transactions by background workers and parallel apply  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Thursday, January 12, 2023 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> On Thu, Jan 12, 2023 at 4:21 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Thu, Jan 12, 2023 at 10:34 AM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> > >
> > > On Thu, Jan 12, 2023 at 9:54 AM Peter Smith <smithpb2250@gmail.com>
> wrote:
> > > >
> > > >
> > > > doc/src/sgml/monitoring.sgml
> > > >
> > > > 5. pg_stat_subscription
> > > >
> > > > @@ -3198,11 +3198,22 @@ SELECT pid, wait_event_type, wait_event
> > > > FROM pg_stat_activity WHERE wait_event i
> > > >
> > > >       <row>
> > > >        <entry role="catalog_table_entry"><para
> > > > role="column_definition">
> > > > +       <structfield>apply_leader_pid</structfield>
> <type>integer</type>
> > > > +      </para>
> > > > +      <para>
> > > > +       Process ID of the leader apply worker, if this process is a apply
> > > > +       parallel worker. NULL if this process is a leader apply worker or a
> > > > +       synchronization worker.
> > > > +      </para></entry>
> > > > +     </row>
> > > > +
> > > > +     <row>
> > > > +      <entry role="catalog_table_entry"><para
> > > > + role="column_definition">
> > > >         <structfield>relid</structfield> <type>oid</type>
> > > >        </para>
> > > >        <para>
> > > >         OID of the relation that the worker is synchronizing; null for the
> > > > -       main apply worker
> > > > +       main apply worker and the parallel apply worker
> > > >        </para></entry>
> > > >       </row>
> > > >
> > > > 5a.
> > > >
> > > > (Same as general comment #1 about terminology)
> > > >
> > > > "apply_leader_pid" --> "leader_apply_pid"
> > > >
> > >
> > > How about naming this as just leader_pid? I think it could be
> > > helpful in the future if we decide to parallelize initial sync (aka
> > > parallel
> > > copy) because then we could use this for the leader PID of parallel
> > > sync workers as well.
> > >
> > > --
> >
> > I still prefer leader_apply_pid.
> > leader_pid does not tell which 'operation' it belongs to. 'apply'
> > gives the clarity that it is apply related process.
> >
> 
> But then do you suggest that tomorrow if we allow parallel sync workers then
> we have a separate column leader_sync_pid? I think that doesn't sound like a
> good idea and moreover one can refer to docs for clarification.

I agree that leader_pid would be better not only for future parallel copy sync feature,
but also it's more consistent with the leader_pid column in pg_stat_activity.

And here is the version patch which addressed Peter's comments and renamed all
the related stuff to leader_pid.

Best Regards,
Hou zj

Attachment

pgsql-hackers by date:

Previous
From: "Daniel Verite"
Date:
Subject: Re: psql's FETCH_COUNT (cursor) is not being respected for CTEs
Next
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: Perform streaming logical transactions by background workers and parallel apply