Thread: PATCH: Add 'pid' column to pg_replication_slots
The attached patch adds a 'pid' column to pg_replication_slots, so it's possible to associate an active slot with the pg_stat_replication entry that describes the walsender using the slot.
--
If a user backend (or bgworker) is using the slot over the SQL interface, the 'pid' column will correspond to the pg_stat_activity entry for that backend instead. After all, both it and pg_stat_replication are views over pg_stat_get_activity() anyway.
Detailed rationale in patch.
Please consider this for 9.5. It's a pretty light patch, and it'd be good to have it in place to ease monitoring of slot-based replication.
Note that logical replication walsenders are broken in HEAD so testing this using the test_decoding module and pg_recvlogical will crash the walsender. This is a pre-existing bug; see http://www.postgresql.org/message-id/CAB7nPqQSdx7coHk0D6G=mkJntGYjXPDw+PWisKKSsAeZFTskvg@mail.gmail.com and http://www.postgresql.org/message-id/CAMsr+YEh50r70+hP+w=rCzEuenoQRCNMDA7PmRSK06Ro9r=9sA@mail.gmail.com)
--
Attachment
Hi, On 2015-04-07 18:41:59 +0800, Craig Ringer wrote: > The attached patch adds a 'pid' column to pg_replication_slots, so it's > possible to associate an active slot with the pg_stat_replication entry > that describes the walsender using the slot. This really should have been done that way from the get go. So I'm inclined to just apply this soon. Will do unless somebody protests. Greetings, Andres Freund
On 2015-04-07 18:41:59 +0800, Craig Ringer wrote: > @@ -331,8 +331,8 @@ ReplicationSlotAcquire(const char *name) > volatile ReplicationSlot *vslot = s; > > SpinLockAcquire(&s->mutex); > - active = vslot->active; > - vslot->active = true; > + active = vslot->active_pid != 0; > + vslot->active_pid = MyProcPid; > SpinLockRelease(&s->mutex); > slot = s; > break; Uh. You're overwriting the existing pid here. Not good if the slot is currently in use. > namecpy(&plugin, &slot->data.plugin); > > - active = slot->active; > + active_pid = slot->active_pid != 0; That doesn't look right. > --- a/src/include/replication/slot.h > +++ b/src/include/replication/slot.h > @@ -84,13 +84,15 @@ typedef struct ReplicationSlot > /* is this slot defined */ > bool in_use; > > - /* is somebody streaming out changes for this slot */ > - bool active; > + /* field 'active' removed in 9.5; see 'active_pid' instead */ > > /* any outstanding modifications? */ > bool just_dirtied; > bool dirty; > > + /* Who is streaming out changes for this slot? 0 for nobody */ > + pid_t active_pid; > + That's a horrible idea. That way we end up with dozens of indirections over time. I don't really like the 'pid' field for pg_replication_slots. About naming it 'active_in' or such? Other than these I plan to push this soon. Greetings, Andres Freund
On 21 April 2015 at 15:19, Andres Freund <andres@anarazel.de> wrote:
On 2015-04-07 18:41:59 +0800, Craig Ringer wrote:
> @@ -331,8 +331,8 @@ ReplicationSlotAcquire(const char *name)
> volatile ReplicationSlot *vslot = s;
>
> SpinLockAcquire(&s->mutex);
> - active = vslot->active;
> - vslot->active = true;
> + active = vslot->active_pid != 0;
> + vslot->active_pid = MyProcPid;
> SpinLockRelease(&s->mutex);
> slot = s;
> break;
Uh. You're overwriting the existing pid here. Not good if the slot is
currently in use.
Isn't that the point? We're acquiring the slot there, per the comment:
"Find a previously created slot and mark it as used by this backend."
> namecpy(&plugin, &slot->data.plugin);
>
> - active = slot->active;
> + active_pid = slot->active_pid != 0;
That doesn't look right.
No, that's certainly not right. I also could've sworn I sorted it out, but that must've been another site, because sure enough it's still there.
I don't really like the 'pid' field for pg_replication_slots. About
naming it 'active_in' or such?
It was originally named active_pid, but changed based on feedback from others that 'pid' would be consistent with pg_stat_activity and pg_replication_slots. I have no strong opinion on the name, though I'd prefer it reflect that the field does in fact represent a process ID.
On April 21, 2015 1:17:32 PM GMT+03:00, Craig Ringer <craig@2ndquadrant.com> wrote: >On 21 April 2015 at 15:19, Andres Freund <andres@anarazel.de> wrote: > >> On 2015-04-07 18:41:59 +0800, Craig Ringer wrote: >> > @@ -331,8 +331,8 @@ ReplicationSlotAcquire(const char *name) >> > volatile ReplicationSlot *vslot = s; >> > >> > SpinLockAcquire(&s->mutex); >> > - active = vslot->active; >> > - vslot->active = true; >> > + active = vslot->active_pid != 0; >> > + vslot->active_pid = MyProcPid; >> > SpinLockRelease(&s->mutex); >> > slot = s; >> > break; >> >> Uh. You're overwriting the existing pid here. Not good if the slot is >> currently in use. >> > >Isn't that the point? We're acquiring the slot there, per the comment: Read the rest of the function. We're checking for conflicts... --- Please excuse brevity and formatting - I am writing this on my mobile phone.
On Tue, Apr 21, 2015 at 6:17 AM, Craig Ringer <craig@2ndquadrant.com> wrote: >> I don't really like the 'pid' field for pg_replication_slots. About >> naming it 'active_in' or such? > > It was originally named active_pid, but changed based on feedback from > others that 'pid' would be consistent with pg_stat_activity and > pg_replication_slots. I have no strong opinion on the name, though I'd > prefer it reflect that the field does in fact represent a process ID. Agreed. I don't like the as-committed name of active_in either. It's not at all clear what that means. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2015-04-21 10:53:08 -0400, Robert Haas wrote: > On Tue, Apr 21, 2015 at 6:17 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > >> I don't really like the 'pid' field for pg_replication_slots. About > >> naming it 'active_in' or such? > > > > It was originally named active_pid, but changed based on feedback from > > others that 'pid' would be consistent with pg_stat_activity and > > pg_replication_slots. I have no strong opinion on the name, though I'd > > prefer it reflect that the field does in fact represent a process ID. > > Agreed. I don't like the as-committed name of active_in either. It's > not at all clear what that means. I like it being called active_*, that makes the correlation to active clear. active_pid then? Greetings, Andres Freund
On Tue, Apr 21, 2015 at 04:54:57PM +0200, Andres Freund wrote: > On 2015-04-21 10:53:08 -0400, Robert Haas wrote: > > On Tue, Apr 21, 2015 at 6:17 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > > >> I don't really like the 'pid' field for pg_replication_slots. About > > >> naming it 'active_in' or such? > > > > > > It was originally named active_pid, but changed based on feedback from > > > others that 'pid' would be consistent with pg_stat_activity and > > > pg_replication_slots. I have no strong opinion on the name, though I'd > > > prefer it reflect that the field does in fact represent a process ID. > > > > Agreed. I don't like the as-committed name of active_in either. It's > > not at all clear what that means. > > I like it being called active_*, that makes the correlation to active > clear. active_pid then? Let's call it active_procpid. (Runs for cover!) ---- (For background, see 9.2 release note item: Rename pg_stat_activity.procpid to pid, to match other system tables (MagnusHagander) The 'p' in 'pid' stands for 'proc', so 'procpid' is redundant.) -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Tue, Apr 21, 2015 at 10:54 AM, Andres Freund <andres@anarazel.de> wrote: > On 2015-04-21 10:53:08 -0400, Robert Haas wrote: >> On Tue, Apr 21, 2015 at 6:17 AM, Craig Ringer <craig@2ndquadrant.com> wrote: >> >> I don't really like the 'pid' field for pg_replication_slots. About >> >> naming it 'active_in' or such? >> > >> > It was originally named active_pid, but changed based on feedback from >> > others that 'pid' would be consistent with pg_stat_activity and >> > pg_replication_slots. I have no strong opinion on the name, though I'd >> > prefer it reflect that the field does in fact represent a process ID. >> >> Agreed. I don't like the as-committed name of active_in either. It's >> not at all clear what that means. > > I like it being called active_*, that makes the correlation to active > clear. active_pid then? wfm -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company