Thread: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?

Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?

From
Bharath Rupireddy
Date:
Hi,

The active_pid of ReplicationSlot structure, which tells whether a
replication slot is active or inactive, isn't persisted to the disk
i.e has no entry in ReplicationSlotPersistentData structure. Isn't it
better if we add that info to ReplicationSlotPersistentData structure
and persist to the disk? This will help to know what were the inactive
replication slots in case the server goes down or crashes for some
reason. Currently, we don't have a way to interpret the replication
slot info in the disk but there's a patch for pg_replslotdata tool at
[1]. This way, one can figure out the reasons for the server
down/crash and figure out which replication slots to remove to bring
the server up and running without touching the other replication
slots.

Thoughts?

[1] - https://www.postgresql.org/message-id/CALj2ACW0rV5gWK8A3m6_X62qH%2BVfaq5hznC%3Di0R5Wojt5%2Byhyw%40mail.gmail.com

Regards,
Bharath Rupireddy.



Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?

From
Alvaro Herrera
Date:
On 2021-Dec-01, Bharath Rupireddy wrote:

> The active_pid of ReplicationSlot structure, which tells whether a
> replication slot is active or inactive, isn't persisted to the disk
> i.e has no entry in ReplicationSlotPersistentData structure. Isn't it
> better if we add that info to ReplicationSlotPersistentData structure
> and persist to the disk? This will help to know what were the inactive
> replication slots in case the server goes down or crashes for some
> reason. Currently, we don't have a way to interpret the replication
> slot info in the disk but there's a patch for pg_replslotdata tool at
> [1]. This way, one can figure out the reasons for the server
> down/crash and figure out which replication slots to remove to bring
> the server up and running without touching the other replication
> slots.

I think the PIDs are log-worthy for sure, but it's not clear to me that
it is desirable to write them to the persistent state file.  In case of
crashes, the log should serve just fine to aid root cause investigation
-- in fact even better than the persistent file, where the data would be
lost as soon as the next client acquires that slot.

-- 
Álvaro Herrera              Valdivia, Chile  —  https://www.EnterpriseDB.com/
"The Postgresql hackers have what I call a "NASA space shot" mentality.
 Quite refreshing in a world of "weekend drag racer" developers."
(Scott Marlowe)



Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?

From
Bharath Rupireddy
Date:
On Wed, Dec 1, 2021 at 9:50 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2021-Dec-01, Bharath Rupireddy wrote:
>
> > The active_pid of ReplicationSlot structure, which tells whether a
> > replication slot is active or inactive, isn't persisted to the disk
> > i.e has no entry in ReplicationSlotPersistentData structure. Isn't it
> > better if we add that info to ReplicationSlotPersistentData structure
> > and persist to the disk? This will help to know what were the inactive
> > replication slots in case the server goes down or crashes for some
> > reason. Currently, we don't have a way to interpret the replication
> > slot info in the disk but there's a patch for pg_replslotdata tool at
> > [1]. This way, one can figure out the reasons for the server
> > down/crash and figure out which replication slots to remove to bring
> > the server up and running without touching the other replication
> > slots.
>
> I think the PIDs are log-worthy for sure, but it's not clear to me that
> it is desirable to write them to the persistent state file.  In case of
> crashes, the log should serve just fine to aid root cause investigation
> -- in fact even better than the persistent file, where the data would be
> lost as soon as the next client acquires that slot.

Thanks. +1 to log a message at LOG level whenever a replication slot
becomes active (gets assigned a valid pid to active_pid) and becomes
inactive(gets assigned 0 to active_pid). Having said that, isn't it
also helpful if we write a bool (1 byte character) whenever the slot
becomes active and inactive to the disk?

Regards,
Bharath Rupireddy.



Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?

From
Bharath Rupireddy
Date:
On Fri, Dec 3, 2021 at 7:39 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Wed, Dec 1, 2021 at 9:50 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > On 2021-Dec-01, Bharath Rupireddy wrote:
> >
> > > The active_pid of ReplicationSlot structure, which tells whether a
> > > replication slot is active or inactive, isn't persisted to the disk
> > > i.e has no entry in ReplicationSlotPersistentData structure. Isn't it
> > > better if we add that info to ReplicationSlotPersistentData structure
> > > and persist to the disk? This will help to know what were the inactive
> > > replication slots in case the server goes down or crashes for some
> > > reason. Currently, we don't have a way to interpret the replication
> > > slot info in the disk but there's a patch for pg_replslotdata tool at
> > > [1]. This way, one can figure out the reasons for the server
> > > down/crash and figure out which replication slots to remove to bring
> > > the server up and running without touching the other replication
> > > slots.
> >
> > I think the PIDs are log-worthy for sure, but it's not clear to me that
> > it is desirable to write them to the persistent state file.  In case of
> > crashes, the log should serve just fine to aid root cause investigation
> > -- in fact even better than the persistent file, where the data would be
> > lost as soon as the next client acquires that slot.
>
> Thanks. +1 to log a message at LOG level whenever a replication slot
> becomes active (gets assigned a valid pid to active_pid) and becomes
> inactive(gets assigned 0 to active_pid). Having said that, isn't it
> also helpful if we write a bool (1 byte character) whenever the slot
> becomes active and inactive to the disk?

Here's the patch that adds a LOG message whenever a replication slot
becomes active and inactive. These logs will be extremely useful on
production servers to debug and analyze inactive replication slot
issues.

Thoughts?

Regards,
Bharath Rupireddy.

Attachment

Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?

From
Kyotaro Horiguchi
Date:
At Tue, 14 Dec 2021 19:04:09 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in 
> On Fri, Dec 3, 2021 at 7:39 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Wed, Dec 1, 2021 at 9:50 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > > I think the PIDs are log-worthy for sure, but it's not clear to me that
> > > it is desirable to write them to the persistent state file.  In case of
> > > crashes, the log should serve just fine to aid root cause investigation
> > > -- in fact even better than the persistent file, where the data would be
> > > lost as soon as the next client acquires that slot.
> >
> > Thanks. +1 to log a message at LOG level whenever a replication slot
> > becomes active (gets assigned a valid pid to active_pid) and becomes
> > inactive(gets assigned 0 to active_pid). Having said that, isn't it
> > also helpful if we write a bool (1 byte character) whenever the slot
> > becomes active and inactive to the disk?
> 
> Here's the patch that adds a LOG message whenever a replication slot
> becomes active and inactive. These logs will be extremely useful on
> production servers to debug and analyze inactive replication slot
> issues.
> 
> Thoughts?

If I create a replication slot, I saw the following lines in server log.

[22054:client backend] LOG:  replication slot "s1" becomes active
[22054:client backend] DETAIL:  The process with PID 22054 acquired it.
[22054:client backend] STATEMENT:  select pg_drop_replication_slot('s1');
[22054:client backend] LOG:  replication slot "s1" becomes inactive
[22054:client backend] DETAIL:  The process with PID 22054 released it.
[22054:client backend] STATEMENT:  select pg_drop_replication_slot('s1');

They are apparently too much as if they were DEBUG3 lines.  The
process PID shown is of the process the slot operations took place so
I think it conveys no information.  The STATEMENT lines are also noisy
for non-ERROR emssages.  Couldn't we hide that line?

That is, how about making the log lines as simple as the follows?

[17156:walsender]  LOG:  acquired replication slot "s1"
[17156:walsender]  LOG:  released replication slot "s1"

I think in the first place we don't need this log lines at slot
creation since it is actually not acquirement nor releasing of a slot.


It behaves in a strange way when executing pg_basebackup.

[22864:walsender] LOG:  replication slot "pg_basebackup_22864" becomes active
[22864:walsender] DETAIL:  The process with PID 22864 acquired it.
[22864:walsender] STATEMENT:  CREATE_REPLICATION_SLOT "pg_basebackup_22864" TEMPORARY PHYSICAL ( RESERVE_WAL)
[22864:walsender] LOG:  replication slot "pg_basebackup_22864" becomes active
[22864:walsender] DETAIL:  The process with PID 22864 acquired it.
[22864:walsender] STATEMENT:  START_REPLICATION SLOT "pg_basebackup_22864" 0/6000000 TIMELINE 1
[22864:walsender] LOG:  replication slot "pg_basebackup_22864" becomes inactive
[22864:walsender] DETAIL:  The process with PID 22864 released it.

The slot is acquired twice then released once.  It is becuase the
patch doesn't emit "becomes inactive" line when releasing a temporary
slot. However, I'd rather think we don't need the first 'become
active' line like the previous example.


@@ -658,6 +690,13 @@ ReplicationSlotDropPtr(ReplicationSlot *slot)
     slot->active_pid = 0;
     slot->in_use = false;
     LWLockRelease(ReplicationSlotControlLock);
+
+    if (pid > 0)
+        ereport(LOG,
+                (errmsg("replication slot \"%s\" becomes inactive",
+                        NameStr(slot->data.name)),
+                 errdetail("The process with PID %d released it.", pid)));
+

This is wrong.  I see a "become inactive" message if I droped an
"inactive" replication slot.  The reason the inactive slot looks as if
it were acquired is it is temporarily aquired as a preparing step of
dropping.


Even assuming that the log lines are simplified to this extent, I
still see it a bit strange that the "becomes acitve (or acruied)"
message shows alone without having a message like "replication
connection accepted".  But that would be another issue even if it is
true.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?

From
Bharath Rupireddy
Date:
On Wed, Dec 15, 2021 at 8:32 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
> > Here's the patch that adds a LOG message whenever a replication slot
> > becomes active and inactive. These logs will be extremely useful on
> > production servers to debug and analyze inactive replication slot
> > issues.
> >
> > Thoughts?
>
> If I create a replication slot, I saw the following lines in server log.
>
> [22054:client backend] LOG:  replication slot "s1" becomes active
> [22054:client backend] DETAIL:  The process with PID 22054 acquired it.
> [22054:client backend] STATEMENT:  select pg_drop_replication_slot('s1');
> [22054:client backend] LOG:  replication slot "s1" becomes inactive
> [22054:client backend] DETAIL:  The process with PID 22054 released it.
> [22054:client backend] STATEMENT:  select pg_drop_replication_slot('s1');
>
> They are apparently too much as if they were DEBUG3 lines.  The
> process PID shown is of the process the slot operations took place so
> I think it conveys no information.  The STATEMENT lines are also noisy
> for non-ERROR emssages.  Couldn't we hide that line?
>
> That is, how about making the log lines as simple as the follows?
>
> [17156:walsender]  LOG:  acquired replication slot "s1"
> [17156:walsender]  LOG:  released replication slot "s1"

Thanks for taking a look at the patch. Here's what I've come up with:

for drops:
2021-12-28 06:39:34.963 UTC [2541600] LOG:  acquired persistent
physical replication slot "myslot1"
2021-12-28 06:39:34.980 UTC [2541600] LOG:  dropped persistent
physical replication slot "myslot1"
2021-12-28 06:47:39.994 UTC [2544153] LOG:  acquired persistent
logical replication slot "myslot2"
2021-12-28 06:47:40.003 UTC [2544153] LOG:  dropped persistent logical
replication slot "myslot2"

for creates:
2021-12-28 06:39:46.859 UTC [2541600] LOG:  created persistent
physical replication slot "myslot1"
2021-12-28 06:39:46.859 UTC [2541600] LOG:  released persistent
physical replication slot "myslot1"
2021-12-28 06:45:20.037 UTC [2544153] LOG:  created persistent logical
replication slot "myslot2"
2021-12-28 06:45:20.058 UTC [2544153] LOG:  released persistent
logical replication slot "myslot2"

for pg_basebackup:
2021-12-28 06:41:04.601 UTC [2542686] LOG:  created temporary physical
replication slot "pg_basebackup_2542686"
2021-12-28 06:41:04.602 UTC [2542686] LOG:  released temporary
physical replication slot "pg_basebackup_2542686"
2021-12-28 06:41:04.602 UTC [2542686] LOG:  acquired temporary
physical replication slot "pg_basebackup_2542686"
2021-12-28 06:41:04.867 UTC [2542686] LOG:  released temporary
physical replication slot "pg_basebackup_2542686"
2021-12-28 06:41:04.954 UTC [2542686] LOG:  dropped temporary physical
replication slot "pg_basebackup_2542686"

The amount of logs may seem noisy, but they do help a lot given the
fact that the server generates much more noise, for instance, the
server logs the syntax error statements. And, the replication slots
don't get created/dropped every now and then (at max, the
pg_basebackup if at all used and configured to take the backups, say,
every 24hrs or so). With the above set of logs debugging for inactive
replication slots becomes easier. One can find the root cause and
answer questions like "why there was a huge WAL file growth at some
point or when did a replication slot become inactive or how much time
a replication slot was inactive or when did a standby disconnected and
connected again or when did a pg_receivewal or pg_recvlogical
connected and disconnected so on.".

Here's the v2 patch. Please provide your thoughts.

Regards,
Bharath Rupireddy.

Attachment

Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?

From
Kyotaro Horiguchi
Date:
At Tue, 28 Dec 2021 12:28:07 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in 
> On Wed, Dec 15, 2021 at 8:32 AM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> > > Here's the patch that adds a LOG message whenever a replication slot
> > > becomes active and inactive. These logs will be extremely useful on
> > > production servers to debug and analyze inactive replication slot
> > > issues.
> > >
> > > Thoughts?
> >
> > If I create a replication slot, I saw the following lines in server log.
> >
> > [22054:client backend] LOG:  replication slot "s1" becomes active
> > [22054:client backend] DETAIL:  The process with PID 22054 acquired it.
> > [22054:client backend] STATEMENT:  select pg_drop_replication_slot('s1');
> > [22054:client backend] LOG:  replication slot "s1" becomes inactive
> > [22054:client backend] DETAIL:  The process with PID 22054 released it.
> > [22054:client backend] STATEMENT:  select pg_drop_replication_slot('s1');
> >
> > They are apparently too much as if they were DEBUG3 lines.  The
> > process PID shown is of the process the slot operations took place so
> > I think it conveys no information.  The STATEMENT lines are also noisy
> > for non-ERROR emssages.  Couldn't we hide that line?
> >
> > That is, how about making the log lines as simple as the follows?
> >
> > [17156:walsender]  LOG:  acquired replication slot "s1"
> > [17156:walsender]  LOG:  released replication slot "s1"
> 
> Thanks for taking a look at the patch. Here's what I've come up with:
> 
> for drops:
(two log lines per slot: acquire->drop)
> 
> for creates:
(two log lines per slot: create->release)
..

Theses are still needlessly verbose.  Even for those who want slot
activities to be logged are not interested in this detail.  This is
still debug logs in that sense.

> The amount of logs may seem noisy, but they do help a lot given the
> fact that the server generates much more noise, for instance, the
> server logs the syntax error statements. And, the replication slots
> don't get created/dropped every now and then (at max, the
> pg_basebackup if at all used and configured to take the backups, say,
> every 24hrs or so). With the above set of logs debugging for inactive

In a nearby thread, there was a discussion that checkpoint logs are
too noisy to defaultly turn on.  Finally it is committed but I don't
think this is committed as is as it is more verbose (IMV) than the
checkpoint logs. Thus this logs need to be muteable.  Meanwhile I
don't think we willingly add a new knob for this feature. I think we
can piggy-back on log_replication_commands for the purpose, changing
its meaning slightly to "log replication commands and related
activity".

> replication slots becomes easier. One can find the root cause and
> answer questions like "why there was a huge WAL file growth at some
> point or when did a replication slot become inactive or how much time
> a replication slot was inactive or when did a standby disconnected and
> connected again or when did a pg_receivewal or pg_recvlogical
> connected and disconnected so on.".

I don't deny it is useful in that cases.  If you are fine that the
logs are debug-only, making them DEBUG1 would work.  But I don't think
you are fine with that since I think you are going to turn on them on
production systems.

> Here's the v2 patch. Please provide your thoughts.

Thanks!  I have three comments on this version.

- I still think "acquire/release" logs on creation/dropping should be
  silenced.  Adding the third parameter to ReplicationSlotAcquire()
  that can mute the acquiring (and as well as the corresponding
  releasing) log will work.

- Need to mute the logs by log_replication_commands.  (We could add
  another choice for the variable for this purpose but I think we
  don't need it.)

- The messages are not translatable as the variable parts are
  adjectives. They need to consist of static sentences.  The
  combinations of the two properties are 6 (note that persistence is
  tristate) but I don't think we accept that complexity for the
  information.  So I recommend to just remove the variable parts from
  the messages.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?

From
Bharath Rupireddy
Date:
On Wed, Jan 5, 2022 at 12:13 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
> > Here's the v2 patch. Please provide your thoughts.
>
> Thanks!  I have three comments on this version.

Thanks for your review.

> - I still think "acquire/release" logs on creation/dropping should be
>   silenced.  Adding the third parameter to ReplicationSlotAcquire()
>   that can mute the acquiring (and as well as the corresponding
>   releasing) log will work.

Done.

> can piggy-back on log_replication_commands for the purpose, changing
> its meaning slightly to "log replication commands and related
> activity".
> - Need to mute the logs by log_replication_commands.  (We could add
>   another choice for the variable for this purpose but I think we
>   don't need it.)

Done.

> - The messages are not translatable as the variable parts are
>   adjectives. They need to consist of static sentences.  The
>   combinations of the two properties are 6 (note that persistence is
>   tristate) but I don't think we accept that complexity for the
>   information.  So I recommend to just remove the variable parts from
>   the messages.

Done.

Here's v3, please review it further.

Regards,
Bharath Rupireddy.
Attachment

Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?

From
Bharath Rupireddy
Date:
On Mon, Jan 10, 2022 at 6:50 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Wed, Jan 5, 2022 at 12:13 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
> > > Here's the v2 patch. Please provide your thoughts.
> >
> > Thanks!  I have three comments on this version.
>
> Thanks for your review.
>
> > - I still think "acquire/release" logs on creation/dropping should be
> >   silenced.  Adding the third parameter to ReplicationSlotAcquire()
> >   that can mute the acquiring (and as well as the corresponding
> >   releasing) log will work.
>
> Done.
>
> > can piggy-back on log_replication_commands for the purpose, changing
> > its meaning slightly to "log replication commands and related
> > activity".
> > - Need to mute the logs by log_replication_commands.  (We could add
> >   another choice for the variable for this purpose but I think we
> >   don't need it.)
>
> Done.
>
> > - The messages are not translatable as the variable parts are
> >   adjectives. They need to consist of static sentences.  The
> >   combinations of the two properties are 6 (note that persistence is
> >   tristate) but I don't think we accept that complexity for the
> >   information.  So I recommend to just remove the variable parts from
> >   the messages.
>
> Done.
>
> Here's v3, please review it further.

Here's the rebased v4 patch.

Regards,
Bharath Rupireddy.

Attachment
On Sun, Feb 27, 2022 at 3:16 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Mon, Jan 10, 2022 at 6:50 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Wed, Jan 5, 2022 at 12:13 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
> > > > Here's the v2 patch. Please provide your thoughts.
> > >
> > > Thanks!  I have three comments on this version.
> >
> > Thanks for your review.
> >
> > > - I still think "acquire/release" logs on creation/dropping should be
> > >   silenced.  Adding the third parameter to ReplicationSlotAcquire()
> > >   that can mute the acquiring (and as well as the corresponding
> > >   releasing) log will work.
> >
> > Done.
> >
> > > can piggy-back on log_replication_commands for the purpose, changing
> > > its meaning slightly to "log replication commands and related
> > > activity".
> > > - Need to mute the logs by log_replication_commands.  (We could add
> > >   another choice for the variable for this purpose but I think we
> > >   don't need it.)
> >
> > Done.
> >
> > > - The messages are not translatable as the variable parts are
> > >   adjectives. They need to consist of static sentences.  The
> > >   combinations of the two properties are 6 (note that persistence is
> > >   tristate) but I don't think we accept that complexity for the
> > >   information.  So I recommend to just remove the variable parts from
> > >   the messages.
> >
> > Done.
> >
> > Here's v3, please review it further.
>
> Here's the rebased v4 patch.

Here's the rebased v5 patch.

Regards,
Bharath Rupireddy.

Attachment
If I read the code right, this patch emits logs when
pg_logical_slot_get_changes and pg_replication_slot_advance SQL
functions are called.  Is this desirable/useful, for the use case you
stated at start of thread?  I think it is most likely pointless.  If you
get rid of those, then the only acquisitions that would log messages are
those in StartReplication and StartLogicalReplication.  So I wonder if
it would be better to leave the API of ReplicationSlotAcquire() alone,
and instead make StartReplication and StartLogicalReplication
responsible for those messages.

I didn't look at the release-side messages you're adding, but I suppose
it should be symmetrical.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"El hombre nunca sabe de lo que es capaz hasta que lo intenta" (C. Dickens)



On Fri, Apr 29, 2022 at 4:02 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> If I read the code right, this patch emits logs when
> pg_logical_slot_get_changes and pg_replication_slot_advance SQL
> functions are called.  Is this desirable/useful, for the use case you
> stated at start of thread?  I think it is most likely pointless.  If you
> get rid of those, then the only acquisitions that would log messages are
> those in StartReplication and StartLogicalReplication.  So I wonder if
> it would be better to leave the API of ReplicationSlotAcquire() alone,
> and instead make StartReplication and StartLogicalReplication
> responsible for those messages.
>
> I didn't look at the release-side messages you're adding, but I suppose
> it should be symmetrical.

Adding the messages right after ReplicationSlotAcquire in the
StartReplication and StartLogicalReplication looks okay, but, if we
just add "released replication slot \"%s\" for logical/physical
replication". in StartReplication and StartLogicalReplication right
after ReplicationSlotRelease, how about ReplicationSlotRelease in
WalSndErrorCleanup and ReplicationSlotShmemExit?

The whole idea is to get to know when and how much time a slot was
inactive/unused when log_replication_commands is set to true.

We can think of adding a timestamp column to on-disk replication slot
data but that will be too much.

Regards,
Bharath Rupireddy.



On Thu, May 5, 2022 at 2:57 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Fri, Apr 29, 2022 at 4:02 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > If I read the code right, this patch emits logs when
> > pg_logical_slot_get_changes and pg_replication_slot_advance SQL
> > functions are called.  Is this desirable/useful, for the use case you
> > stated at start of thread?  I think it is most likely pointless.  If you
> > get rid of those, then the only acquisitions that would log messages are
> > those in StartReplication and StartLogicalReplication.  So I wonder if
> > it would be better to leave the API of ReplicationSlotAcquire() alone,
> > and instead make StartReplication and StartLogicalReplication
> > responsible for those messages.
> >
> > I didn't look at the release-side messages you're adding, but I suppose
> > it should be symmetrical.

Here's the v6 patch, a much simpler one - no changes to any of the
existing function APIs. Please see the sample logs at [1]. There's a
bit of duplicate code in the v6 patch, if the overall approach looks
okay, I can remove that too in the next version of the patch.

Thoughts?

[1]
2022-07-25 12:30:14.847 UTC [152873] LOG:  acquired physical
replication slot "foo"
2022-07-25 12:30:20.878 UTC [152873] LOG:  released physical
replication slot "foo"

2022-07-25 12:49:18.023 UTC [168738] LOG:  acquired logical
replication slot "bar"
2022-07-25 12:49:28.105 UTC [168738] LOG:  released logical
replication slot "bar"

Regards,
Bharath Rupireddy.

Attachment
On Mon, Jul 25, 2022 at 6:31 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Here's the v6 patch, a much simpler one - no changes to any of the
> existing function APIs. Please see the sample logs at [1]. There's a
> bit of duplicate code in the v6 patch, if the overall approach looks
> okay, I can remove that too in the next version of the patch.

I modified the log_replication_commands description in guc_tables.c.
Please review the v7 patch further.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment
On Thu, Sep 15, 2022 at 10:39 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Mon, Jul 25, 2022 at 6:31 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > Here's the v6 patch, a much simpler one - no changes to any of the
> > existing function APIs. Please see the sample logs at [1]. There's a
> > bit of duplicate code in the v6 patch, if the overall approach looks
> > okay, I can remove that too in the next version of the patch.
>
> I modified the log_replication_commands description in guc_tables.c.
> Please review the v7 patch further.
>

I see that you have modified the patch to address the comments from
Alvaro. Personally, I feel it would be better to add such a message at
a centralized location instead of spreading these in different callers
of slot acquire/release functionality to avoid getting these missed in
the new callers in the future. However, if Alvaro and others think
that the current style is better then we should go ahead and do it
that way. I hope that we should be able to decide on this and get it
into PG16. Anyone else would like to weigh in here?

--
With Regards,
Amit Kapila.



Hi, I had a quick look at the v7 patch.

You might consider to encapsulate some of this logic in new functions like:

void
LogReplicationSlotAquired(bool is_physical, char *slotname)
{
    loglevel = log_replication_commands ? LOG : DEBUG3;

    if (is_physical)
        ereport(loglevel,
            (errmsg("acquired physical replication slot \"%s\"", slotname)));
    else
        ereport(loglevel,
            (errmsg("acquired logical replication slot \"%s\"", slotname)));
}

void
LogReplicationSlotReleased(bool is_physical, char *slotname)
{
    loglevel = log_replication_commands ? LOG : DEBUG3;

    if (is_physical)
        ereport(loglevel,
            (errmsg("released physical replication slot \"%s\"", slotname)));
    else
        ereport(loglevel,
            (errmsg("released logical replication slot \"%s\"", slotname)));
}

~~

THEN

ReplicationSlotShmemExit and WalSndErrorCleanup can call it like:
if (MyReplicationSlot != NULL)
{
    bool is_physical = SlotIsPhysical(MyReplicationSlot);
    char *slotname = pstrdup(NameStr(MyReplicationSlot->data.name));

    ReplicationSlotRelease();
    LogReplicationSlotReleased(is_physical, slotname);
}

~

StartlReplication can call like:
LogReplicationSlotAquired(true, cmd->slotname);
...
LogReplicationSlotReleased(true, cmd->slotname);

~

StartLogicalReplication can call like:
LogReplicationSlotAquired(false, cmd->slotname);
...
LogReplicationSlotReleased(false, cmd->slotname);


~~~

TBH, I am not sure for the *current* code if the encapsulation is
worth the trouble or not. But maybe at least it helps message
consistency and will make it easier if future callers are needed. I
guess those functions could also give you some central point to
comment the intent of this logging? Feel free to take or leave this
code suggestion.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



On 2023-Mar-22, Amit Kapila wrote:

> I see that you have modified the patch to address the comments from
> Alvaro. Personally, I feel it would be better to add such a message at
> a centralized location instead of spreading these in different callers
> of slot acquire/release functionality to avoid getting these missed in
> the new callers in the future. However, if Alvaro and others think
> that the current style is better then we should go ahead and do it
> that way. I hope that we should be able to decide on this and get it
> into PG16. Anyone else would like to weigh in here?

I like Peter Smith's suggestion downthread.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



On Thu, Mar 23, 2023 at 3:37 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2023-Mar-22, Amit Kapila wrote:
>
> > I see that you have modified the patch to address the comments from
> > Alvaro. Personally, I feel it would be better to add such a message at
> > a centralized location instead of spreading these in different callers
> > of slot acquire/release functionality to avoid getting these missed in
> > the new callers in the future. However, if Alvaro and others think
> > that the current style is better then we should go ahead and do it
> > that way. I hope that we should be able to decide on this and get it
> > into PG16. Anyone else would like to weigh in here?
>
> I like Peter Smith's suggestion downthread.

+1. Please review the attached v8 patch further.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment
On Fri, Mar 24, 2023 at 4:33 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Thu, Mar 23, 2023 at 3:37 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > On 2023-Mar-22, Amit Kapila wrote:
> >
> > > I see that you have modified the patch to address the comments from
> > > Alvaro. Personally, I feel it would be better to add such a message at
> > > a centralized location instead of spreading these in different callers
> > > of slot acquire/release functionality to avoid getting these missed in
> > > the new callers in the future. However, if Alvaro and others think
> > > that the current style is better then we should go ahead and do it
> > > that way. I hope that we should be able to decide on this and get it
> > > into PG16. Anyone else would like to weigh in here?
> >
> > I like Peter Smith's suggestion downthread.
>
> +1. Please review the attached v8 patch further.
>

Patch v8 applied OK, and builds/renders the HTML docs OK, and passes
the regression and subscription TAP tests OK.

(Note - I didn't do any additional manual testing, and I've assumed it
to be covering all the necessary acquire/related logging that you
wanted).

~~

Here are some minor comments:

1.
+ ereport(log_replication_commands ? LOG : DEBUG3,
+ (errmsg("acquired physical replication slot \"%s\"",
+ slotname)));

AFAIK those extra parentheses wrapping the "errmsg" part are not necessary.

~~

2.
extern void LogReplicationSlotAquired(bool is_physical, char *slotname);
extern void LogReplicationSlotReleased(bool is_physical, char *slotname);

The "char *slotname" params of those helper functions should probably
be declared and defined as "const char *slotname".

~~

Otherwise, from a code review perspective the patch v8 LGTM.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



On Thu, Mar 23, 2023, at 2:33 PM, Bharath Rupireddy wrote:
On Thu, Mar 23, 2023 at 3:37 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2023-Mar-22, Amit Kapila wrote:
>
> > I see that you have modified the patch to address the comments from
> > Alvaro. Personally, I feel it would be better to add such a message at
> > a centralized location instead of spreading these in different callers
> > of slot acquire/release functionality to avoid getting these missed in
> > the new callers in the future. However, if Alvaro and others think
> > that the current style is better then we should go ahead and do it
> > that way. I hope that we should be able to decide on this and get it
> > into PG16. Anyone else would like to weigh in here?
>
> I like Peter Smith's suggestion downthread.

+1. Please review the attached v8 patch further.
If you are adding separate functions as suggested, you should add a comment at
the top of ReplicationSlotAcquire() and ReplicationSlotRelease() functions
saying that LogReplicationSlotAquired() and LogReplicationSlotReleased()
functions should be called  respectively after it.

My suggestion is that the functions should have the same name with a "Log"
prefix. On of them has a typo "Aquired" in its name. Hence,
LogReplicationSlotAcquire() and LogReplicationSlotRelease() as names. It is
easier to find if someone is grepping by the origin function.

I prefer a sentence that includes a verb.

  physical replication slot \"%s\" is acquired
  logical replication slot \"%s\" is released

Isn't the PID important for this use case? If so, of course, you can rely on
log_line_prefix (%p) but if the PID is crucial for an investigation then it
should also be included in the message.


--
Euler Taveira

On Fri, Mar 24, 2023 at 3:11 AM Euler Taveira <euler@eulerto.com> wrote:
>
> If you are adding separate functions as suggested, you should add a comment at
> the top of ReplicationSlotAcquire() and ReplicationSlotRelease() functions
> saying that LogReplicationSlotAquired() and LogReplicationSlotReleased()
> functions should be called  respectively after it.

Done.

> My suggestion is that the functions should have the same name with a "Log"
> prefix. On of them has a typo "Aquired" in its name. Hence,
> LogReplicationSlotAcquire() and LogReplicationSlotRelease() as names. It is
> easier to find if someone is grepping by the origin function.

Done.

> I prefer a sentence that includes a verb.
>
>   physical replication slot \"%s\" is acquired
>   logical replication slot \"%s\" is released

Hm, changed for now. But I'll leave it to the committer's discretion.

> Isn't the PID important for this use case? If so, of course, you can rely on
> log_line_prefix (%p) but if the PID is crucial for an investigation then it
> should also be included in the message.

On Fri, Mar 24, 2023 at 3:10 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Patch v8 applied OK, and builds/renders the HTML docs OK, and passes
> the regression and subscription TAP tests OK.
>
> Here are some minor comments:
>
> 1.
> + ereport(log_replication_commands ? LOG : DEBUG3,
> + (errmsg("acquired physical replication slot \"%s\"",
> + slotname)));
>
> AFAIK those extra parentheses wrapping the "errmsg" part are not necessary.

Done

> 2.
> extern void LogReplicationSlotAquired(bool is_physical, char *slotname);
> extern void LogReplicationSlotReleased(bool is_physical, char *slotname);
>
> The "char *slotname" params of those helper functions should probably
> be declared and defined as "const char *slotname".

Done.

> Otherwise, from a code review perspective the patch v8 LGTM.

Thanks. Please have a look at the v9 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment
On Thu, 23 Mar 2023 at 23:30, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> > + ereport(log_replication_commands ? LOG : DEBUG3,
> > + (errmsg("acquired physical replication slot \"%s\"",
> > + slotname)));

So this is just a bit of bike-shedding but I don't feel like these log
messages really meet the standard we set for our logging. Like what
did the acquiring? What does "acquired" actually mean for a
replication slot? Is there not any meta information about the
acquisition that can give more context to the reader to make this
message more meaningful?

I would expect a log message like this to say, I dunno, something like
"physical replication slot \"%s\" acquired by streaming TCP connection
to 192.168.0.1:999 at LSN ... with xxxMB of logs to read"

I even would be wondering if the other end shouldn't also be logging a
corresponding log and we shouldn't be going out of our way to ensure
there's enough information to match them up and presenting them in a
way that makes that easy.

-- 
greg



On 2023-Mar-23, Greg Stark wrote:

> On Thu, 23 Mar 2023 at 23:30, Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > > + ereport(log_replication_commands ? LOG : DEBUG3,
> > > + (errmsg("acquired physical replication slot \"%s\"",
> > > + slotname)));
> 
> So this is just a bit of bike-shedding but I don't feel like these log
> messages really meet the standard we set for our logging. Like what
> did the acquiring? What does "acquired" actually mean for a
> replication slot? Is there not any meta information about the
> acquisition that can give more context to the reader to make this
> message more meaningful?
> 
> I would expect a log message like this to say, I dunno, something like
> "physical replication slot \"%s\" acquired by streaming TCP connection
> to 192.168.0.1:999 at LSN ... with xxxMB of logs to read"

Hmm, I don't disagree with your argument in principle, but I think this
proposal is going too far.  I think stating the PID is more than
sufficient.  And I don't think we need this patch to go great lengths to
explain what acquisition is, either; I mean, maybe that's a good thing
to have, but then that's a different patch.

> I even would be wondering if the other end shouldn't also be logging a
> corresponding log and we shouldn't be going out of our way to ensure
> there's enough information to match them up and presenting them in a
> way that makes that easy.

Hmm, you should be able to match things using the connection
information.  I don't think the slot acquisition operation in itself is
that important.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Use it up, wear it out, make it do, or do without"



On Fri, Mar 24, 2023 at 3:05 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2023-Mar-23, Greg Stark wrote:
>
> > On Thu, 23 Mar 2023 at 23:30, Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > >
> > > > + ereport(log_replication_commands ? LOG : DEBUG3,
> > > > + (errmsg("acquired physical replication slot \"%s\"",
> > > > + slotname)));
> >
> > So this is just a bit of bike-shedding but I don't feel like these log
> > messages really meet the standard we set for our logging. Like what
> > did the acquiring? What does "acquired" actually mean for a
> > replication slot? Is there not any meta information about the
> > acquisition that can give more context to the reader to make this
> > message more meaningful?
> >
> > I would expect a log message like this to say, I dunno, something like
> > "physical replication slot \"%s\" acquired by streaming TCP connection
> > to 192.168.0.1:999 at LSN ... with xxxMB of logs to read"
>
> Hmm, I don't disagree with your argument in principle, but I think this
> proposal is going too far.  I think stating the PID is more than
> sufficient.

Do you mean to have something like "physical/logical replication slot
\"%s\" is released/acquired by PID %d", MyProcPid? If yes, the
log_line_prefix already contains PID right? Or do we want to cover the
cases when someone changes log_line_prefix to not contain PID?

> And I don't think we need this patch to go great lengths to
> explain what acquisition is, either; I mean, maybe that's a good thing
> to have, but then that's a different patch.
>
> > I even would be wondering if the other end shouldn't also be logging a
> > corresponding log and we shouldn't be going out of our way to ensure
> > there's enough information to match them up and presenting them in a
> > way that makes that easy.
>
> Hmm, you should be able to match things using the connection
> information.  I don't think the slot acquisition operation in itself is
> that important.

Yeah, the intention of the patch is to track the patterns of slot
acquisitions and releases to aid analysis. Of course, this information
alone may not help but when matched with others in the logs, it will.

The v9 patch was failing because I was using MyReplicationSlot after
it got reset by slot release, attached v10 patch fixes it.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment
On Mon, Mar 27, 2023 at 11:08 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> The v9 patch was failing because I was using MyReplicationSlot after
> it got reset by slot release, attached v10 patch fixes it.
>

+ *
+ * Note: use LogReplicationSlotAcquire() if needed, to log a message after
+ * acquiring the replication slot.
  */
 void
 ReplicationSlotAcquire(const char *name, bool nowait)
@@ -542,6 +554,9 @@ retry:

When does it need to be logged? For example, recently, we added one
more slot acquisition/release call in
binary_upgrade_logical_slot_has_caught_up(); it is not clear from the
comments whether we need to LOG it or not. I guess at some place like
atop LogReplicationSlotAcquire() we should document in a bit more
specific way as to when is this expected to be called.

--
With Regards,
Amit Kapila.



On Wed, Nov 1, 2023 at 2:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Mar 27, 2023 at 11:08 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > The v9 patch was failing because I was using MyReplicationSlot after
> > it got reset by slot release, attached v10 patch fixes it.
> >
>
> + *
> + * Note: use LogReplicationSlotAcquire() if needed, to log a message after
> + * acquiring the replication slot.
>   */
>  void
>  ReplicationSlotAcquire(const char *name, bool nowait)
> @@ -542,6 +554,9 @@ retry:
>
> When does it need to be logged? For example, recently, we added one
> more slot acquisition/release call in
> binary_upgrade_logical_slot_has_caught_up(); it is not clear from the
> comments whether we need to LOG it or not. I guess at some place like
> atop LogReplicationSlotAcquire() we should document in a bit more
> specific way as to when is this expected to be called.
>

I agree. Just saying "if needed" in those function comments doesn't
help with knowing how to judge when logging is needed or not.

~

Looking back at the thread history it seems the function comment was
added after Euler [1] suggested ("... you should add a comment at the
top of ReplicationSlotAcquire() and ReplicationSlotRelease() functions
saying that LogReplicationSlotAquired() and
LogReplicationSlotReleased() functions should be called respectively
after it.")

But that's not quite compatible with what Alvaro [2] had written long
back ("... the only acquisitions that would log messages are those in
StartReplication and StartLogicalReplication.").

In other words, ReplicationSlotAcquire/ReplicationSlotRelease is
called by more places than you care to log from.

~

Adding a better explanatory comment than "if needed" will be good, and
maybe that is all that is necessary. I'm not sure.

OTOH, if you have to explain that logging is only wanted for a couple
of scenarios, then it raises some doubts about the usefulness of
having a common function in the first place. I had the same doubts
back in March [3] ("I am not sure for the *current* code if the
encapsulation is worth the trouble or not.").

======
[1] Euler - https://www.postgresql.org/message-id/c42d5634-ca9b-49a7-85cd-9eff9feb33b4%40app.fastmail.com
[2] Alvaro - https://www.postgresql.org/message-id/202204291032.qfvyuqxkjnjw%40alvherre.pgsql
[3] Peter -
https://www.postgresql.org/message-id/CAHut%2BPu6Knwooc_NckMxszGrAJnytgpMadtoJ-OA-SFWT%2BGFYw%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Austalia



On Thu, Nov 2, 2023 at 7:19 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> But that's not quite compatible with what Alvaro [2] had written long
> back ("... the only acquisitions that would log messages are those in
> StartReplication and StartLogicalReplication.").
>
> In other words, ReplicationSlotAcquire/ReplicationSlotRelease is
> called by more places than you care to log from.

I refreshed my thoughts for this patch and I think it's enough if
walsenders alone log messages when slots become active and inactive.
How about something like the attached v11 patch?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment
On Sun, Nov 5, 2023 at 4:01 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Thu, Nov 2, 2023 at 7:19 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > But that's not quite compatible with what Alvaro [2] had written long
> > back ("... the only acquisitions that would log messages are those in
> > StartReplication and StartLogicalReplication.").
> >
> > In other words, ReplicationSlotAcquire/ReplicationSlotRelease is
> > called by more places than you care to log from.
>
> I refreshed my thoughts for this patch and I think it's enough if
> walsenders alone log messages when slots become active and inactive.
> How about something like the attached v11 patch?
>

+ * This function is currently used only in walsender.
+ */
+void
+ReplicationSlotAcquireAndLog(const char *name, bool nowait)

BTW, is the reason for using it only in walsender is that it is a
background process and it is not very apparent whether the slot is
created, and for foreground processes, it is a bit clear when the
command is executed. If so, the other alternative is to either use a
parameter to the existing function or directly use am_walsender flag
to distinguish when to print the message in acquire/release calls.

Can you please tell me the use case of this additional message?

A few other minor comments:
1.
+        Causes each replication command and related activity to be logged in
+        the server log.

Can we be bit more specific by changing to something like: "Causes
each replication command and slot acquisition/release to be logged in
the server log."

2.
+ ereport(log_replication_commands ? LOG : DEBUG1,
+ (errmsg("walsender process with PID %d acquired %s replication slot \"%s\"",

It seems PID and process name is quite unlike what we print in other
similar messages. For example, see below messages when we start
replication via pub/sub :

2023-11-06 08:41:57.867 IST [24400] LOG:  received replication
command: CREATE_REPLICATION_SLOT "sub1" LOGICAL pgoutput (SNAPSHOT
'nothing')
2023-11-06 08:41:57.867 IST [24400] STATEMENT:
CREATE_REPLICATION_SLOT "sub1" LOGICAL pgoutput (SNAPSHOT 'nothing')
...
...
2023-11-06 08:41:57.993 IST [22332] LOG:  walsender process with PID
22332 acquired logical replication slot "sub1"
2023-11-06 08:41:57.993 IST [22332] STATEMENT:  START_REPLICATION SLOT
"sub1" LOGICAL 0/0 (proto_version '4', origin 'any', publication_names
'"pub1"')
...
...
2023-11-06 08:41:58.015 IST [22332] LOG:  starting logical decoding
for slot "sub1"
2023-11-06 08:41:58.015 IST [22332] DETAIL:  Streaming transactions
committing after 0/1522730, reading WAL from 0/15226F8.
2023-11-06 08:41:58.015 IST [22332] STATEMENT:  START_REPLICATION SLOT
"sub1" LOGICAL 0/0 (proto_version '4', origin 'any', publication_names
'"pub1"')

We can get the PID from the log line as for other logs and I don't see
the process name printed anywhere else.

--
With Regards,
Amit Kapila.



Hi,

FWIW, I used a small script to set up the following environment to
help observe the log messages written by this patch.

Node1 = primary; it creates a PUBLICATION
Node2 = physical standby; the publication is replicated here
Node3 = subscriber; it subscribes to the publication now on the standby Node2

~

In the log files you can see:

1. The "physical" message from the walsender of the primary (Node1)
e.g. 2023-11-06 18:59:58.094 AEDT [17091] LOG:  walsender process with
PID 17091 acquired physical replication slot "pg_basebackup_17091"

2. The "logical" message from the pub/sub walsender of the standby (Node2)
e.g. 2023-11-06 19:16:29.053 AEDT [12774] LOG:  walsender process with
PID 12774 acquired logical replication slot "sub1"

~

PSA the test script and logs

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachment
On Mon, Nov 6, 2023 at 9:09 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Nov 5, 2023 at 4:01 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Thu, Nov 2, 2023 at 7:19 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > >
> > > But that's not quite compatible with what Alvaro [2] had written long
> > > back ("... the only acquisitions that would log messages are those in
> > > StartReplication and StartLogicalReplication.").
> > >
> > > In other words, ReplicationSlotAcquire/ReplicationSlotRelease is
> > > called by more places than you care to log from.
> >
> > I refreshed my thoughts for this patch and I think it's enough if
> > walsenders alone log messages when slots become active and inactive.
> > How about something like the attached v11 patch?
> >
>
> + * This function is currently used only in walsender.
> + */
> +void
> +ReplicationSlotAcquireAndLog(const char *name, bool nowait)
>
> BTW, is the reason for using it only in walsender is that it is a
> background process and it is not very apparent whether the slot is
> created, and for foreground processes, it is a bit clear when the
> command is executed.
>
> Can you please tell me the use case of this additional message?

Replication slot acquisitions and releases by backends say when
running pg_replication_slot_advance or pg_logical_slot_get_changes or
pg_drop_replication_slot or pg_create_{physical,
logical}_replication_slot are transient unlike walsenders which
comparatively hold slots for longer durations. Therefore, I've added
them only for walsenders. These messages help to know the lifetime of
a replication slot - one can know how long a streaming standby or
logical subscriber is down, IOW, how long a replication slot is
inactive in production. For instance, the time between released and
acquired slots in the below messages is the inactive replication slot
duration.

2023-11-13 11:06:34.338 UTC [470262] LOG:  acquired physical
replication slot "sb_repl_slot"
2023-11-13 11:06:34.338 UTC [470262] STATEMENT:  START_REPLICATION
SLOT "sb_repl_slot" 0/3000000 TIMELINE 1
2023-11-13 11:09:24.918 UTC [470262] LOG:  released physical
replication slot "sb_repl_slot"
2023-11-13 12:01:40.530 UTC [470967] LOG:  acquired physical
replication slot "sb_repl_slot"
2023-11-13 12:01:40.530 UTC [470967] STATEMENT:  START_REPLICATION
SLOT "sb_repl_slot" 0/3000000 TIMELINE 1

> If so, the other alternative is to either use a
> parameter to the existing function or directly use am_walsender flag
> to distinguish when to print the message in acquire/release calls.

Done that way. PSA v12.

> A few other minor comments:
> 1.
> +        Causes each replication command and related activity to be logged in
> +        the server log.
>
> Can we be bit more specific by changing to something like: "Causes
> each replication command and slot acquisition/release to be logged in
> the server log."

Done.

> 2.
> + ereport(log_replication_commands ? LOG : DEBUG1,
> + (errmsg("walsender process with PID %d acquired %s replication slot \"%s\"",
>
> It seems PID and process name is quite unlike what we print in other
> similar messages. For example, see below messages when we start
> replication via pub/sub :
>
> We can get the PID from the log line as for other logs and I don't see
> the process name printed anywhere else.

There was a comment upthread to have PID printed, but I agree to be
consistent and changed the messages to be: acquired physical/logical
replication slot "foo" and released physical/logical replication slot
"foo".

PSA v12 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment
On Mon, Nov 13, 2023 at 5:43 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Mon, Nov 6, 2023 at 9:09 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sun, Nov 5, 2023 at 4:01 AM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > >
> > > On Thu, Nov 2, 2023 at 7:19 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > > >
> > > > But that's not quite compatible with what Alvaro [2] had written long
> > > > back ("... the only acquisitions that would log messages are those in
> > > > StartReplication and StartLogicalReplication.").
> > > >
> > > > In other words, ReplicationSlotAcquire/ReplicationSlotRelease is
> > > > called by more places than you care to log from.
> > >
> > > I refreshed my thoughts for this patch and I think it's enough if
> > > walsenders alone log messages when slots become active and inactive.
> > > How about something like the attached v11 patch?
> > >
> >
> > + * This function is currently used only in walsender.
> > + */
> > +void
> > +ReplicationSlotAcquireAndLog(const char *name, bool nowait)
> >
> > BTW, is the reason for using it only in walsender is that it is a
> > background process and it is not very apparent whether the slot is
> > created, and for foreground processes, it is a bit clear when the
> > command is executed.
> >
> > Can you please tell me the use case of this additional message?
>
> Replication slot acquisitions and releases by backends say when
> running pg_replication_slot_advance or pg_logical_slot_get_changes or
> pg_drop_replication_slot or pg_create_{physical,
> logical}_replication_slot are transient unlike walsenders which
> comparatively hold slots for longer durations. Therefore, I've added
> them only for walsenders. These messages help to know the lifetime of
> a replication slot - one can know how long a streaming standby or
> logical subscriber is down, IOW, how long a replication slot is
> inactive in production. For instance, the time between released and
> acquired slots in the below messages is the inactive replication slot
> duration.
>
> 2023-11-13 11:06:34.338 UTC [470262] LOG:  acquired physical
> replication slot "sb_repl_slot"
> 2023-11-13 11:06:34.338 UTC [470262] STATEMENT:  START_REPLICATION
> SLOT "sb_repl_slot" 0/3000000 TIMELINE 1
> 2023-11-13 11:09:24.918 UTC [470262] LOG:  released physical
> replication slot "sb_repl_slot"
> 2023-11-13 12:01:40.530 UTC [470967] LOG:  acquired physical
> replication slot "sb_repl_slot"
> 2023-11-13 12:01:40.530 UTC [470967] STATEMENT:  START_REPLICATION
> SLOT "sb_repl_slot" 0/3000000 TIMELINE 1
>
> > If so, the other alternative is to either use a
> > parameter to the existing function or directly use am_walsender flag
> > to distinguish when to print the message in acquire/release calls.
>
> Done that way. PSA v12.
>
> > A few other minor comments:
> > 1.
> > +        Causes each replication command and related activity to be logged in
> > +        the server log.
> >
> > Can we be bit more specific by changing to something like: "Causes
> > each replication command and slot acquisition/release to be logged in
> > the server log."
>
> Done.
>
> > 2.
> > + ereport(log_replication_commands ? LOG : DEBUG1,
> > + (errmsg("walsender process with PID %d acquired %s replication slot \"%s\"",
> >
> > It seems PID and process name is quite unlike what we print in other
> > similar messages. For example, see below messages when we start
> > replication via pub/sub :
> >
> > We can get the PID from the log line as for other logs and I don't see
> > the process name printed anywhere else.
>
> There was a comment upthread to have PID printed, but I agree to be
> consistent and changed the messages to be: acquired physical/logical
> replication slot "foo" and released physical/logical replication slot
> "foo".
>
> PSA v12 patch.

Compiler isn't happy with v12
https://cirrus-ci.com/task/5543061376204800?logs=gcc_warning#L405. PSA
v13 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment
Here are some review comments for patch v13-0001.

======
doc/src/sgml/config.sgml

1.
        <para>
-        Causes each replication command to be logged in the server log.
-        See <xref linkend="protocol-replication"/> for more information about
-        replication command. The default value is <literal>off</literal>.
-        Only superusers and users with the appropriate <literal>SET</literal>
-        privilege can change this setting.
+        Causes each replication command and slot acquisition/release to be
+        logged in the server log. See <xref linkend="protocol-replication"/>
+        for more information about replication command. The default value is
+        <literal>off</literal>. Only superusers and users with the appropriate
+        <literal>SET</literal> privilege can change this setting.
        </para>

Should that also mention about walsender?

e.g.
"and slot acquisition/release" ==> "and <literal>walsender</literal>
slot acquisition/release"

======
src/backend/replication/slot.c

2. ReplicationSlotAcquire

  if (SlotIsLogical(s))
  pgstat_acquire_replslot(s);
+
+ if (am_walsender)
+ ereport(log_replication_commands ? LOG : DEBUG1,
+ errmsg_internal("acquired %s replication slot \"%s\"",
+ SlotIsPhysical(MyReplicationSlot) ? "physical" : "logical",
+ NameStr(MyReplicationSlot->data.name)));

2a.
Instead of calling SlotIsLogical() and then again calling
SlotIsPhysical(), it might be better to assign this one time to a
local variable.

~

2b.
IMO it is better to continue using variable 's' here instead of
'MyReplicationSlot'. Code is not only shorter but is also consistent
with the rest of the function which never uses MyReplicationSlot, even
in the places where it could have.

~

SUGGESTION (for #2a and #2b)
is_logical = SlotIsLogical(s);
if (is_logical)
  pgstat_acquire_replslot(s);

if (am_walsender)
  ereport(log_replication_commands ? LOG : DEBUG1,
    errmsg_internal("acquired %s replication slot \"%s\"",
      is_logical ? "logical" : "physical", NameStr(s->data.name)));

~~~

3. ReplicationSlotRelease

 ReplicationSlotRelease(void)
 {
  ReplicationSlot *slot = MyReplicationSlot;
+ char    *slotname = NULL; /* keep compiler quiet */
+ bool is_physical = false; /* keep compiler quiet */

  Assert(slot != NULL && slot->active_pid != 0);

+ if (am_walsender)
+ {
+ slotname = pstrdup(NameStr(MyReplicationSlot->data.name));
+ is_physical = SlotIsPhysical(MyReplicationSlot);
+ }
+

3a.
Notice 'MyReplicationSlot' is already assigned to the local 'slot'
variable, so IMO it is better if this new code also uses that 'slot'
variable for consistency with the rest of the function.

~

3b.
Consider flipping the flag to be 'is_logical' instead of
'is_physical', so the ereport substitution will match the other
ReplicationSlotAcquirecode suggested above (#2a).

~

SUGGESTION (For #3a and #3b)
if (am_walsender)
{
  slotname = pstrdup(NameStr(slot->data.name));
  is_logical = SlotIsLogical(slot);
}

======
Kind Regards,
Peter Smith.
Fujitsu Australia



On Tue, Nov 14, 2023 at 4:50 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are some review comments for patch v13-0001.

Thanks.

> ======
> doc/src/sgml/config.sgml
>
> 1.
>
> Should that also mention about walsender?
>
> e.g.
> "and slot acquisition/release" ==> "and <literal>walsender</literal>
> slot acquisition/release"

Changed.

> 2a.
> Instead of calling SlotIsLogical() and then again calling
> SlotIsPhysical(), it might be better to assign this one time to a
> local variable.
>
> 2b.
> IMO it is better to continue using variable 's' here instead of
> 'MyReplicationSlot'. Code is not only shorter but is also consistent
> with the rest of the function which never uses MyReplicationSlot, even
> in the places where it could have.
>
> SUGGESTION (for #2a and #2b)
> is_logical = SlotIsLogical(s);
> if (is_logical)
>   pgstat_acquire_replslot(s);
>
> if (am_walsender)
>   ereport(log_replication_commands ? LOG : DEBUG1,
>     errmsg_internal("acquired %s replication slot \"%s\"",
>       is_logical ? "logical" : "physical", NameStr(s->data.name)));

Use of a separate variable isn't good IMO, I used SlotIsLogical(s); directly.

> 3a.
> Notice 'MyReplicationSlot' is already assigned to the local 'slot'
> variable, so IMO it is better if this new code also uses that 'slot'
> variable for consistency with the rest of the function.
>
> 3b.
> Consider flipping the flag to be 'is_logical' instead of
> 'is_physical', so the ereport substitution will match the other
> ReplicationSlotAcquirecode suggested above (#2a).
>
> SUGGESTION (For #3a and #3b)
> if (am_walsender)
> {
>   slotname = pstrdup(NameStr(slot->data.name));
>   is_logical = SlotIsLogical(slot);
> }

Done.

PSA v14 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment
Hi,

Thanks for addressing my previous comments. Patch v14-0001 looks good
to me, except I have one question:

The patch uses errmsg_internal() for the logging, but I noticed the
only other code using GUC 'log_replication_commands' has errmsg()
instead of errmsg_internal(). Isn't it better to be consistent with
the existing code?

======
Kind Regards,
Peter Smith.
Fujitsu Australia



On Wed, Nov 15, 2023 at 4:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Thanks for addressing my previous comments. Patch v14-0001 looks good
> to me, except I have one question:
>
> The patch uses errmsg_internal() for the logging, but I noticed the
> only other code using GUC 'log_replication_commands' has errmsg()
> instead of errmsg_internal(). Isn't it better to be consistent with
> the existing code?
>

I agree that we should errmsg here. If we read the description of
errmsg_internal() [1], it is recommended to be used for "cannot
happen" cases where we don't want to spend translation effort which is
not the case here. Also, similar to the below message, we should add a
comment for a translator.

ereport(LOG,
/* translator: %s is SIGKILL or SIGABRT */
(errmsg("issuing %s to recalcitrant children",
send_abort_for_kill ? "SIGABRT" : "SIGKILL")));

Another minor comment:
+        Causes each replication command and <literal>walsender</literal>
+        process replication slot acquisition/release to be logged in the server
+        log.

Isn't it better to use process's instead of process in the above sentence?

[1] -https://www.postgresql.org/docs/devel/error-message-reporting.html

--
With Regards,
Amit Kapila.



On Wed, Nov 15, 2023 at 9:49 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Nov 15, 2023 at 4:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Thanks for addressing my previous comments. Patch v14-0001 looks good
> > to me, except I have one question:
> >
> > The patch uses errmsg_internal() for the logging, but I noticed the
> > only other code using GUC 'log_replication_commands' has errmsg()
> > instead of errmsg_internal(). Isn't it better to be consistent with
> > the existing code?
> >
>
> I agree that we should errmsg here. If we read the description of
> errmsg_internal() [1], it is recommended to be used for "cannot
> happen" cases where we don't want to spend translation effort which is
> not the case here.

I chose not to translate the newly added messages as they are only
written to server logs not sent to the client. However, I've changed
to errmsg, after looking at the errmsg_internal docs.

> Also, similar to the below message, we should add a
> comment for a translator.
>
> ereport(LOG,
> /* translator: %s is SIGKILL or SIGABRT */
> (errmsg("issuing %s to recalcitrant children",
> send_abort_for_kill ? "SIGABRT" : "SIGKILL")));

Added.

> Another minor comment:
> +        Causes each replication command and <literal>walsender</literal>
> +        process replication slot acquisition/release to be logged in the server
> +        log.
>
> Isn't it better to use process's instead of process in the above sentence?

Changed.

PSA v15 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment
Patch v15-0001 LGTM.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



On Wed, Nov 15, 2023 at 11:00 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> PSA v15 patch.
>

The patch looks good to me. I have slightly modified the translator
message and commit message in the attached. I'll push this tomorrow
unless there are any comments.

--
With Regards,
Amit Kapila.

Attachment
Translation-wise, this doesn't work, because you're building a string.
There's no reason to think that the words "logical" and "physical"
should stay untranslated; the message would make no sense, or at least
would be very ugly.

You should do something like

if (am_walsender)
{
    ereport(log_replication_commands ? LOG : DEBUG1,
        SlotIsLogical(s) ? errmsg("acquired logical replication slot \"%s\"", NameStr(s->data.name)) :
        errmsg("acquired physical replication slot \"%s\"", NameStr(s->data.name)));
}

(Obviously, lose the "translator:" comments since they are unnecessary)


If you really want to keep the "logical"/"physical" word untranslated,
you need to split it out of the sentence somehow.  But it would be
really horrible IMO.  Like

errmsg("acquired replication slot \"%s\" of type \"%s\"",
       NameStr(s->data.name), SlotIsLogical(s) ? "logical" : "physical")

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Use it up, wear it out, make it do, or do without"



On Wed, Nov 15, 2023 at 3:58 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> Translation-wise, this doesn't work, because you're building a string.
> There's no reason to think that the words "logical" and "physical"
> should stay untranslated; the message would make no sense, or at least
> would be very ugly.
>
> You should do something like
>
> if (am_walsender)
> {
>         ereport(log_replication_commands ? LOG : DEBUG1,
>                 SlotIsLogical(s) ? errmsg("acquired logical replication slot \"%s\"", NameStr(s->data.name)) :
>                 errmsg("acquired physical replication slot \"%s\"", NameStr(s->data.name)));
> }

This seems better, so done that way.

> (Obviously, lose the "translator:" comments since they are unnecessary)

The translator message now indicates that the remaining %s denotes the
replication slot name.

PSA v17 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment
Some minor comments for v17-0001.

======

1.
+       ereport(log_replication_commands ? LOG : DEBUG1,
+               SlotIsLogical(s)
+       /* translator: %s is name of the replication slot */
+               ? errmsg("acquired logical replication slot \"%s\"",
+                        NameStr(s->data.name))
+               : errmsg("acquired physical replication slot \"%s\"",
+                        NameStr(s->data.name)));

1a.
FWIW, if the ternary was inside the errmsg, there would be less code
duplication.

~

1b.
I searched HEAD code and did not find any "translator:" comments for
just ordinary slot name substitutions like these; AFAICT that comment
is not necessary anymore.

~

SUGGESTION (#1a and #1b)

ereport(log_replication_commands ? LOG : DEBUG1,
        errmsg(SlotIsLogical(s)
               ? "acquired logical replication slot \"%s\""
               : "acquired physical replication slot \"%s\"",
               NameStr(s->data.name)));

~~~

2.
Ditto for the other ereport.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



On Thu, Nov 16, 2023 at 3:48 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> ~
>
> SUGGESTION (#1a and #1b)
>
> ereport(log_replication_commands ? LOG : DEBUG1,
>         errmsg(SlotIsLogical(s)
>                ? "acquired logical replication slot \"%s\""
>                : "acquired physical replication slot \"%s\"",
>                NameStr(s->data.name)));
>
> ~~~
>

Personally, I prefer the way Bharath had in his patch. Did you see any
preferred way in the existing code?

--
With Regards,
Amit Kapila.



On Wed, Nov 15, 2023 at 3:58 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> Translation-wise, this doesn't work, because you're building a string.
> There's no reason to think that the words "logical" and "physical"
> should stay untranslated; the message would make no sense, or at least
> would be very ugly.
>
> You should do something like
>
> if (am_walsender)
> {
>         ereport(log_replication_commands ? LOG : DEBUG1,
>                 SlotIsLogical(s) ? errmsg("acquired logical replication slot \"%s\"", NameStr(s->data.name)) :
>                 errmsg("acquired physical replication slot \"%s\"", NameStr(s->data.name)));
> }
>
> (Obviously, lose the "translator:" comments since they are unnecessary)
>
>
> If you really want to keep the "logical"/"physical" word untranslated,
> you need to split it out of the sentence somehow.  But it would be
> really horrible IMO.  Like
>
> errmsg("acquired replication slot \"%s\" of type \"%s\"",
>        NameStr(s->data.name), SlotIsLogical(s) ? "logical" : "physical")
>

Thanks for the suggestion. I would like to clarify on this a bit. What
do exactly mean by splitting out of the sentence? For example, in one
of the existing messages:

ereport(LOG,
/* translator: %s is SIGKILL or SIGABRT */
(errmsg("issuing %s to recalcitrant children",
send_abort_for_kill ? "SIGABRT" : "SIGKILL")));

Do here words SIGABRT/SIGKILL remain untranslated due to the
translator's comment? I thought this was similar to the message being
proposed but seems like this message construction follows translation
rules better.

--
With Regards,
Amit Kapila.



On Thu, Nov 16, 2023 at 12:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Nov 15, 2023 at 3:58 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > Translation-wise, this doesn't work, because you're building a string.
> > There's no reason to think that the words "logical" and "physical"
> > should stay untranslated; the message would make no sense, or at least
> > would be very ugly.
> >
> > You should do something like
> >
> > if (am_walsender)
> > {
> >         ereport(log_replication_commands ? LOG : DEBUG1,
> >                 SlotIsLogical(s) ? errmsg("acquired logical replication slot \"%s\"", NameStr(s->data.name)) :
> >                 errmsg("acquired physical replication slot \"%s\"", NameStr(s->data.name)));
> > }
> >
> > (Obviously, lose the "translator:" comments since they are unnecessary)
> >
> >
> > If you really want to keep the "logical"/"physical" word untranslated,
> > you need to split it out of the sentence somehow.  But it would be
> > really horrible IMO.  Like
> >
> > errmsg("acquired replication slot \"%s\" of type \"%s\"",
> >        NameStr(s->data.name), SlotIsLogical(s) ? "logical" : "physical")
> >
>
> Thanks for the suggestion. I would like to clarify on this a bit. What
> do exactly mean by splitting out of the sentence? For example, in one
> of the existing messages:
>
> ereport(LOG,
> /* translator: %s is SIGKILL or SIGABRT */
> (errmsg("issuing %s to recalcitrant children",
> send_abort_for_kill ? "SIGABRT" : "SIGKILL")));
>
> Do here words SIGABRT/SIGKILL remain untranslated due to the
> translator's comment? I thought this was similar to the message being
> proposed but seems like this message construction follows translation
> rules better.
>

IIUC, that example is different because "SIGABRT" / "SIGKILL" are not
real words, so you don't want the translator to attempt to translate
them.You want them to appear in the message as-is.

OTOH in this patch "logical" and "physical" are just normal English
words that should be translated as part of the original message.
e.g. like in these similar messages:
- msgid "database \"%s\" is used by an active logical replication slot"
- msgstr "la base de données « %s » est utilisée par un slot de
réplication logique actif"

======
Kind Regards,
Peter Smith.
Fujitsu Australia



On Thu, Nov 16, 2023 at 12:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Nov 16, 2023 at 3:48 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > ~
> >
> > SUGGESTION (#1a and #1b)
> >
> > ereport(log_replication_commands ? LOG : DEBUG1,
> >         errmsg(SlotIsLogical(s)
> >                ? "acquired logical replication slot \"%s\""
> >                : "acquired physical replication slot \"%s\"",
> >                NameStr(s->data.name)));
> >
> > ~~~
> >
>
> Personally, I prefer the way Bharath had in his patch. Did you see any
> preferred way in the existing code?

Not really. I think the errmsg combined with ternary is not so common.
I couldn't find many examples, so I wouldn't try to claim anything is
a "preferred" way

There are some existing examples, like Bharath had:

ereport(NOTICE,
        (errcode(ERRCODE_DUPLICATE_OBJECT),
         collencoding == -1
         ? errmsg("collation \"%s\" already exists, skipping",
                  collname)
         : errmsg("collation \"%s\" for encoding \"%s\" already
exists, skipping",
                  collname, pg_encoding_to_char(collencoding))));

OTOH, when there are different numbers of substitution parameters in
each of the errmsg like that, you don't have much choice but to do it
that way.

I am fine with whatever is chosen -- I was only offering an alternative.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



On 2023-Nov-16, Peter Smith wrote:

> I searched HEAD code and did not find any "translator:" comments for
> just ordinary slot name substitutions like these; AFAICT that comment
> is not necessary anymore.

True.  Lose that.

The rationale I have is to consider whether a translator looking at the
original message message in isolation is going to understand what the %s
means.  If it's possible to tell what it is without having to go read
the source code that leads to the message, then you don't need a
"translator:" comment.  Otherwise you do.

You also need to assume the translator is not stupid, but that seems an
OK assumption.

> SUGGESTION (#1a and #1b)
> 
> ereport(log_replication_commands ? LOG : DEBUG1,
>         errmsg(SlotIsLogical(s)
>                ? "acquired logical replication slot \"%s\""
>                : "acquired physical replication slot \"%s\"",
>                NameStr(s->data.name)));

The bad thing about this is that gettext() is not going to pick up these
strings into the translation catalog.  You could fix that by adding
gettext_noop() calls around them:

 ereport(log_replication_commands ? LOG : DEBUG1,
         errmsg(SlotIsLogical(s)
                ? gettext_noop("acquired logical replication slot \"%s\"")
                : gettext_noop("acquired physical replication slot \"%s\""),
                NameStr(s->data.name)));

but at that point it's not clear that it's really better than putting
the ternary in the outer scope.

You can verify this by doing "make update-po" and then searching for the
messages in postgres.pot.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Cada quien es cada cual y baja las escaleras como quiere" (JMSerrat)



On Thu, Nov 16, 2023 at 4:01 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2023-Nov-16, Peter Smith wrote:
>
> > I searched HEAD code and did not find any "translator:" comments for
> > just ordinary slot name substitutions like these; AFAICT that comment
> > is not necessary anymore.
>
> True.  Lose that.

Removed.

> The rationale I have is to consider whether a translator looking at the
> original message message in isolation is going to understand what the %s
> means.  If it's possible to tell what it is without having to go read
> the source code that leads to the message, then you don't need a
> "translator:" comment.  Otherwise you do.

Agree. I think it's easy for one to guess the slot name follows "....
replication slot %s", so I removed the translator message.

> > SUGGESTION (#1a and #1b)
> >
> > ereport(log_replication_commands ? LOG : DEBUG1,
> >         errmsg(SlotIsLogical(s)
> >                ? "acquired logical replication slot \"%s\""
> >                : "acquired physical replication slot \"%s\"",
> >                NameStr(s->data.name)));
>
> The bad thing about this is that gettext() is not going to pick up these
> strings into the translation catalog.  You could fix that by adding
> gettext_noop() calls around them:
>
>  ereport(log_replication_commands ? LOG : DEBUG1,
>          errmsg(SlotIsLogical(s)
>                 ? gettext_noop("acquired logical replication slot \"%s\"")
>                 : gettext_noop("acquired physical replication slot \"%s\""),
>                 NameStr(s->data.name)));
>
> but at that point it's not clear that it's really better than putting
> the ternary in the outer scope.

I retained wrapping messages in errmsg("...").

> You can verify this by doing "make update-po" and then searching for the
> messages in postgres.pot.

Translation gives me [1] with v18 patch

PSA v18 patch.

[1]
#: replication/slot.c:545
#, c-format
msgid "acquired logical replication slot \"%s\""
msgstr ""

#: replication/slot.c:547
#, c-format
msgid "acquired physical replication slot \"%s\""
msgstr ""

#: replication/slot.c:622
#, c-format
msgid "released logical replication slot \"%s\""
msgstr ""

#: replication/slot.c:624
#, c-format
msgid "released physical replication slot \"%s\""
msgstr ""

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment
On Thu, Nov 16, 2023 at 6:09 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Thu, Nov 16, 2023 at 4:01 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > On 2023-Nov-16, Peter Smith wrote:
> >
> > > I searched HEAD code and did not find any "translator:" comments for
> > > just ordinary slot name substitutions like these; AFAICT that comment
> > > is not necessary anymore.
> >
> > True.  Lose that.
>
> Removed.
>
> > The rationale I have is to consider whether a translator looking at the
> > original message message in isolation is going to understand what the %s
> > means.  If it's possible to tell what it is without having to go read
> > the source code that leads to the message, then you don't need a
> > "translator:" comment.  Otherwise you do.
>
> Agree. I think it's easy for one to guess the slot name follows "....
> replication slot %s", so I removed the translator message.
>
> > > SUGGESTION (#1a and #1b)
> > >
> > > ereport(log_replication_commands ? LOG : DEBUG1,
> > >         errmsg(SlotIsLogical(s)
> > >                ? "acquired logical replication slot \"%s\""
> > >                : "acquired physical replication slot \"%s\"",
> > >                NameStr(s->data.name)));
> >
> > The bad thing about this is that gettext() is not going to pick up these
> > strings into the translation catalog.  You could fix that by adding
> > gettext_noop() calls around them:
> >
> >  ereport(log_replication_commands ? LOG : DEBUG1,
> >          errmsg(SlotIsLogical(s)
> >                 ? gettext_noop("acquired logical replication slot \"%s\"")
> >                 : gettext_noop("acquired physical replication slot \"%s\""),
> >                 NameStr(s->data.name)));
> >
> > but at that point it's not clear that it's really better than putting
> > the ternary in the outer scope.
>
> I retained wrapping messages in errmsg("...").
>
> > You can verify this by doing "make update-po" and then searching for the
> > messages in postgres.pot.
>
> Translation gives me [1] with v18 patch
>
> PSA v18 patch.
>

LGTM. I'll push this early next week unless there are further
suggestions or comments.

--
With Regards,
Amit Kapila.



On Sat, Nov 18, 2023 at 4:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Nov 16, 2023 at 6:09 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> > PSA v18 patch.
> >
>
> LGTM. I'll push this early next week unless there are further
> suggestions or comments.
>

Pushed.

--
With Regards,
Amit Kapila.