Thread: Add lookup table for replication slot invalidation causes

Add lookup table for replication slot invalidation causes

From
Bharath Rupireddy
Date:
Hi,

Presently, replication slot invalidation causes and their text are
scattered into ReplicationSlotInvalidationCause enum and a bunch of
macros. This is making the code to get invalidation cause text given
the cause as enum and vice-versa unreadable, longer and inextensible.
The attached patch adds a lookup table for all invalidation causes for
better readability and extensibility. FWIW, another patch in
discussion https://www.postgresql.org/message-id/CALj2ACWgACB4opnbqi=x7Hc4aqcgkXoLsh1VB+gfidXaDQNu_Q@mail.gmail.com
adds a couple of other invalidation reasons, this lookup table makes
the life easier and code shorter.

Thoughts?

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Add lookup table for replication slot invalidation causes

From
Jelte Fennema-Nio
Date:
On Tue, 20 Feb 2024 at 12:11, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
> Thoughts?

Seems like a good improvement overall. But I'd prefer the definition
of the lookup table to use this syntax:

const char *const SlotInvalidationCauses[] = {
    [RS_INVAL_NONE] = "none",
    [RS_INVAL_WAL_REMOVED] = "wal_removed",
    [RS_INVAL_HORIZON] = "rows_removed",
    [RS_INVAL_WAL_LEVEL] = "wal_level_sufficient",
};


Regarding the actual patch:

-   Assert(conflict_reason);

Probably we should keep this Assert. As well as the Assert(0)


+   for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++)

Strictly speaking this is a slight change in behaviour, since now
"none" is also parsed. That seems fine to me though.



Re: Add lookup table for replication slot invalidation causes

From
Michael Paquier
Date:
On Tue, Feb 20, 2024 at 05:53:03PM +0100, Jelte Fennema-Nio wrote:
> Seems like a good improvement overall. But I'd prefer the definition
> of the lookup table to use this syntax:
>
> const char *const SlotInvalidationCauses[] = {
>     [RS_INVAL_NONE] = "none",
>     [RS_INVAL_WAL_REMOVED] = "wal_removed",
>     [RS_INVAL_HORIZON] = "rows_removed",
>     [RS_INVAL_WAL_LEVEL] = "wal_level_sufficient",
> };

+1.

> Regarding the actual patch:
>
> -   Assert(conflict_reason);
>
> Probably we should keep this Assert. As well as the Assert(0)

The assert(0) at the end of the routine, likely so.  I don't see a
huge point for the assert on conflict_reason as we'd crash anyway  on
strcmp, no?

> +   for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++)
>
> Strictly speaking this is a slight change in behaviour, since now
> "none" is also parsed. That seems fine to me though.

Yep.  This does not strike me as an issue.  We only use
GetSlotInvalidationCause() in synchronize_slots(), mapping to NULL in
the case of "none".

Agreed that this is an improvement.

+/* Maximum number of invalidation causes */
+#define    RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL

There is no need to add that to slot.h: it is only used in slot.c.
--
Michael

Attachment

Re: Add lookup table for replication slot invalidation causes

From
Bharath Rupireddy
Date:
On Wed, Feb 21, 2024 at 5:04 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Feb 20, 2024 at 05:53:03PM +0100, Jelte Fennema-Nio wrote:
> > Seems like a good improvement overall. But I'd prefer the definition
> > of the lookup table to use this syntax:
> >
> > const char *const SlotInvalidationCauses[] = {
> >     [RS_INVAL_NONE] = "none",
> >     [RS_INVAL_WAL_REMOVED] = "wal_removed",
> >     [RS_INVAL_HORIZON] = "rows_removed",
> >     [RS_INVAL_WAL_LEVEL] = "wal_level_sufficient",
> > };
>
> +1.

Done that way. I'm fine with the designated initialization [1] that an
ISO C99 compliant compiler offers. PostgreSQL installation guide
https://www.postgresql.org/docs/current/install-requirements.html says
that we need an at least C99-compliant ISO/ANSI C compiler.

[1] https://open-std.org/JTC1/SC22/WG14/www/docs/n494.pdf
https://en.cppreference.com/w/c/99
https://www.ibm.com/docs/en/zos/2.4.0?topic=initializers-designated-aggregate-types-c-only

> > Regarding the actual patch:
> >
> > -   Assert(conflict_reason);
> >
> > Probably we should keep this Assert. As well as the Assert(0)
>
> The assert(0) at the end of the routine, likely so.  I don't see a
> huge point for the assert on conflict_reason as we'd crash anyway  on
> strcmp, no?

Right, but an assertion isn't a bad idea there as it can generate a
backtrace as opposed to the crash generating just SEGV note (and
perhaps a crash dump) in server logs.

With these two asserts, the behavior (asserts on null and non-existent
inputs) is the same as what GetSlotInvalidationCause has right now.

> +/* Maximum number of invalidation causes */
> +#define    RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
>
> There is no need to add that to slot.h: it is only used in slot.c.

Right, but it needs to be updated whenever a new cause is added to
enum ReplicationSlotInvalidationCause. Therefore, I think it's better
to be closer there in slot.h.

Please see the attached v2 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Add lookup table for replication slot invalidation causes

From
Michael Paquier
Date:
On Wed, Feb 21, 2024 at 09:49:37AM +0530, Bharath Rupireddy wrote:
> On Wed, Feb 21, 2024 at 5:04 AM Michael Paquier <michael@paquier.xyz> wrote:
> Done that way. I'm fine with the designated initialization [1] that an
> ISO C99 compliant compiler offers. PostgreSQL installation guide
> https://www.postgresql.org/docs/current/install-requirements.html says
> that we need an at least C99-compliant ISO/ANSI C compiler.

Note the recent commit 74a730631065 where Alvaro has changed for the
lwlock tranche names.  That's quite elegant.

> Right, but an assertion isn't a bad idea there as it can generate a
> backtrace as opposed to the crash generating just SEGV note (and
> perhaps a crash dump) in server logs.
>
> With these two asserts, the behavior (asserts on null and non-existent
> inputs) is the same as what GetSlotInvalidationCause has right now.

Well, I won't fight you over that.

>> +/* Maximum number of invalidation causes */
>> +#define    RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
>>>> There is no need to add that to slot.h: it is only used in slot.c.

>
> Right, but it needs to be updated whenever a new cause is added to
> enum ReplicationSlotInvalidationCause. Therefore, I think it's better
> to be closer there in slot.h.

A new cause would require an update of SlotInvalidationCause, so if
you keep RS_INVAL_MAX_CAUSES close to it that's impossible to miss.
IMO, it makes just more sense to keep that in slot.c because of the
static assert as well.

+ * If you add a new invalidation cause here, remember to add its name in
+ * SlotInvalidationCauses in the same order as that of the cause.

The order does not matter with the way v2 does things with
SlotInvalidationCauses[], no?
--
Michael

Attachment

Re: Add lookup table for replication slot invalidation causes

From
Bharath Rupireddy
Date:
On Wed, Feb 21, 2024 at 11:56 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> Note the recent commit 74a730631065 where Alvaro has changed for the
> lwlock tranche names.  That's quite elegant.

Yes, that's absolutely neat. FWIW, designated initializer syntax can
be used in a few more places though. I'm not sure how much worth it
will be but I'll see if I can quickly put up a patch for it.

> > With these two asserts, the behavior (asserts on null and non-existent
> > inputs) is the same as what GetSlotInvalidationCause has right now.
>
> Well, I won't fight you over that.

Haha :)

> A new cause would require an update of SlotInvalidationCause, so if
> you keep RS_INVAL_MAX_CAUSES close to it that's impossible to miss.
> IMO, it makes just more sense to keep that in slot.c because of the
> static assert as well.

Hm, okay. Moved that to slot.c but left a note in the comment atop
enum to update it.

> + * If you add a new invalidation cause here, remember to add its name in
> + * SlotInvalidationCauses in the same order as that of the cause.
>
> The order does not matter with the way v2 does things with
> SlotInvalidationCauses[], no?

Ugh. Corrected that now.

Please see the attached v3 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Add lookup table for replication slot invalidation causes

From
Michael Paquier
Date:
On Wed, Feb 21, 2024 at 12:50:00PM +0530, Bharath Rupireddy wrote:
> Please see the attached v3 patch.

Seems globally OK, so applied.  I've simplified a bit the comments,
painted some extra const, and kept variable name as conflict_reason as
the other routines of slot.h use "name" already to refer to the slot
names, and that was a bit confusing IMO.
--
Michael

Attachment

Re: Add lookup table for replication slot invalidation causes

From
Peter Smith
Date:
Hi, Sorry for the late comment but isn't the pushed logic now
different to what it was there before?

IIUC previously (in a non-debug build) if the specified
conflict_reason was not found, it returned RS_INVAL_NONE -- now it
seems to return whatever enum happens to be last.

How about something more like below:

----------
ReplicationSlotInvalidationCause
GetSlotInvalidationCause(const char *conflict_reason)
{
  ReplicationSlotInvalidationCause cause;
  bool        found  = false;

  for (cause = 0; !found && cause <= RS_INVAL_MAX_CAUSES; cause++)
    found = strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0;

  Assert(found);
  return found ? cause : RS_INVAL_NONE;
}
----------

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Add lookup table for replication slot invalidation causes

From
Peter Smith
Date:
On Thu, Feb 22, 2024 at 5:19 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi, Sorry for the late comment but isn't the pushed logic now
> different to what it was there before?
>
> IIUC previously (in a non-debug build) if the specified
> conflict_reason was not found, it returned RS_INVAL_NONE -- now it
> seems to return whatever enum happens to be last.
>
> How about something more like below:
>
> ----------
> ReplicationSlotInvalidationCause
> GetSlotInvalidationCause(const char *conflict_reason)
> {
>   ReplicationSlotInvalidationCause cause;
>   bool        found  = false;
>
>   for (cause = 0; !found && cause <= RS_INVAL_MAX_CAUSES; cause++)
>     found = strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0;
>
>   Assert(found);
>   return found ? cause : RS_INVAL_NONE;
> }
> ----------
>

Oops. Perhaps I meant more like below -- in any case, the point was
the same -- to ensure RS_INVAL_NONE is what returns if something
unexpected happens.

ReplicationSlotInvalidationCause
GetSlotInvalidationCause(const char *conflict_reason)
{
    ReplicationSlotInvalidationCause cause;

    for (cause = 0; cause <= RS_INVAL_MAX_CAUSES; cause++)
    {
        if (strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0)
            return cause;
    }

    Assert(0);
    return RS_INVAL_NONE;
}

----------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Add lookup table for replication slot invalidation causes

From
Michael Paquier
Date:
On Thu, Feb 22, 2024 at 05:30:08PM +1100, Peter Smith wrote:
> Oops. Perhaps I meant more like below -- in any case, the point was
> the same -- to ensure RS_INVAL_NONE is what returns if something
> unexpected happens.

You are right that this could be a bit confusing, even if we should
never reach this state.  How about avoiding to return the index of the
loop as result, as of the attached?  Would you find that cleaner?
--
Michael

Attachment

Re: Add lookup table for replication slot invalidation causes

From
Bharath Rupireddy
Date:
On Thu, Feb 22, 2024 at 12:26 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Feb 22, 2024 at 05:30:08PM +1100, Peter Smith wrote:
> > Oops. Perhaps I meant more like below -- in any case, the point was
> > the same -- to ensure RS_INVAL_NONE is what returns if something
> > unexpected happens.
>
> You are right that this could be a bit confusing, even if we should
> never reach this state.  How about avoiding to return the index of the
> loop as result, as of the attached?  Would you find that cleaner?

Looks neat!

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add lookup table for replication slot invalidation causes

From
Peter Smith
Date:
On Thu, Feb 22, 2024 at 5:56 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Feb 22, 2024 at 05:30:08PM +1100, Peter Smith wrote:
> > Oops. Perhaps I meant more like below -- in any case, the point was
> > the same -- to ensure RS_INVAL_NONE is what returns if something
> > unexpected happens.
>
> You are right that this could be a bit confusing, even if we should
> never reach this state.  How about avoiding to return the index of the
> loop as result, as of the attached?  Would you find that cleaner?
> --

Hi, yes, it should never happen, but thanks for making the changes.

I would've just removed every local variable instead of adding more of
them. I also felt the iteration starting from RS_INVAL_NONE instead of
0 is asserting RS_INVAL_NONE must always be the first enum and can't
be rearranged. Probably it will never happen, but why require it?

------
ReplicationSlotInvalidationCause
GetSlotInvalidationCause(const char *conflict_reason)
{
    for (ReplicationSlotInvalidationCause cause = 0; cause <=
RS_INVAL_MAX_CAUSES; cause++)
        if (strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0)
            return cause;

    Assert(0);
    return RS_INVAL_NONE;
}
------

But maybe those nits are a matter of personal choice. Your patch code
addressed my main concern, so it LGTM.

----------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Add lookup table for replication slot invalidation causes

From
Michael Paquier
Date:
On Fri, Feb 23, 2024 at 09:04:04AM +1100, Peter Smith wrote:
> I would've just removed every local variable instead of adding more of
> them. I also felt the iteration starting from RS_INVAL_NONE instead of
> 0 is asserting RS_INVAL_NONE must always be the first enum and can't
> be rearranged. Probably it will never happen, but why require it?

FWIW, I think that the code is OK as-is, so I'd just let it be for
now.
--
Michael

Attachment