Thread: [PATCH] Refactor pqformat.{c,h} and protocol.h

[PATCH] Refactor pqformat.{c,h} and protocol.h

From
Aleksander Alekseev
Date:
Hi,

While investigating a bug report [1] I wanted to find all the pieces
of code that form PqMsg_DataRow messages and couldn't easily do it.
This is because one authors prefer writing:

pq_beginmessage(buf, 'D');

.. while others:

pq_beginmessage(buf, PqMsg_DataRow);

The proposed patchset fixes this.

- Patch 1 replaces all the char's with PqMsg's
- Patch 2 makes PqMsg an enum. This ensures that the problem will not
appear again in the future and also gives us a bit more type-safety.
- Patch 3 rearranges the order of the functions in pqformat.{c,h} a
bit to make the code easier to read.

[1]: https://www.postgresql.org/message-id/flat/1df84daa-7d0d-e8cc-4762-85523e45e5e7%40mailbox.org

-- 
Best regards,
Aleksander Alekseev

Attachment

Re: [PATCH] Refactor pqformat.{c,h} and protocol.h

From
Aleksander Alekseev
Date:
Hi,

> The proposed patchset fixes this.

In v1 I mistakenly named the enum PgMsg while it should have been
PqMsg. Here is the corrected patchset.

-- 
Best regards,
Aleksander Alekseev

Attachment

Re: [PATCH] Refactor pqformat.{c,h} and protocol.h

From
Nathan Bossart
Date:
On Tue, Jul 16, 2024 at 04:09:44PM +0300, Aleksander Alekseev wrote:
> - Patch 1 replaces all the char's with PqMsg's

Thanks.  I'll look into committing this one in the near future.

> - Patch 2 makes PqMsg an enum. This ensures that the problem will not
> appear again in the future and also gives us a bit more type-safety.

This was briefly brought up in the discussion that ultimately led to
protocol.h [0].  I don't have a particularly strong opinion on the matter,
but I will note that protocol.h was intended to be easily usable in
third-party code, and changing it from characters to an enum from v17 to
v18 might cause some extra code churn.

[0] https://postgr.es/m/20230807091044.jjgsl2rgheazaung%40alvherre.pgsql

-- 
nathan



Re: [PATCH] Refactor pqformat.{c,h} and protocol.h

From
Tom Lane
Date:
Nathan Bossart <nathandbossart@gmail.com> writes:
> This was briefly brought up in the discussion that ultimately led to
> protocol.h [0].  I don't have a particularly strong opinion on the matter,
> but I will note that protocol.h was intended to be easily usable in
> third-party code, and changing it from characters to an enum from v17 to
> v18 might cause some extra code churn.

We could avoid that issue by back-patching into 17; I don't think
it's quite too late for that, and the header is new as of 17.

However, I'm generally -1 on the proposal independently of that,
because I think it is the wrong thing from an ABI standpoint.
The message type codes in the protocol are chars, full stop.
If we declare the data type as an enum, we have basically zero
control over how wide the compiler will choose to make that ---
although it's pretty likely that it *won't* choose char width.
So you could not represent the message layout accurately with
an enum.  Perhaps nobody is doing that, but it seems to me like
a foot-gun of roughly the same caliber as not using an enum.
Also, going in this direction would require adding casts between
char and enum in assorted places, which might be error-prone or
warning-inducing.

So on the whole, "leave it alone" seems like the right answer.

(This applies only to the s/char/enum/ proposal; I've not read
the patchset further than that.)

            regards, tom lane



Re: [PATCH] Refactor pqformat.{c,h} and protocol.h

From
Aleksander Alekseev
Date:
Hi,

> > This was briefly brought up in the discussion that ultimately led to
> > protocol.h [0].  I don't have a particularly strong opinion on the matter,
> > but I will note that protocol.h was intended to be easily usable in
> > third-party code, and changing it from characters to an enum from v17 to
> > v18 might cause some extra code churn.
>
> We could avoid that issue by back-patching into 17; I don't think
> it's quite too late for that, and the header is new as of 17.
>
> However, I'm generally -1 on the proposal independently of that,
>
> [...]
>
> (This applies only to the s/char/enum/ proposal; I've not read
> the patchset further than that.)

That's fair enough. Also it's not clear how much type safety enums
give us exactly. E.g. after applying 0002 pq_putmessage() a.k.a.
PQcommMethods->putmessage() silently casts its first argument from
`enum PqMsg` to `char` (shrug).

-- 
Best regards,
Aleksander Alekseev



Re: [PATCH] Refactor pqformat.{c,h} and protocol.h

From
Nathan Bossart
Date:
I took a closer look at 0001.

diff --git a/src/include/libpq/protocol.h b/src/include/libpq/protocol.h
index 4b8d440365..8c0f095edf 100644
--- a/src/include/libpq/protocol.h
+++ b/src/include/libpq/protocol.h
@@ -47,6 +47,7 @@
 #define PqMsg_EmptyQueryResponse    'I'
 #define PqMsg_BackendKeyData        'K'
 #define PqMsg_NoticeResponse        'N'
+#define PqMsg_Progress              'P'
 #define PqMsg_AuthenticationRequest 'R'
 #define PqMsg_ParameterStatus        'S'
 #define PqMsg_RowDescription        'T'

As discussed elsewhere [0], we can add the leader/worker protocol
characters to protocol.h, but they should probably go in a separate
section.  I'd recommend breaking that part out to a separate patch, too.

[0] https://postgr.es/m/20230829161555.GB2136737%40nathanxps13

-- 
nathan



Re: [PATCH] Refactor pqformat.{c,h} and protocol.h

From
Aleksander Alekseev
Date:
Hi,

> As discussed elsewhere [0], we can add the leader/worker protocol
> characters to protocol.h, but they should probably go in a separate
> section.  I'd recommend breaking that part out to a separate patch, too.

OK, here is the updated patchset. This time I chose not to include this patch:

> - Patch 3 rearranges the order of the functions in pqformat.{c,h} a
> bit to make the code easier to read.

... since arguably there is not much value in it. Please let me know
if you think it's actually needed.

-- 
Best regards,
Aleksander Alekseev

Attachment

Re: [PATCH] Refactor pqformat.{c,h} and protocol.h

From
Nathan Bossart
Date:
On Tue, Jul 16, 2024 at 09:14:35PM +0300, Aleksander Alekseev wrote:
>> As discussed elsewhere [0], we can add the leader/worker protocol
>> characters to protocol.h, but they should probably go in a separate
>> section.  I'd recommend breaking that part out to a separate patch, too.
> 
> OK, here is the updated patchset. This time I chose not to include this patch:

Thanks.  The only thing that stands out to me is the name of the parallel
leader/worker protocol message.  In the original thread for protocol
characters, some early versions of the patch called it a "parallel
progress" message, but this new one just calls it PqMsg_Progress.  I guess
PqMsg_ParallelProgress might be a tad more descriptive and less likely to
cause naming collisions with new frontend/backend messages, but I'm not
tremendously worried about either of those things.  Thoughts?

-- 
nathan



Re: [PATCH] Refactor pqformat.{c,h} and protocol.h

From
Aleksander Alekseev
Date:
Hi,

> Thanks.  The only thing that stands out to me is the name of the parallel
> leader/worker protocol message.  In the original thread for protocol
> characters, some early versions of the patch called it a "parallel
> progress" message, but this new one just calls it PqMsg_Progress.  I guess
> PqMsg_ParallelProgress might be a tad more descriptive and less likely to
> cause naming collisions with new frontend/backend messages, but I'm not
> tremendously worried about either of those things.  Thoughts?

Personally I'm fine with either option.

-- 
Best regards,
Aleksander Alekseev



Re: [PATCH] Refactor pqformat.{c,h} and protocol.h

From
Nathan Bossart
Date:
On Tue, Jul 16, 2024 at 10:58:37PM +0300, Aleksander Alekseev wrote:
>> Thanks.  The only thing that stands out to me is the name of the parallel
>> leader/worker protocol message.  In the original thread for protocol
>> characters, some early versions of the patch called it a "parallel
>> progress" message, but this new one just calls it PqMsg_Progress.  I guess
>> PqMsg_ParallelProgress might be a tad more descriptive and less likely to
>> cause naming collisions with new frontend/backend messages, but I'm not
>> tremendously worried about either of those things.  Thoughts?
> 
> Personally I'm fine with either option.

Alright.  Well, I guess I'll flip a coin tomorrow unless someone else
chimes in with an opinion.

-- 
nathan



Re: [PATCH] Refactor pqformat.{c,h} and protocol.h

From
Nathan Bossart
Date:
On Tue, Jul 16, 2024 at 04:38:06PM -0500, Nathan Bossart wrote:
> Alright.  Well, I guess I'll flip a coin tomorrow unless someone else
> chimes in with an opinion.

Committed and back-patched to v17.  I left it as PqMsg_Progress.

-- 
nathan