Thread: [PATCH] Refactor pqformat.{c,h} and protocol.h
Hi, While investigating a bug report [1] I wanted to find all the pieces of code that form PqMsg_DataRow messages and couldn't easily do it. This is because one authors prefer writing: pq_beginmessage(buf, 'D'); .. while others: pq_beginmessage(buf, PqMsg_DataRow); The proposed patchset fixes this. - Patch 1 replaces all the char's with PqMsg's - Patch 2 makes PqMsg an enum. This ensures that the problem will not appear again in the future and also gives us a bit more type-safety. - Patch 3 rearranges the order of the functions in pqformat.{c,h} a bit to make the code easier to read. [1]: https://www.postgresql.org/message-id/flat/1df84daa-7d0d-e8cc-4762-85523e45e5e7%40mailbox.org -- Best regards, Aleksander Alekseev
Attachment
Hi, > The proposed patchset fixes this. In v1 I mistakenly named the enum PgMsg while it should have been PqMsg. Here is the corrected patchset. -- Best regards, Aleksander Alekseev
Attachment
On Tue, Jul 16, 2024 at 04:09:44PM +0300, Aleksander Alekseev wrote: > - Patch 1 replaces all the char's with PqMsg's Thanks. I'll look into committing this one in the near future. > - Patch 2 makes PqMsg an enum. This ensures that the problem will not > appear again in the future and also gives us a bit more type-safety. This was briefly brought up in the discussion that ultimately led to protocol.h [0]. I don't have a particularly strong opinion on the matter, but I will note that protocol.h was intended to be easily usable in third-party code, and changing it from characters to an enum from v17 to v18 might cause some extra code churn. [0] https://postgr.es/m/20230807091044.jjgsl2rgheazaung%40alvherre.pgsql -- nathan
Nathan Bossart <nathandbossart@gmail.com> writes: > This was briefly brought up in the discussion that ultimately led to > protocol.h [0]. I don't have a particularly strong opinion on the matter, > but I will note that protocol.h was intended to be easily usable in > third-party code, and changing it from characters to an enum from v17 to > v18 might cause some extra code churn. We could avoid that issue by back-patching into 17; I don't think it's quite too late for that, and the header is new as of 17. However, I'm generally -1 on the proposal independently of that, because I think it is the wrong thing from an ABI standpoint. The message type codes in the protocol are chars, full stop. If we declare the data type as an enum, we have basically zero control over how wide the compiler will choose to make that --- although it's pretty likely that it *won't* choose char width. So you could not represent the message layout accurately with an enum. Perhaps nobody is doing that, but it seems to me like a foot-gun of roughly the same caliber as not using an enum. Also, going in this direction would require adding casts between char and enum in assorted places, which might be error-prone or warning-inducing. So on the whole, "leave it alone" seems like the right answer. (This applies only to the s/char/enum/ proposal; I've not read the patchset further than that.) regards, tom lane
Hi, > > This was briefly brought up in the discussion that ultimately led to > > protocol.h [0]. I don't have a particularly strong opinion on the matter, > > but I will note that protocol.h was intended to be easily usable in > > third-party code, and changing it from characters to an enum from v17 to > > v18 might cause some extra code churn. > > We could avoid that issue by back-patching into 17; I don't think > it's quite too late for that, and the header is new as of 17. > > However, I'm generally -1 on the proposal independently of that, > > [...] > > (This applies only to the s/char/enum/ proposal; I've not read > the patchset further than that.) That's fair enough. Also it's not clear how much type safety enums give us exactly. E.g. after applying 0002 pq_putmessage() a.k.a. PQcommMethods->putmessage() silently casts its first argument from `enum PqMsg` to `char` (shrug). -- Best regards, Aleksander Alekseev
I took a closer look at 0001. diff --git a/src/include/libpq/protocol.h b/src/include/libpq/protocol.h index 4b8d440365..8c0f095edf 100644 --- a/src/include/libpq/protocol.h +++ b/src/include/libpq/protocol.h @@ -47,6 +47,7 @@ #define PqMsg_EmptyQueryResponse 'I' #define PqMsg_BackendKeyData 'K' #define PqMsg_NoticeResponse 'N' +#define PqMsg_Progress 'P' #define PqMsg_AuthenticationRequest 'R' #define PqMsg_ParameterStatus 'S' #define PqMsg_RowDescription 'T' As discussed elsewhere [0], we can add the leader/worker protocol characters to protocol.h, but they should probably go in a separate section. I'd recommend breaking that part out to a separate patch, too. [0] https://postgr.es/m/20230829161555.GB2136737%40nathanxps13 -- nathan
Hi, > As discussed elsewhere [0], we can add the leader/worker protocol > characters to protocol.h, but they should probably go in a separate > section. I'd recommend breaking that part out to a separate patch, too. OK, here is the updated patchset. This time I chose not to include this patch: > - Patch 3 rearranges the order of the functions in pqformat.{c,h} a > bit to make the code easier to read. ... since arguably there is not much value in it. Please let me know if you think it's actually needed. -- Best regards, Aleksander Alekseev
Attachment
On Tue, Jul 16, 2024 at 09:14:35PM +0300, Aleksander Alekseev wrote: >> As discussed elsewhere [0], we can add the leader/worker protocol >> characters to protocol.h, but they should probably go in a separate >> section. I'd recommend breaking that part out to a separate patch, too. > > OK, here is the updated patchset. This time I chose not to include this patch: Thanks. The only thing that stands out to me is the name of the parallel leader/worker protocol message. In the original thread for protocol characters, some early versions of the patch called it a "parallel progress" message, but this new one just calls it PqMsg_Progress. I guess PqMsg_ParallelProgress might be a tad more descriptive and less likely to cause naming collisions with new frontend/backend messages, but I'm not tremendously worried about either of those things. Thoughts? -- nathan
Hi, > Thanks. The only thing that stands out to me is the name of the parallel > leader/worker protocol message. In the original thread for protocol > characters, some early versions of the patch called it a "parallel > progress" message, but this new one just calls it PqMsg_Progress. I guess > PqMsg_ParallelProgress might be a tad more descriptive and less likely to > cause naming collisions with new frontend/backend messages, but I'm not > tremendously worried about either of those things. Thoughts? Personally I'm fine with either option. -- Best regards, Aleksander Alekseev
On Tue, Jul 16, 2024 at 10:58:37PM +0300, Aleksander Alekseev wrote: >> Thanks. The only thing that stands out to me is the name of the parallel >> leader/worker protocol message. In the original thread for protocol >> characters, some early versions of the patch called it a "parallel >> progress" message, but this new one just calls it PqMsg_Progress. I guess >> PqMsg_ParallelProgress might be a tad more descriptive and less likely to >> cause naming collisions with new frontend/backend messages, but I'm not >> tremendously worried about either of those things. Thoughts? > > Personally I'm fine with either option. Alright. Well, I guess I'll flip a coin tomorrow unless someone else chimes in with an opinion. -- nathan
On Tue, Jul 16, 2024 at 04:38:06PM -0500, Nathan Bossart wrote: > Alright. Well, I guess I'll flip a coin tomorrow unless someone else > chimes in with an opinion. Committed and back-patched to v17. I left it as PqMsg_Progress. -- nathan