Re: Request for comment on setting binary format output per session - Mailing list pgsql-hackers
From | Jeff Davis |
---|---|
Subject | Re: Request for comment on setting binary format output per session |
Date | |
Msg-id | dcd25c5b805735378cf846f0178bb635716a5ed1.camel@j-davis.com Whole thread Raw |
In response to | Re: Request for comment on setting binary format output per session (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Request for comment on setting binary format output per session
Re: Request for comment on setting binary format output per session |
List | pgsql-hackers |
On Wed, 2023-10-04 at 15:10 -0400, Robert Haas wrote: > I hadn't really considered client_encoding as a precedent for this > setting. A lot of my discomfort with the proposed mechanism also > applies to client_encoding, namely, suppose you call some function or > procedure or whatever and it changes client_encoding on your behalf > and now your communication with the server is all screwed up. This may have some security implications, but we've had lots of discussion about the general topic of executing malicious code, and the ability to mess with the on-the-wire formats might not be any worse than what can already happen. (Though expanding it to binary formats might slightly increase the attack surface area.) > That > seems very unpleasant. Yet it's also existing behavior. The binary format setting is better in some ways and worse in other ways. For text encoding, usually it's expecting a single encoding and so a single setting at the start of the session makes sense. For binary formats, the client is likely to support some values in binary and others not; and user-defined types make it even messier. On the other hand, at least the results are marked as being binary format, so if something unexpected happens, a well-written client is more likely to see that something went wrong. For text encoding, the client would have to be a bit more defensive. Another thing to consider is that using a GUC for binary formats is a protocol change in a way that client_encoding is not. The existing documentation for the protocol already specifies when binary formats will be used, and a GUC would change that behavior. We absolutely would need to update the documentation, and clients (like psql) really should be updated. > I think one > could conclude on these facts either that (a) client_encoding is fine > and the problems with controlling behavior using that kind of > mechanism are mostly theoretical or I'm not clear on the exact rules for a protocol version bump and why a GUC helps us avoid one. If we have a binary_formats GUC, the client would need to know the server version and check that it's >=17 before sending the "SET binary_formats='...'" commmand, right? What's the difference between that and making it an explicit protocol message that only >=17 understand? In any case, I think clients and connection poolers can work around the problems, and they are mostly minor in practice, but I wouldn't call them "theoretical". If there's enough utility in the binary_formats parameter, we can decide to put up with the problems; which is different than saying there aren't any. > (b) that we messed up with > client_encoding and shouldn't add any more mistakes of the same ilk > or > (c) that we should really be looking at redesigning the way > client_encoding works, too. (b) doesn't seem like a very helpful perspective without some ideas toward (c). I think (c) is worth discussing but we don't have to block on it. Regards, Jeff Davis
pgsql-hackers by date: