Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs |
Date | |
Msg-id | CA+TgmoYZQ4N6aJwtaoCUTfjniqvZohgOh9R=EkyUVB+oN413vQ@mail.gmail.com Whole thread Raw |
In response to | Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs (Jelte Fennema-Nio <me@jeltef.nl>) |
Responses |
Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
|
List | pgsql-hackers |
On Mon, Apr 22, 2024 at 5:19 PM Jelte Fennema-Nio <me@jeltef.nl> wrote: > On Mon, 22 Apr 2024 at 16:26, Robert Haas <robertmhaas@gmail.com> wrote: > > That's a fair point, but I'm still not seeing much practical > > advantage. It's unlikely that a client is going to set a random bit in > > a format parameter for no reason. > > I think you're missing an important point of mine here. The client > wouldn't be "setting a random bit in a format parameter for no > reason". The client would decide it is allowed to set this bit, > because the PG version it connected to supports column encryption > (e.g. PG18). But this completely breaks protocol and application layer > separation. I can't see what the problem is here. If the client is connected to a database that contains encrypted columns, and its response to seeing an encrypted column is to set this bit, that's fine and nothing should break. If a client doesn't know about encrypted columns and sets that bit at random, that will break things, and formally I think that's a risk, because I don't believe we document anywhere that you shouldn't set unused bits in the format mask. But practically, it's not likely. (And also, maybe we should document that you shouldn't do that.) > It doesn't seem completely outside of the realm of possibility for a > pooler to gather some statistics on the amount of Bind messages that > use text vs binary query parameters. That's very easily doable now, > while looking only at the protocol layer. If a client then sets the > new format parameter bit, this pooler could then get confused and > close the connection. Right, this is the kind of risk I was worried about. I think it's similar to my example of a client setting an unused bit for no reason and breaking everything. Here, you've hypothesized a pooler that tries to interpret the bit and just errors out when it sees something it doesn't understand. I agree that *formally* this is enough to justify bumping the protocol version, but I think *practically* it isn't, because the incompatibility is so minor as to inconvenience almost nobody, whereas changing the protocol version affects everybody. Let's consider a hypothetical country much like Canada except that there are three official languages rather than two: English, French, and Robertish. Robertish is just like English except that the meanings of the words cabbage and rutabaga are reversed. Shall we mandate that all signs in the country be printed in three languages rather than two? Formally, we ought, because the substantial minority of our hypothetical country that proudly speaks Robertish as their mother tongue will not want to feel that they are second class citizens. But practically, there are very few situations where the differences between the two languages are going to inconvenience anyone. Indeed, the French speakers might be a bit put out if English is effectively represented twice on every sign while their mother tongue is there only once. Of course, people are entitled to organize their countries politically in any way that works for the people who live in them, but as a practical matter, English and Robertish are mutually intelligible. And so here. If someone codes a connection pooler in the way you suppose, then it will break. But, first of all, they probably won't do that, both because it's not particularly likely that someone wants to gather that particular set of statistics and also because erroring out seems like an overreaction. And secondly, let's imagine that we do bump the protocol version and think about whether and how that solves the problem. A client will request from the pooler a version 3.1 connection and the pooler will say, sorry, no can do, I only understand 3.0. So the client will now say, oh ok, no problem, I'm going to refrain from setting that parameter format bit. Cool, right? Well, no, not really. First, now the client application is probably broken. If the client is varying its behavior based on the server's protocol version, that must mean that it cares about accessing encrypted columns, and that means that the bit in question is not an optional feature. So actually, the fact that the pooler can force the client to downgrade hasn't fixed anything at all. Second, if the connection pooler were written to do something other than close the connection, like say mask out the one bit that it knows how to deal with or have an "unknown" bucket to count values that it doesn't recognize, then it wouldn't have needed to care about the protocol version in the first place. It would have been better off not even knowing, because then it wouldn't have forced a downgrade onto the client application for no real reason. Throwing an error wasn't a wrong decision on the part of the person writing the pooler, but there are other things they could have done that would have been less brittle. Third, applications, drivers, and connection poolers now all need to worry about handling downgrades smoothly. If a connection pooler requests a v3.1 connection to the server and gets v3.0, it had better make sure that it only advertises 3.0 to the client. If the client requests v3.0, the pooler had better make sure to either request v3.0 from the server. Or alternatively, the pooler can be prepared to translate between 3.0 and 3.1 wherever that's needed, in either direction. But it's not at all clear what that would look like for something like TCE. Will the pooler arrange to encrypt parameters destined for encrypted tables if the client doesn't do so? Will it arrange to decrypt values coming from encrypted tables if the client doesn't understand encryption? It's possible someone will code that sort of thing, but I bet a lot of people won't bother. In general, I think we'll quickly end up with a bunch of different protocol versions -- say, 3.0 through 3.4 -- but people will thoroughly test with only one or two of them and support for the others will either be buggy because it wasn't tested or work anyway because the differences didn't really matter in the first place. > 1. I strongly believe minor protocol version bumps after the initial > 3.1 one can be made painless for clients/poolers (so the ones to > 3.2/3.3/etc). Similar to how TLS 1.3 can be safely introduced, and not > having to worry about breaking TLS 1.2 communication. Once clients and > poolers implement version negotiation support for 3.1, there's no > reason for version negation support to work for 3.0 and 3.1 to then > suddenly break on the 3.2 bump. To be clear, I'm talking about the act > of bumping the version here, not the actual protocol changes. So > assuming zero/near-zero client implementation effort for the new > features (like never setting the newly supported bit in a format > parameter), then bumping the protocol version for these new features > can never have negative consequences. I do like the idea of being able to introduce new versions without breaking things, but I think that if the TLS folks bumped the protocol version for something as minor as what we're talking about here, there would quickly be so many TLS versions that the result would be unmanageable. I suspect that they either never make small changes and batch everything up for the next rev, or they slip small changes into existing protocol versions as I propose that we do here. I have zero objection to bumping the protocol version when there is a real question of mutual intelligibility, and zero objection to trying to reduce friction around version bumps. But my current view, which I reserve the right to revise at a later time, is that a change that 99.99+% of people can safely ignore is not a sufficient reason for a version bump. > 2. I very much want to keep a clear split between the protocol layer > and the application layer of our communication. And these layers merge > whenever (like you say) "the wire protocol has changed from one > release to another", but no protocol version bump or protocol > extension is used to indicate that. When that happens the only way for > a client to know what valid wire protocol messages are according to > the server, is by checking the server version. This completely breaks > the separation between layers. So, while checking the server version > indeed works for direct client to postgres communication, it starts to > break down whenever you put a pooler inbetween (as explained in the > example earlier in this email). And it breaks down even more when > connecting to servers that implement the Postgres wire protocol, but > are not postgres at all, like CockroachDB. Right now libpq and other > postgres drivers can be used to talk to these other servers and > poolers, but if we start mixing protocol and application layer stuff > then eventually that will stop being the case. In practice, it's already the case. If such databases don't share code with PostgreSQL, it seems impossible that the replication subprotocol works in any meaningful way. It seems very likely that there are other dark corners of the protocol where things don't work either. And TCE will be another one, but bumping the protocol version doesn't fix that. I kind of feel bad arguing so much about this - I don't think the urge to bump the protocol version when we change the protocol is a bad one in concept. And it sounds like you've done more work with software that cares about the protocol outside of PostgreSQL itself than I have. So maybe you're right and I'm all wet. But I can't understand why you don't see practical problems with frequent version bumps. It's not just about the one-time effort of getting everything that doesn't currently understand how to negotiate a version to do so. It's about how everyone acts on that information, or doesn't, and whether the end result of all of those individual decisions is better or worse for the community as a whole. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: