Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs |
Date | |
Msg-id | CA+TgmobCj-+BQH7Bd1NNB3p-Z1iz4q4HdvNkBHO0EfD3BgYgqQ@mail.gmail.com Whole thread Raw |
In response to | Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs (Jelte Fennema-Nio <me@jeltef.nl>) |
Responses |
Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs |
List | pgsql-hackers |
On Thu, Mar 14, 2024 at 1:54 PM Jelte Fennema-Nio <me@jeltef.nl> wrote: > In my view there can be, **by definition,** only two general types of > protocol changes: > 1. A "simple protocol change": This is one that requires agreement by > both the client and server, that they understand the new message types > involved in this change. e.g. the ParameterSet message proposal (this > message type is either supported or it's not) > 2. A "parameterized protocol change": This requires the same as 1, but > should also allow some parameterization from the client, e.g. for the > compression proposal, the client should specify what compression > algorithm the server should use to compress data when sending it to > the client. You seem to be supposing here that all protocol changes will consist of adding new message types. While I think that will be a common pattern, I do not think it will be a universal one. I do agree, however, that every possible variation of the protocol is either Boolean-valued (i.e. are we doing X or not?) or something more complicated (i.e. how exactly are doing X?). > Client and Server can agree that a "simple protocol change" is > supported by both advertising a minimum minor protocol version. And > for a "parameterized protocol change" the client can send a _pq_ > parameter in the startup packet. > > So, new _pq_ parameters should only ever be introduced for > parameterized protocol changes. They are not meant to advertise > support, they are meant to configure protocol features. For a > non-configurable protocol feature, we'd simply bump the protocol > version. And since _pq_ parameters thus always control some setting at > the session level, we can simply store it as a GUC, because that's how > we store all our parameters at a session level. This is definitely not how I was thinking about it. I was imagining that we wanted to reserve bumping the protocol version for more significant changes, and that we'd use _pq_ parameters for relatively minor new functionality whether Boolean-valued or otherwise. > I think given the automatic downgrade supported by the > NegotiateProtocolVersion, there's no real down-side to requesting the > newest version by default. The only downside I can see is when > connecting to other applications (i.e. non PostgreSQL servers) that > implement the postgres protocol, but don't implement > NegotiateProtocolVersion. But for that I added the > max_protocol_version connection option in 0006 (of my latest > patchset). You might be right. This is a minor point that's not worth arguing about too much. > > I'm really unhappy with 0002-0004. > > Just to be clear, (afaict) your argument below seems to only really be > about 0004, not about 0002 or 0003. Was there anything about 0002 & > 0003 that you were unhappy with? 0002 & 0003 are not dependent an 0004 > imho. Because even when not making _pq_ parameters map to GUCs, we'd > still need to change libpq to not instantly close the connection > whenever a _pq_ parameter (or new protocol version) is not supported > by the server (which is what 0002 & 0003 do). I completely agree that we need to avoid slamming the connection shut. What I don't agree with is taking the list of protocol extensions that the server knows about and putting it into an array of strings that the user can see. I don't want the user to know or care so much about what's happening at the wire protocol level. The user is entitled to know whether PQsetProtocolParameter() will work or not, and the user is entitled to know whether it has a chance of working next time if it didn't work this time, and when it fails, the user is entitled to a good error message explaining the reason for the failure. But the user is not entitled to know what negotiation took place over the wire to figure that out. They shouldn't need to know that the _pq_ namespace exists, and they shouldn't need to know whether we negotiated the availability or unavailability of PQsetProtocolParameter() using [a] the protocol minor version number, [b] the protocol major version number, [c] a protocol option called parameter_set, or [d] a protocol option called bananas_foster. Those are all things that might change in the future. Just as a for instance, I had a thought that we might accumulate a few new message types controlled by protocol options (ParameterSet, AlwaysSendTypeInBinaryFormat, whatever else) while keeping the protocol version as 3.0, and then eventually bump the protocol version to 3.1 where all of that would be mandatory functionality. So the protocol parameters wouldn't be specified any more when using 3.1, but they would be specified when talking to older 3.0 servers. That difference shouldn't be visible to the user. The user should only know the result of the negotiation. > Okay, our interpretation is very different here. From my perspective > introducing a non-GUC namespace is NOT AT ALL the benefit of the _pq_ > prefix. The main benefit (imho) is that it allows sending an > "optional" parameter (i.e GUC) in the startup packet. So, one where if > the server doesn't recognize it the connection attempt still succeeds. > If you specify a normal GUC in the connection parameters and the > server doesn't know about it, the server will close the connection. But this is another example of a problem that can *easily* be fixed without using up the entirety of the _pq_ namespace. You can introduce _pq_.dont_error_on_unknown_gucs=1 or _pq_.dont_error_on_these_gucs='foo, bar, baz'. The distinction between the startup packet containing whizzbang=frob and instead containing _pq_.whizzbang=frob shouldn't be just whether an error is thrown if we don't know anything about whizzbang. > > ... but no matter what we do exactly, I don't want the very first > > protocol extension we ever add to eat up all of _pq_. I intended that > > to support decades worth of extensibility, not just one patch. > > This seems to be the core of your argument. But honestly, I don't > understand this logic at all. Why do you think that assigning _pq_ > parameters to GUCs **for the ones that match an existing GUC** would > have such a far reaching effect into the future. There's only a > handful of _pq_ parameters being proposed on the mailinglist at the > moment. Even if we implement all of those as GUCs, and in the future, > we'd want a _pq_ parameter that does not map to GUC (which I > personally doubt will ever be the case). Then we can simply change the > server code like so and do something special for that parameter: I guess I'm in the same position as you are -- your argument doesn't really make any sense to me. That also has the unfortunate disadvantage of making it difficult for me to explain why I don't agree with you, but let me just tick off a few things that I'm thinking about here: 1. Connection poolers. If I'm talking to pgpool and pgpool is talking to the server, and pgpool and I agree to use compression, that's completely separate from whether pgpool and the server are using compression. If I have to interrogate the compression state by executing "SHOW some_compression_guc", then I'm going to get the wrong answer unless pgpool runs a full SQL parser on every command that I execute and intercepts the ones that touch protocol parameters. That's bound to be expensive an unreliable -- consider something like SELECT current_setting('some_compression_guc') || ' ' | current_setting('some_other_guc') which isn't half as pathological as it first looks. I want to be able to know the state of my protocol parameters by calling libpq functions that answer my questions definitively based on libpq's own internal state. libpq itself *must* know what compression I'm using on my connection; the server's answer may be different. 2. Clarity of meaning across versions. Let's say we add a protocol option in the future that expands the message-type byte into a two-byte word. Failure of the two sides to agree on the value of this protocol option is, fairly obviously, a catastrophe. I assume that if we actually did something like this, there's a fair chance it would be protocol version 4.0 rather than an option to 3.whatever, but it's a good example of something that might someday need to be changed that is not just a new message type and about which the communicating parties must absolutely agree. Let's say we do as you propose and have a GUC wide_message_types = {true | false}. Now, what happens when a sneaky user of an older libpq, which does not know about this option, tries to connect to a newer server? As I see it, in your proposal, the client thinks they're just setting a GUC, but the server thinks we're completely changing up the wire protocol. Disaster ensues. From my point of view, the problem is created by the fact that you're mixing together two things which ought to be kept well-separated -- the act of negotiating what protocol variant we're using, on the one hand, and the setting of particular GUCs to particular values, on the other. 3. Generally, and maybe this is just an expansion of the previous point, it feels to me like you've conflated the thing you want to do right now with what everybody who wants to modify the protocol will ever want to do in the future. It's just all GUCs, all the time! But the GUC model is actually a poor fit in all kinds of scenarios, which is why we have all kinds of other ways to configure things too, like connection parameters for instance. Now, to be fair, it's often useful to expose values that are configured through some other means as read-only GUCs, so the dividing line between GUCs and other things does get a bit sloppy. And we're already using client_encoding, which is a GUC, for something that really ought not to have been handled that way ... because to take the connection pooler example again, there's no reason -- other than bad wire-protocol design -- why the encoding being used between the client and the pooler needs to match the encoding being used between the pooler and the server. But in my view, this isn't evidence that we should continue to muddy the distinction between things that ought to be protocol parameters and things that ought to be GUCs; rather, it's evidence of the need to make the distinction between the two as crisp as we possibly can. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: