Thread: Re: protocol-level wait-for-LSN
> The patch adds a protocol extension called _pq_.wait_for_lsn as well > as a libpq connection option wait_for_lsn to activate the same. (Use > e.g., psql -d 'wait_for_lsn=1'.) > > With this protocol extension, two things are changed: > > - The ReadyForQuery message sends back the current LSN. If other protocol extension X tries to add something to the ReadyForQuery message too, what would happen? Currently ReadyForQuery message is like this: Byte1('Z') Int32 Byte1 With the wait_for_lsn extension, It becomes: Byte1('Z') Int32 Byte1 String Suppose the X extension wants to extend like this: Byte1('Z') Int32 Byte1 Int32 It seems impossible to coexist both. Does this mean once the wait_for_lsn extension is brought into the frontend/backend protocol specification, no other extensions that touch ReadyForQuery cannot be defined? Best reagards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
On 29.10.24 06:06, Tatsuo Ishii wrote: >> The patch adds a protocol extension called _pq_.wait_for_lsn as well >> as a libpq connection option wait_for_lsn to activate the same. (Use >> e.g., psql -d 'wait_for_lsn=1'.) >> >> With this protocol extension, two things are changed: >> >> - The ReadyForQuery message sends back the current LSN. > > If other protocol extension X tries to add something to the > ReadyForQuery message too, what would happen? I think one would have to define that somehow. If it's useful, the additional fields of both extensions could be appended, in some defined order. But this is an interesting question to think about.
>>> With this protocol extension, two things are changed: >>> >>> - The ReadyForQuery message sends back the current LSN. >> If other protocol extension X tries to add something to the >> ReadyForQuery message too, what would happen? > > I think one would have to define that somehow. If it's useful, the > additional fields of both extensions could be appended, in some > defined order. But this is an interesting question to think about. I think this kind of extension, which adds new filed to an existing message type, should be implemented as v4 protocol. Best reagards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
On Wed, 30 Oct 2024 at 07:49, Tatsuo Ishii <ishii@postgresql.org> wrote: > > I think one would have to define that somehow. If it's useful, the > > additional fields of both extensions could be appended, in some > > defined order. But this is an interesting question to think about. > > I think this kind of extension, which adds new filed to an existing > message type, should be implemented as v4 protocol. Could you explain why you think a major version bump is needed? In what situation do you care about this. Because for my usecases (client implementations & pgbouncer) I don't think that would be necessary. If a client doesn't send the _pq_.wait_for_lsn protocol parameter, it will never receive this new version. I don't really see a problem with having two protocol parameters change the same message. Yes, you have to define what the result of their combination is, but that seems trivial to do for additions of fields. You either define the first protocol parameter that was added to the spec, to add its field before the second. Or you could do it based on something non-time-dependent, like the alphabetic order of the protocol parameter, or the alphabetic order of the fields that they add. The main guarantees I'd like to uphold are listed here: https://www.postgresql.org/message-id/CAGECzQR5PMud4q8Atyz0gOoJ1xNH33g7g-MLXFML1_Vrhbzs6Q@mail.gmail.com
> On Wed, 30 Oct 2024 at 07:49, Tatsuo Ishii <ishii@postgresql.org> wrote: >> > I think one would have to define that somehow. If it's useful, the >> > additional fields of both extensions could be appended, in some >> > defined order. But this is an interesting question to think about. >> >> I think this kind of extension, which adds new filed to an existing >> message type, should be implemented as v4 protocol. > > Could you explain why you think a major version bump is needed? In > what situation do you care about this. Because for my usecases (client > implementations & pgbouncer) I don't think that would be necessary. If > a client doesn't send the _pq_.wait_for_lsn protocol parameter, it > will never receive this new version. Yes, if there's only one extension for a message type, it would not be a big problem. But if there's more than one extensions that want to change the same type, problem arises as I have already discussed them upthread. > I don't really see a problem with having two protocol parameters > change the same message. Yes, you have to define what the result of > their combination is, but that seems trivial to do for additions of > fields. You either define the first protocol parameter that was added > to the spec, to add its field before the second. Or you could do it > based on something non-time-dependent, like the alphabetic order of > the protocol parameter, or the alphabetic order of the fields that > they add. That sounds far from trivial. So each extension needs to check if any other extension which modifies the same message type is activated? That requires each extension implementation to have built-in knowledge about any conflicting extension. Moreover each extension may not be added at once. If extension Y is added after extension X is defined, then implementation of X needs to be changed because at the time when X is defined, it did not need to care about Y. Another way to deal with the problem could be defining a new protocol message which describes those conflict information so that each extensions do not need to have such information built-in, but maybe it is too complex. Best reagards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
On 30/10/2024 13:34, Tatsuo Ishii wrote: >> On Wed, 30 Oct 2024 at 07:49, Tatsuo Ishii <ishii@postgresql.org> wrote: >>>> I think one would have to define that somehow. If it's useful, the >>>> additional fields of both extensions could be appended, in some >>>> defined order. But this is an interesting question to think about. >>> >>> I think this kind of extension, which adds new filed to an existing >>> message type, should be implemented as v4 protocol. >> >> Could you explain why you think a major version bump is needed? In >> what situation do you care about this. Because for my usecases (client >> implementations & pgbouncer) I don't think that would be necessary. If >> a client doesn't send the _pq_.wait_for_lsn protocol parameter, it >> will never receive this new version. > > Yes, if there's only one extension for a message type, it would not be > a big problem. But if there's more than one extensions that want to > change the same type, problem arises as I have already discussed them > upthread. > >> I don't really see a problem with having two protocol parameters >> change the same message. Yes, you have to define what the result of >> their combination is, but that seems trivial to do for additions of >> fields. You either define the first protocol parameter that was added >> to the spec, to add its field before the second. Or you could do it >> based on something non-time-dependent, like the alphabetic order of >> the protocol parameter, or the alphabetic order of the fields that >> they add. > > That sounds far from trivial. So each extension needs to check if any > other extension which modifies the same message type is activated? > That requires each extension implementation to have built-in knowledge > about any conflicting extension. Moreover each extension may not be > added at once. If extension Y is added after extension X is defined, > then implementation of X needs to be changed because at the time when > X is defined, it did not need to care about Y. Another way to deal > with the problem could be defining a new protocol message which > describes those conflict information so that each extensions do not > need to have such information built-in, but maybe it is too complex. Note that the "protocol extension" mechanism is *not* meant for user-defined extensions. That's not the primary purpose anyway. It allows evolving the protocol in core code in a backwards compatible way, but indeed the different extensions will need to be coordinated so that they don't clash with each other. If they introduced new message types for example, they better use different message type codes. We might have made a mistake by calling this mechanism "protocol extensions", because it makes people think of user-defined extensions. With user-defined extensions, yes, you have exactly the problem you describe.We have no rules on how a protocol extension is allowed to change the protocol. It might add fields, it might add messages, or it might change the meaning of existing messages. Or encapsulate the whole protocol in XML. So yes, each protocol extension needs to know about all the other protocol extensions that it can be used with. In practice we'll avoid doing crazy stuff so that the protocol extensions are orthogonal, but if user-defined extensions get involved, there's not much we can do to ensure that. -- Heikki Linnakangas Neon (https://neon.tech)
On Wed, 30 Oct 2024 at 12:53, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > We might have made a mistake by calling this mechanism "protocol > extensions", because it makes people think of user-defined extensions. I think this is a real problem, that's probably worth fixing. I created a separate thread to address this[1] > So yes, each protocol extension needs to know about all the other > protocol extensions that it can be used with. In practice we'll avoid > doing crazy stuff so that the protocol extensions are orthogonal Just as an example, let's say we add a server-side query time to the protocol (which honestly seems like a pretty useful feature). So that ReadyForQuery now returns the query time if the query_time protocol. For clients it isn't difficult at all to support any combination of query_time & wait_for_lsn options. As long as we define that the wait_for_lsn field is before the query_time field if both exist, then two simple if statements like this would do the trick: if (wait_for_lsn_enabled) { // interpret next field as LSN } if (query_time_enabled) { // interpret next field as query time } [1]: https://www.postgresql.org/message-id/CAGECzQQoc%2BV94TrF-5cMikCMaf-uUnU52euwSCtQBeDYqXnXyA%40mail.gmail.com
>> So yes, each protocol extension needs to know about all the other >> protocol extensions that it can be used with. In practice we'll avoid >> doing crazy stuff so that the protocol extensions are orthogonal > > Just as an example, let's say we add a server-side query time to the > protocol (which honestly seems like a pretty useful feature). So that > ReadyForQuery now returns the query time if the query_time protocol. > For clients it isn't difficult at all to support any combination of > query_time & wait_for_lsn options. As long as we define that the > wait_for_lsn field is before the query_time field if both exist, then > two simple if statements like this would do the trick: > > if (wait_for_lsn_enabled) { > // interpret next field as LSN > } > if (query_time_enabled) { > // interpret next field as query time > } But if (query_time_enabled) { // interpret next field as query time } if (wait_for_lsn_enabled) { // interpret next field as LSN } doesn't work, right? I don't like clients need to know the exact order of each protocol extensions. BTW, > Just as an example, let's say we add a server-side query time to the > protocol (which honestly seems like a pretty useful feature). So that > ReadyForQuery now returns the query time if the query_time protocol. Probaby it's better CommandComplete returns the query time because there could be multiple query-time in multi-statement query or extended query protocol. Best reagards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
On Wed, Oct 30, 2024 at 6:45 PM Tatsuo Ishii <ishii@postgresql.org> wrote: > doesn't work, right? I don't like clients need to know the exact order > of each protocol extensions. I agree with this criticism, at least for the most part. Years and years ago, the only way to specify EXPLAIN options was to say EXPLAIN [ANALYZE] [VERBOSE] query. So, if you said, EXPLAIN VERBOSE ANALYZE query, it didn't work. Actually, it still doesn't, but now you can say EXPLAIN (VERBOSE, ANALYZE) query and that will work, because the new options syntax allows for options to be specified in any order. And a really key point here is that for quite a while we were resistant to adding any new EXPLAIN options precisely because everyone knew the requirement to mention the options in a specific order did not scale. We could reasonably ask users to remember that ANALYZE had to come before VERBOSE, but asking people to remember the correct order of three or six or ten options would end up being quite annoying. And I think the problem here is the same. When you want to add the first set of optional fields to a protocol message, it seems perfectly reasonable to decide that _pq_.tde=1 or _pq_.wait_for_lsn=1 turns them on. When you add the second set of fields, it probably still feels reasonable. But when you get up to half a dozen or so protocol extensions that affect the same underlying set of messages, it's going to start to be pretty annoying. Parsing that protocol message is going to require pretty complicated code. Even if you don't care about the contents of the extra fields, you still potentially need code to understand and ignore them, unless you refuse support for the protocol extension altogether. Now, what makes this case less of a problem than the EXPLAIN case mentioned above is that people are not typically going to construct protocol messages by hand. As long as the protocol documentation is clear about the ordering of fields and which fields are controlled by which options, maybe it's not too horrible if everybody has to go through and write a bunch of if-statements. But even so, wouldn't it be easier if protocol extensions only added new message types instead of redefining existing ones? Then you could just ignore message types you don't care about. To be clear, I'm not saying that we should never, ever extend an existing message type. I'm just saying that the design of cramming a bunch of new fields into a message type doesn't seem entirely scalable, and therefore I believe we should consider whether there are reasonable alternatives. -- Robert Haas EDB: http://www.enterprisedb.com