Re: GUC_REPORT for protocol tunables was: Re: Optimize binary serialization format of arrays with fixed size elements - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: GUC_REPORT for protocol tunables was: Re: Optimize binary serialization format of arrays with fixed size elements |
Date | |
Msg-id | CA+TgmoYmM1wgN4Qpmh4qeBCuC68OHFnVUBjfFx7erUwXZCSiqg@mail.gmail.com Whole thread Raw |
In response to | Re: GUC_REPORT for protocol tunables was: Re: Optimize binary serialization format of arrays with fixed size elements (Merlin Moncure <mmoncure@gmail.com>) |
Responses |
Re: GUC_REPORT for protocol tunables was: Re: Optimize
binary serialization format of arrays with fixed size elements
|
List | pgsql-hackers |
On Tue, Jan 24, 2012 at 11:16 AM, Merlin Moncure <mmoncure@gmail.com> wrote: >> Our current protocol allocates a 2-byte integer for the purposes of >> specifying the type of each parameter, and another 2-byte integer for >> the purpose of specifying the result type... but only one bit is >> really needed at present: text or binary. If we revise the protocol >> version at some point, we might want to use some of that bit space to >> allow some more fine-grained negotiation of the protocol version. So, >> for example, we might define the top 5 bits as reserved (always pass >> zero), the next bit as a text/binary flag, and the remaining 10 bits >> as a 10-bit "format version number". When a change like this comes >> along, we can bump the highest binary format version recognized by the >> server, and clients who request the new version can get it. >> >> Alternatively, we might conclude that a 2-byte integer for each >> parameter is overkill and try to cut back... but the point is there's >> a bunch of unused bitspace there now. In theory we could even do >> something this without bumping the protocol version since the >> documentation seems clear that any value other than 0 and 1 yields >> undefined behavior, but in practice that seems like it might be a bit >> too edgy. > > Yeah. But again, this isn't a contract between libpq and the server, > but between the application and the server... I don't see how this is relevant. The text/binary format flag is there in both libpq and the underlying protocol. > So I'd vote against any format code > beyond the text/binary switch that currently exists (which, by the > way, while useful, is one of the great sins of libpq that we have to > deal with basically forever). While wire formatting is granular down > to the type level, applications should not have to deal with that. > They should Just Work. So who decides what format code to stuff into > the protocol? Where are the codes defined? > > I'm very much in the camp that sometime, presumably during connection > startup, the protocol accepts a non-#defined-in-libpq token (database > version?) from the application that describes to the server what wire > formats can be used and the server sends one back. There probably has > to be some additional facilities for non-core types but let's put that > aside for the moment. Those two tokens allow the server to pick the > highest supported wire format (text and binary!) that everybody > understands. The server's token is useful if we're being fancy and we > want libpq to translate an older server's wire format to a newer one > for the application. This of course means moving some of the type > system into the client, which is something we might not want to do > since among other things it puts a heavy burden on non-libpq driver > authors (but then again, they can always stay on the v3 protocol, > which can benefit from being frozen in terms of wire formats). I think it's sensible for the server to advertise a version to the client, but I don't see how you can dismiss add-on types so blithely. The format used to represent any given type is logically a property of that type, and only for built-in types is that associated with the server version. I do wonder whether we are making a mountain out of a mole-hill here, though. If I properly understand the proposal on the table, which it's possible that I don't, but if I do, the new format is self-identifying: when the optimization is in use, it sets a bit that previously would always have been clear. So if we just go ahead and change this, clients that have been updated to understand the new format will work just fine. The server uses the proposed optimization only for arrays that meet certain criteria, so any properly updated client must still be able to handle the case where that bit isn't set.On the flip side, clients that aren't expecting thenew optimization might break. But that's, again, no different than what happened when we changed the default bytea output format. If you get bit, you either update your client or shut off the optimization and deal with the performance consequences of so doing. In fact, the cases are almost perfectly analogous, because in each case the proposal was based on the size of the output format being larger than necessary, and wanting to squeeze it down to a smaller size for compactness. And more generally, does anyone really expect that we're never going to change the output format of any type we support ever again, without retaining infinite backward compatibility? I didn't hear any screams of outrage when we updated the hyphenation rules for contrib/isbn - well, ok, there were some howls, but that was because the rules were still incomplete and US-centric, not so much because people thought it was unacceptable for the hyphenation rules to be different in major release N+1 than they were in major release N. If the IETF goes and defines a new standard for formatting IPv6 addresses, we're likely to eventually support it via the inet and cidr datatypes. The only things that seem reasonably immune to future changes are text and numeric, but even with numeric it's not impossible that the maximum available precision or scale could eventually be different than what it is now. I think it's unrealistic to suppose that new major releases won't ever require drivers or applications to make any updates. My first experience with this was an application that got broken by the addition of attisdropped, and sure, I spent a day cursing, but would I be happier if PostgreSQL didn't support dropping columns? No, not really. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: