Re: GUC_REPORT for protocol tunables was: Re: Optimize binary serialization format of arrays with fixed size elements - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: GUC_REPORT for protocol tunables was: Re: Optimize binary serialization format of arrays with fixed size elements
Date
Msg-id CAHyXU0yO=uxLyABa2+xtDnOVd0dhAxG-auvJ2-mX-3w46RA_pA@mail.gmail.com
Whole thread Raw
In response to Re: GUC_REPORT for protocol tunables was: Re: Optimize binary serialization format of arrays with fixed size elements  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: GUC_REPORT for protocol tunables was: Re: Optimize binary serialization format of arrays with fixed size elements
List pgsql-hackers
On Tue, Jan 24, 2012 at 8:26 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Jan 23, 2012 at 5:49 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> I'm not sure that you're getting anything with that user facing
>> complexity.  The only realistic case I can see for explicit control of
>> wire formats chosen is to defend your application from format changes
>> in the server when upgrading the server and/or libpq.   This isn't a
>> "let's get better compression problem", this is "I upgraded my
>> database and my application broke" problem.
>>
>> Fixing this problem in non documentation fashion is going to require a
>> full protocol change, period.
>
> Our current protocol allocates a 2-byte integer for the purposes of
> specifying the type of each parameter, and another 2-byte integer for
> the purpose of specifying the result type... but only one bit is
> really needed at present: text or binary.  If we revise the protocol
> version at some point, we might want to use some of that bit space to
> allow some more fine-grained negotiation of the protocol version.  So,
> for example, we might define the top 5 bits as reserved (always pass
> zero), the next bit as a text/binary flag, and the remaining 10 bits
> as a 10-bit "format version number".  When a change like this comes
> along, we can bump the highest binary format version recognized by the
> server, and clients who request the new version can get it.
>
> Alternatively, we might conclude that a 2-byte integer for each
> parameter is overkill and try to cut back... but the point is there's
> a bunch of unused bitspace there now.  In theory we could even do
> something this without bumping the protocol version since the
> documentation seems clear that any value other than 0 and 1 yields
> undefined behavior, but in practice that seems like it might be a bit
> too edgy.

Yeah.  But again, this isn't a contract between libpq and the server,
but between the application and the server...unless you want libpq to
do format translation to something the application can understand (but
even then the application is still involved).  I'm not very
enthusiastic about encouraging libpq application authors to pass
format #defines for every single parameter and consumed datum to get
future proofing on wire formats.  So I'd vote against any format code
beyond the text/binary switch that currently exists (which, by the
way, while useful, is one of the great sins of libpq that we have to
deal with basically forever).  While wire formatting is granular down
to the type level, applications should not have to deal with that.
They should Just Work.  So who decides what format code to stuff into
the protocol?  Where are the codes defined?

I'm very much in the camp that sometime, presumably during connection
startup, the protocol accepts a non-#defined-in-libpq token (database
version?) from the application that describes to the server what wire
formats can be used and the server sends one back.  There probably has
to be some additional facilities for non-core types but let's put that
aside for the moment.  Those two tokens allow the server to pick the
highest supported wire format (text and binary!) that everybody
understands.  The server's token is useful if we're being fancy and we
want libpq to translate an older server's wire format to a newer one
for the application.  This of course means moving some of the type
system into the client, which is something we might not want to do
since among other things it puts a heavy burden on non-libpq driver
authors (but then again, they can always stay on the v3 protocol,
which can benefit from being frozen in terms of wire formats).

merlin


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Multithread Query Planner
Next
From: Jaime Casanova
Date:
Subject: Re: Measuring relation free space