Re: Optimize binary serialization format of arrays with fixed size elements - Mailing list pgsql-hackers

From Mikko Tiihonen
Subject Re: Optimize binary serialization format of arrays with fixed size elements
Date
Msg-id 4F1C83DA.6000906@nitorcreations.com
Whole thread Raw
In response to Re: Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility  (Noah Misch <noah@leadboat.com>)
Responses Re: Optimize binary serialization format of arrays with fixed size elements
List pgsql-hackers
Previous title was: Add minor version to v3 protocol to allow changes without breaking backwards compatibility

On 01/20/2012 04:45 AM, Noah Misch wrote:
> On Thu, Jan 19, 2012 at 02:00:20PM -0500, Robert Haas wrote:
>> On Thu, Jan 19, 2012 at 10:37 AM, Noah Misch<noah@leadboat.com>  wrote:
>>> I agree with Merlin; the frontend/backend protocol is logically distinct from
>>> the binary send/recv formats of data types. ?For one key point, the latter is
>>> not exclusively core-defined; third-party extensions change their send/recv
>>> formats on a different schedule. ?They can add myext.binary_format_version
>>> GUCs of their own to cope in a similar way.
>>
>> I agree.  It occurs to me that we recently changed the default *text*
>> output format for bytea for reasons not dissimilar to those
>> contemplated here.  Presumably, that's a much more disruptive change,
>> and yet we've had minimal complaints because anyone who gets bitten
>> can easily set bytea_output='escape' and the problem goes away.  The
>> same thing seems like it would work here, only the number of people
>> needing to change the parameter will probably be even smaller, because
>> fewer people use binary than text.
>>
>> Having said that, if we're to follow the precedent set by
>> bytea_format, maybe we ought to just add
>> binary_array_format={huge,ittybitty} and be done with it, rather than
>> inventing a minor protocol version GUC for something that isn't really
>> a protocol version change at all.  We could introduce a
>> differently-named general mechanism, but I guess I'm not seeing the
>> point of that either.  Just because someone has a
>> backward-compatibility issue with one change of this type doesn't mean
>> they have a similar issue with all of them.  So I think adding a
>> special-purpose GUC is more logical and more parallel to what we've
>> done in the past, and it doesn't saddle us with having to be certain
>> that we've designed the mechanism generally enough to handle all the
>> cases that may come later.
>
> That makes sense.  An attraction of a single binary format version was avoiding
> the "Is this worth a GUC?" conversation for each change.  However, adding a GUC
> should be no more notable than bumping a binary format version.

I see the main difference between the GUC per feature vs minor version being that
in versioned changes old clients keep working because the have to explicitly
request a specific version. Whereas in separate GUC variables each feature will be
enabled by default and users have to either keep up with new client versions or
figure out how to explicitly disable the changes.

However, due to popular vote I removed the minor version proposal for now.


Here is a second version of the patch. The binary array encoding changes
stay the same but all code around was rewritten.

Changes from previous versions based on received comments:
* removed the minor protocol version concept
* introduced a new GUC variable array_output copying the current
   bytea_output type, with values "full" (old value) and
   "smallfixed" (new default)
* added documentation for the new GUC variable
* used constants for the array flags variable values

-Mikko

Attachment

pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: Collect frequency statistics for arrays
Next
From: Jeff Janes
Date:
Subject: Re: CLOG contention, part 2