Re: Optimize binary serialization format of arrays with fixed size elements - Mailing list pgsql-hackers
From | Noah Misch |
---|---|
Subject | Re: Optimize binary serialization format of arrays with fixed size elements |
Date | |
Msg-id | 20120123031657.GE15693@tornado.leadboat.com Whole thread Raw |
In response to | Re: Optimize binary serialization format of arrays with fixed size elements (Mikko Tiihonen <mikko.tiihonen@nitorcreations.com>) |
List | pgsql-hackers |
On Sun, Jan 22, 2012 at 11:47:06PM +0200, Mikko Tiihonen wrote: > On 01/20/2012 04:45 AM, Noah Misch wrote: >> On Thu, Jan 19, 2012 at 02:00:20PM -0500, Robert Haas wrote: >>> Having said that, if we're to follow the precedent set by >>> bytea_format, maybe we ought to just add >>> binary_array_format={huge,ittybitty} and be done with it, rather than >>> inventing a minor protocol version GUC for something that isn't really >>> a protocol version change at all. We could introduce a >>> differently-named general mechanism, but I guess I'm not seeing the >>> point of that either. Just because someone has a >>> backward-compatibility issue with one change of this type doesn't mean >>> they have a similar issue with all of them. So I think adding a >>> special-purpose GUC is more logical and more parallel to what we've >>> done in the past, and it doesn't saddle us with having to be certain >>> that we've designed the mechanism generally enough to handle all the >>> cases that may come later. >> >> That makes sense. An attraction of a single binary format version was avoiding >> the "Is this worth a GUC?" conversation for each change. However, adding a GUC >> should be no more notable than bumping a binary format version. > > I see the main difference between the GUC per feature vs minor version being that > in versioned changes old clients keep working because the have to explicitly > request a specific version. Whereas in separate GUC variables each feature will be > enabled by default and users have to either keep up with new client versions or > figure out how to explicitly disable the changes. No, we can decide that anew for each GUC. If you'd propose that array_output = full be the default, that works for me. > Here is a second version of the patch. The binary array encoding changes > stay the same but all code around was rewritten. > > Changes from previous versions based on received comments: > * removed the minor protocol version concept > * introduced a new GUC variable array_output copying the current > bytea_output type, with values "full" (old value) and > "smallfixed" (new default) How about the name "array_binary_output"? > * added documentation for the new GUC variable > * used constants for the array flags variable values > *** a/doc/src/sgml/config.sgml > --- b/doc/src/sgml/config.sgml > *************** COPY postgres_log FROM '/full/path/to/lo > *** 4888,4893 **** > --- 4888,4910 ---- > </listitem> > </varlistentry> > > + <varlistentry id="guc-array-output" xreflabel="array_output"> > + <term><varname>array_output</varname> (<type>enum</type>)</term> > + <indexterm> > + <primary><varname>array_output</> configuration parameter</primary> > + </indexterm> > + <listitem> > + <para> > + Sets the output format for binary encoding of values of > + type <type>array</type>. Valid values are > + <literal>smallfixed</literal> (the default) > + and <literal>full</literal> (the traditional PostgreSQL > + format). The <type>array</type> type always It's not "The array type" but "Array types", a class. > + accepts both formats on input, regardless of this setting. > + </para> > + </listitem> > + </varlistentry> The section "Array Input and Output Syntax" should reference this GUC. > *** a/src/backend/utils/misc/guc.c > --- b/src/backend/utils/misc/guc.c > *************** > *** 64,69 **** > --- 64,70 ---- > #include "storage/predicate.h" > #include "tcop/tcopprot.h" > #include "tsearch/ts_cache.h" > + #include "utils/array.h" > #include "utils/builtins.h" > #include "utils/bytea.h" > #include "utils/guc_tables.h" > *************** static const struct config_enum_entry by > *** 225,230 **** > --- 226,243 ---- > }; > > /* > + * Options for enum values defined in this module. > + * > + * NOTE! Option values may not contain double quotes! > + */ Don't replicate this comment. > + > + static const struct config_enum_entry array_output_options[] = { > + {"full", ARRAY_OUTPUT_FULL, false}, > + {"smallfixed", ARRAY_OUTPUT_SMALLFIXED, false}, > + {NULL, 0, false} > + }; > + > + /* > * We have different sets for client and server message level options because > * they sort slightly different (see "log" level) > */ > *************** static struct config_enum ConfigureNames > *** 3047,3052 **** > --- 3060,3075 ---- > NULL, NULL, NULL > }, > > + { > + {"array_output", PGC_USERSET, COMPAT_OPTIONS_PREVIOUS, You've put the GUC in COMPAT_OPTIONS_PREVIOUS, but you've put its documentation in the section for CLIENT_CONN_STATEMENT. I don't have a strong opinion on which one to use, but be consistent. > + gettext_noop("Sets the binary output format for arrays."), > + NULL > + }, > + &bytea_output, &array_output > + ARRAY_OUTPUT_SMALLFIXED, array_output_options, > + NULL, NULL, NULL > + }, > + > { > {"client_min_messages", PGC_USERSET, LOGGING_WHEN, > gettext_noop("Sets the message levels that are sent to the client."), Thanks, nm
pgsql-hackers by date: