Re: Binary support for pgoutput plugin - Mailing list pgsql-hackers

From Daniel Gustafsson
Subject Re: Binary support for pgoutput plugin
Date
Msg-id CB158AB2-DADE-43DE-B374-BD4519A03CF2@yesql.se
Whole thread Raw
In response to Re: Binary support for pgoutput plugin  (Dave Cramer <davecramer@gmail.com>)
Responses Re: Binary support for pgoutput plugin
List pgsql-hackers
> On 7 Jul 2020, at 02:16, Dave Cramer <davecramer@gmail.com> wrote:

> OK, rebased it down to 2 patches, attached.

I took a look at this patchset today.  The feature clearly seems like something
which we'd benefit from having, especially if it allows for the kind of
extensions that were discussed at the beginning of this thread.  In general I
think it's in pretty good shape, there are however a few comments:

The patch lacks any kind of test, which I think is required for it to be
considered for committing.  It also doesn't update the \dRs view in psql to
include the subbinary column which IMO it should.  I took the liberty to write
this as well as tests as I was playing with the patch, the attached 0003
contains this, while 0001 and 0002 are your patches included to ensure the
CFBot can do it's thing.  This was kind of thrown together to have something
while testing, so it definately need a once-over or two.

The comment here implies that unchanged is the default value for format, but
isn't this actually setting it to text formatted value?
+   /* default is unchanged */
+   tuple->format = palloc(natts * sizeof(char));
+   memset(tuple->format, 't', natts * sizeof(char));
Also, since the values member isn't memset() with a default, this seems a bit
misleading at best no?

For the rest of the backend we aren't including the defname in the errmsg like
what is done here.  Maybe we should, but I think that should be done
consistently if so, and we should stick to just "conflicting or redundant
options" for now.  At the very least, this needs a comma between "options" and
the defname and ideally the defname wrapped in \".
-                        errmsg("conflicting or redundant options")));
+                        errmsg("conflicting or redundant options %s already provided", defel->defname)));

These types of constructs are IMHO quite hard to read:
+         if(
+ #ifdef WORDS_BIGENDIAN
+             true
+ #else
+             false
+ #endif
+                 != bigendian)
How about spelling out the statement completely for both cases, or perhaps
encapsulating the logic in a macro? Something like the below perhaps?
+ #ifdef WORDS_BIGENDIAN
+         if (bigendian != true)
+ #else
+         if (bigendian != false)
+ #endif

This change needs to be removed before a commit, just highlighting that here to
avoid it going unnoticed.
-/* #define WAL_DEBUG */
+#define WAL_DEBUG

Reading this I'm left wondering if we shoulnd't introduce macros for the kinds,
since we now compare with 'u', 't' etc in quite a few places and add comments
explaining the types everywhere.  A descriptive name would make it easier to
grep for all occurrences, and avoid the need for the comment lines.  Thats not
necesarily for this patch though, but an observation from reading it.


I found a few smaller nitpicks as well, some of these might go away by a
pg_indent run but I've included them all here regardless:

This change, and the subsequent whitespace removal later in the same function,
seems a bit pointless:
-       /* read the relation id */
        relid = pq_getmsgint(in, 4);
-

Braces should go on the next line:
+       if (options->proto.logical.binary) {

This should be a C /* ...  */ comment, or perhaps just removed since the code
is quite self explanatory:
+   // default to false
+   *binary_basetypes = false;

Indentation here:
-                        errmsg("conflicting or redundant options")));
+                       errmsg("conflicting or redundant options %s already provided", defel->defname)));

..as well as here (there are a few like this one):
+                   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                   errmsg("incompatible datum size")));

Capitalization of "after" to make it a proper sentence:
+ * after we know that the subscriber is requesting binary check to make sure

Excessive whitespace and indentation in a few places, and not enough in some:
+                if (binary_given)
+                {
+                values[Anum_pg_subscription_subbinary - 1] =
...
+   if ( *binary_basetypes == true )
...
+       if (sizeof(int)  != int_size)
...
+       if( float4_byval !=
...
+         if (sizeof(long)  != long_size)
+                     ereport(ERROR,
...
+        if (tupleData->format[remoteattnum] =='u')
...
+   bool       binary_basetypes;

That's all for now.

cheers ./daniel


Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks
Next
From: Robert Haas
Date:
Subject: Re: Proposal: Automatic partition creation