Re: Logical Replication WIP - Mailing list pgsql-hackers

From Petr Jelinek
Subject Re: Logical Replication WIP
Date
Msg-id 18aaf2c9-febe-09bd-84d0-5e215810bb7f@2ndquadrant.com
Whole thread Raw
In response to Re: Logical Replication WIP  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Responses Re: Logical Replication WIP  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On 09/09/16 06:33, Peter Eisentraut wrote:
> Review of 0003-Define-logical-replication-protocol-and-output-plugi.patch:
>
> (This is still based on the Aug 31 patch set, but at quick glance I
> didn't see any significant changes in the Sep 8 set.)
>

Yep.

> The start_replication option pg_version option is not documented and
> not used in any later patch.  We can probably do without it and just
> rely on the protocol version.
>

That's leftover from binary type data transfer which is not part of this 
initial approach as it adds a lot of complications to both protocol and 
apply side. So yes can do without.

> In pgoutput_startup(), you check opt->output_type.  But it is not set
> anywhere.  Actually, the startup callback is supposed to set it
> itself.

Leftover from pglogical which actually supports both output types.

> In init_rel_sync_cache(), the way hash_flags is set seems kind of
> weird.  I think that variable could be removed and the flags put
> directly into the hash_create() call.
>

Eh, yes no idea how that came to be.

> pgoutput_config.c seems over-engineered, e.g., converting cstring to
> Datum and back.  Just do normal DefElem list parsing in pgoutput.c.
> That's not pretty either, but at least it's a common coding pattern.
>

Yes now that we have only couple of options I agree.

> In the protocol documentation, explain the meaning of int64 as a
> commit timestamp.
>

You mean that it's milliseconds since postgres epoch?

> On the actual protocol messages:
>
> Why do strings have a length byte?  That is not how other strings in
> the protocol work.  As a minor side-effect, this would limit for
> example column names to 255 characters.

Because I originally sent them without the null termination but I guess 
they don't really need it anymore. (the 255 char limit is not really 
important in practice given the column length is limited to 64 
characters anyway)

>
> The message structure doesn't match the documentation in some ways.
> For example Attributes and TupleData are not separate messages but are
> contained in Relation and Insert/Update/Delete messages.  So the
> documentation needs to be structured a bit differently.
>
> In the Attributes message (or actually Relation message), we don't
> need the 'A' and 'C' bytes.
>

Hmm okay will look into it. I guess if we remove the 'A' then rest of 
the Attribute message neatly merges into the Relation message. The more 
interesting part will be the TupleData as it's common part of other 
messages.

> I'm not sure that pgoutput should concern itself with the client
> encoding.  The client encoding should already be set by the initial
> FE/BE protocol handshake.  I haven't checked that further yet, so it
> might already work, or it should be made to work that way, or I might
> be way off.

Yes, I think you are right, that was there mostly for same reason as the 
pg_version.

>
> Slight abuse of pqformat functions.  We're not composing messages
> using pq_beginmessage()/pq_endmessage(), and we're not using
> pq_getmsgend() when reading.  The "proper" way to do this is probably
> to define a custom set of PQcommMethods.  (low priority)
>

If we change that, I'd probably rather go with direct use of StringInfo 
functions.

--   Petr Jelinek                  http://www.2ndQuadrant.com/  PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Mark Dilger
Date:
Subject: inappropriate use of NameGetDatum macro
Next
From: Andres Freund
Date:
Subject: Re: Logical Replication WIP