Re: [HACKERS] Logical Replication and Character encoding - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: [HACKERS] Logical Replication and Character encoding
Date
Msg-id 20170201.120540.183393194.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to [HACKERS] Logical Replication and Character encoding  ("Shinoda, Noriyoshi" <noriyoshi.shinoda@hpe.com>)
Responses Re: [HACKERS] Logical Replication and Character encoding  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Re: [HACKERS] Logical Replication and Character encoding  (Euler Taveira <euler@timbira.com.br>)
Re: [HACKERS] Logical Replication and Character encoding  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
List pgsql-hackers
Hello,

At Tue, 31 Jan 2017 12:46:18 +0000, "Shinoda, Noriyoshi" <noriyoshi.shinoda@hpe.com> wrote in
<AT5PR84MB0084FAE5976D89CDE9733093EE4A0@AT5PR84MB0084.NAMPRD84.PROD.OUTLOOK.COM>
>  I tried a committed Logical Replication environment. I found
>  that replication between databases of different encodings did
>  not convert encodings in character type columns. Is this
>  behavior correct?

The output plugin for subscription is pgoutput and it currently
doesn't consider encoding but would easiliy be added if desired
encoding is informed.

The easiest (but somewhat seems fragile) way I can guess is,

- Subscriber connects with client_encoding specification and the output plugin pgoutput decide whether it accepts the
encodingor not. If the subscriber doesn't, pgoutput send data without conversion.
 

The attached small patch does this and works with the following
CREATE SUBSCRIPTION.

CREATE SUBSCRIPTION sub1 CONNECTION 'host=/tmp port=5432 dbname=postgres client_encoding=EUC_JP' PUBLICATION pub1;


Also we may have explicit negotiation on, for example,
CREATE_REPLICATION_SLOT.
'CREATE_REPLICATION_SLOT sub1 LOGICAL pgoutput ENCODING EUC_JP'

Or output plugin may take options.
'CREATE_REPLICATION_SLOT sub1 LOGICAL pgoutput OPTIONS(encoding EUC_JP)'


Any opinions?

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 1f30de6..6a235d7 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -16,6 +16,7 @@#include "catalog/pg_namespace.h"#include "catalog/pg_type.h"#include "libpq/pqformat.h"
+#include "mb/pg_wchar.h"#include "replication/logicalproto.h"#include "utils/builtins.h"#include "utils/lsyscache.h"
@@ -442,6 +443,9 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple)        pq_sendbyte(out, 't');
  /* 'text' data follows */        outputstr = OidOutputFunctionCall(typclass->typoutput, values[i]);
 
+        if (pg_get_client_encoding() != GetDatabaseEncoding())
+            outputstr = pg_server_to_client(outputstr, strlen(outputstr));
+        len = strlen(outputstr) + 1;    /* null terminated */        pq_sendint(out, len, 4);        /* length */
 appendBinaryStringInfo(out, outputstr, len); /* data */ 

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [HACKERS] WAL consistency check facility
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: [HACKERS] Logical Replication and Character encoding