Thread: pgsql: Generic Messages for Logical Decoding

pgsql: Generic Messages for Logical Decoding

From
Simon Riggs
Date:
Generic Messages for Logical Decoding

API and mechanism to allow generic messages to be inserted into WAL that are
intended to be read by logical decoding plugins. This commit adds an optional
new callback to the logical decoding API.

Messages are either text or bytea. Messages can be transactional, or not, and
are identified by a prefix to allow multiple concurrent decoding plugins.

(Not to be confused with Generic WAL records, which are intended to allow crash
recovery of extensible objects.)

Author: Petr Jelinek and Andres Freund
Reviewers: Artur Zakirov, Tomas Vondra, Simon Riggs
Discussion: 5685F999.6010202@2ndquadrant.com

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/3fe3511d05127cc024b221040db2eeb352e7d716

Modified Files
--------------
contrib/test_decoding/Makefile                  |   2 +-
contrib/test_decoding/expected/ddl.out          |  21 ++--
contrib/test_decoding/expected/messages.out     |  79 ++++++++++++++++
contrib/test_decoding/sql/ddl.sql               |   3 +-
contrib/test_decoding/sql/messages.sql          |  25 +++++
contrib/test_decoding/test_decoding.c           |  18 ++++
doc/src/sgml/func.sgml                          |  45 +++++++++
doc/src/sgml/logicaldecoding.sgml               |  38 ++++++++
src/backend/access/rmgrdesc/Makefile            |   6 +-
src/backend/access/rmgrdesc/logicalmsgdesc.c    |  41 ++++++++
src/backend/access/transam/rmgr.c               |   1 +
src/backend/replication/logical/Makefile        |   2 +-
src/backend/replication/logical/decode.c        |  46 +++++++++
src/backend/replication/logical/logical.c       |  38 ++++++++
src/backend/replication/logical/logicalfuncs.c  |  27 ++++++
src/backend/replication/logical/message.c       |  87 +++++++++++++++++
src/backend/replication/logical/reorderbuffer.c | 121 ++++++++++++++++++++++++
src/backend/replication/logical/snapbuild.c     |  19 ++++
src/bin/pg_xlogdump/.gitignore                  |  21 +---
src/bin/pg_xlogdump/rmgrdesc.c                  |   1 +
src/include/access/rmgrlist.h                   |   1 +
src/include/catalog/pg_proc.h                   |   4 +
src/include/replication/logicalfuncs.h          |   2 +
src/include/replication/message.h               |  41 ++++++++
src/include/replication/output_plugin.h         |  13 +++
src/include/replication/reorderbuffer.h         |  22 +++++
src/include/replication/snapbuild.h             |   2 +
27 files changed, 693 insertions(+), 33 deletions(-)


Re: pgsql: Generic Messages for Logical Decoding

From
Michael Paquier
Date:
On Wed, Apr 6, 2016 at 6:08 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Generic Messages for Logical Decoding
>
> API and mechanism to allow generic messages to be inserted into WAL that are
> intended to be read by logical decoding plugins. This commit adds an optional
> new callback to the logical decoding API.
>
> Messages are either text or bytea. Messages can be transactional, or not, and
> are identified by a prefix to allow multiple concurrent decoding plugins.
>
> (Not to be confused with Generic WAL records, which are intended to allow crash
> recovery of extensible objects.)

jacana says boom:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2016-04-06%2012%3A02%3A05

*** c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/contrib/test_decoding/expected/messages.out
Wed Apr  6 08:02:30 2016
---
c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/contrib/test_decoding/./regression_output/results/messages.out
Wed Apr  6 08:40:24 2016
***************
*** 54,75 ****

  COMMIT;
  SELECT 'žluťoučký kůň' FROM pg_logical_emit_message(true, 'test',
'žluťoučký kůň');
!    ?column?
! ---------------
!  žluťoučký kůň
! (1 row)
!
  SELECT data FROM pg_logical_slot_get_changes('regression_slot',
NULL, NULL, 'force-binary', '0', 'skip-empty-xacts', '1');
                                   data
! ----------------------------------------------------------------------
   message: transactional: 1 prefix: test, sz: 4 content:msg1
   message: transactional: 0 prefix: test, sz: 4 content:msg2
   message: transactional: 0 prefix: test, sz: 4 content:msg4
   message: transactional: 0 prefix: test, sz: 4 content:msg6
   message: transactional: 1 prefix: test, sz: 4 content:msg5
   message: transactional: 1 prefix: test, sz: 4 content:msg7
!  message: transactional: 1 prefix: test, sz: 19 content:žluťoučký kůň
! (7 rows)

  SELECT 'init' FROM pg_drop_replication_slot('regression_slot');
   ?column?
--- 54,70 ----

  COMMIT;
  SELECT 'žluťoučký kůň' FROM pg_logical_emit_message(true, 'test',
'žluťoučký kůň');
! ERROR:  character with byte sequence 0xc5 0xa5 in encoding "UTF8"
has no equivalent in encoding "WIN1252"
  SELECT data FROM pg_logical_slot_get_changes('regression_slot',
NULL, NULL, 'force-binary', '0', 'skip-empty-xacts', '1');
                              data
! ------------------------------------------------------------
   message: transactional: 1 prefix: test, sz: 4 content:msg1
   message: transactional: 0 prefix: test, sz: 4 content:msg2
   message: transactional: 0 prefix: test, sz: 4 content:msg4
   message: transactional: 0 prefix: test, sz: 4 content:msg6
   message: transactional: 1 prefix: test, sz: 4 content:msg5
   message: transactional: 1 prefix: test, sz: 4 content:msg7
! (6 rows)

  SELECT 'init' FROM pg_drop_replication_slot('regression_slot');
   ?column?

I thought we always avoided non-ASCII characters in tests, no?
--
Michael


Re: pgsql: Generic Messages for Logical Decoding

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> Generic Messages for Logical Decoding

The buildfarm thinks this was a remarkably poor choice of random data
payload:

SELECT 'žluťoučký kůň' FROM pg_logical_emit_message(true, 'test', 'žluťoučký kůň');
    ?column?
 ---------------
  žluťoučký kůň

Since this test is not, so far as I can see, at all interested in
character encoding questions, I suggest not using non-ASCII data in it.

            regards, tom lane


Re: pgsql: Generic Messages for Logical Decoding

From
Simon Riggs
Date:
On 6 April 2016 at 14:35, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Simon Riggs <simon@2ndQuadrant.com> writes:
> Generic Messages for Logical Decoding

The buildfarm thinks this was a remarkably poor choice of random data
payload:

SELECT 'žluťoučký kůň' FROM pg_logical_emit_message(true, 'test', 'žluťoučký kůň');
    ?column?
 ---------------
  žluťoučký kůň

Since this test is not, so far as I can see, at all interested in
character encoding questions, I suggest not using non-ASCII data in it.

Yes, working on it now.

This was my bad, since I requested it be added.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: pgsql: Generic Messages for Logical Decoding

From
Fujii Masao
Date:
On Wed, Apr 6, 2016 at 6:08 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Generic Messages for Logical Decoding
>
> API and mechanism to allow generic messages to be inserted into WAL that are
> intended to be read by logical decoding plugins. This commit adds an optional
> new callback to the logical decoding API.

When I specify very long text message, pg_xlogdump failed to dump
correctly the WAL file which should contain that inserted text message.
Isn't this a bug?

You can reproduce the problem by the following steps.

=# SELECT pg_xlogfile_name(pg_switch_xlog());
     pg_xlogfile_name
--------------------------
 000000010000000000000003
(1 row)

=# SELECT pg_xlogfile_name(pg_current_xlog_location());
     pg_xlogfile_name
--------------------------
 000000010000000000000003
(1 row)

=# SELECT pg_logical_emit_message(true, 'test',
repeat('0123456789ABCDEFG', 1024*1024));
 pg_logical_emit_message
-------------------------
 0/510CD40
(1 row)

=# SELECT pg_xlogfile_name(pg_current_xlog_location());
     pg_xlogfile_name
--------------------------
 000000010000000000000005
(1 row)

=# insert into t values(0,0);
INSERT 0 1
postgres=# SELECT pg_xlogfile_name(pg_current_xlog_location());
     pg_xlogfile_name
--------------------------
 000000010000000000000005
(1 row)

The WAL record of pg_logical_emit_message() should be stored in
000000010000000000000004 and 000000010000000000000005.
The WAL record of last insertion should be stored in
000000010000000000000005.
But the results of pg_xlogdump were wrong as follows.

$ pg_xlogdump data/pg_xlog/000000010000000000000004
rmgr: Standby     len (rec/tot):     24/    50, tx:          0, lsn:
0/04000028, prev 0/030146F8, desc: RUNNING_XACTS nextXid 863
latestCompletedXid 862 oldestRunningXid 863

$ pg_xlogdump data/pg_xlog/000000010000000000000005
pg_xlogdump: FATAL:  could not find a valid record after 0/5000000

Regards,

--
Fujii Masao


Re: pgsql: Generic Messages for Logical Decoding

From
Andres Freund
Date:

On April 6, 2016 5:00:54 PM GMT+02:00, Fujii Masao <masao.fujii@gmail.com> wrote:
>On Wed, Apr 6, 2016 at 6:08 PM, Simon Riggs <simon@2ndquadrant.com>
>wrote:
>> Generic Messages for Logical Decoding
>>
>> API and mechanism to allow generic messages to be inserted into WAL
>that are
>> intended to be read by logical decoding plugins. This commit adds an
>optional
>> new callback to the logical decoding API.
>
>When I specify very long text message, pg_xlogdump failed to dump
>correctly the WAL file which should contain that inserted text message.
>Isn't this a bug?
>
>You can reproduce the problem by the following steps.
>
>=# SELECT pg_xlogfile_name(pg_switch_xlog());
>     pg_xlogfile_name
>--------------------------
> 000000010000000000000003
>(1 row)
>
>=# SELECT pg_xlogfile_name(pg_current_xlog_location());
>     pg_xlogfile_name
>--------------------------
> 000000010000000000000003
>(1 row)
>
>=# SELECT pg_logical_emit_message(true, 'test',
>repeat('0123456789ABCDEFG', 1024*1024));
> pg_logical_emit_message
>-------------------------
> 0/510CD40
>(1 row)
>
>=# SELECT pg_xlogfile_name(pg_current_xlog_location());
>     pg_xlogfile_name
>--------------------------
> 000000010000000000000005
>(1 row)
>
>=# insert into t values(0,0);
>INSERT 0 1
>postgres=# SELECT pg_xlogfile_name(pg_current_xlog_location());
>     pg_xlogfile_name
>--------------------------
> 000000010000000000000005
>(1 row)
>
>The WAL record of pg_logical_emit_message() should be stored in
>000000010000000000000004 and 000000010000000000000005.
>The WAL record of last insertion should be stored in
>000000010000000000000005.
>But the results of pg_xlogdump were wrong as follows.
>
>$ pg_xlogdump data/pg_xlog/000000010000000000000004
>rmgr: Standby     len (rec/tot):     24/    50, tx:          0, lsn:
>0/04000028, prev 0/030146F8, desc: RUNNING_XACTS nextXid 863
>latestCompletedXid 862 oldestRunningXid 863
>
>$ pg_xlogdump data/pg_xlog/000000010000000000000005
>pg_xlogdump: FATAL:  could not find a valid record after 0/5000000

If you specify a file it only looks at records in that file. Try with -s.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: pgsql: Generic Messages for Logical Decoding

From
Fujii Masao
Date:
On Thu, Apr 7, 2016 at 12:06 AM, Andres Freund <andres@anarazel.de> wrote:
>
>
> On April 6, 2016 5:00:54 PM GMT+02:00, Fujii Masao <masao.fujii@gmail.com> wrote:
>>On Wed, Apr 6, 2016 at 6:08 PM, Simon Riggs <simon@2ndquadrant.com>
>>wrote:
>>> Generic Messages for Logical Decoding
>>>
>>> API and mechanism to allow generic messages to be inserted into WAL
>>that are
>>> intended to be read by logical decoding plugins. This commit adds an
>>optional
>>> new callback to the logical decoding API.
>>
>>When I specify very long text message, pg_xlogdump failed to dump
>>correctly the WAL file which should contain that inserted text message.
>>Isn't this a bug?
>>
>>You can reproduce the problem by the following steps.
>>
>>=# SELECT pg_xlogfile_name(pg_switch_xlog());
>>     pg_xlogfile_name
>>--------------------------
>> 000000010000000000000003
>>(1 row)
>>
>>=# SELECT pg_xlogfile_name(pg_current_xlog_location());
>>     pg_xlogfile_name
>>--------------------------
>> 000000010000000000000003
>>(1 row)
>>
>>=# SELECT pg_logical_emit_message(true, 'test',
>>repeat('0123456789ABCDEFG', 1024*1024));
>> pg_logical_emit_message
>>-------------------------
>> 0/510CD40
>>(1 row)
>>
>>=# SELECT pg_xlogfile_name(pg_current_xlog_location());
>>     pg_xlogfile_name
>>--------------------------
>> 000000010000000000000005
>>(1 row)
>>
>>=# insert into t values(0,0);
>>INSERT 0 1
>>postgres=# SELECT pg_xlogfile_name(pg_current_xlog_location());
>>     pg_xlogfile_name
>>--------------------------
>> 000000010000000000000005
>>(1 row)
>>
>>The WAL record of pg_logical_emit_message() should be stored in
>>000000010000000000000004 and 000000010000000000000005.
>>The WAL record of last insertion should be stored in
>>000000010000000000000005.
>>But the results of pg_xlogdump were wrong as follows.
>>
>>$ pg_xlogdump data/pg_xlog/000000010000000000000004
>>rmgr: Standby     len (rec/tot):     24/    50, tx:          0, lsn:
>>0/04000028, prev 0/030146F8, desc: RUNNING_XACTS nextXid 863
>>latestCompletedXid 862 oldestRunningXid 863
>>
>>$ pg_xlogdump data/pg_xlog/000000010000000000000005
>>pg_xlogdump: FATAL:  could not find a valid record after 0/5000000
>
> If you specify a file it only looks at records in that file. Try with -s.

In my example, the WAL record of INSERT that I executed last should be in
000000010000000000000005. But pg_xlogdump could not display that.
The output of pg_xlogdump was:

$ pg_xlogdump data/pg_xlog/000000010000000000000005
pg_xlogdump: FATAL:  could not find a valid record after 0/5000000

ISTM that if a WAL file starts with the latter half of LOGICAL MESSAGE
WAL data, pg_xlogdump treats it as invalid and gives up dumping the
remaining WAL data in the file.

Regards,

--
Fujii Masao


Re: pgsql: Generic Messages for Logical Decoding

From
Andres Freund
Date:
On 2016-04-07 12:26:28 +0900, Fujii Masao wrote:
> In my example, the WAL record of INSERT that I executed last should be in
> 000000010000000000000005. But pg_xlogdump could not display that.
> The output of pg_xlogdump was:
>
> $ pg_xlogdump data/pg_xlog/000000010000000000000005
> pg_xlogdump: FATAL:  could not find a valid record after 0/5000000
>
> ISTM that if a WAL file starts with the latter half of LOGICAL MESSAGE
> WAL data, pg_xlogdump treats it as invalid and gives up dumping the
> remaining WAL data in the file.

That'd obviously be something to investigate. IIRC there's a thread
nearby about something like this. But just to confirm, if you use -s
over multiple records it works?

Greetings,

Andres Freund


Re: pgsql: Generic Messages for Logical Decoding

From
Fujii Masao
Date:
On Thu, Apr 7, 2016 at 1:47 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2016-04-07 12:26:28 +0900, Fujii Masao wrote:
>> In my example, the WAL record of INSERT that I executed last should be in
>> 000000010000000000000005. But pg_xlogdump could not display that.
>> The output of pg_xlogdump was:
>>
>> $ pg_xlogdump data/pg_xlog/000000010000000000000005
>> pg_xlogdump: FATAL:  could not find a valid record after 0/5000000
>>
>> ISTM that if a WAL file starts with the latter half of LOGICAL MESSAGE
>> WAL data, pg_xlogdump treats it as invalid and gives up dumping the
>> remaining WAL data in the file.
>
> That'd obviously be something to investigate. IIRC there's a thread
> nearby about something like this.

Okay, will check. Thanks for the info!

> But just to confirm, if you use -s
> over multiple records it works?

Yeah, it worked expected.

Regards,

--
Fujii Masao