Home > mailing lists

JSON decoding plugin - Mailing list pgsql-hackers

From	Euler Taveira
Subject	JSON decoding plugin
Date	December 9, 2013 16:04:01
Msg-id	52A5BFAE.1040209@timbira.com.br Whole thread Raw
Responses	Re: JSON decoding plugin
List	pgsql-hackers

Tree view

Hi,

A few months ago, it was proposed [1] that would be interested to have a
json output plugin for logical decoding. Here it is.

Each transaction is a JSON object that can contain xid (optional),
timestamp (optional), and change array. Each change's element is a
command that was decoded and it can contains: kind (I/U/D), schema
(optional), table, columnnames, columntypes (optional), columnvalues,
and oldkeys (only for U/D). columnnames, columntypes and columnvalues
are arrays. oldkeys is an object that contains the following arrays:
keynames, keytypes (optional), and keyvalues.

The JSON objects are serialized if you are decoding a serie of
transactions. Here is an output example:

{
    "xid": 702,
    "change": [
        {
            "kind": "insert",
            "schema": "public",
            "table": "foo",
            "columnnames": ["a", "b", "c"],
            "columntypes": ["int4", "int4", "text"],
            "columnvalues": [1, 2, "test"]
        }
        ,{
            "kind": "update",
            "schema": "public",
            "table": "foo",
            "columnnames": ["a", "b", "c"],
            "columntypes": ["int4", "int4", "text"],
            "columnvalues": [1, 2, "test2"],
            "oldkeys": {
                "keynames": ["a", "b"],
                "keytypes": ["int4", "int4"],
                "keyvalues": [1, 2]
            }
        }
    ]
}
{
    "xid": 703,
    "change": [
        {
            "kind": "update",
            "schema": "public",
            "table": "foo",
            "columnnames": ["a", "b", "c"],
            "columntypes": ["int4", "int4", "text"],
            "columnvalues": [1, 3, "test2"],
            "oldkeys": {
                "keynames": ["a", "b"],
                "keytypes": ["int4", "int4"],
                "keyvalues": [1, 2]
            }
        }
    ]
}
{
    "xid": 704,
    "change": [
        {
            "kind": "delete",
            "schema": "public",
            "table": "foo",
            "oldkeys": {
                "keynames": ["a", "b"],
                "keytypes": ["int4", "int4"],
                "keyvalues": [1, 3]
            }
        }
    ]
}


Some data types was adapted to conform with JSON spec. NAN and Infinity
are not valid JSON symbols so their representation is NULL (as some JSON
implementations). Due to JSON datatype simplicity, I represent the vast
majority of Postgres datatypes as string (However, I admit that we could
mimic the json datatype conversion rules).

The oldkeys treatment follows what was defined by the commit [2]. It uses:

(i) primary key (default behavior);
(ii) unique index (if REPLICA IDENTITY USING INDEX is defined for table);
(iii) full tuple (if REPLICA IDENTITY FULL is defined for table);
(iv) nothing means an error (if REPLICA IDENTITY NOTHING is defined for
table).

The TOAST columns have a special treatment for UPDATEs. If a tuple that
contains a TOAST field is updated, the TOAST field is included iif it is
changed too. It means that unchanged TOAST field are omitted from
columns* arrays. This means less overhead while transmitting,
processing and applying changes.

By design, (i) output plugin doesn't know about aborted transactions and
(ii) subtransactions are reordered into a toplevel transaction and only
the committed pieces are passed to the plugin.

You can test it firing the regression tests (e.g. 'make test') or using
the following steps?

postgresql.conf:
wal_level = logical
max_wal_senders = 2
max_logical_slots = 2

start collecting WAL records:

$ pg_recvlogical --slot=foo -d euler -f /dev/stdout
--plugin=json_decoding_plugin --init

[execute some transactions]

start printing decoded transactions:

$ pg_recvlogical --slot=foo -d euler -f /dev/stdout --start

stop collecting WAL records:

$ pg_recvlogical --slot=foo -d euler -f /dev/stdout --stop


Comments?


[1]
http://www.postgresql.org/message-id/20131015144848.GC8001@awork2.anarazel.de
[2]
http://www.postgresql.org/message-id/E1VepyF-0005gU-Tv@gemulon.postgresql.org


--
   Euler Taveira                   Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Attachment

json-plugin-0.6.tgz

pgsql-hackers by date:

From: "MauMau"
Date: 09 December 2013, 15:22:41
Subject: Re: [bug fix] pg_ctl always uses the same event source

From: "MauMau"
Date: 09 December 2013, 16:05:02
Subject: Re: Recovery to backup point

JSON decoding plugin - Mailing list pgsql-hackers

Attachment

Previous

Next