JSON decoding plugin - Mailing list pgsql-hackers
From | Euler Taveira |
---|---|
Subject | JSON decoding plugin |
Date | |
Msg-id | 52A5BFAE.1040209@timbira.com.br Whole thread Raw |
Responses |
Re: JSON decoding plugin
|
List | pgsql-hackers |
Hi, A few months ago, it was proposed [1] that would be interested to have a json output plugin for logical decoding. Here it is. Each transaction is a JSON object that can contain xid (optional), timestamp (optional), and change array. Each change's element is a command that was decoded and it can contains: kind (I/U/D), schema (optional), table, columnnames, columntypes (optional), columnvalues, and oldkeys (only for U/D). columnnames, columntypes and columnvalues are arrays. oldkeys is an object that contains the following arrays: keynames, keytypes (optional), and keyvalues. The JSON objects are serialized if you are decoding a serie of transactions. Here is an output example: { "xid": 702, "change": [ { "kind": "insert", "schema": "public", "table": "foo", "columnnames": ["a", "b", "c"], "columntypes": ["int4", "int4", "text"], "columnvalues": [1, 2, "test"] } ,{ "kind": "update", "schema": "public", "table": "foo", "columnnames": ["a", "b", "c"], "columntypes": ["int4", "int4", "text"], "columnvalues": [1, 2, "test2"], "oldkeys": { "keynames": ["a", "b"], "keytypes": ["int4", "int4"], "keyvalues": [1, 2] } } ] } { "xid": 703, "change": [ { "kind": "update", "schema": "public", "table": "foo", "columnnames": ["a", "b", "c"], "columntypes": ["int4", "int4", "text"], "columnvalues": [1, 3, "test2"], "oldkeys": { "keynames": ["a", "b"], "keytypes": ["int4", "int4"], "keyvalues": [1, 2] } } ] } { "xid": 704, "change": [ { "kind": "delete", "schema": "public", "table": "foo", "oldkeys": { "keynames": ["a", "b"], "keytypes": ["int4", "int4"], "keyvalues": [1, 3] } } ] } Some data types was adapted to conform with JSON spec. NAN and Infinity are not valid JSON symbols so their representation is NULL (as some JSON implementations). Due to JSON datatype simplicity, I represent the vast majority of Postgres datatypes as string (However, I admit that we could mimic the json datatype conversion rules). The oldkeys treatment follows what was defined by the commit [2]. It uses: (i) primary key (default behavior); (ii) unique index (if REPLICA IDENTITY USING INDEX is defined for table); (iii) full tuple (if REPLICA IDENTITY FULL is defined for table); (iv) nothing means an error (if REPLICA IDENTITY NOTHING is defined for table). The TOAST columns have a special treatment for UPDATEs. If a tuple that contains a TOAST field is updated, the TOAST field is included iif it is changed too. It means that unchanged TOAST field are omitted from columns* arrays. This means less overhead while transmitting, processing and applying changes. By design, (i) output plugin doesn't know about aborted transactions and (ii) subtransactions are reordered into a toplevel transaction and only the committed pieces are passed to the plugin. You can test it firing the regression tests (e.g. 'make test') or using the following steps? postgresql.conf: wal_level = logical max_wal_senders = 2 max_logical_slots = 2 start collecting WAL records: $ pg_recvlogical --slot=foo -d euler -f /dev/stdout --plugin=json_decoding_plugin --init [execute some transactions] start printing decoded transactions: $ pg_recvlogical --slot=foo -d euler -f /dev/stdout --start stop collecting WAL records: $ pg_recvlogical --slot=foo -d euler -f /dev/stdout --stop Comments? [1] http://www.postgresql.org/message-id/20131015144848.GC8001@awork2.anarazel.de [2] http://www.postgresql.org/message-id/E1VepyF-0005gU-Tv@gemulon.postgresql.org -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Attachment
pgsql-hackers by date: