Re: Re: pglogical_output - a general purpose logical decoding output plugin - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: Re: pglogical_output - a general purpose logical decoding output plugin
Date
Msg-id CAMsr+YHZH_rApcTZ36G27agS6igyYdW_aXw9PPLbh2JCEhGq9A@mail.gmail.com
Whole thread Raw
In response to Re: pglogical_output - a general purpose logical decoding output plugin  (Tomasz Rybak <tomasz.rybak@post.pl>)
Responses Re: Re: pglogical_output - a general purpose logical decoding output plugin  (Tomasz Rybak <tomasz.rybak@post.pl>)
List pgsql-hackers
On 22 January 2016 at 06:13, Tomasz Rybak <tomasz.rybak@post.pl> wrote:
 
+ data stream. The output stream is designed to be compact and fast to decode,
+ and the plugin supports upstream filtering of data so that only the required
+ information is sent.

plugin supports upstream filtering of data through hooks so that ...


Ok.
 
+ subset of that database may be selected for replication, currently based on
+ table and on replication origin. Filtering by a WHERE clause can be supported
+ easily in future.

Is this filtering by table and replication origin implemented? I haven't
noticed it in source.

That's what the hooks are for.
 
+ other daemon is required. It's accumulated using

Stream of changes is accumulated...

Ok.
 
+ [the `CREATE_REPLICATION_SLOT ... LOGICAL ...` or `START_REPLICATION SLOT ... LOGICAL ...` commands](http://www.postgresql.org/docs/current/static/logicaldecoding-walsender.html) to start streaming changes. (It can also be used via
+ [SQL level functions](http://www.postgresql.org/docs/current/static/logicaldecoding-sql.html)
+ over a non-replication connection, but this is mainly for debugging purposes)

Replication slot can also be configured (causing output plugin to be loaded) via [SQL level functions]...

Covered in the next section. Or at least it is in the SGML docs conversion I'm still trying to finish off..
 
+ * Client issues `CREATE_REPLICATION_SLOT slotname LOGICAL 'pglogical'` if it's setting up for the first time

* Client issues `CREATE_REPLICATION_SLOT slotname LOGICAL 'pglogical'` to setup replication if it's connecting for the first time

I disagree. It's entirely possible to do your slot creation/setup manually or via something else, re-use a slot first created by another node, etc. Slot creation is part of client setup, not so much connection.
 
+ Details are in the replication protocol docs.

Add link to file with protocol documentation.

Is done in SGML conversion.
 
+ If your application creates its own slots on first use and hasn't previously
+ connected to this database on this system you'll need to create a replication
+ slot. This keeps track of the client's replay state even while it's disconnected.

If your application hasn't previously connected to this database on this system
it'll need to create and configure replication slot which keeps track of the
client's replay state even while it's disconnected.

As above, I don't quite agree.
 
+ `pglogical`'s output plugin now sends a continuous series of `CopyData`

As this is separate plugin, use 'pglogical_output' plugin now sends...
(not only here but also in few other places).

Thanks. Done.

+ All hooks are called in their own memory context, which lasts for the duration

All hooks are called in separate hook memory context, which lasts for the duration...

I don't see the difference, but OK.

Simplified:

> All hooks are called in a memory context that lasts ...
 
+ + switched to a longer lived memory context like TopMemoryContext. Memory allocated
+ + in the hook context will be automatically when the decoding session shuts down.

...will be automatically freed when the decoding...

Fixed, thanks.
 
DDL for global object changes must be synchronized via some external means.

Just:
Global object changes must be synchronized via some external means.

Agree, done.
 
+ determine why an error occurs in a downstream, since you can examine a
+ json-ified representation of the xact. It's necessary to supply a minimal

since you can examine a transaction in json (and not binary) format. It's necessary

Ok, done.
 
+ Once you've peeked the stream and know the LSN you want to discard up to, you
+ can use `pg_logical_slot_peek_changes`, specifying an `upto_lsn`, to consume

Shouldn't it be pg_logical_slot_get_changes? get_changes consumes changes,
peek_changes leaves them in the stream. Especially as example below
points that we need to use get_changes.

Yes. It should. Whoops. Thanks.
 
DESIGN.md:

+ attnos don't necessarily correspond. The column names might, and their ordering
+ might even be the same, but any column drop or column type change will result

The column names and their ordering might even be the same...

I disagree, that has a different meaning. It's also not really user-facing docs so I'm not too worried about being quite as readable.

README.pglogical_output_plhooks:

+ Note that pglogical
+ must already be installed so that its headers can be found.

Note that pglogical_output must already...

Thanks.
 
+ Arguments are the oid of the affected relation, and the change type: 'I'nsert,
+ 'U'pdate or 'D'elete. There is no way to access the change data - columns changed,
+ new values, etc.

Is it true (no way to access change data)? You added passing change
to C hooks; from looking at code it looks like it's true, but I want to be sure.

While the change data is now passed to the C hook, there's no attempt to expose it via PL/PgSQL. So yeah, that's still true.

Thanks again for this. Sorry it's taken so long to get the SGML docs converted. Too many other things on. 


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: log_checkpoint's "0 transaction log file(s) added" is extremely misleading
Next
From: Tom Lane
Date:
Subject: Re: Extracting fields from 'infinity'::TIMESTAMP[TZ]