Re: pglogical_output - a general purpose logical decoding output plugin - Mailing list pgsql-hackers
From | Craig Ringer |
---|---|
Subject | Re: pglogical_output - a general purpose logical decoding output plugin |
Date | |
Msg-id | CAMsr+YFq=GtG9L-Boz9C07P56TSY3VYABd6Kin1H0RFmnwBx9A@mail.gmail.com Whole thread Raw |
In response to | pglogical_output - a general purpose logical decoding output plugin (Craig Ringer <craig@2ndquadrant.com>) |
List | pgsql-hackers |
On 2 November 2015 at 20:17, Craig Ringer <craig@2ndquadrant.com> wrote: > Hi all > > I'd like to submit pglogical_output for inclusion in the 9.6 series as > a contrib. A few points are likely to come up in anything but the most cursory examination of the patch. The README alludes to protocol docs that aren't in the tree. A followup will add them shortly, they just need a few tweaks. There are pg_regress tests, but they're limited. The output plugin uses the binary output mode, and pg_regress doesn't play well with that at all. Timestamps, XIDs, LSNs, etc are embedded in the output. Also, pglogical its self emits LSNs and timestamps in commit messages. Some things, like the startup message, are likely to contain variable data in future too. So we can't easily do a "dumb" comparison of expected output. That's why the bulk of the tests in test/ are in Python, using psycopg2. Python and psycopg2 were used partly because of the excellent work done by Oleksandr Shulgin at Zalando (https://github.com/zalando/psycopg2/tree/feature/replication-protocol, https://github.com/psycopg/psycopg2/pull/322) which means we can connect to the walsender and consume the replication protocol, rather than relying only on the SQL interfaces. Both are supported, and only the SQL interface is used by default. It also means the tests can have logic to validate the protocol stream, examining it message by message to ensure it's exactly what's expected. Rather than a diff where two lines of binary gibberish don't match, you get a specific error. Of course, I'm aware that the buildfarm animals aren't required to have Python, let alone a patched psycopg2, so we can't rely on these as smoketests. That's why the pg_regress tests are there too. There another extension inside it, in contrib/pglogical_output/examples/hooks . I'm not sure if this should be separated out into a separate contrib/ since it's very tightly coupled to pglogical_output. Its purpose is to expose the hooks from pglogical_output to SQL, so that they can be implemented by plpgsql or whatever, instead of having to be C functions. It's not integrated into pglogical_output proper because I see this as mainly a test and prototyping facility. It's necessary to have this in order for the unit tests to cover filtering and hooks, but most practical users will never want or need it. So I'd rather not integrate it into pglogical_output proper. pglogical_output has headers, and it installs them into Pg's include/ tree at install time. This is not something that prior contribs have done, so there's no policy for it as yet. The reason for doing so is that the output plugin exposes a hooks API so that it can be reused by different clients with different needs, rather than being tightly coupled to just one downstream user. For example, it makes no assumptions about things like what replication origin names mean - keeping with the design of replication origins, which just provide mechanism without policy. That means that the client needs to tell the output plugin how to filter transactions if it wants to do selective replication on a node-by-node basis. Similarly, there's no built-in support for selective replication on a per-table basis, just a hook you can implement. So clients can provide their own policy for how to decide what tables to replicate. When we're calling hooks for each and every row we really want a C function pointer so we can avoid the need to go through the fmgr each time, and so we can pass a `struct Relation` and bypass the need for catalog lookups. That sort of thing. Table metadata is sent for each row. It really needs to be sent once for each consecutive series of rows for the same table. Some care is required to make sure it's invalidated and re-sent when the table structure changes mid-series. So that's a pending change. It's important for efficiency, but pretty isolated and doesn't make the plugin less useful otherwise, so I thought it could wait. Sending the whole old tuple is not yet supported, per the fixme in pglogical_write_update . It should really be a TODO, since to support this we really need a way to keep track of replica identity for a table, but also WAL-log the whole old tuple. (ab)using REPLICA IDENTITY FULL to log the old tuple means we lose information about what the real identity key is. So this is more of a wanted future feature, and I'll change it to a TODO. I'd like to delay some ERROR messages until after the startup parameters are sent. That way the client can see more info about the server's configuration, version, capabilities, etc, and possibly reconnect with acceptable settings. Because a logical decoding plugin isn't allowed to generate input during its startup callback, though, this could mean indefinitely delaying an error until the upstream does some work that results in a decoding callback. So for now errors on protocol mismatches, etc, are sent immediately. Text encoding names are compared byte-wise. They should be looked up in the catalogs and compared properly. Just not done yet. I think those are the main points. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: