Thread: Streaming replication for psycopg2

Streaming replication for psycopg2

From
"Shulgin, Oleksandr"
Date:
Hello,

I've submitted a patch to psycopg2 to support streaming replication protocol (COPY_BOTH): https://github.com/psycopg/psycopg2/pull/322

It would be great if more people had a chance to take a look and provide feedback about it.  In particular, please see example usage at this github comment[1]. Some bikeshedding is really needed here. :-)

Re: Streaming replication for psycopg2

From
"Shulgin, Oleksandr"
Date:
On Tue, Jun 2, 2015 at 2:23 PM, Shulgin, Oleksandr <oleksandr.shulgin@zalando.de> wrote:
>
> Hello,
>
> I've submitted a patch to psycopg2 to support streaming replication protocol (COPY_BOTH): https://github.com/psycopg/psycopg2/pull/322
>
> It would be great if more people had a chance to take a look and provide feedback about it.  In particular, please see example usage at this github comment[1]. Some bikeshedding is really needed here. :-)

I've been suggested that instead of putting the sync/stop methods into the replication message class like this

class ReplicationMessage(str):
    #wal_end
    #data_start
    #send_time
    #def commit(self):
    #def stop(self):
    ...

it would make more sense to put them into the cursor, like this:

class ReplicationCursor(...):

    def sync_server(self, msg):
        ...

    def stop_replication(self):
        ...

The client code will be able then to do this:

class LogicalWriter(object):

    def __init__(self, cursor):
        self.cursor = cursor

    def write(self, msg): # receives instance of ReplicationMessage
        if should_stop_replication():
            self.cursor.stop_replication()
            return

        self.actually_store_the_message(msg)

        if stored_reliably() and want_to_report_now():
            self.cursor.sync_server(msg)

        # return value not examined by caller


That seems like a more sane interface to me.

--
Alex

Re: Streaming replication for psycopg2

From
"Shulgin, Oleksandr"
Date:
On Thu, Jun 4, 2015 at 5:49 PM, Shulgin, Oleksandr <oleksandr.shulgin@zalando.de> wrote:
On Tue, Jun 2, 2015 at 2:23 PM, Shulgin, Oleksandr <oleksandr.shulgin@zalando.de> wrote:
>
> Hello,
>
> I've submitted a patch to psycopg2 to support streaming replication protocol (COPY_BOTH): https://github.com/psycopg/psycopg2/pull/322
>
> It would be great if more people had a chance to take a look and provide feedback about it.  In particular, please see example usage at this github comment[1]. Some bikeshedding is really needed here. :-)

Hello again,

I have updated the pull request above to address the feedback I've gathered from using this construct internally and from other sources.  Now included, the lower level asynchronous interface that gives the user more control, for the price of doing select() and keeping track of keepalive timeouts manually.

It would be nice if someone could review this.  The final look can be found by using this link (docs included): https://github.com/psycopg/psycopg2/pull/322/files

Thanks.
--
Alex

Re: [HACKERS] Streaming replication for psycopg2

From
Craig Ringer
Date:
On 30 June 2015 at 22:42, Shulgin, Oleksandr
<oleksandr.shulgin@zalando.de> wrote:
> On Thu, Jun 4, 2015 at 5:49 PM, Shulgin, Oleksandr
> <oleksandr.shulgin@zalando.de> wrote:
>>
>> On Tue, Jun 2, 2015 at 2:23 PM, Shulgin, Oleksandr
>> <oleksandr.shulgin@zalando.de> wrote:
>> >
>> > I've submitted a patch to psycopg2 to support streaming replication
>> > protocol (COPY_BOTH): https://github.com/psycopg/psycopg2/pull/322
>
> Hello again,
>
> I have updated the pull request above to address the feedback I've gathered
> from using this construct internally and from other sources.  Now included,
> the lower level asynchronous interface that gives the user more control, for
> the price of doing select() and keeping track of keepalive timeouts
> manually.
>
> It would be nice if someone could review this.  The final look can be found
> by using this link (docs included):
> https://github.com/psycopg/psycopg2/pull/322/files

Per the pull request's history, I've given this a fairly detailed
review. With a few minor changes I think it's really good and will be
a valuable addition to psycopg2.

Before merging I think the connection should preferably be split into
a proper subclass, though it might not be worth the verbosity involved
when working with the CPython API.

I suspect that the row-by-row COPY support should be pushed down to be
shared with copy_expert, too. There's so much logic that'd be shared:
the push (callback) or pull oriented read modes, the support for
reading raw bytes or decoded unicode text for rows, the result object,
etc.

Otherwise I think this is great, and it was a real pleasure to read a
patch where the first half is good documentation and examples.

Since pyscopg2 is libpq based, so it can't be treated as an
independent re-implementation of the protocol for testing purposes.
For that it'd probably be necessary to add replication protocol
support to PgJDBC. But it's still really handy for prototyping logical
decoding clients, testing output plugins, adapting logical decoding
output to feed into other systems, etc, and I can see this as
something that could be really handy for optional extra tests for
PostgreSQL its self - validating the replication protocol, etc.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services