Re: Timeline following for logical slots - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: Timeline following for logical slots
Date
Msg-id CAMsr+YHBm3mUtXb2_RD=QsnUpdT0dR8K-+GTbBgpRdYuZFmXtw@mail.gmail.com
Whole thread Raw
In response to Re: Timeline following for logical slots  (Craig Ringer <craig@2ndquadrant.com>)
Responses Re: Timeline following for logical slots
List pgsql-hackers
On 29 April 2016 at 15:40, Craig Ringer <craig@2ndquadrant.com> wrote:
 
I don't think pg_recvlogical can do anything about the need for that dummy write, since the client has no way to determine the exact LSN of the commit record of the xact of interest. It can't rely on pg_current_xlog_insert_location() or pg_current_xlog_location() since autovacuum or a checkpoint might've written xlog since. Logical streaming replication doesn't have a non-blocking mode where it returns immediately if it'd have to wait for more xlog so we can't just send off the most recent server LSN as the endpoint.

(Patch attached. Blah blah explanation blah):

With this patch pg_recvlogical takes a new --endpos LSN argument, and will exit if either:

* it receives an XLogData message with dataStart >= endpos; or
* it receives a keepalive with walEnd >= endpos

The latter allows it to work without needing a dummy transaction to make it see a data message after endpos. If there's nothing to read on the socket until a keepalive we know that the server has nothing to send us, and if walend has passed endpos we know nothing can have committed before endpos.


The way I've written things the endpos is the point where we stop receiving and exit, so if a record with start lsn >= endpos is received we'll exit without writing it.

I thought about writing out the record before exiting if the record start LSN is exactly endpos. That'd be handy in cases where the client knows a commit's LSN and wants everything up to that commit. But it's easy enough in this case for the client to set endpos to the commit start lsn + 1, so it's not like the current behaviour stops you doing anything, and it means the code can just test endpos and exit. pg_current_xlog_insert_location() will return at least the lsn of the last commit + 1, so you'll get the expected behaviour for free there. It does mean we might wait for the next walsender keepalive or status update before we exit, though, so if someone feels strongly that endpos should be an inclusive bound I can do that. It's just a bit uglier in the code.

I can't add a "number of xacts" filter like the SQL interface has because pg_recvlogical has no idea which records represent a commit, so it's not possible without changing the protocol. I'm not convinced a "number of messages" filter is particularly useful. I could add a timeout, but it's easy enough to do that in a wrapper (like IPC::Run). So I'm sticking with just the LSN filter for now.

Also because pg_recvlogical isn't aware of transaction boundaries, setting endpos might result in a partial transaction being output if endpos is after the end of the last xact wanted and some other xact containing changes made before endpos commits after endpos but before the next status update/keepalive is sent. That xact won't be consumed from the server and will just be sent when the slot is next read from. This won't result in unpredictable output for testing since there we control what other xacts run and will generally exit based on walsender status updates/keepalives.

Here's the patch. Docs included. Comments?

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Attachment

pgsql-hackers by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: Add jsonb_compact(...) for whitespace-free jsonb to text
Next
From: Andre Mikulec
Date:
Subject: SPI_exec ERROR in pl/r of R 3.2.4 on PostgreSQL on Windows 7