Thread: Retrieve the snapshot's LSN

Retrieve the snapshot's LSN

From
Florent Guiliani
Date:
Hello everyone,

I would need to start a read repeatable transaction and retrieve the
corresponding LSN. I'm looking for pointers or Ideas on how to achieve
this.

Andres F. suggested me to extend pg_export_snapshot() [1] and call
GetLatestSnapshot() [2] while reliably retrieving the current LSN.
Should I call
GetXLogWriteRecPtr() [3] for that ? What lock(s) could I take to
synchronize the two calls?

Any other Idea ?

A snapshot is exported when creating a logical replication slot [4]
and the corresponding LSN is also returned [5]. This is what I need
except that I'd rather prefer to not create a replication slot each
time I need the snapshot.

During slot creation, the snapshot building and exporting code seems
highly coupled with the logical decoding stuff. It doesn't seems much
reusable to retrieve the snapshot's LSN outside of logical decoding.

Thank you for your help,

References:

[1] pg_export_snapshot()

https://github.com/postgres/postgres/blob/aa9eac45ea868e6ddabc4eb076d18be10ce84c6a/src/backend/utils/time/snapmgr.c#L1111

[2] GetLatestSnapshot()

https://github.com/postgres/postgres/blob/aa9eac45ea868e6ddabc4eb076d18be10ce84c6a/src/backend/utils/time/snapmgr.c#L259

[3] GetXLogWriteRecPtr()

https://github.com/postgres/postgres/blob/7b156c1e0746a46d083d7dbcd28afb303b3484ef/src/backend/access/transam/xlog.c#L10616

[4] Exported snapshot in logical replication slot creation

https://github.com/postgres/postgres/blob/aa9eac45ea868e6ddabc4eb076d18be10ce84c6a/src/backend/replication/walsender.c#L815
   /* build initial snapshot, might take a while */   DecodingContextFindStartpoint(ctx);
   /*    * Export a plain (not of the snapbuild.c type) snapshot to the user    * that can be imported into another
session.   */    snapshot_name = SnapBuildExportSnapshot(ctx->snapshot_builder);
 


[5] Consistent point LSN in logical replication slot creation:

https://github.com/postgres/postgres/blob/aa9eac45ea868e6ddabc4eb076d18be10ce84c6a/src/backend/replication/walsender.c#L831

snprintf(xpos, sizeof(xpos), "%X/%X",                                           (uint32)
(MyReplicationSlot->data.confirmed_flush >> 32),                                           (uint32)
MyReplicationSlot->data.confirmed_flush);

...cut...

/* second field: LSN at which we became consistent */
pq_sendstring(&buf, "consistent_point"); /* col name */

...cut....

/* consistent wal location */
pq_sendint(&buf, strlen(xpos), 4); /* col2 len */
pq_sendbytes(&buf, xpos, strlen(xpos));

--
Florent



Re: Retrieve the snapshot's LSN

From
Robert Haas
Date:
On Wed, Jul 15, 2015 at 12:51 PM, Florent Guiliani <florent@guiliani.fr> wrote:
> During slot creation, the snapshot building and exporting code seems
> highly coupled with the logical decoding stuff. It doesn't seems much
> reusable to retrieve the snapshot's LSN outside of logical decoding.

I don't think "the snapshot's LSN" has a well-defined meaning in
general.  The obvious meaning would be "the LSN such that all commits
prior to that LSN are visible and all later commits are invisible",
but such an LSN need not exist.  Suppose A writes a commit record at
LSN 0/10000, and then B writes a commit record at 0/10100, and then B
calls ProcArrayEndTransaction().  At this point, B is visible and A is
not visible, even though A's commit record precedes that of B.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Retrieve the snapshot's LSN

From
Andres Freund
Date:
On 2015-07-16 12:40:07 -0400, Robert Haas wrote:
> On Wed, Jul 15, 2015 at 12:51 PM, Florent Guiliani <florent@guiliani.fr> wrote:
> > During slot creation, the snapshot building and exporting code seems
> > highly coupled with the logical decoding stuff. It doesn't seems much
> > reusable to retrieve the snapshot's LSN outside of logical decoding.
>
> I don't think "the snapshot's LSN" has a well-defined meaning in
> general.  The obvious meaning would be "the LSN such that all commits
> prior to that LSN are visible and all later commits are invisible",
> but such an LSN need not exist.  Suppose A writes a commit record at
> LSN 0/10000, and then B writes a commit record at 0/10100, and then B
> calls ProcArrayEndTransaction().  At this point, B is visible and A is
> not visible, even though A's commit record precedes that of B.

Well, in combination with logical decoding it kinda has one: It should
allow you to take a dump of the database with a certain snapshot and
replay all transactions with a commit lsn bigger than the "snapshot's
lsn" and end up with a continually consistent database.

Visibility for HS actually works precisely in commit LSN order, even if
that is possibly different than on the primary...



Re: Retrieve the snapshot's LSN

From
Robert Haas
Date:
On Thu, Jul 16, 2015 at 12:54 PM, Andres Freund <andres@anarazel.de> wrote:
> Well, in combination with logical decoding it kinda has one: It should
> allow you to take a dump of the database with a certain snapshot and
> replay all transactions with a commit lsn bigger than the "snapshot's
> lsn" and end up with a continually consistent database.
>
> Visibility for HS actually works precisely in commit LSN order, even if
> that is possibly different than on the primary...

That makes sense, and hopefully answers Florent's question about why
this is only exposed through the slot mechanism.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Retrieve the snapshot's LSN

From
Andres Freund
Date:
On 2015-07-16 13:08:48 -0400, Robert Haas wrote:
> On Thu, Jul 16, 2015 at 12:54 PM, Andres Freund <andres@anarazel.de> wrote:
> > Well, in combination with logical decoding it kinda has one: It should
> > allow you to take a dump of the database with a certain snapshot and
> > replay all transactions with a commit lsn bigger than the "snapshot's
> > lsn" and end up with a continually consistent database.
> >
> > Visibility for HS actually works precisely in commit LSN order, even if
> > that is possibly different than on the primary...
>
> That makes sense, and hopefully answers Florent's question about why
> this is only exposed through the slot mechanism.

Trying to swap-in the pub conversion, I think Florent wanted to be able to
re-sync a standby from an existing slot. Which kinda makes sense to
me. We could do something like
SELECT * FROM pg_export_snapshot_for_slot(...);

which would return the snapshot name and the LSN.

There'd need to be some finicky locking to get that, but it should b epossible.

Andres



Re: Retrieve the snapshot's LSN

From
Florent Guiliani
Date:
On Thu, Jul 16, 2015 at 6:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> I don't think "the snapshot's LSN" has a well-defined meaning in
> general.  The obvious meaning would be "the LSN such that all commits
> prior to that LSN are visible and all later commits are invisible",

I like this definition.

> but such an LSN need not exist.  Suppose A writes a commit record at
> LSN 0/10000, and then B writes a commit record at 0/10100, and then B
> calls ProcArrayEndTransaction().  At this point, B is visible and A is
> not visible, even though A's commit record precedes that of B.

Maybe that's what Andres referred as "doable with some finicky locking".

There is some race conditions to build a snapshot with an associated
consistent LSN. If I understand your example, A is supposed to call
ProcArrayEndTransaction() anytime soon. Could we wait/lock until it
happens?

--
Florent



Re: Retrieve the snapshot's LSN

From
Robert Haas
Date:
On Fri, Jul 17, 2015 at 4:16 AM, Florent Guiliani <florent@guiliani.fr> wrote:
>> but such an LSN need not exist.  Suppose A writes a commit record at
>> LSN 0/10000, and then B writes a commit record at 0/10100, and then B
>> calls ProcArrayEndTransaction().  At this point, B is visible and A is
>> not visible, even though A's commit record precedes that of B.
>
> Maybe that's what Andres referred as "doable with some finicky locking".
>
> There is some race conditions to build a snapshot with an associated
> consistent LSN. If I understand your example, A is supposed to call
> ProcArrayEndTransaction() anytime soon.

Right.

> Could we wait/lock until it
> happens?

In theory, yes.  I'm not sure what the code would would look like, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Retrieve the snapshot's LSN

From
Florent Guiliani
Date:
On Thu, Jul 16, 2015 at 7:13 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2015-07-16 13:08:48 -0400, Robert Haas wrote:
>> On Thu, Jul 16, 2015 at 12:54 PM, Andres Freund <andres@anarazel.de> wrote:
>> > Well, in combination with logical decoding it kinda has one: It should
>> > allow you to take a dump of the database with a certain snapshot and
>> > replay all transactions with a commit lsn bigger than the "snapshot's
>> > lsn" and end up with a continually consistent database.
>> >
>> > Visibility for HS actually works precisely in commit LSN order, even if
>> > that is possibly different than on the primary...
>>
>> That makes sense, and hopefully answers Florent's question about why
>> this is only exposed through the slot mechanism.
>
> Trying to swap-in the pub conversion, I think Florent wanted to be able to
> re-sync a standby from an existing slot. Which kinda makes sense to
> me. We could do something like
> SELECT * FROM pg_export_snapshot_for_slot(...);
>
> which would return the snapshot name and the LSN.
>
> There'd need to be some finicky locking to get that, but it should b epossible.

A pg_export_snapshot_for_slot(...) would work very well.

Let me explain the use case. You have many downstream systems that are
replicated with logical decoding. Using a dedicated replication slot
for each target is not practical. A single logical replication slot is
configured. It generates a stream of LSN-stamped transactions in
commit order. Those transactions are published to all downstream
nodes.

The snapshot exported during the slot creation can be used to generate
a complete dump that the replicated systems will load before applying
the transaction stream.

How do you individually reload/recover one of the downstream node? You
can use the initial dump and reapply all transactions emitted since
the slot's inception. It will quickly become impractical. What you
need is to generate a newer dump and only apply the transactions from
that point.

Problem: How do you synchronize this newer dump with the LSN-stamped
stream of transactions? Being able to tell what LSN correspond to the
consistent dump would solve it.

I've started a quick&dirty solution:
https://github.com/flyerman/postgres/commit/a13432d5e596a8b13ff911637afd764f53af2ab3

where I copied CreateReplicationSlot():

https://github.com/flyerman/postgres/blob/a13432d5e596a8b13ff911637afd764f53af2ab3/src/backend/replication/walsender.c#L764

into ExportLogicalDecodingSnapshot() and removed everything that isn't
needed for the snapshot creation. I still need to plug it into the
replication protocol grammar to test it.

It's not very good solution. Among others bad things, it will exercise
the output plugin for nothing.

--
Florent



Re: Retrieve the snapshot's LSN

From
Robert Haas
Date:
On Fri, Jul 17, 2015 at 8:31 AM, Florent Guiliani <florent@guiliani.fr> wrote:
> A pg_export_snapshot_for_slot(...) would work very well.
>
> Let me explain the use case. You have many downstream systems that are
> replicated with logical decoding. Using a dedicated replication slot
> for each target is not practical. A single logical replication slot is
> configured. It generates a stream of LSN-stamped transactions in
> commit order. Those transactions are published to all downstream
> nodes.
>
> The snapshot exported during the slot creation can be used to generate
> a complete dump that the replicated systems will load before applying
> the transaction stream.
>
> How do you individually reload/recover one of the downstream node? You
> can use the initial dump and reapply all transactions emitted since
> the slot's inception. It will quickly become impractical. What you
> need is to generate a newer dump and only apply the transactions from
> that point.

I'd like to point out that I told Andres repeatedly during the
development of logical decoding that this exact thing was going to be
a problem.  He told me fixing it was way too complicated, but I hope
he'll relent, because I still think this is important.

(Not trying to be a jerk here.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Retrieve the snapshot's LSN

From
Florent Guiliani
Date:
On Fri, Jul 17, 2015 at 6:53 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Jul 17, 2015 at 8:31 AM, Florent Guiliani <florent@guiliani.fr> wrote:
>> A pg_export_snapshot_for_slot(...) would work very well.
>>
>> Let me explain the use case. You have many downstream systems that are
>> replicated with logical decoding. Using a dedicated replication slot
>> for each target is not practical. A single logical replication slot is
>> configured. It generates a stream of LSN-stamped transactions in
>> commit order. Those transactions are published to all downstream
>> nodes.
>>
>> The snapshot exported during the slot creation can be used to generate
>> a complete dump that the replicated systems will load before applying
>> the transaction stream.
>>
>> How do you individually reload/recover one of the downstream node? You
>> can use the initial dump and reapply all transactions emitted since
>> the slot's inception. It will quickly become impractical. What you
>> need is to generate a newer dump and only apply the transactions from
>> that point.
>
> I'd like to point out that I told Andres repeatedly during the
> development of logical decoding that this exact thing was going to be
> a problem.  He told me fixing it was way too complicated, but I hope
> he'll relent, because I still think this is important.
>
> (Not trying to be a jerk here.)


As a quick solution, I added a replication command named
LOGICAL_DECODING_SNAPSHOT that does everything CREATE_REPLICATION_SLOT
does for logical slots except that the slot is automatically dropped (
see attached patch or
https://github.com/flyerman/postgres/compare/9.5...flyerman:export-logical-snapshot
).

Feel free to review and comment my patch. I'd like to provide a better
solution but I'd need some guidance on how to get the locking right
between the snapshot creation and the LSN retrieval.

--
Florent

Attachment