Thread: Logical decoding slots can go backwards when used from SQL, docs are wrong
Hi all
I think I found a couple of logical decoding issues while writing tests for failover slots.
Despite the docs' claim that a logical slot will replay data "exactly once", a slot's confirmed_lsn can go backwards and the SQL functions can replay the same data more than once.We don't mark a slot as dirty if only its confirmed_lsn is advanced, so it isn't flushed to disk. For failover slots this means it also doesn't get replicated via WAL. After a master crash, or for failover slots after a promote event, the confirmed_lsn will go backwards. Users of the SQL interface must keep track of the safely locally flushed slot position themselves and throw the repeated data away. Unlike with the walsender protocol it has no way to ask the server to skip that data.
Worse, because we don't dirty the slot even a *clean shutdown* causes slot confirmed_lsn to go backwards. That's a bug IMO. We should force a flush of all slots at the shutdown checkpoint, whether dirty or not, to address it.
Barring objections I intend to submit a fix to:
- Document that slots can replay data more than once
- Force flush of all slots on a shutdown checkpoint
Also, pg_logical_slot_get_changes and its _peek_ variant should have a param specifying the starting LSN to read and return. If this is lower than the restart_lsn but non-null it should ERROR; if it's greater than or equal it should use this position instead of starting at the confirmed_lsn.
Time permitting I'd like to add a pg_logical_slot_confirm function, so you can aternate _peek_changes and _confirm, making it possible to get walsender-interface-like behaviour via the SQL interface. Right now you can't reliably use the SQL interface because _get_changes can succeed locally and advance the slot but the client can fail to receive all the changes due to network issues etc. Sure, the SQL interface is meant mainly for testing, but especially for !postgresql downstreams I strongly suspect people will want to use it for "real" work rather than have to modify each client driver to support replication protocol extensions.
Opinions?
--
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Simon Riggs
Date:
On 11 March 2016 at 08:19, Craig Ringer <craig@2ndquadrant.com> wrote:
--
Hi allI think I found a couple of logical decoding issues while writing tests for failover slots.Despite the docs' claim that a logical slot will replay data "exactly once", a slot's confirmed_lsn can go backwards and the SQL functions can replay the same data more than once.We don't mark a slot as dirty if only its confirmed_lsn is advanced, so it isn't flushed to disk. For failover slots this means it also doesn't get replicated via WAL. After a master crash, or for failover slots after a promote event, the confirmed_lsn will go backwards. Users of the SQL interface must keep track of the safely locally flushed slot position themselves and throw the repeated data away. Unlike with the walsender protocol it has no way to ask the server to skip that data.
Yeh, I read that and thought it was wrong. "At least once, in order" is what we want, how it works and how it should be described.
Worse, because we don't dirty the slot even a *clean shutdown* causes slot confirmed_lsn to go backwards. That's a bug IMO. We should force a flush of all slots at the shutdown checkpoint, whether dirty or not, to address it.
Given the earlier understanding, above, I'm not sure I see why that must happen. But I agree it should happen.
Barring objections I intend to submit a fix to:- Document that slots can replay data more than once- Force flush of all slots on a shutdown checkpoint
+1
Also, pg_logical_slot_get_changes and its _peek_ variant should have a param specifying the starting LSN to read and return. If this is lower than the restart_lsn but non-null it should ERROR; if it's greater than or equal it should use this position instead of starting at the confirmed_lsn.
Maybe. I don't really like changing APIs. Can we add new funcs? Make sure a NULL LSN can be passed as well.
Is the return type of pg_logical_slot_peek_changes() incorrect in the docs?
Time permitting I'd like to add a pg_logical_slot_confirm function, so you can aternate _peek_changes and _confirm, making it possible to get walsender-interface-like behaviour via the SQL interface.
I thought thats what replorigins do.
Right now you can't reliably use the SQL interface because _get_changes can succeed locally and advance the slot but the client can fail to receive all the changes due to network issues etc. Sure, the SQL interface is meant mainly for testing, but especially for !postgresql downstreams I strongly suspect people will want to use it for "real" work rather than have to modify each client driver to support replication protocol extensions.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Craig Ringer
Date:
On 11 March 2016 at 17:00, Simon Riggs <simon@2ndquadrant.com> wrote:
Also, pg_logical_slot_get_changes and its _peek_ variant should have a param specifying the starting LSN to read and return. If this is lower than the restart_lsn but non-null it should ERROR; if it's greater than or equal it should use this position instead of starting at the confirmed_lsn.Maybe. I don't really like changing APIs. Can we add new funcs? Make sure a NULL LSN can be passed as well.
It'd be compatible with the old SQL API and we've already made one such change in 9.5, to add the parameter to immediately reserve WAL for physical slots. You just CREATE OR REPLACE FUNCTION it in system_views.sql with a DEFAULT so the old signature still works.
That works for C-level callers too, unless they try to get clever and DirectFunctionCall3 it. Even then it might, I don't recall off the top of my head, but if not we can defend against that too by testing PG_NARGS in the function its self. This wasn't done for the reserve_wal change and I'm not convinced it's needed even if the fmgr doesn't take care of the default parameters - it's not like that's a function that makes much sense to DirectFunctionCall anyway.
Is the return type of pg_logical_slot_peek_changes() incorrect in the docs?
I don't think so, why?
Time permitting I'd like to add a pg_logical_slot_confirm function, so you can aternate _peek_changes and _confirm, making it possible to get walsender-interface-like behaviour via the SQL interface.I thought thats what replorigins do.
Replication origins provide an easy and efficient way for a PostgreSQL downstream (client) to track its replay state and position in a reliable way. If the client is PostgreSQL then you can use replication origins to track the last flushed LSN, yes. If you're using the walsender interface you must do so. They are not used by the server-side of logical decoding on either walsender or SQL interfaces.
If you're using the SQL interface it has its own replay position tracking. Just before pg_logical_slot_get_changes returns it updates the confirm position of the slot, as if you'd sent a replay confirmation from the client on the walsender protocol. The problem is that this happens before we know the client has received and flushed all the data we just sent it. The client might not get all (or any) of it if there are network issues, for example.
Because the replay position can go backwards the client can and must track the last-safely-replayed position its self, as established earlier, so it can skip over data it already saw. The problem is that when using the SQL interface the client can't do the reverse - it can't tell the server "send me that data again, please, I only got half of it last time because the network died". The server has advanced the confirmed position and there's no way to ask for the data again over the SQL interface.
That's why I want to add a param to let it do so, replaying from prior to the confirm location or skipping past data that'd otherwise be replayed. If you pass NULL (the default, and the only option in 9.5 or older) for that param then you get data from the server's last recorded confirm location, which is the server's best-effort tracking of your replay position from server-side.
That's also why a pg_logical_slot_confirm function could be handy. It lets you do a _peek_changes then a _confirm when you know you've flushed them, prior to the next peek.
I think the docs for pg_replication_slots are wrong about the confirm location btw, it says data from before the confirm location can't be replayed, but it can over the walsender protocol. There's just no way to ask for it over the SQL interface.
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Alvaro Herrera
Date:
Craig Ringer wrote: > Hi all > > I think I found a couple of logical decoding issues while writing tests for > failover slots. > > Despite the docs' claim that a logical slot will replay data "exactly > once", a slot's confirmed_lsn can go backwards and the SQL functions can > replay the same data more than once.We don't mark a slot as dirty if only > its confirmed_lsn is advanced, so it isn't flushed to disk. For failover > slots this means it also doesn't get replicated via WAL. After a master > crash, or for failover slots after a promote event, the confirmed_lsn will > go backwards. Users of the SQL interface must keep track of the safely > locally flushed slot position themselves and throw the repeated data away. > Unlike with the walsender protocol it has no way to ask the server to skip > that data. > > Worse, because we don't dirty the slot even a *clean shutdown* causes slot > confirmed_lsn to go backwards. That's a bug IMO. We should force a flush of > all slots at the shutdown checkpoint, whether dirty or not, to address it. Why don't we mark the slot dirty when confirmed_lsn advances? If we fix that, doesn't it fix the other problems too? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Craig Ringer
Date:
On 11 March 2016 at 20:15, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Craig Ringer wrote:
> Hi all
>
> I think I found a couple of logical decoding issues while writing tests for
> failover slots.
>
> Despite the docs' claim that a logical slot will replay data "exactly
> once", a slot's confirmed_lsn can go backwards and the SQL functions can
> replay the same data more than once.We don't mark a slot as dirty if only
> its confirmed_lsn is advanced, so it isn't flushed to disk. For failover
> slots this means it also doesn't get replicated via WAL. After a master
> crash, or for failover slots after a promote event, the confirmed_lsn will
> go backwards. Users of the SQL interface must keep track of the safely
> locally flushed slot position themselves and throw the repeated data away.
> Unlike with the walsender protocol it has no way to ask the server to skip
> that data.
>
> Worse, because we don't dirty the slot even a *clean shutdown* causes slot
> confirmed_lsn to go backwards. That's a bug IMO. We should force a flush of
> all slots at the shutdown checkpoint, whether dirty or not, to address it.
Why don't we mark the slot dirty when confirmed_lsn advances? If we fix
that, doesn't it fix the other problems too?
Yes, it does.
That'll cause slots to be written out at checkpoints when they otherwise wouldn't have to be, but I'd rather be doing a little more work in this case. Compared to the disk activity from WAL decoding etc the effect should be undetectable anyway.
Andres? Any objection to dirtying a slot when the confirmed lsn advances, so we write it out at the next checkpoint?
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Petr Jelinek
Date:
On 14/03/16 08:08, Craig Ringer wrote: > On 11 March 2016 at 20:15, Alvaro Herrera <alvherre@2ndquadrant.com > <mailto:alvherre@2ndquadrant.com>> wrote: > > Craig Ringer wrote: > > Hi all > > > > I think I found a couple of logical decoding issues while writing tests for > > failover slots. > > > > Despite the docs' claim that a logical slot will replay data "exactly > > once", a slot's confirmed_lsn can go backwards and the SQL functions can > > replay the same data more than once.We don't mark a slot as dirty if only > > its confirmed_lsn is advanced, so it isn't flushed to disk. For failover > > slots this means it also doesn't get replicated via WAL. After a master > > crash, or for failover slots after a promote event, the confirmed_lsn will > > go backwards. Users of the SQL interface must keep track of the safely > > locally flushed slot position themselves and throw the repeated data away. > > Unlike with the walsender protocol it has no way to ask the server to skip > > that data. > > > > Worse, because we don't dirty the slot even a *clean shutdown* causes slot > > confirmed_lsn to go backwards. That's a bug IMO. We should force a flush of > > all slots at the shutdown checkpoint, whether dirty or not, to address it. > > Why don't we mark the slot dirty when confirmed_lsn advances? If we fix > that, doesn't it fix the other problems too? > > > Yes, it does. > It will not change the fact that slot can go backwards however even in clean shutdown of the server as in walsender the confirmed_lsn only changes after feedback message so if client crashes it won't get updated (for obvious reasons). You btw can emulate asking for the specific LSN in SQL interface by first calling the pg_logical_slot_get_changes function with upto_lsn set to whatever lsn you expect to start at, but it's ugly. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Craig Ringer
Date:
On 14 March 2016 at 17:16, Petr Jelinek <petr@2ndquadrant.com> wrote:
It will not change the fact that slot can go backwards
Sure. I don't consider that a problem in general though. It's similar to the way we lose cached sequence chunks on crash - a performance optimisation with a user visible effect. It needs to be documented correctly, but isn't IMO a problem.
Failing to save the full slot state on clean shutdown is though, IMO. It wouldn't matter if it was only used for sync rep, but since it's also used to manage the resume point for SQL-level logical decoding calls we should make a reasonable effort to preserve it ... and allow the user to handle the situation correctly when we fail to preserve it.
You btw can emulate asking for the specific LSN in SQL interface by first calling the pg_logical_slot_get_changes function with upto_lsn set to whatever lsn you expect to start at, but it's ugly.
Ugh.
I didn't realise pg_logical_slot_get_changes could backtrack by setting an upto_lsn in the past. That doesn't seem like something we should really be doing - if it's a limit, and we're already past that limit, we should just be returning the empty set.
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Petr Jelinek
Date:
On 14/03/16 10:48, Craig Ringer wrote: > > You btw can emulate asking for the specific LSN in SQL interface by > first calling the pg_logical_slot_get_changes function with upto_lsn > set to whatever lsn you expect to start at, but it's ugly. > > > Ugh. > > I didn't realise pg_logical_slot_get_changes could backtrack by setting > an upto_lsn in the past. That doesn't seem like something we should > really be doing - if it's a limit, and we're already past that limit, we > should just be returning the empty set. > Not past, future from the point of old confirmed_lsn at least. The point was that the next call will start from whatever lsn you specified as upto_lsn. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Craig Ringer
Date:
Attached is a patch to mark a logical replication slot as dirty if its confirmed lsn is changed.
Aesthetically I'm not sure if it's better to do it per this patch and call ReplicationSlotMarkDirty after we release the spinlock, add a new ReplicationSlotMarkDirtyLocked() that skips the spinlock acquisition, or add a bool is_locked arg to ReplicationSlotMarkDirty and update all call sites. All will have the same result.
I've confirmed this works as expected as part of the failover slots test suite but it'd be pretty seriously cumbersome to extract. If anyone feels strongly about it I can add a test that shows that
- start master
- create slot
- do some stuff
- replay from slot
- fast-stop master
- start master
- replay from slot
doesn't see the same data a second time, but if we repeat it using immediate stop it will see the same data when replaying again.
Also attached is another patch to amend the docs to reflect the fact that a slot can actually replay the same change more than once. I avoided the strong temptation to update other parts of the docs nearby.
Both these are fixes that should IMO be applied to 9.6.
Attachment
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Craig Ringer
Date:
The first patch was incorrectly created on top of failover slots not HEAD. Attached patch applies on HEAD.
Attachment
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
"David G. Johnston"
Date:
On Thursday, March 17, 2016, Craig Ringer <craig@2ndquadrant.com> wrote:
The first patch was incorrectly created on top of failover slots not HEAD. Attached patch applies on HEAD.
Lots of logical decoding work ongoing but this one shows as active in the September cf and the comments by Craig indicate potential relevance to 9.6. Is this still live as-is or has it been subsumed by other threads?
David J.
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Craig Ringer
Date:
On 5 June 2016 at 09:54, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Thursday, March 17, 2016, Craig Ringer <craig@2ndquadrant.com> wrote:The first patch was incorrectly created on top of failover slots not HEAD. Attached patch applies on HEAD.Lots of logical decoding work ongoing but this one shows as active in the September cf and the comments by Craig indicate potential relevance to 9.6. Is this still live as-is or has it been subsumed by other threads?
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Petr Jelinek
Date:
On 20/08/16 16:01, Craig Ringer wrote: > On 5 June 2016 at 09:54, David G. Johnston <david.g.johnston@gmail.com > <mailto:david.g.johnston@gmail.com>> wrote: > > On Thursday, March 17, 2016, Craig Ringer <craig@2ndquadrant.com > <mailto:craig@2ndquadrant.com>> wrote: > > The first patch was incorrectly created on top of failover slots > not HEAD. Attached patch applies on HEAD. > > > Lots of logical decoding work ongoing but this one shows as active > in the September cf and the comments by Craig indicate potential > relevance to 9.6. Is this still live as-is or has it been subsumed > by other threads? > > > > Yes. Both those patches are still pending and should be considered for > commit in the next CF. (Of course, I think they should just be > committed, but I would, wouldn't I?) > > I think the doc one should definitely go in and possibly be back-patched all the way to 9.4. I didn't look at the other one. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Simon Riggs
Date:
On 20 August 2016 at 15:04, Petr Jelinek <petr@2ndquadrant.com> wrote: > On 20/08/16 16:01, Craig Ringer wrote: >> >> On 5 June 2016 at 09:54, David G. Johnston <david.g.johnston@gmail.com >> <mailto:david.g.johnston@gmail.com>> wrote: >> >> On Thursday, March 17, 2016, Craig Ringer <craig@2ndquadrant.com >> <mailto:craig@2ndquadrant.com>> wrote: >> >> The first patch was incorrectly created on top of failover slots >> not HEAD. Attached patch applies on HEAD. >> >> >> Lots of logical decoding work ongoing but this one shows as active >> in the September cf and the comments by Craig indicate potential >> relevance to 9.6. Is this still live as-is or has it been subsumed >> by other threads? >> >> >> >> Yes. Both those patches are still pending and should be considered for >> commit in the next CF. (Of course, I think they should just be >> committed, but I would, wouldn't I?) >> >> > > I think the doc one should definitely go in and possibly be back-patched all > the way to 9.4. I didn't look at the other one. I agree the doc patch should go in, though I suggest reword it slightly, like attached patch. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Craig Ringer
Date:
On 1 September 2016 at 21:19, Simon Riggs <simon@2ndquadrant.com> wrote: > I agree the doc patch should go in, though I suggest reword it > slightly, like attached patch. Thanks. The rewording looks good to me and I think that Doc-correction-each-change-once.v2.patch is ready for commit. Meanwhile, thinking about the patch to dirty slots on confirmed_flush_lsn advance some more, I don't think it's ideal to do it in LogicalConfirmReceivedLocation(). I'd rather not add more complexity there, it's already complex enough. Doing it there will also lead to unnecessary slot write-out being done to slots at normal checkpoints even if the slot hasn't otherwise changed, possibly in response to lsn advances sent in response to keepalives due to activity on other unrelated databases. Slots are always fsync()ed when written out so we don't want to do it more than we have to. We really only need to write out slots where only the confirmed_flush_lsn has changed at a shutdown checkpoint since it's not really a big worry if it goes backwards on crash, and otherwise it can't. Even then it only _really_ matters when the SQL interface is used. Losing the confirmed_flush_lsn is very annoying when using pg_recvlogical too, and was the original motivation for this patch. But I'm thinking of instead teaching pg_recvlogical to write out a status file with its last confirmed point on exit and to be able to take that as an argument when (re)connecting. Poor-man's replication origins, effectively. So here's a simpler patch that just dirties the slot when it's replayed something from it on the SQL interface, so it's flushed out next checkpoint or on shutdown. That's the main user visible defect that should be fixed and it's trivial to do here. It means we'll still forget the confirmed_flush_lsn on clean shutdown if it was advanced using the walsender protocol, but *shrug*. That's just a little inconvenient. I can patch pg_recvlogical separately. The alternative is probably to add a second, "softer" dirty tracking method that only causes a write at a clean shutdown or forced checkpoint - and maybe doesn't bother fsync()ing. That's a bit more invasive but would work for walsender use as well as the SQL interface. I don't think it's worth the bother, since in the end callers have to be prepared for repeated data on crash anyway. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Petr Jelinek
Date:
On 02/09/16 09:58, Craig Ringer wrote: > On 1 September 2016 at 21:19, Simon Riggs <simon@2ndquadrant.com> wrote: > >> I agree the doc patch should go in, though I suggest reword it >> slightly, like attached patch. > > Thanks. The rewording looks good to me and I think that > Doc-correction-each-change-once.v2.patch is ready for commit. > > Meanwhile, thinking about the patch to dirty slots on > confirmed_flush_lsn advance some more, I don't think it's ideal to do > it in LogicalConfirmReceivedLocation(). I'd rather not add more > complexity there, it's already complex enough. Doing it there will > also lead to unnecessary slot write-out being done to slots at normal > checkpoints even if the slot hasn't otherwise changed, possibly in > response to lsn advances sent in response to keepalives due to > activity on other unrelated databases. Slots are always fsync()ed when > written out so we don't want to do it more than we have to. > > We really only need to write out slots where only the > confirmed_flush_lsn has changed at a shutdown checkpoint since it's > not really a big worry if it goes backwards on crash, and otherwise it > can't. Even then it only _really_ matters when the SQL interface is > used. Losing the confirmed_flush_lsn is very annoying when using > pg_recvlogical too, and was the original motivation for this patch. > But I'm thinking of instead teaching pg_recvlogical to write out a > status file with its last confirmed point on exit and to be able to > take that as an argument when (re)connecting. Poor-man's replication > origins, effectively. > > So here's a simpler patch that just dirties the slot when it's > replayed something from it on the SQL interface, so it's flushed out > next checkpoint or on shutdown. That's the main user visible defect > that should be fixed and it's trivial to do here. It means we'll still > forget the confirmed_flush_lsn on clean shutdown if it was advanced > using the walsender protocol, but *shrug*. That's just a little > inconvenient. I can patch pg_recvlogical separately. Okay that sounds reasonable, the SQL interface is already somewhat different than walsender as it does not really "stream" so makes sense to improve the behavior there. As a side note, I would really like to have cursor-like SQL interface which behaves more like walsender one but that's very different patch. > > The alternative is probably to add a second, "softer" dirty tracking > method that only causes a write at a clean shutdown or forced > checkpoint - and maybe doesn't bother fsync()ing. That's a bit more > invasive but would work for walsender use as well as the SQL > interface. I don't think it's worth the bother, since in the end > callers have to be prepared for repeated data on crash anyway. > Correct me if I am wrong but I think the only situation where it would matter is on server that restarts often or crashes often (as the logical decoding then has to do the work many times) but I don't think it's worth optimizing for that kind of scenario. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Craig Ringer
Date:
On 2 September 2016 at 16:50, Petr Jelinek <petr@2ndquadrant.com> wrote: >> So here's a simpler patch that just dirties the slot when it's >> replayed something from it on the SQL interface [...] > > Okay that sounds reasonable, the SQL interface is already somewhat different > than walsender as it does not really "stream" so makes sense to improve the > behavior there. Yep. For the walsender interface it's reasonable to expect the downstream to keep track of its progress - even if right now pg_recvlogical doesn't do so very well. It's not practical for the SQL interface. > As a side note, I would really like to have cursor-like SQL > interface which behaves more like walsender one but that's very different > patch. Me too. Specifically, I'd really like to be able to "peek, confirm, peek, confirm" and also be able to specify my own start lsn for peek. But as you say, that's a different patch. >> The alternative is probably to add a second, "softer" dirty tracking >> method that only causes a write at a clean shutdown or forced >> checkpoint - and maybe doesn't bother fsync()ing. That's a bit more >> invasive but would work for walsender use as well as the SQL >> interface. I don't think it's worth the bother, since in the end >> callers have to be prepared for repeated data on crash anyway. >> > > Correct me if I am wrong but I think the only situation where it would > matter is on server that restarts often or crashes often (as the logical > decoding then has to do the work many times) but I don't think it's worth > optimizing for that kind of scenario. Right. Though it's not so much doing the work many times that's the concern - this change doesn't affect restart_lsn at all. It's that it sends already-seen-and-confirmed changes to the output plugin and therefore the client. Not that big a deal. As far as I'm concerned, it's the job of users of the walsender interface to keep track of the position they've consumed up to and specify it to subsequent START_REPLICATION calls. You can't do that on the SQL interface, which is why this change is needed there. Maybe the docs should be more explicit about the reason to specify start lsn to START_REPLICATION not just 0/0 though, and also that we will disregard it if it's less than the stored confirmed_flush_lsn. Again, separate patch. So the main change becomes the one-liner in my prior mail. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Andres Freund
Date:
On 2016-09-02 10:50:17 +0200, Petr Jelinek wrote: > Okay that sounds reasonable, the SQL interface is already somewhat different > than walsender as it does not really "stream" so makes sense to improve the > behavior there. As a side note, I would really like to have cursor-like SQL > interface which behaves more like walsender one but that's very different > patch. FWIW, with the changes in https://www.postgresql.org/message-id/20160827214829.zo2dfb5jaikii5nw@alap3.anarazel.de thats starting to be realistic for a function based interface. By switching things over to ValuePerCall mode you could actually use a cursor, without pre-materializing all changes. Greetings, Andres Freund
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Craig Ringer
Date:
On 2 September 2016 at 17:49, Craig Ringer <craig@2ndquadrant.com> wrote: > So the main change becomes the one-liner in my prior mail. Per feedback from Simon, updated with a new test in src/test/recovery . If you revert the change to src/backend/replication/logical/logicalfuncs.c then the test will start failing. I'd quite like to backpatch the fix since it's simple and safe, but I don't feel that it's hugely important to do so. This is an annoyance not a serious data risking bug. My only concern here is that we still lose position on crash. So SQL-interface callers should still be able to cope with it. But they won't see it happen if they're only testing with normal shutdowns, host reboots, etc. In practice users aren't likely to notice this anyway, though, since most people don't restart the server all the time. I think it's better than what we have. This issue could be eliminated completely by calling ReplicationSlotSave() after ReplicationSlotMarkDirty(). This would force an immediate flush after a non-peeked SQL interface decoding call. It means more fsync()ing, though, and the SQL interface can only be used by polling so that could be a significant performance hit. We'd want to only flush when the position changed, but even then it'll mean a sync every time anything gets returned. The better alternative is to add a variant on pg_logical_slot_get_changes(...) etc that accepts a start LSN. But it's not convenient or easy for SQL callers to extract the last commit LSN received from the last call to pass it to the next one, so this isn't simple, and is probably best tackled as part of the SQL interface API change Petr and Andres discussed in this thread. I'm inclined to just dirty the slot for now, then provide a better solution by adding the peek/confirm interface discussed upthread down the track. Andres, Petr, got an opinion here? -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Craig Ringer
Date:
On 5 September 2016 at 10:41, Craig Ringer <craig@2ndquadrant.com> wrote: > On 2 September 2016 at 17:49, Craig Ringer <craig@2ndquadrant.com> wrote: > >> So the main change becomes the one-liner in my prior mail. > > Per feedback from Simon, updated with a new test in src/test/recovery . ... attached this time. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Petr Jelinek
Date:
On 05/09/16 04:41, Craig Ringer wrote: > On 2 September 2016 at 17:49, Craig Ringer <craig@2ndquadrant.com> wrote: > >> So the main change becomes the one-liner in my prior mail. > > Per feedback from Simon, updated with a new test in src/test/recovery . > > If you revert the change to > src/backend/replication/logical/logicalfuncs.c then the test will > start failing. > > I'd quite like to backpatch the fix since it's simple and safe, but I > don't feel that it's hugely important to do so. This is an annoyance > not a serious data risking bug. > > My only concern here is that we still lose position on crash. So > SQL-interface callers should still be able to cope with it. But they > won't see it happen if they're only testing with normal shutdowns, > host reboots, etc. In practice users aren't likely to notice this > anyway, though, since most people don't restart the server all the > time. I think it's better than what we have. > > This issue could be eliminated completely by calling > ReplicationSlotSave() after ReplicationSlotMarkDirty(). This would > force an immediate flush after a non-peeked SQL interface decoding > call. It means more fsync()ing, though, and the SQL interface can only > be used by polling so that could be a significant performance hit. > We'd want to only flush when the position changed, but even then it'll > mean a sync every time anything gets returned. > > The better alternative is to add a variant on > pg_logical_slot_get_changes(...) etc that accepts a start LSN. But > it's not convenient or easy for SQL callers to extract the last commit > LSN received from the last call to pass it to the next one, so this > isn't simple, and is probably best tackled as part of the SQL > interface API change Petr and Andres discussed in this thread. > It isn't? I thought lsn is first column of the output of that function? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Simon Riggs
Date:
On 5 September 2016 at 03:41, Craig Ringer <craig@2ndquadrant.com> wrote: > On 2 September 2016 at 17:49, Craig Ringer <craig@2ndquadrant.com> wrote: > >> So the main change becomes the one-liner in my prior mail. > > Per feedback from Simon, updated with a new test in src/test/recovery . > > If you revert the change to > src/backend/replication/logical/logicalfuncs.c then the test will > start failing. > > I'd quite like to backpatch the fix since it's simple and safe, but I > don't feel that it's hugely important to do so. This is an annoyance > not a serious data risking bug. I've committed to HEAD only. We can discuss backpatching elsewhere also. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Logical decoding slots can go backwards when used from SQL, docs are wrong
From
Craig Ringer
Date:
On 5 September 2016 at 16:33, Petr Jelinek <petr@2ndquadrant.com> wrote: >> The better alternative is to add a variant on >> pg_logical_slot_get_changes(...) etc that accepts a start LSN. But >> it's not convenient or easy for SQL callers to extract the last commit >> LSN received from the last call to pass it to the next one, so this >> isn't simple, and is probably best tackled as part of the SQL >> interface API change Petr and Andres discussed in this thread. >> > > It isn't? I thought lsn is first column of the output of that function? It is. While it's a pain to use that from SQL (psql, etc) as well as doing something with the results, there's no point since anything working in simple SQL will get terminated when the server restarts anyway. So really my point above is moot. That's doesn't reflect on the slot dirty patch, it just means there's no need to do anything extra when we add the cursor-like interface later in order to fully solve this. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services