Thread: [HACKERS] [PATCH] Off-by-one error in logical slot resource retention
Hi all I've found a minor off-by-one error in the resource retention logic for logical slots, where we treat confirmed_flush as meaning "flushed up to and including this LSN". Seems reasonable, but the rest of the code treats it as "flushed up to but excluding this LSN". In particular, we treat confirmed_flush as the inclusive startpoint for a new logical decoding session if no startpoint is provided by the client and will replay a change whose commit record begins at exactly confirmed_flush to the client. This issue was identified while debugging an issue where duplicate rows were replicated after unclean shutdown by logical replication on a table with no PK. I'd prefer to make confirmed_flush mean "confirmed flushed up to and including" everywhere, but the knock-on effects are too ugly. In particular we'd then be changing the meaning of START_REPLICATION ... LOGICAL ... 's argument LSN to be "start replay of commits after, but not including, this LSN". We can't just adjust it by 1, otherwise suddenly we'd get different results if we passed the confirmed_flush value explicitly vs passing 0/0 and letting it be picked up implicitly. Two further minor patches follow, one to fix a harmless but confusing quirk in logical walsender init, and one that adds explanatory comments to relevant parts of the code (and a couple of spots in the docs) to avoid future occurrences of the above issues. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 9 March 2017 at 11:17, Craig Ringer <craig@2ndquadrant.com> wrote: > Hi all > > I've found a minor off-by-one error in the resource retention logic > for logical slots, where we treat confirmed_flush as meaning "flushed > up to and including this LSN". Seems reasonable, but the rest of the > code treats it as "flushed up to but excluding this LSN". In > particular, we treat confirmed_flush as the inclusive startpoint for a > new logical decoding session if no startpoint is provided by the > client and will replay a change whose commit record begins at exactly > confirmed_flush to the client. This should be backpatched to 9.4. The others do not need backpatching. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] [PATCH] Off-by-one error in logical slot resource retention
From
Aleksander Alekseev
Date:
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation: not tested Hi Craig, I'm afraid patches 0002 and 0003 don't apply anymore. Could you please resolve the conflicts? The new status of this patch is: Waiting on Author
Re: [HACKERS] [PATCH] Off-by-one error in logical slot resource retention
From
Daniel Gustafsson
Date:
> On 01 Sep 2017, at 14:28, Aleksander Alekseev <a.alekseev@postgrespro.ru> wrote: > > The following review has been posted through the commitfest application: > make installcheck-world: not tested > Implements feature: not tested > Spec compliant: not tested > Documentation: not tested > > Hi Craig, > > I'm afraid patches 0002 and 0003 don't apply anymore. Could you please resolve the conflicts? > > The new status of this patch is: Waiting on Author Have you had a chance to look at rebasing this patch so we can get it further in the process? cheers ./daniel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Off-by-one error in logical slot resource retention
From
Daniel Gustafsson
Date:
> On 15 Sep 2017, at 13:26, Daniel Gustafsson <daniel@yesql.se> wrote: > >> On 01 Sep 2017, at 14:28, Aleksander Alekseev <a.alekseev@postgrespro.ru> wrote: >> >> The following review has been posted through the commitfest application: >> make installcheck-world: not tested >> Implements feature: not tested >> Spec compliant: not tested >> Documentation: not tested >> >> Hi Craig, >> >> I'm afraid patches 0002 and 0003 don't apply anymore. Could you please resolve the conflicts? >> >> The new status of this patch is: Waiting on Author > > Have you had a chance to look at rebasing this patch so we can get it further > in the process? Since this patch has been in Waiting for Author state for the duration of the commitfest without moving, I’m marking it Returned with Feedback. If there is still interest in pursuing this patch, please re-submit it to the next commitfest with the comments addressed. cheers ./daniel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 2 October 2017 at 05:27, Daniel Gustafsson <daniel@yesql.se> wrote:
> On 15 Sep 2017, at 13:26, Daniel Gustafsson <daniel@yesql.se> wrote:
>
>> On 01 Sep 2017, at 14:28, Aleksander Alekseev <a.alekseev@postgrespro.ru> wrote:
>>
>> The following review has been posted through the commitfest application:
>> make installcheck-world: not tested
>> Implements feature: not tested
>> Spec compliant: not tested
>> Documentation: not tested
>>
>> Hi Craig,
>>
>> I'm afraid patches 0002 and 0003 don't apply anymore. Could you please resolve the conflicts?
>>
>> The new status of this patch is: Waiting on Author
>
> Have you had a chance to look at rebasing this patch so we can get it further
> in the process?
Since this patch has been in Waiting for Author state for the duration of the
commitfest without moving, I’m marking it Returned with Feedback. If there is
still interest in pursuing this patch, please re-submit it to the next
commitfest with the comments addressed.
Thanks. I'll revisit it next CF.