Re: Potential data loss due to race condition during logical replication slot creation - Mailing list pgsql-bugs
From | Masahiko Sawada |
---|---|
Subject | Re: Potential data loss due to race condition during logical replication slot creation |
Date | |
Msg-id | CAD21AoBJcRiVBT5_ys13wA9c-8zbbqQJsMiVELX3t+HREwJwbA@mail.gmail.com Whole thread Raw |
In response to | Re: Potential data loss due to race condition during logical replication slot creation (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
RE: Potential data loss due to race condition during logical replication slot creation
("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Re: Potential data loss due to race condition during logical replication slot creation (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-bugs |
On Thu, Feb 8, 2024 at 2:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Sat, Feb 3, 2024 at 12:12 PM Callahan, Drew <callaan@amazon.com> wrote: > > > > > I think we can prepare an isolation test of the above steps and > > > include it in contrib/test_decoding/specs. > > > > Thanks for the suggestion. Included an isolation test for the > > exact repro and a couple other tests for fully before and fully after. > > These two tests don't add a whole lot of value since this > > is already heavily tested elsewhere, so I'm good with removing. > > However, thought it made sense to include from a perspective of > > totality given the test name. > > > > > Another simple way to fix this is that we always set > > > need_full_snapshot to true when slot creation. But it would then make > > > the snapshot builder include non-catalog-change transactions too while > > > finding the initial startpoint. Which is not necessary. > > > > Since we're not accumulating changes anyway and we never distribute snapshots > > while inconsistent it wouldn't cause any memory pressure in the ReorderBuffer. > > This seems pretty contained and wouldn't require the use of a global variable > > and memory context callbacks which is a bit nicer from a readability standpoint. > > Sorry I missed this comment. > > I think it's a good point that this fix doesn't lead to any on-disk > compatibility while not affecting existing users much, especially for > back-patching. If we want to choose this approach for bank branches, > we need to carefully consider the side impacts. > > > It's a little inefficient with memory and CPU, especially in the presence of > > long running transactions, but likely would not be noticeable to most users. > > This matches my analysis. > > Here is the summary of several proposals we've discussed: > > a) Have CreateInitDecodingContext() always pass need_full_snapshot = > true to AllocateSnapshotBuilder(). > > This is the simplest approach and doesn't break the on-disk > SnapBuildOnDisck compatibility. A downside is that it's a little > inefficient with memory and CPU since it includes non-catalog change > transactions too. Also, passing need_full_snapshot to > CreateInitDecodingContext() will no longer have any meaning. > > b) Have snapbuild.c being able to handle multiple SnapBuildOnDisk versions. > > This bumps SNAPBUILD_VERSION and therefore breaks the on-disk > SnapBuildOnDisk compatibility, but snapbuild.c can handle multiple > versions. But since different branches are using different versions of > SnapBuildOnDisk, it would probably be hard to have the SnapBuildOnDisk > version that is consistent across all versions. > > c) Add a global variable, say in_create, to snapbuild.c > > it would also require adding a setter function for in_create. There > are several ideas where to set/reset the flag. One idea is that we > reset the flag in AllocateSnapshotBuilder() and set the flag anywhere > before starting to find the start point, for example at the beginning > of DecodingContextFindStartpoint(). It probably won't require a memory > context callback to make sure to clear the flag. This idea doesn't > have a downside from users and extensions perspects. But I'm slightly > hesitant to add a global variable. > > What do you think? and any other ideas? I've drafted the idea (c) for discussion (for master and v16 for now). I also liked the idea (a) but I'm concerned a bit about future impact. > > As a separate topic, I think that this backpatching complexity comes > from the fact that we're serializing a whole SnapBuild to the disk > although we don't need to serialize all of its fields. In order to > make future back-patching easy, it might be worth considering > decoupling the fields that need to be serialized from SnapBuild > struct. If we implement this idea, we won't need to bump SNAPBUILD_VERSION for HEAD. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
pgsql-bugs by date: