Re: walmethods.c/h are doing some strange things - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: walmethods.c/h are doing some strange things |
Date | |
Msg-id | CA+TgmoYZ2Epi=OYkxQ=LEV=yBvi483PW8CfGJ2m9j8qQJALc6w@mail.gmail.com Whole thread Raw |
In response to | Re: walmethods.c/h are doing some strange things (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
List | pgsql-hackers |
On Thu, Sep 15, 2022 at 9:39 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > At Fri, 2 Sep 2022 11:52:38 -0400, Robert Haas <robertmhaas@gmail.com> wrote in > > that type that can ever exist, and the pointer to that object is > > stored in a global variable managed by walmethods.c. So whereas in > > other cases we give you the object and then a way to get the > > corresponding set of callbacks, here we only give you the callbacks, > > and we therefore have to impose the artificial restriction that there > > can only ever be one object. > > Makes sense to me. OK, I have committed the patches. Before doing that, I fixed a couple of bugs in the first one, and did a little bit of rebasing of the second one. > I agree to the view. That part seems to be a part of > open_for_write()'s body functions. But, I'm not sure how we untangle > them at a glance, too. In the first place, I'm not sure why we need > to do that despite the file going to be overwritten from the > beginning, though.. I suspect that we're going to have to untangle things bit by bit. One place where I think we might be able to improve things is with the handling of compression suffixes (e.g. .gz, .lz4) and temp suffixes (e.g. .partial, .tmp). At present, responsibility for adding these suffixes to pathnames is spread across receivelog.c and walmethods.c in a way that, to me, looks pretty random. It's not exactly clear to me what the best design is here right now, but I think either (A) receivelog.c should take full responsibility for computing the exact filename and walmethods.c should just blindly write the data into the exact filename it's given, or else (B) receivelog.c should take no responsibility for pathname construction and the fact that there is pathname munging happening should be hidden inside walmethods.c. Right now, walmethods.c is doing pathname munging, but the munging is visible from receivelog.c, so the responsibility is spread across the two files rather than being the sole responsibility of either. What's also a little bit aggravating about this is that it doesn't feel like we're accomplishing all that much code reuse here. walmethods.c and receivelog.c are shared only between pg_basebackup and pg_receivewal, but the tar method is used only by pg_basebackup. The directory mode is shared, but actually the two programs need a bunch of different things. pg_recievelog needs a bunch of logic to deal with the possibility that it is receiving data into an existing directory and that this directory might at any time start to be used to feed a standby, or used for PITR. That's not an issue for pg_basebackup: if it fails, the whole directory will be removed, so there is no need to worry about fsyncs or padding or overwriting existing files. On the other hand, pg_basebackup needs a bunch of logic to create a .done file for each WAL segment which is not required in the case of pg_receivewal. It feels like we've got as much conditional logic as we do common logic... -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: