On Thu, Mar 6, 2025 at 11:02 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Thu, Mar 06, 2025 at 11:35:58AM -0800, Masahiko Sawada wrote:
> > On Thu, Mar 6, 2025 at 12:11 AM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > === 1
> > >
> > > -parse_snapshot_filename(const char *filename)
> > > +parse_snapshot_filename(char *filename)
> > >
> > > Why?
> >
> > Sorry, that's a leftover that I attempted. We should use 'const'.
>
> No problem at all! Okay, re-added in the attached.
>
> > >
> > > === 2
> > >
> > > - if (sscanf(filename, "%X-%X", &hi, &lo) != 2)
> > > + if (sscanf(filename, "%X-%X.snap", &hi, &lo) != 2)
> > >
> > > We could replace (sscanf(filename, "%X-%X.snap", &hi, &lo) != 2) with
> > > (sscanf(filename, "%X-%X.foo", &hi, &lo) != 2) and the regression tests would
> > > still pass.
> > >
> > > So, I think it's better to remove the .snap here as it could give the "wrong"
> > > impression that it's "useful".
> > >
> > > The attached removes the .snap and adds a comment like:
> > >
> > > "
> > > * Note: We deliberately don't use "%X-%X.snap" because sscanf only counts
> > > * converted values (%X), not literal text matches.
> > > "
> >
> > Yes, but we include the suffix when doing sscanf() in many other
> > places. I don't see the specific benefit of doing it in another way
> > only in this case.
>
> The idea was to not give the "wrong" impression that it's "useful" to add the
> suffix. That said I agree, it would be better to also update the other places
> where the suffix is being used. I think I'll start a dedicated thread for
> that (to at least put some comments around those places).
>
> As far our current patch then I propose to add a comment though, like in the
> attached?
>
> > > === 3
> > >
> > > + /*
> > > + * Bring back the LSN to the snapshot file format and compare
> > > + * it to the given name to see if the extracted LSN is sane.
> > > + */
> > > + sprintf(tmpfname, "%X-%X.snap", hi, lo);
> > > + if (strcmp(tmpfname, filename) != 0)
> > >
> > > The idea was also to ensure that there are no extra characters between the LSN
> > > values and the .snap extension:
> >
> > Right. How about the following comment?
> >
> > Bring back the extracted LSN to the snapshot file format and compare
> > it to the given filename. This check strictly checks if the given filename
> > follows the format of the snapshot filename.
>
> Yeah sounds good, done that way in the attached.
>
> > FYI I've analyzed the idea of extracting LSN from the filename and
> > bringing it back to the format to validate the given filename. IIUC we
> > strictly validate the given filename. For example, we accept
> > '0-ABC.snap' but not '0-0ABC.snap',
>
> Yeah, and so '0A-ABC.snap' for example.
>
> > which is probably fine.
>
> I think so, as long as the snapshot files are generated with %X (and not %08X
> for example) in SnapBuildSerialize():
>
> "
> sprintf(path, "%s/%X-%X.snap",
> PG_LOGICAL_SNAPSHOTS_DIR,
> LSN_FORMAT_ARGS(lsn));
> "
>
Thank you for updating the patch. I've made some cosmetic changes and
attached a new version patch. Could you review it?
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com