Hi,
On Thu, Mar 06, 2025 at 11:35:58AM -0800, Masahiko Sawada wrote:
> On Thu, Mar 6, 2025 at 12:11 AM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > === 1
> >
> > -parse_snapshot_filename(const char *filename)
> > +parse_snapshot_filename(char *filename)
> >
> > Why?
>
> Sorry, that's a leftover that I attempted. We should use 'const'.
No problem at all! Okay, re-added in the attached.
> >
> > === 2
> >
> > - if (sscanf(filename, "%X-%X", &hi, &lo) != 2)
> > + if (sscanf(filename, "%X-%X.snap", &hi, &lo) != 2)
> >
> > We could replace (sscanf(filename, "%X-%X.snap", &hi, &lo) != 2) with
> > (sscanf(filename, "%X-%X.foo", &hi, &lo) != 2) and the regression tests would
> > still pass.
> >
> > So, I think it's better to remove the .snap here as it could give the "wrong"
> > impression that it's "useful".
> >
> > The attached removes the .snap and adds a comment like:
> >
> > "
> > * Note: We deliberately don't use "%X-%X.snap" because sscanf only counts
> > * converted values (%X), not literal text matches.
> > "
>
> Yes, but we include the suffix when doing sscanf() in many other
> places. I don't see the specific benefit of doing it in another way
> only in this case.
The idea was to not give the "wrong" impression that it's "useful" to add the
suffix. That said I agree, it would be better to also update the other places
where the suffix is being used. I think I'll start a dedicated thread for
that (to at least put some comments around those places).
As far our current patch then I propose to add a comment though, like in the
attached?
> > === 3
> >
> > + /*
> > + * Bring back the LSN to the snapshot file format and compare
> > + * it to the given name to see if the extracted LSN is sane.
> > + */
> > + sprintf(tmpfname, "%X-%X.snap", hi, lo);
> > + if (strcmp(tmpfname, filename) != 0)
> >
> > The idea was also to ensure that there are no extra characters between the LSN
> > values and the .snap extension:
>
> Right. How about the following comment?
>
> Bring back the extracted LSN to the snapshot file format and compare
> it to the given filename. This check strictly checks if the given filename
> follows the format of the snapshot filename.
Yeah sounds good, done that way in the attached.
> FYI I've analyzed the idea of extracting LSN from the filename and
> bringing it back to the format to validate the given filename. IIUC we
> strictly validate the given filename. For example, we accept
> '0-ABC.snap' but not '0-0ABC.snap',
Yeah, and so '0A-ABC.snap' for example.
> which is probably fine.
I think so, as long as the snapshot files are generated with %X (and not %08X
for example) in SnapBuildSerialize():
"
sprintf(path, "%s/%X-%X.snap",
PG_LOGICAL_SNAPSHOTS_DIR,
LSN_FORMAT_ARGS(lsn));
"
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com