On Tue, Mar 4, 2025 at 7:05 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Tue, Mar 04, 2025 at 10:45:54AM +0000, Bertrand Drouvot wrote:
> > Hi,
> >
> > On Tue, Mar 04, 2025 at 07:38:14AM +0000, Bertrand Drouvot wrote:
> > > PFA 0001 a new version to also parse the ".snap".
> >
> > PFA v3 (a slightly different version) to "correctly" use the new report_error
> > introduced in v2.
>
> It looks like that CheckPointSnapBuild() also rely on sscanf() to check
> for the snapshot file extension. As seen, up-thread we can't rely on sscanf()
> to do so.
>
> Attached a c file to show the "issue":
>
> $ gcc -o sscanf_file_ext sscanf_file_ext.c
> $ ./sscanf_file_ext 0-40796E18.snap
> File: "0-40796E18.snap"
> Parse result: 2 (2 means success)
>
> $ ./sscanf_file_ext 0-40796E18.foo
> File: "0-40796E18.foo"
> Parse result: 2 (2 means success)
>
> So it looks like that, in reality, in CheckPointSnapBuild() we report a parsing
> message and "continue" even if the file name extension is not ".snap".
>
> That's not a big deal because, in pratice, it still removes the .$pid.tmp files
> (given they have the "hex" part well formated). But, it does not look like it was
> the original intent, so maybe we should also re-think CheckPointSnapBuild() and
> open a dedicated thread? (If so, I'm happy to do so).
I believe that the situation is a bit different in these cases; in
CheckPointSnapBuild () we iterate over all files in
pg_logical/snapshots directory and extract the LSN from the filenames
whereas in pg_logicalinspect case, user can pass arbitrary filename.
Therefore, as for the former cases, it might make sense to check the
file extension as well if we want to prevent a problem where someone
injects files into pg_logical/snapshots directory. But it might be too
much and I guess that there are other places where we assume that
there are no malformed files injected under $PGDATA.
>
> Note that there might be other places that may need the same kind of attention:
> pg_archivecleanup.c?
In pg_archivecleanup.c, we use sscanf() two places: for partial WAL
files and for backup history files. IIUC we check the format of the
filename using IsPartialXLogFileName() and IsBackupHistoryFileName()
before using sscanf() so I think it's okay.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com