Re: BUG #18828: Crash when pg_get_logical_snapshot_meta() passed empty string - Mailing list pgsql-bugs

From Bertrand Drouvot
Subject Re: BUG #18828: Crash when pg_get_logical_snapshot_meta() passed empty string
Date
Msg-id Z8qZ8PeWec6XdhxZ@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
In response to Re: BUG #18828: Crash when pg_get_logical_snapshot_meta() passed empty string  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: BUG #18828: Crash when pg_get_logical_snapshot_meta() passed empty string
List pgsql-bugs
Hi,

On Thu, Mar 06, 2025 at 11:35:58AM -0800, Masahiko Sawada wrote:
> On Thu, Mar 6, 2025 at 12:11 AM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > === 1
> >
> > -parse_snapshot_filename(const char *filename)
> > +parse_snapshot_filename(char *filename)
> >
> > Why?
> 
> Sorry, that's a leftover that I attempted. We should use 'const'.

No problem at all! Okay, re-added in the attached.

> >
> > === 2
> >
> > -   if (sscanf(filename, "%X-%X", &hi, &lo) != 2)
> > +   if (sscanf(filename, "%X-%X.snap", &hi, &lo) != 2)
> >
> > We could replace (sscanf(filename, "%X-%X.snap", &hi, &lo) != 2) with
> > (sscanf(filename, "%X-%X.foo", &hi, &lo) != 2) and the regression tests would
> > still pass.
> >
> > So, I think it's better to remove the .snap here as it could give the "wrong"
> > impression that it's "useful".
> >
> > The attached removes the .snap and adds a comment like:
> >
> > "
> >  * Note: We deliberately don't use "%X-%X.snap" because sscanf only counts
> >  * converted values (%X), not literal text matches.
> > "
> 
> Yes, but we include the suffix when doing sscanf() in many other
> places. I don't see the specific benefit of doing it in another way
> only in this case.

The idea was to not give the "wrong" impression that it's "useful" to add the
suffix. That said I agree, it would be better to also update the other places
where the suffix is being used. I think I'll start a dedicated thread for
that (to at least put some comments around those places).

As far our current patch then I propose to add a comment though, like in the
attached?

> > === 3
> >
> > +   /*
> > +    * Bring back the LSN to the snapshot file format and compare
> > +    * it to  the given name to see if the extracted LSN is sane.
> > +    */
> > +   sprintf(tmpfname, "%X-%X.snap", hi, lo);
> > +   if (strcmp(tmpfname, filename) != 0)
> >
> > The idea was also to ensure that there are no extra characters between the LSN
> > values and the .snap extension:
> 
> Right. How about the following comment?
> 
> Bring back the extracted LSN to the snapshot file format and compare
> it to the given filename. This check strictly checks if the given filename
> follows the format of the snapshot filename.

Yeah sounds good, done that way in the attached.

> FYI I've analyzed the idea of extracting LSN from the filename and
> bringing it back to the format to validate the given filename. IIUC we
> strictly validate the given filename. For example, we accept
> '0-ABC.snap' but not '0-0ABC.snap',

Yeah, and so '0A-ABC.snap' for example.

> which is probably fine.

I think so, as long as the snapshot files are generated with %X (and not %08X
for example) in SnapBuildSerialize():

"
    sprintf(path, "%s/%X-%X.snap",
            PG_LOGICAL_SNAPSHOTS_DIR,
            LSN_FORMAT_ARGS(lsn));
"

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-bugs by date:

Previous
From: Christopher Inokuchi
Date:
Subject: Window Functions with identical PARTITION BY and ORDER BY clauses evaluated separately
Next
From: Álvaro Herrera
Date:
Subject: Re: BUG #18832: Segfault in GrantLockLocal