Re: Decision by Monday: PQescapeString() vs. encoding violation - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: Decision by Monday: PQescapeString() vs. encoding violation
Date
Msg-id f9e447e5-d94f-45f0-8ccb-2570cd4419c9@dunslane.net
Whole thread Raw
Responses Re: Decision by Monday: PQescapeString() vs. encoding violation
List pgsql-hackers
On 2025-02-14 Fr 8:27 PM, Noah Misch wrote:
> The security team has a question below about how best to proceed with a recent
> behavior change.
>
> Commit 5dc1e42b4fa6a4434afa7d7cdcf0291351a7b873 for this week's CVE-2025-1094
> changed how PQescapeString()[1] reacts to input that is not valid in the
> client encoding.  Before that commit, the function would ignore encoding
> problems except at the end of the string.  Now, it replaces the bad sequence
> up to the length implied by the first byte.  For example, if UTF8 input has
> 0xc2 followed by an ASCII byte, the function removes both bytes.
>
> Jeff Davis reported to the security team that
> http://www.unicode.org/reports/tr36/#Ill-Formed_Subsequences forbids something
> like this, saying a UTF-8 converter "must not consume the [second byte] if it
> continues".  (That's my summary, not his.  He might reply to add color here.)
> While PQescapeString() is not a UTF-8 converter, that standard still felt to
> multiple security team members like a decent argument for removing only the
> invalid 0xc2, not the byte following it.  UTF8 is the most important encoding,
> and other encodings tolerate this.  Either way, the server will report an
> encoding error.  The difference doesn't have functional consequences if one
> simply puts the function result in a query.  The difference could matter for
> debugging or if applications are postprocessing the PQescapeString() result in
> some way.  Postprocessing is not supported, but we'd still like to do the best
> thing for applications that may already be doing it.
>
> Security team members disagreed on whether next week's releases are the last
> reasonable chance to change this, or whether changing it in e.g. May would be
> reasonable.  If applications make changes to cope with the new behavior, that
> could be an argument against further change.
>
> Question for all: would you switch to the "remove fewer bytes" behavior in
> next week's releases, switch later, or switch never?  Why so?  Please answer
> in the next 24hr if possible, given the little time until we wrap next week's
> releases on Monday.  I regret the late notice.
>
> I'm attaching a WIP patch from Andres Freund.  We may use it to adopt the
> "remove fewer bytes" behavior, if that's the decision.
>
> Thanks,
> nm
>
>
> [1] The commit changed other functions, but PQescapeString() is most
> interesting for this discussion.  Others have ways to report errors, or they
> have reason to believe the input is already checked.  New code should be using
> the others and checking the error indicator.



Removing fewer bytes seems like the right thing to do, now or later. The 
WIP patch itself is trivial. If it weren't I'd be opposed to doing it at 
the last minute. Even so I'm a bit nervous - last minute changes have 
bitten us before.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




pgsql-hackers by date:

Previous
From: Christoph Berg
Date:
Subject: Re: pg17.3 PQescapeIdentifier() ignores len
Next
From: Andres Freund
Date:
Subject: Re: pg17.3 PQescapeIdentifier() ignores len