Thread: pg17.3 PQescapeIdentifier() ignores len

pg17.3 PQescapeIdentifier() ignores len

From
Justin Pryzby
Date:
I found errors in our sql log after upgrading to 17.3.

error_severity | ERROR
message        | schema "rptcache.44e3955c33bb79f55750897da0c5ab1fa2004af1_20250214" does not exist
query          | copy
"rptcache.44e3955c33bb79f55750897da0c5ab1fa2004af1_20250214"."44e3955c33bb79f55750897da0c5ab1fa2004af1_20250214"from
stdin

The copy command is from pygresql's inserttable(), which does:

    do {
        t = strchr(s, '.');
        if (!t)
            t = s + strlen(s);
        table = PQescapeIdentifier(self->cnx, s, (size_t)(t - s));
        fprintf(stderr, "table %s len %ld => %s\n", s, t-s, table);
        if (bufpt < bufmax)
            bufpt += snprintf(bufpt, (size_t)(bufmax - bufpt), "%s", table);
        PQfreemem(table);
        s = t;
        if (*s && bufpt < bufmax)
            *bufpt++ = *s++;
    } while (*s);

The fprintf suggests that since 5dc1e42b4 PQescapeIdentifier ignores its len.

python3 -c "import pg; db=pg.DB('postgres'); db.inserttable('child.a000000000000', [1])")
table child.a000000000000 len 5 => "child.a000000000000"
table a000000000000 len 13 => "a000000000000"

-- 
Justin



Re: pg17.3 PQescapeIdentifier() ignores len

From
Ranier Vilela
Date:
Em qui., 13 de fev. de 2025 às 13:51, Justin Pryzby <pryzby@telsasoft.com> escreveu:
I found errors in our sql log after upgrading to 17.3.

error_severity | ERROR
message        | schema "rptcache.44e3955c33bb79f55750897da0c5ab1fa2004af1_20250214" does not exist
query          | copy "rptcache.44e3955c33bb79f55750897da0c5ab1fa2004af1_20250214"."44e3955c33bb79f55750897da0c5ab1fa2004af1_20250214" from stdin

The copy command is from pygresql's inserttable(), which does:

    do {
        t = strchr(s, '.');
        if (!t)
            t = s + strlen(s);
        table = PQescapeIdentifier(self->cnx, s, (size_t)(t - s));
        fprintf(stderr, "table %s len %ld => %s\n", s, t-s, table);
        if (bufpt < bufmax)
            bufpt += snprintf(bufpt, (size_t)(bufmax - bufpt), "%s", table);
        PQfreemem(table);
        s = t;
        if (*s && bufpt < bufmax)
            *bufpt++ = *s++;
    } while (*s);

The fprintf suggests that since 5dc1e42b4 PQescapeIdentifier ignores its len.
Interesting, Coverity has some new reports regarding PQescapeIdentifier.

CID 1591290: (#1 of 1): Out-of-bounds access (OVERRUN)
2. alloc_strlen: Allocating insufficient memory for the terminating null of the string. [Note: The source code implementation of the function has been overridden by a builtin model.]

Until now, I was in disbelief.

best regards,
Ranier Vilela

Re: pg17.3 PQescapeIdentifier() ignores len

From
Tom Lane
Date:
Justin Pryzby <pryzby@telsasoft.com> writes:
> The fprintf suggests that since 5dc1e42b4 PQescapeIdentifier ignores its len.

Ugh, yes.  Need something like the attached.

FTR, 5dc1e42b4 et al were quite subtle patches done under extreme time
pressure.  I wonder if they have any other issues.  More eyes on those
patches would be welcome, now that they are public.

            regards, tom lane

diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index e97ad02542..120d4d032e 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -4224,7 +4224,7 @@ PQescapeInternal(PGconn *conn, const char *str, size_t len, bool as_ident)
     char       *rp;
     int            num_quotes = 0; /* single or double, depending on as_ident */
     int            num_backslashes = 0;
-    size_t        input_len = strlen(str);
+    size_t        input_len = strnlen(str, len);
     size_t        result_size;
     char        quote_char = as_ident ? '"' : '\'';
     bool        validated_mb = false;
@@ -4274,7 +4274,7 @@ PQescapeInternal(PGconn *conn, const char *str, size_t len, bool as_ident)
             if (!validated_mb)
             {
                 if (pg_encoding_verifymbstr(conn->client_encoding, s, remaining)
-                    != strlen(s))
+                    != remaining)
                 {
                     libpq_append_conn_error(conn, "invalid multibyte character");
                     return NULL;

Re: pg17.3 PQescapeIdentifier() ignores len

From
Tom Lane
Date:
Ranier Vilela <ranier.vf@gmail.com> writes:
> Interesting, Coverity has some new reports regarding PQescapeIdentifier.

> CID 1591290: (#1 of 1): Out-of-bounds access (OVERRUN)
> 2. alloc_strlen: Allocating insufficient memory for the terminating null of
> the string. [Note: The source code implementation of the function has been
> overridden by a builtin model.]

That's not new, we've been seeing those for awhile.  I've been
ignoring them on the grounds that (a) if the code actually had such a
problem, valgrind testing would have found it, and (b) the message is
saying in so many words that they're ignoring our code in favor of
somebody's apparently-inaccurate model of said code.

            regards, tom lane



Re: pg17.3 PQescapeIdentifier() ignores len

From
Ranier Vilela
Date:
Em qui., 13 de fev. de 2025 às 16:05, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
Ranier Vilela <ranier.vf@gmail.com> writes:
> Interesting, Coverity has some new reports regarding PQescapeIdentifier.

> CID 1591290: (#1 of 1): Out-of-bounds access (OVERRUN)
> 2. alloc_strlen: Allocating insufficient memory for the terminating null of
> the string. [Note: The source code implementation of the function has been
> overridden by a builtin model.]

That's not new, we've been seeing those for awhile.  I've been
ignoring them on the grounds that (a) if the code actually had such a
problem, valgrind testing would have found it, and (b) the message is
saying in so many words that they're ignoring our code in favor of
somebody's apparently-inaccurate model of said code.
Thanks Tom, extra care is needed when analyzing these reports.

best regards,
Ranier Vilela

Re: pg17.3 PQescapeIdentifier() ignores len

From
Nathan Bossart
Date:
On Thu, Feb 13, 2025 at 02:00:09PM -0500, Tom Lane wrote:
> Justin Pryzby <pryzby@telsasoft.com> writes:
>> The fprintf suggests that since 5dc1e42b4 PQescapeIdentifier ignores its len.
> 
> Ugh, yes.  Need something like the attached.

Your patch looks right to me.

-- 
nathan



Re: pg17.3 PQescapeIdentifier() ignores len

From
Ranier Vilela
Date:
Em qui., 13 de fev. de 2025 às 16:00, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
Justin Pryzby <pryzby@telsasoft.com> writes:
> The fprintf suggests that since 5dc1e42b4 PQescapeIdentifier ignores its len.

Ugh, yes.  Need something like the attached.

FTR, 5dc1e42b4 et al were quite subtle patches done under extreme time
pressure.  I wonder if they have any other issues.  More eyes on those
patches would be welcome, now that they are public.
Passes on standard tests at Windows 64 bits, msvc 2022 64 bits.

best regards,
Ranier Vilela

Re: pg17.3 PQescapeIdentifier() ignores len

From
Andres Freund
Date:
Hi,

On 2025-02-13 14:00:09 -0500, Tom Lane wrote:
> Justin Pryzby <pryzby@telsasoft.com> writes:
> > The fprintf suggests that since 5dc1e42b4 PQescapeIdentifier ignores its len.
> 
> Ugh, yes.  Need something like the attached.

I just pushed this fix, together with an expansion of test_escape.c. With the
expanded test both uses of strlen() are detected.


> FTR, 5dc1e42b4 et al were quite subtle patches done under extreme time
> pressure.  I wonder if they have any other issues. More eyes on those
> patches would be welcome, now that they are public.

Indeed.

Greetings,

Andres Freund



Re: pg17.3 PQescapeIdentifier() ignores len

From
Christoph Berg
Date:
Re: Andres Freund
> > > The fprintf suggests that since 5dc1e42b4 PQescapeIdentifier ignores its len.
> > 
> > Ugh, yes.  Need something like the attached.
> 
> I just pushed this fix, together with an expansion of test_escape.c. With the
> expanded test both uses of strlen() are detected.

FTR, this is also caught by pygresql's regression tests:

test_inserttable_with_dotted_table_name
(tests.test_classic_connection.TestInserttable.test_inserttable_with_dotted_table_name)... ERROR
 

https://ci.debian.net/packages/p/pygresql/testing/amd64/57838998/
https://qa.debian.org/excuses.php?package=postgresql-17

What's missing in the PG regression tests to see that problem?

Christoph



Re: pg17.3 PQescapeIdentifier() ignores len

From
Andres Freund
Date:
Hi,

On 2025-02-15 13:33:54 +0100, Christoph Berg wrote:
> Re: Andres Freund
> > > > The fprintf suggests that since 5dc1e42b4 PQescapeIdentifier ignores its len.
> > > 
> > > Ugh, yes.  Need something like the attached.
> > 
> > I just pushed this fix, together with an expansion of test_escape.c. With the
> > expanded test both uses of strlen() are detected.
> 
> FTR, this is also caught by pygresql's regression tests:
> 
> test_inserttable_with_dotted_table_name
(tests.test_classic_connection.TestInserttable.test_inserttable_with_dotted_table_name)... ERROR
 
> 
> https://ci.debian.net/packages/p/pygresql/testing/amd64/57838998/
> https://qa.debian.org/excuses.php?package=postgresql-17
> 
> What's missing in the PG regression tests to see that problem?

Well, the expanded tests added as part of the fix would catch it, but I agree,
it's a problem this wasn't caught beforehand.

I don't think that common uses of PQescapeIdentifier/Literal are likely to
catch the problem, so it's perhaps not too surprising it wasn't caught. Which,
I guess, shows that we really need more explicit edge-case coverage of at
least the most crucial APIs (we barely have any).  There's pretty much no way
that pg_regress or TAP test style tests are going to catch a problem like
this.

Greetings,

Andres Freund



Re: pg17.3 PQescapeIdentifier() ignores len

From
Christoph Berg
Date:
Re: Andres Freund
> > What's missing in the PG regression tests to see that problem?
> 
> Well, the expanded tests added as part of the fix would catch it, but I agree,
> it's a problem this wasn't caught beforehand.

Oh sorry, I was actually skimming the git log to see if there is a
test, but then failed to realize there is one. Thanks!

> I don't think that common uses of PQescapeIdentifier/Literal are likely to
> catch the problem, so it's perhaps not too surprising it wasn't caught. Which,
> I guess, shows that we really need more explicit edge-case coverage of at
> least the most crucial APIs (we barely have any).  There's pretty much no way
> that pg_regress or TAP test style tests are going to catch a problem like
> this.

What I can do is to trigger regression tests on all packages on
apt.postgresql.org after the minor releases have been built and then
raise any flags before the release goes out.

Except that pygresql isn't yet a package on apt.pg.o... will fix that
now. This time, the problem was caught by Debian's CI machinery.

Christoph



Re: pg17.3 PQescapeIdentifier() ignores len

From
Andres Freund
Date:
Hi,

On 2025-02-15 17:55:12 +0100, Christoph Berg wrote:
> Re: Andres Freund
> > I don't think that common uses of PQescapeIdentifier/Literal are likely to
> > catch the problem, so it's perhaps not too surprising it wasn't caught. Which,
> > I guess, shows that we really need more explicit edge-case coverage of at
> > least the most crucial APIs (we barely have any).  There's pretty much no way
> > that pg_regress or TAP test style tests are going to catch a problem like
> > this.
> 
> What I can do is to trigger regression tests on all packages on
> apt.postgresql.org after the minor releases have been built and then
> raise any flags before the release goes out.

I think that'd be *really* helpful. Of course that does require somebody
watching and raising an alarm...

Do you have ongoing package builds for sid or such?


> Except that pygresql isn't yet a package on apt.pg.o... will fix that
> now. This time, the problem was caught by Debian's CI machinery.

Are there regular outomated rebuilds that could tell us of such a problem
between the release being stamped and actually made?

Greetings,

Andres Freund



Re: pg17.3 PQescapeIdentifier() ignores len

From
Christoph Berg
Date:
Re: Andres Freund
> I think that'd be *really* helpful. Of course that does require somebody
> watching and raising an alarm...
> 
> Do you have ongoing package builds for sid or such?

What I am doing anyway is to trigger the regression test of each
package once a month (randomly distributed over the month so I'm not
getting swamped with failures) and then try to keep this page as
"green" as possible:

https://jengus.postgresql.org/view/Testsuite/

> Are there regular outomated rebuilds that could tell us of such a problem
> between the release being stamped and actually made?

The current proposal would be to simply build the minor releases from
the Monday tarballs, do the extended testing on Tuesday/Wednesday by
triggering all tests from that Jenkins page, and hopefully complain
before Thursday.

There are daily builds of all PG server branches, but
1) there are too many temporary failures so I don't have enough time to chase them
2) these packages are not used as basis for regression testing the other packages

https://jengus.postgresql.org/view/Snapshot/

1) might improve once the networking (hello arm64) and IO (hello
s390x) issues of some of the build machines are fixed.

2) might be something we should give a try. Either it works, or it
turns out there are too many weird minor-version-skew problems that
aren't actual bugs.

It would still not catch any last-minute regression from security
patches, though.

Christoph