Re: [GENERAL] psql weird behaviour with charset encodings - Mailing list pgsql-hackers

From Noah Misch
Subject Re: [GENERAL] psql weird behaviour with charset encodings
Date
Msg-id 20150618004756.GB415598@tornado.leadboat.com
Whole thread Raw
In response to Re: [GENERAL] psql weird behaviour with charset encodings  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: [GENERAL] psql weird behaviour with charset encodings  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On Wed, Jun 03, 2015 at 05:25:45PM +0900, Michael Paquier wrote:
> On Tue, Jun 2, 2015 at 4:19 PM, Michael Paquier  <michael.paquier@gmail.com> wrote:
> > On Sun, May 24, 2015 at 2:43 AM, Noah Misch <noah@leadboat.com> wrote:
> > > It would be good to purge the code of precisions on "s" conversion specifiers,
> > > then Assert(!pointflag) in fmtstr() to catch new introductions.  I won't plan
> > > to do it myself, but it would be a nice little defensive change.
> >
> > This sounds like a good protection idea, but as it impacts existing
> > backend code relying in sprintf's port version we should only do the
> > assertion in HEAD in my opinion, and mention it in the release notes of the
> > next major version at the time a patch in this area is applied. I guess

Adding the assertion would be master-only.  We don't necessarily release-note
C API changes.

> > that we had better backpatch the places using .*s though on back-branches.

I would tend to back-patch only the ones that cause interesting bugs.  For
example, we should not reach the read.c elog() calls anyway, so it's not a big
deal if the GNU libc bug makes them a bit less helpful in back branches.
(Thanks for the list of code sites; it was more complete than anything I had.)
So far, only tar.c looks harmed enough to back-patch.

> Attached is a patch purging a bunch of places from using %.*s, this will
> make the code more solid when facing non-ASCII strings. I let pg_upgrade
> and pg_basebackup code paths alone as it reduces the code lisibility by
> moving out of this separator. We may want to fix them though if file path
> names have non-ASCII characters, but it seems less critical.

To add the assertion, we must of course fix all uses.

Having seen the patch I requested, I don't like the result as much as I
expected to like it.  The patched code is definitely harder to read and write:

> @@ -1534,7 +1541,10 @@ parseNodeString(void)
>          return_value = _readDeclareCursorStmt();
>      else
>      {
> -        elog(ERROR, "badly formatted node string \"%.32s\"...", token);
> +        char buf[33];
> +        memcpy(buf, token, 32);
> +        buf[33] = '\0';
> +        elog(ERROR, "badly formatted node string \"%s\"...", buf);
>          return_value = NULL;    /* keep compiler quiet */
>      }

(Apropos, that terminator belongs in buf[32], not buf[33].)

Perhaps we're better off setting aside the whole idea, or forcing use of
snprintf.c on configurations having the bug?

Thanks,
nm



pgsql-hackers by date:

Previous
From: Brendan Jurd
Date:
Subject: Re: [PATCH] Function to get size of asynchronous notification queue
Next
From: Michael Paquier
Date:
Subject: Re: [GENERAL] psql weird behaviour with charset encodings