Re: [BUGS] Bug in byteaout code in all PostgreSQL versions - Mailing list pgsql-patches

From Joe Conway
Subject Re: [BUGS] Bug in byteaout code in all PostgreSQL versions
Date
Msg-id 3FCA59BC.9030409@joeconway.com
Whole thread Raw
In response to Re: [BUGS] Bug in byteaout code in all PostgreSQL versions  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-patches
Bruce Momjian wrote:

> Joe Conway has an updated version of this he will be applying shortly.
> Thanks.
>
> Joe, please make sure you CC this person once your patch is applied.

I just applied the attached patch to cvs head, and equivalent ones on
the 7.3 and 7.4 stable branches. I also attached the output of Sergey's
test program with the patch applied. Thanks for the nice,
self-contained, test case!

Joe

Index: doc/src/sgml/datatype.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/datatype.sgml,v
retrieving revision 1.131
diff -c -r1.131 datatype.sgml
*** doc/src/sgml/datatype.sgml    16 Nov 2003 20:29:16 -0000    1.131
--- doc/src/sgml/datatype.sgml    29 Nov 2003 05:28:38 -0000
***************
*** 1076,1084 ****
      strings are distinguished from characters strings by two
      characteristics: First, binary strings specifically allow storing
      octets of value zero and other <quote>non-printable</quote>
!     octets.  Second, operations on binary strings process the actual
!     bytes, whereas the encoding and processing of character strings
!     depends on locale settings.
     </para>

     <para>
--- 1076,1085 ----
      strings are distinguished from characters strings by two
      characteristics: First, binary strings specifically allow storing
      octets of value zero and other <quote>non-printable</quote>
!     octets (defined as octets outside the range 32 to 126).
!     Second, operations on binary strings process the actual bytes,
!     whereas the encoding and processing of character strings depends
!     on locale settings.
     </para>

     <para>
***************
*** 1131,1144 ****
         <entry><literal>\\</literal></entry>
        </row>

       </tbody>
      </tgroup>
     </table>

     <para>
!     Note that the result in each of the examples in <xref linkend="datatype-binary-sqlesc"> was exactly one
!     octet in length, even though the output representation of the zero
!     octet and backslash are more than one character.
     </para>

     <para>
--- 1132,1156 ----
         <entry><literal>\\</literal></entry>
        </row>

+       <row>
+        <entry>0 to 31 and 127 to 255</entry>
+        <entry><quote>non-printable</quote> octets</entry>
+        <entry><literal>'\\<replaceable>xxx'</></literal> (octal value)</entry>
+        <entry><literal>SELECT '\\001'::bytea;</literal></entry>
+        <entry><literal>\001</literal></entry>
+       </row>
+
       </tbody>
      </tgroup>
     </table>

     <para>
!     The requirement to escape <quote>non-printable</quote> octets actually
!     varies depending on locale settings. In some instances you can get away
!     with leaving them unescaped. Note that the result in each of the examples
!     in <xref linkend="datatype-binary-sqlesc"> was exactly one octet in
!     length, even though the output representation of the zero octet and
!     backslash are more than one character.
     </para>

     <para>
***************
*** 1206,1212 ****
        <row>
         <entry>32 to 126</entry>
         <entry><quote>printable</quote> octets</entry>
!        <entry>ASCII representation</entry>
         <entry><literal>SELECT '\\176'::bytea;</literal></entry>
         <entry><literal>~</literal></entry>
        </row>
--- 1218,1224 ----
        <row>
         <entry>32 to 126</entry>
         <entry><quote>printable</quote> octets</entry>
!        <entry>client character set representation</entry>
         <entry><literal>SELECT '\\176'::bytea;</literal></entry>
         <entry><literal>~</literal></entry>
        </row>
Index: src/backend/utils/adt/varlena.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/utils/adt/varlena.c,v
retrieving revision 1.106
diff -c -r1.106 varlena.c
*** src/backend/utils/adt/varlena.c    25 Sep 2003 06:58:05 -0000    1.106
--- src/backend/utils/adt/varlena.c    29 Nov 2003 05:28:40 -0000
***************
*** 186,195 ****
      {
          if (*vp == '\\')
              len += 2;
!         else if (isprint((unsigned char) *vp))
!             len++;
!         else
              len += 4;
      }
      rp = result = (char *) palloc(len);
      vp = VARDATA(vlena);
--- 186,195 ----
      {
          if (*vp == '\\')
              len += 2;
!         else if ((unsigned char) *vp < 0x20 || (unsigned char) *vp > 0x7e)
              len += 4;
+         else
+             len++;
      }
      rp = result = (char *) palloc(len);
      vp = VARDATA(vlena);
***************
*** 200,208 ****
              *rp++ = '\\';
              *rp++ = '\\';
          }
!         else if (isprint((unsigned char) *vp))
!             *rp++ = *vp;
!         else
          {
              val = *vp;
              rp[0] = '\\';
--- 200,206 ----
              *rp++ = '\\';
              *rp++ = '\\';
          }
!         else if ((unsigned char) *vp < 0x20 || (unsigned char) *vp > 0x7e)
          {
              val = *vp;
              rp[0] = '\\';
***************
*** 213,218 ****
--- 211,218 ----
              rp[1] = DIG(val & 03);
              rp += 4;
          }
+         else
+             *rp++ = *vp;
      }
      *rp = '\0';
      PG_RETURN_CSTRING(result);
Index: src/interfaces/libpq/fe-exec.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/interfaces/libpq/fe-exec.c,v
retrieving revision 1.153
diff -c -r1.153 fe-exec.c
*** src/interfaces/libpq/fe-exec.c    31 Oct 2003 17:43:10 -0000    1.153
--- src/interfaces/libpq/fe-exec.c    29 Nov 2003 05:28:45 -0000
***************
*** 2261,2267 ****
   *        '\0' == ASCII  0 == \\000
   *        '\'' == ASCII 39 == \'
   *        '\\' == ASCII 92 == \\\\
!  *        anything >= 0x80 ---> \\ooo (where ooo is an octal expression)
   */
  unsigned char *
  PQescapeBytea(const unsigned char *bintext, size_t binlen, size_t *bytealen)
--- 2261,2268 ----
   *        '\0' == ASCII  0 == \\000
   *        '\'' == ASCII 39 == \'
   *        '\\' == ASCII 92 == \\\\
!  *        anything < 0x20, or > 0x7e ---> \\ooo
!  *                                      (where ooo is an octal expression)
   */
  unsigned char *
  PQescapeBytea(const unsigned char *bintext, size_t binlen, size_t *bytealen)
***************
*** 2280,2286 ****
      vp = bintext;
      for (i = binlen; i > 0; i--, vp++)
      {
!         if (*vp == 0 || *vp >= 0x80)
              len += 5;            /* '5' is for '\\ooo' */
          else if (*vp == '\'')
              len += 2;
--- 2281,2287 ----
      vp = bintext;
      for (i = binlen; i > 0; i--, vp++)
      {
!         if (*vp < 0x20 || *vp > 0x7e)
              len += 5;            /* '5' is for '\\ooo' */
          else if (*vp == '\'')
              len += 2;
***************
*** 2299,2305 ****

      for (i = binlen; i > 0; i--, vp++)
      {
!         if (*vp == 0 || *vp >= 0x80)
          {
              (void) sprintf(rp, "\\\\%03o", *vp);
              rp += 5;
--- 2300,2306 ----

      for (i = binlen; i > 0; i--, vp++)
      {
!         if (*vp < 0x20 || *vp > 0x7e)
          {
              (void) sprintf(rp, "\\\\%03o", *vp);
              rp += 5;
[root@dev misc]# psql test -c "select name, setting from pg_settings where name like 'lc%'"
    name     |   setting
-------------+--------------
 lc_collate  | ru_RU.KOI8-R
 lc_ctype    | ru_RU.KOI8-R
 lc_messages | ru_RU.KOI8-R
 lc_monetary | ru_RU.KOI8-R
 lc_numeric  | ru_RU.KOI8-R
 lc_time     | ru_RU.KOI8-R
(6 rows)

[root@dev misc]# psql -l
        List of databases
   Name    |  Owner   | Encoding
-----------+----------+-----------
 template0 | postgres | SQL_ASCII
 template1 | postgres | SQL_ASCII
 test      | postgres | KOI8
(3 rows)

[root@dev misc]# ./bytea-test test test.data

Send to server:
\\000\\001\\002\\003\\004\\005\\006\\007\\010\\016\\017\\020\\021\\022\\023\\024\\025\\026\\027\\030\\031\\032\\033\\034\\035\\036\\037!"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\\177\\200\\201\\202\\203\\204\\205\\206\\207\\210\\211\\212\\213\\214\\215\\216\\217\\220\\221\\222\\223\\224\\225\\226\\227\\230\\231\\232\\233\\234\\235\\236\\237\\240\\241\\242\\243\\244\\245\\246\\247\\250\\251\\252\\253\\254\\255\\256\\257\\260\\261\\262\\263\\264\\265\\266\\267\\270\\271\\272\\273\\274\\275\\276\\277\\300\\301\\302\\303\\304\\305\\306\\307\\310\\311\\312\\313\\314\\315\\316\\317\\320\\321\\322\\323\\324\\325\\326\\327\\330\\331\\332\\333\\334\\335\\336\\337\\340\\341\\342\\343\\344\\345\\346\\347\\350\\351\\352\\353\\354\\355\\356\\357\\360\\361\\362\\363\\364\\365\\366\\367\\370\\371\\372\\373\\374\\375\\376\\377

Recieve from server:
\000\001\002\003\004\005\006\007\010\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\177\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\241\242\243\244\245\246\247\250\251\252\253\254\255\256\257\260\261\262\263\264\265\266\267\270\271\272\273\274\275\276\277\300\301\302\303\304\305\306\307\310\311\312\313\314\315\316\317\320\321\322\323\324\325\326\327\330\331\332\333\334\335\336\337\340\341\342\343\344\345\346\347\350\351\352\353\354\355\356\357\360\361\362\363\364\365\366\367\370\371\372\373\374\375\376\377

Test successfully done.

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: export FUNC_MAX_ARGS as a read-only GUC variable (was: [GENERAL] SELECT Question)
Next
From: Bruce Momjian
Date:
Subject: Re: clock_timestamp() and transaction_timestamp() function