Thread: bytea datatype documentation patch

bytea datatype documentation patch

From
Joe Conway
Date:
Here's a second bytea documentation patch. This one significantly
expands the "Binary Data" section added by Bruce recently.

-- Joe

Re: bytea datatype documentation patch

From
Bruce Momjian
Date:
Patch applied.  Thanks.

---------------------------------------------------------------------------


> Here's a second bytea documentation patch. This one significantly
> expands the "Binary Data" section added by Bruce recently.
>
> -- Joe

[ text/html is unsupported, treating like TEXT/PLAIN ]

> *** sgml.orig/datatype.sgml    Mon Nov 19 01:05:00 2001
> --- sgml/datatype.sgml    Tue Nov 20 01:04:54 2001
> ***************
> *** 965,987 ****
>
>     </sect1>
>
> !   <sect1 id="datatype-binary">
> !    <title>Binary Data</title>
>
>      <para>
> !     The <type>bytea</type> data type allows storage of binary data,
> !     specifically allowing storage of NULLs which are entered as
> !     <literal>'\\000'</>. The first backslash is interpreted by the
> !     single quotes, and the second is recognized by <type>bytea</> and
> !     precedes a three digit octal value. For a similar reason, a
> !     backslash must be entered into a field as <literal>'\\\\'</> or
> !     <literal>'\\134'</>. You may also have to escape line feeds and
> !     carriage return if your interface automatically translates these. It
> !     can store values of any length. <type>Bytea</> is a non-standard
> !     data type.
>      </para>
>     </sect1>
>
>     <sect1 id="datatype-datetime">
>      <title>Date/Time Types</title>
>
> --- 965,1260 ----
>
>     </sect1>
>
> !  <sect1 id="datatype-binary">
> !   <title>Binary Strings</title>
> !    <para>
> !     The <type>bytea</type> data type allows storage of binary strings.
> !    </para>
> !
> !    <table tocentry="1">
> !     <title>Binary String Types</title>
> !     <tgroup cols="3">
> !      <thead>
> !       <row>
> !        <entry>Type Name</entry>
> !        <entry>Storage</entry>
> !        <entry>Description</entry>
> !       </row>
> !      </thead>
> !      <tbody>
> !       <row>
> !        <entry>bytea</entry>
> !        <entry>4 bytes plus the actual string</entry>
> !        <entry>Variable (not specifically limited)
> !               length binary string</entry>
> !       </row>
> !      </tbody>
> !     </tgroup>
> !    </table>
> !
> !    <para>
> !     A binary string is a sequence of octets that does not have either a
> !     character set or collation associated with it.  Bytea specifically
> !     allows storage of NULLs and other 'non-printable' <acronym>ASCII
> !     </acronym> characters.
> !    </para>
> !
> !    <para>
> !     Certain <acronym>ASCII</acronym> characters MUST be escaped (but all
> !     characters MAY be escaped) when used as part of a string literal in an
> !     <acronym>SQL</acronym> statement. In general, to escape a character, it
> !     is converted into the three digit octal number equal to the decimal
> !     <acronym>ASCII</acronym> value, and preceeded by two backslashes. The
> !     single quote (') and backslash (\) characters have special alternate
> !     escape sequences. Details are in
> !     <xref linkend="datatype-binary-sqlesc">.
> !    </para>
> !
> !    <table id="datatype-binary-sqlesc">
> !     <title><acronym>SQL</acronym> Literal Escaped <acronym>ASCII</acronym>
> !            Characters</title>
> !     <tgroup cols="5">
> !      <thead>
> !       <row>
> !        <entry>Decimal <acronym>ASCII</acronym> Value</entry>
> !        <entry>Description</entry>
> !        <entry>Input Escaped Representation</entry>
> !        <entry>Example</entry>
> !        <entry>Printed Result</entry>
> !       </row>
> !      </thead>
> !
> !      <tbody>
> !       <row>
> !        <entry> <literal> 0 </literal> </entry>
> !        <entry> null byte </entry>
> !        <entry> <literal> '\\000' </literal> </entry>
> !        <entry> <literal> select '\\000'::bytea; </literal> </entry>
> !        <entry> <literal> \000 </literal></entry>
> !       </row>
> !
> !       <row>
> !        <entry> <literal> 39 </literal> </entry>
> !        <entry> single quote </entry>
> !        <entry> <literal> '\\'' or '\\047' </literal> </entry>
> !        <entry> <literal> select '\''::bytea; </literal></entry>
> !        <entry> <literal> ' </literal></entry>
> !       </row>
> !
> !       <row>
> !        <entry> <literal>92</literal> </entry>
> !        <entry> backslash </entry>
> !        <entry> <literal> '\\\\' or '\\134' </literal> </entry>
> !        <entry> <literal> select '\\\\'::bytea; </literal></entry>
> !        <entry> <literal> \\ </literal></entry>
> !       </row>
> !
> !      </tbody>
> !     </tgroup>
> !    </table>
> !
> !    <para>
> !     Note that the result in each of the examples above was exactly one
> !     byte in length, even though the output representation of the null byte
> !     and backslash are more than one character. Bytea output characters
> !     are also escaped. In general, each "non-printable" character is
> !     converted into the three digit octal number equal to its decimal
> !     <acronym>ASCII</acronym> value, and preceeded by one backslash. Most
> !     "printable" characters are represented by their standard
> !     <acronym>ASCII</acronym> representation. The backslash (\) character
> !     has a special alternate output representation. Details are in
> !     <xref linkend="datatype-binary-resesc">.
> !    </para>
> !
> !    <table id="datatype-binary-resesc">
> !     <title><acronym>SQL</acronym> Output Escaped <acronym>ASCII</acronym>
> !            Characters</title>
> !     <tgroup cols="5">
> !      <thead>
> !       <row>
> !        <entry>Decimal <acronym>ASCII</acronym> Value</entry>
> !        <entry>Description</entry>
> !        <entry>Output Escaped Representation</entry>
> !        <entry>Example</entry>
> !        <entry>Printed Result</entry>
> !       </row>
> !      </thead>
> !
> !      <tbody>
> !
> !       <row>
> !        <entry> <literal> 39 </literal> </entry>
> !        <entry> single quote </entry>
> !        <entry> <literal> ' </literal> </entry>
> !        <entry> <literal> select '\\047'::bytea; </literal></entry>
> !        <entry> <literal> ' </literal></entry>
> !       </row>
> !
> !       <row>
> !        <entry> <literal> 92 </literal> </entry>
> !        <entry> backslash </entry>
> !        <entry> <literal> \\ </literal> </entry>
> !        <entry> <literal> select '\\134'::bytea; </literal></entry>
> !        <entry> <literal> \\ </literal></entry>
> !       </row>
> !
> !       <row>
> !        <entry> <literal> 0 to 31 and 127 to 255 </literal> </entry>
> !        <entry> non-printable characters </entry>
> !        <entry> <literal> \### (octal value) </literal> </entry>
> !        <entry> <literal> select '\\001'::bytea; </literal> </entry>
> !        <entry> <literal> \001 </literal></entry>
> !       </row>
> !
> !       <row>
> !        <entry> <literal> 32 to 126 </literal> </entry>
> !        <entry> printable characters </entry>
> !        <entry> ASCII representation </entry>
> !        <entry> <literal>  select '\\176'::bytea; </literal> </entry>
> !        <entry> <literal> ~ </literal></entry>
> !       </row>
> !
> !      </tbody>
> !     </tgroup>
> !    </table>
> !
> !    <para>
> !     <acronym>SQL</acronym> string literals (input strings) must be
> !     preceeded with two backslashes due to the fact that they must pass
> !     through two parsers in the PostgreSQL backend. The first backslash
> !     is interpreted as an escape character by the string literal parser,
> !     and therefore is consumed, leaving the characters that follow it.
> !     The second backslash is recognized by <type>bytea</> input function
> !     as the prefix of a three digit octal value. For example, a string
> !     literal passed to the backend as <literal>'\\001'</literal> becomes
> !     <literal>'\001'</literal> after passing through the string literal
> !     parser. The <literal>'\001'</literal> is then sent to the bytea
> !     input function, where it is converted to a single byte with a decimal
> !     <acronym>ASCII</acronym> value of 1.
> !    </para>
> !
> !    <para>
> !     For a similar reason, a backslash must be input as
> !     <literal>'\\\\'</literal> (or <literal>'\\134'</literal>). The first
> !     and third backslashes are interpreted as escape characters by the
> !     string literal parser, and therefore are consumed, leaving the
> !     second and forth backslashes untouched. The second and forth
> !     backslashes are recognized by <type>bytea</> input function as a single
> !     backslash. For example, a string literal passed to the backend as
> !     <literal>'\\\\'</literal> becomes <literal>'\\'</literal> after passing
> !     through the string literal parser. The <literal>'\\'</literal> is then
> !     sent to the bytea input function, where it is converted to a single
> !     byte with a decimal <acronym>ASCII</acronym> value of 92.
> !    </para>
> !
> !    <para>
> !     A single quote is a bit different in that it must be input as
> !     <literal>'\''</literal> (or <literal>'\\134'</literal>), NOT as
> !     <literal>'\\''</literal>. This is because, while the literal parser
> !     interprets the single quote as a special character, and will consume
> !     the single backslash, the bytea input function does NOT recognize
> !     a single quote as a special character. Therefore a string
> !     literal passed to the backend as <literal>'\''</literal> becomes
> !     <literal>'''</literal> after passing through the string literal
> !     parser. The <literal>'''</literal> is then sent to the bytea
> !     input function, where it is retains its single byte decimal
> !     <acronym>ASCII</acronym> value of 39.
> !    </para>
>
>      <para>
> !     Depending on the front end to PostgreSQL you use, you may have
> !     additional work to do in terms of escaping and unescaping bytea
> !     strings. For example, you may also have to escape line feeds and
> !     carriage return if your interface automatically translates these.
> !     Or you may have to double up on backslashes if the parser for your
> !     language or choice also treats them as an escape character.
>      </para>
> +
> +    <sect2 id="datatype-binary-compat">
> +     <title>Compatibility</title>
> +     <para>
> +      Bytea provides most of the functionality of the SQL99 binary string
> +      type per SQL99 section 4.3. A comparison of PostgreSQL bytea and SQL99
> +      Binary Strings is presented in
> +      <xref linkend="datatype-binary-compat-comp">.
> +     </para>
> +
> +     <table id="datatype-binary-compat-comp">
> +      <title>Comparison of SQL99 Binary String and BYTEA types</title>
> +      <tgroup cols="2">
> +       <thead>
> +        <row>
> +         <entry>SQL99</entry>
> +         <entry>BYTEA</entry>
> +        </row>
> +       </thead>
> +
> +       <tbody>
> +        <row>
> +         <entry> Name of data type BINARY LARGE OBJECT or BLOB </entry>
> +         <entry> Name of data type BYTEA </entry>
> +        </row>
> +
> +        <row>
> +         <entry> Sequence of octets that does not have either a character set
> +                 or collation associated with it. </entry>
> +         <entry> same </entry>
> +        </row>
> +
> +        <row>
> +         <entry> Described by a binary data type descriptor containing the
> +                 name of the data type and the maximum length
> +                 in octets</entry>
> +         <entry> Described by a binary data type descriptor containing the
> +                 name of the data type with no specific maximum length
> +                 </entry>
> +        </row>
> +
> +        <row>
> +         <entry> All binary strings are mutually comparable in accordance
> +                 with the rules of comparison predicates.</entry>
> +         <entry> same</entry>
> +        </row>
> +
> +        <row>
> +         <entry> Binary string values can only be compared for equality.
> +                 </entry>
> +         <entry> Binary string values can be compared for equality, greater
> +                 than, greater than or equal, less than, less than or equal
> +                 </entry>
> +        </row>
> +
> +        <row>
> +         <entry> Operators operating on and returning binary strings
> +                 include concatenation, substring, overlay, and trim</entry>
> +         <entry> Operators operating on and returning binary strings
> +                 include concatenation, substring, and trim. The <literal>
> +                 'leading'</literal> and <literal>'trailing'</literal>
> +                 arguments for trim are not yet implemented.
> +                 </entry>
> +        </row>
> +
> +        <row>
> +         <entry> Other operators involving binary strings
> +                 include length, position, and the like predicate</entry>
> +         <entry> same</entry>
> +        </row>
> +
> +        <row>
> +         <entry> A binary string literal is comprised of an even number of
> +                 hexidecimal digits, in single quotes, preceeded by "X",
> +                 e.g. X'1a43fe'</entry>
> +         <entry> A binary string literal is comprised of ASCII characters
> +                 escaped according to the rules shown in
> +                 <xref linkend="datatype-binary-sqlesc"> </entry>
> +        </row>
> +       </tbody>
> +      </tgroup>
> +     </table>
> +    </sect2>
>     </sect1>
>
> +
>     <sect1 id="datatype-datetime">
>      <title>Date/Time Types</title>
>

>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: bytea datatype documentation patch

From
Peter Eisentraut
Date:
Joe Conway writes:

> Here's a second bytea documentation patch. This one significantly
> expands the "Binary Data" section added by Bruce recently.

This is good material, just a couple of comments.

I feel that basing the whole discussion on what "ASCII" characters can be
stored, are nonprintable, must be escaped, etc. is not appropriate.
Bytea doesn't store characters, it stores bytes (or octets, as you wish).
The correspondence between bytes and characters is old-fashioned and
pretty much does not exist in SQL except for a few places such as
OCTET_LENGTH.

Conversely, the discussion about whether text should be able to cope with
zero bytes is nonsensical because text stores characters and not bytes.

So I would base this discussion on the premise "bytea stores binary data"
(insert examples).

Some stylistic issues:

bytea => <type>bytea</type>

NULLs => zero bytes/bytes of value zero ("NULL" is too overloaded)

'non-printable' => <quote>nonprintable</quote>

MUST => <emphasis>must</emphasis>

--
Peter Eisentraut   peter_e@gmx.net


Re: bytea datatype documentation patch

From
Joe Conway
Date:
Peter Eisentraut wrote:

> So I would base this discussion on the premise "bytea stores binary data"
> (insert examples).
>
> Some stylistic issues:
>
> bytea => <type>bytea</type>
>
> NULLs => zero bytes/bytes of value zero ("NULL" is too overloaded)
>
> 'non-printable' => <quote>nonprintable</quote>
>
> MUST => <emphasis>must</emphasis>
>

Here's a patch against *CVS tip* to address Peter's comments. Please let
me know what you think!

Thanks,

Joe

*** sgml.orig/datatype.sgml    Tue Nov 20 07:42:44 2001
--- sgml/datatype.sgml    Tue Nov 20 18:07:22 2001
***************
*** 984,990 ****
       <tbody>
        <row>
         <entry>bytea</entry>
!        <entry>4 bytes plus the actual string</entry>
         <entry>Variable (not specifically limited)
                length binary string</entry>
        </row>
--- 984,990 ----
       <tbody>
        <row>
         <entry>bytea</entry>
!        <entry>4 bytes plus the actual binary string</entry>
         <entry>Variable (not specifically limited)
                length binary string</entry>
        </row>
***************
*** 994,1022 ****

     <para>
      A binary string is a sequence of octets that does not have either a
!     character set or collation associated with it.  Bytea specifically
!     allows storage of NULLs and other 'non-printable' <acronym>ASCII
!     </acronym> characters.
     </para>

     <para>
!     Certain <acronym>ASCII</acronym> characters MUST be escaped (but all
!     characters MAY be escaped) when used as part of a string literal in an
!     <acronym>SQL</acronym> statement. In general, to escape a character, it
!     is converted into the three digit octal number equal to the decimal
!     <acronym>ASCII</acronym> value, and preceeded by two backslashes. The
!     single quote (') and backslash (\) characters have special alternate
!     escape sequences. Details are in
      <xref linkend="datatype-binary-sqlesc">.
     </para>

     <table id="datatype-binary-sqlesc">
!     <title><acronym>SQL</acronym> Literal Escaped <acronym>ASCII</acronym>
!            Characters</title>
      <tgroup cols="5">
       <thead>
        <row>
!        <entry>Decimal <acronym>ASCII</acronym> Value</entry>
         <entry>Description</entry>
         <entry>Input Escaped Representation</entry>
         <entry>Example</entry>
--- 994,1021 ----

     <para>
      A binary string is a sequence of octets that does not have either a
!     character set or collation associated with it.  <type>Bytea</type>
!     specifically allows storing octets of zero value and other
!     <quote>non-printable</quote> octets.
     </para>

     <para>
!     Octets of certain values <emphasis>must</emphasis> be escaped (but all
!     octet values <emphasis>may</emphasis> be escaped) when used as part of
!     a string literal in an <acronym>SQL</acronym> statement. In general,
!     to escape an octet, it is converted into the three digit octal number
!     equivalent of its decimal octet value, and preceeded by two
!     backslashes. Octets with the decimal values 39 (single quote), and 92
!     (backslash), have special alternate escape sequences. Details are in
      <xref linkend="datatype-binary-sqlesc">.
     </para>

     <table id="datatype-binary-sqlesc">
!     <title><acronym>SQL</acronym> Literal Escaped Octets</title>
      <tgroup cols="5">
       <thead>
        <row>
!        <entry>Decimal Octet Value</entry>
         <entry>Description</entry>
         <entry>Input Escaped Representation</entry>
         <entry>Example</entry>
***************
*** 1027,1033 ****
       <tbody>
        <row>
         <entry> <literal> 0 </literal> </entry>
!        <entry> null byte </entry>
         <entry> <literal> '\\000' </literal> </entry>
         <entry> <literal> select '\\000'::bytea; </literal> </entry>
         <entry> <literal> \000 </literal></entry>
--- 1026,1032 ----
       <tbody>
        <row>
         <entry> <literal> 0 </literal> </entry>
!        <entry> zero octet </entry>
         <entry> <literal> '\\000' </literal> </entry>
         <entry> <literal> select '\\000'::bytea; </literal> </entry>
         <entry> <literal> \000 </literal></entry>
***************
*** 1055,1078 ****

     <para>
      Note that the result in each of the examples above was exactly one
!     byte in length, even though the output representation of the null byte
!     and backslash are more than one character. Bytea output characters
!     are also escaped. In general, each "non-printable" character is
!     converted into the three digit octal number equal to its decimal
!     <acronym>ASCII</acronym> value, and preceeded by one backslash. Most
!     "printable" characters are represented by their standard
!     <acronym>ASCII</acronym> representation. The backslash (\) character
!     has a special alternate output representation. Details are in
!     <xref linkend="datatype-binary-resesc">.
     </para>

     <table id="datatype-binary-resesc">
!     <title><acronym>SQL</acronym> Output Escaped <acronym>ASCII</acronym>
!            Characters</title>
      <tgroup cols="5">
       <thead>
        <row>
!        <entry>Decimal <acronym>ASCII</acronym> Value</entry>
         <entry>Description</entry>
         <entry>Output Escaped Representation</entry>
         <entry>Example</entry>
--- 1054,1076 ----

     <para>
      Note that the result in each of the examples above was exactly one
!     octet in length, even though the output representation of the zero
!     octet and backslash are more than one character. <type>Bytea</type>
!     output octets are also escaped. In general, each
!     <quote>non-printable</quote> octet decimal value is converted into
!     its equivalent three digit octal value, and preceeded by one backslash.
!     Most <quote>printable</quote> octets are represented by their standard
!     representation in the client character set. The octet with decimal
!     value 92 (backslash) has a special alternate output representation.
!     Details are in <xref linkend="datatype-binary-resesc">.
     </para>

     <table id="datatype-binary-resesc">
!     <title><acronym>SQL</acronym> Output Escaped Octets</title>
      <tgroup cols="5">
       <thead>
        <row>
!        <entry>Decimal Octet Value</entry>
         <entry>Description</entry>
         <entry>Output Escaped Representation</entry>
         <entry>Example</entry>
***************
*** 1100,1106 ****

        <row>
         <entry> <literal> 0 to 31 and 127 to 255 </literal> </entry>
!        <entry> non-printable characters </entry>
         <entry> <literal> \### (octal value) </literal> </entry>
         <entry> <literal> select '\\001'::bytea; </literal> </entry>
         <entry> <literal> \001 </literal></entry>
--- 1098,1104 ----

        <row>
         <entry> <literal> 0 to 31 and 127 to 255 </literal> </entry>
!        <entry> <quote>non-printable</quote> octets </entry>
         <entry> <literal> \### (octal value) </literal> </entry>
         <entry> <literal> select '\\001'::bytea; </literal> </entry>
         <entry> <literal> \001 </literal></entry>
***************
*** 1108,1115 ****

        <row>
         <entry> <literal> 32 to 126 </literal> </entry>
!        <entry> printable characters </entry>
!        <entry> ASCII representation </entry>
         <entry> <literal>  select '\\176'::bytea; </literal> </entry>
         <entry> <literal> ~ </literal></entry>
        </row>
--- 1106,1113 ----

        <row>
         <entry> <literal> 32 to 126 </literal> </entry>
!        <entry> <quote>printable</quote> octets </entry>
!        <entry> client character set representation </entry>
         <entry> <literal>  select '\\176'::bytea; </literal> </entry>
         <entry> <literal> ~ </literal></entry>
        </row>
***************
*** 1123,1198 ****
      preceeded with two backslashes due to the fact that they must pass
      through two parsers in the PostgreSQL backend. The first backslash
      is interpreted as an escape character by the string literal parser,
!     and therefore is consumed, leaving the characters that follow it.
!     The second backslash is recognized by <type>bytea</> input function
      as the prefix of a three digit octal value. For example, a string
      literal passed to the backend as <literal>'\\001'</literal> becomes
      <literal>'\001'</literal> after passing through the string literal
!     parser. The <literal>'\001'</literal> is then sent to the bytea
!     input function, where it is converted to a single byte with a decimal
!     <acronym>ASCII</acronym> value of 1.
     </para>

     <para>
      For a similar reason, a backslash must be input as
      <literal>'\\\\'</literal> (or <literal>'\\134'</literal>). The first
!     and third backslashes are interpreted as escape characters by the
      string literal parser, and therefore are consumed, leaving the
      second and forth backslashes untouched. The second and forth
!     backslashes are recognized by <type>bytea</> input function as a single
!     backslash. For example, a string literal passed to the backend as
!     <literal>'\\\\'</literal> becomes <literal>'\\'</literal> after passing
!     through the string literal parser. The <literal>'\\'</literal> is then
!     sent to the bytea input function, where it is converted to a single
!     byte with a decimal <acronym>ASCII</acronym> value of 92.
     </para>

     <para>
      A single quote is a bit different in that it must be input as
!     <literal>'\''</literal> (or <literal>'\\134'</literal>), NOT as
!     <literal>'\\''</literal>. This is because, while the literal parser
!     interprets the single quote as a special character, and will consume
!     the single backslash, the bytea input function does NOT recognize
!     a single quote as a special character. Therefore a string
      literal passed to the backend as <literal>'\''</literal> becomes
      <literal>'''</literal> after passing through the string literal
!     parser. The <literal>'''</literal> is then sent to the bytea
!     input function, where it is retains its single byte decimal
!     <acronym>ASCII</acronym> value of 39.
     </para>

     <para>
      Depending on the front end to PostgreSQL you use, you may have
!     additional work to do in terms of escaping and unescaping bytea
!     strings. For example, you may also have to escape line feeds and
!     carriage return if your interface automatically translates these.
!     Or you may have to double up on backslashes if the parser for your
!     language or choice also treats them as an escape character.
     </para>

     <sect2 id="datatype-binary-compat">
      <title>Compatibility</title>
      <para>
!      Bytea provides most of the functionality of the SQL99 binary string
!      type per SQL99 section 4.3. A comparison of PostgreSQL bytea and SQL99
!      Binary Strings is presented in
       <xref linkend="datatype-binary-compat-comp">.
      </para>

      <table id="datatype-binary-compat-comp">
!      <title>Comparison of SQL99 Binary String and BYTEA types</title>
       <tgroup cols="2">
        <thead>
         <row>
          <entry>SQL99</entry>
!         <entry>BYTEA</entry>
         </row>
        </thead>

        <tbody>
         <row>
!         <entry> Name of data type BINARY LARGE OBJECT or BLOB </entry>
!         <entry> Name of data type BYTEA </entry>
         </row>

         <row>
--- 1121,1201 ----
      preceeded with two backslashes due to the fact that they must pass
      through two parsers in the PostgreSQL backend. The first backslash
      is interpreted as an escape character by the string literal parser,
!     and therefore is consumed, leaving the octets that follow.
!     The second backslash is recognized by <type>bytea</type> input function
      as the prefix of a three digit octal value. For example, a string
      literal passed to the backend as <literal>'\\001'</literal> becomes
      <literal>'\001'</literal> after passing through the string literal
!     parser. The <literal>'\001'</literal> is then sent to the
!     <type>bytea</type> input function, where it is converted to a single
!     octet with a decimal value of 1.
     </para>

     <para>
      For a similar reason, a backslash must be input as
      <literal>'\\\\'</literal> (or <literal>'\\134'</literal>). The first
!     and third backslashes are interpreted as escape octets by the
      string literal parser, and therefore are consumed, leaving the
      second and forth backslashes untouched. The second and forth
!     backslashes are recognized by the <type>bytea</type> input function
!     as a single backslash. For example, a string literal passed to the
!     backend as <literal>'\\\\'</literal> becomes <literal>'\\'</literal>
!     after passing through the string literal parser. The
!     <literal>'\\'</literal> is then sent to the <type>bytea</type> input
!     function, where it is converted to a single octet with a decimal
!     value of 92.
     </para>

     <para>
      A single quote is a bit different in that it must be input as
!     <literal>'\''</literal> (or <literal>'\\134'</literal>),
!     <emphasis>not</emphasis> as <literal>'\\''</literal>. This is because,
!     while the literal parser interprets the single quote as a special
!     character, and will consume the single backslash, the
!     <type>bytea</type> input function does <emphasis>not</emphasis>
!     recognize a single quote as a special octet. Therefore a string
      literal passed to the backend as <literal>'\''</literal> becomes
      <literal>'''</literal> after passing through the string literal
!     parser. The <literal>'''</literal> is then sent to the
!     <type>bytea</type> input function, where it is retains its single
!     octet decimal value of 39.
     </para>

     <para>
      Depending on the front end to PostgreSQL you use, you may have
!     additional work to do in terms of escaping and unescaping
!     <type>bytea</type> strings. For example, you may also have to escape
!     line feeds and carriage returns if your interface automatically
!     translates these. Or you may have to double up on backslashes if
!     the parser for your language or choice also treats them as an
!     escape octet.
     </para>

     <sect2 id="datatype-binary-compat">
      <title>Compatibility</title>
      <para>
!      <type>Bytea</type> provides most of the functionality of the binary
!      string type per SQL99 section 4.3. A comparison of SQL99 Binary
!      Strings and PostgreSQL <type>bytea</type> is presented in
       <xref linkend="datatype-binary-compat-comp">.
      </para>

      <table id="datatype-binary-compat-comp">
!      <title>Comparison of SQL99 Binary String and PostgreSQL
!      <type>BYTEA</type> types</title>
       <tgroup cols="2">
        <thead>
         <row>
          <entry>SQL99</entry>
!         <entry><type>BYTEA</type></entry>
         </row>
        </thead>

        <tbody>
         <row>
!         <entry> Name of data type <type>BINARY LARGE OBJECT</type>
!                 or <type>BLOB</type> </entry>
!         <entry> Name of data type <type>BYTEA</type> </entry>
         </row>

         <row>
***************
*** 1242,1250 ****

         <row>
          <entry> A binary string literal is comprised of an even number of
!                 hexidecimal digits, in single quotes, preceeded by "X",
!                 e.g. X'1a43fe'</entry>
!         <entry> A binary string literal is comprised of ASCII characters
                  escaped according to the rules shown in
                  <xref linkend="datatype-binary-sqlesc"> </entry>
         </row>
--- 1245,1253 ----

         <row>
          <entry> A binary string literal is comprised of an even number of
!                 hexidecimal digits, in single quotes, preceeded by <quote>X</quote>,
!                 e.g. <literal>X'1a43fe'</literal></entry>
!         <entry> A binary string literal is comprised of octets
                  escaped according to the rules shown in
                  <xref linkend="datatype-binary-sqlesc"> </entry>
         </row>

Re: bytea datatype documentation patch

From
Bruce Momjian
Date:
Patch applied.  Thanks.

---------------------------------------------------------------------------


> Peter Eisentraut wrote:
>
> > So I would base this discussion on the premise "bytea stores binary data"
> > (insert examples).
> >
> > Some stylistic issues:
> >
> > bytea => <type>bytea</type>
> >
> > NULLs => zero bytes/bytes of value zero ("NULL" is too overloaded)
> >
> > 'non-printable' => <quote>nonprintable</quote>
> >
> > MUST => <emphasis>must</emphasis>
> >
>
> Here's a patch against *CVS tip* to address Peter's comments. Please let
> me know what you think!
>
> Thanks,

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026