Thread: Re: [HACKERS] PQescapeIdentifier

Re: [HACKERS] PQescapeIdentifier

From
Bruce Momjian
Date:
Christopher Kings-Lynne wrote:
> TODO item done for 8.2:
>
> * Add PQescapeIdentifier() to libpq
>
> Someone probably needs to check this :)

Updated patch applied.  Thanks.

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/libpq.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/libpq.sgml,v
retrieving revision 1.211
diff -c -c -r1.211 libpq.sgml
*** doc/src/sgml/libpq.sgml    23 May 2006 22:13:19 -0000    1.211
--- doc/src/sgml/libpq.sgml    26 Jun 2006 23:54:12 -0000
***************
*** 2279,2284 ****
--- 2279,2347 ----
  </para>
  </sect2>

+ <sect2 id="libpq-exec-escape-identifier">
+   <title>Escaping Identifier for Inclusion in SQL Commands</title>
+
+    <indexterm zone="libpq-exec-escape-identifier"><primary>PQescapeIdentifier</></>
+    <indexterm zone="libpq-exec-escape-identifier"><primary>escaping strings</></>
+
+ <para>
+ <function>PQescapeIdentifier</function> escapes a string for use
+ as an identifier name within an SQL command.  For example; table names,
+ column names, view names and user names are all identifiers.
+ Double quotes (") must be escaped to prevent them from being interpreted
+ specially by the SQL parser. <function>PQescapeIdentifier</> performs this
+ operation.
+ </para>
+
+ <tip>
+ <para>
+ It is especially important to do proper escaping when handling strings that
+ were received from an untrustworthy source.  Otherwise there is a security
+ risk: you are vulnerable to <quote>SQL injection</> attacks wherein unwanted
+ SQL commands are fed to your database.
+ </para>
+ </tip>
+
+ <para>
+ Note that it is still necessary to do escaping of identifiers when
+ using functions that support parameterized queries such as <function>PQexecParams</> or
+ its sibling routines. Only literal values are automatically escaped
+ using these functions, not identifiers.
+
+ <synopsis>
+ size_t PQescapeIdentifier (char *to, const char *from, size_t length);
+ </synopsis>
+ </para>
+
+ <para>
+ The parameter <parameter>from</> points to the first character of the string
+ that is to be escaped, and the <parameter>length</> parameter gives the
+ number of characters in this string.  A terminating zero byte is not
+ required, and should not be counted in <parameter>length</>.  (If
+ a terminating zero byte is found before <parameter>length</> bytes are
+ processed, <function>PQescapeIdentifier</> stops at the zero; the behavior
+ is thus rather like <function>strncpy</>.)
+ <parameter>to</> shall point to a
+ buffer that is able to hold at least one more character than twice
+ the value of <parameter>length</>, otherwise the behavior is
+ undefined.  A call to <function>PQescapeIdentifier</> writes an escaped
+ version of the <parameter>from</> string to the <parameter>to</>
+ buffer, replacing special characters so that they cannot cause any
+ harm, and adding a terminating zero byte.  The double quotes that
+ may surround <productname>PostgreSQL</> identifiers are not
+ included in the result string; they should be provided in the SQL
+ command that the result is inserted into.
+ </para>
+ <para>
+ <function>PQescapeIdentifier</> returns the number of characters written
+ to <parameter>to</>, not including the terminating zero byte.
+ </para>
+ <para>
+ Behavior is undefined if the <parameter>to</> and <parameter>from</>
+ strings overlap.
+ </para>
+ </sect2>

   <sect2 id="libpq-exec-escape-bytea">
    <title>Escaping Binary Strings for Inclusion in SQL Commands</title>
Index: src/interfaces/libpq/exports.txt
===================================================================
RCS file: /cvsroot/pgsql/src/interfaces/libpq/exports.txt,v
retrieving revision 1.11
diff -c -c -r1.11 exports.txt
*** src/interfaces/libpq/exports.txt    28 May 2006 22:42:05 -0000    1.11
--- src/interfaces/libpq/exports.txt    26 Jun 2006 23:54:20 -0000
***************
*** 130,132 ****
--- 130,134 ----
  PQencryptPassword         128
  PQisthreadsafe            129
  enlargePQExpBuffer        130
+ PQescapeIdentifier        131
+
Index: src/interfaces/libpq/fe-exec.c
===================================================================
RCS file: /cvsroot/pgsql/src/interfaces/libpq/fe-exec.c,v
retrieving revision 1.186
diff -c -c -r1.186 fe-exec.c
*** src/interfaces/libpq/fe-exec.c    28 May 2006 21:13:54 -0000    1.186
--- src/interfaces/libpq/fe-exec.c    26 Jun 2006 23:54:21 -0000
***************
*** 2516,2521 ****
--- 2516,2557 ----
  }

  /*
+  * Escaping arbitrary strings to get valid SQL identifier strings.
+  *
+  * Replaces " with "".
+  *
+  * length is the length of the source string.  (Note: if a terminating NUL
+  * is encountered sooner, PQescapeIdentifier stops short of "length"; the behavior
+  * is thus rather like strncpy.)
+  *
+  * For safety the buffer at "to" must be at least 2*length + 1 bytes long.
+  * A terminating NUL character is added to the output string, whether the
+  * input is NUL-terminated or not.
+  *
+  * Returns the actual length of the output (not counting the terminating NUL).
+  */
+ size_t
+ PQescapeIdentifier(char *to, const char *from, size_t length)
+ {
+     const char *source = from;
+     char       *target = to;
+     size_t        remaining = length;
+
+     while (remaining > 0 && *source != '\0')
+     {
+         if (*source  == '"')
+             *target++ = *source;
+         *target++ = *source++;
+         remaining--;
+     }
+
+     /* Write the terminating NUL character. */
+     *target = '\0';
+
+     return target - to;
+ }
+
+ /*
   *        PQescapeBytea    - converts from binary string to the
   *        minimal encoding necessary to include the string in an SQL
   *        INSERT statement with a bytea type column as the target.
Index: src/interfaces/libpq/libpq-fe.h
===================================================================
RCS file: /cvsroot/pgsql/src/interfaces/libpq/libpq-fe.h,v
retrieving revision 1.129
diff -c -c -r1.129 libpq-fe.h
*** src/interfaces/libpq/libpq-fe.h    23 May 2006 22:13:19 -0000    1.129
--- src/interfaces/libpq/libpq-fe.h    26 Jun 2006 23:54:21 -0000
***************
*** 436,441 ****
--- 436,443 ----
                    size_t *to_length);
  extern unsigned char *PQunescapeBytea(const unsigned char *strtext,
                  size_t *retbuflen);
+ extern size_t PQescapeIdentifier(char *to, const char *from, size_t length);
+
  /* These forms are deprecated! */
  extern size_t PQescapeString(char *to, const char *from, size_t length);
  extern unsigned char *PQescapeBytea(const unsigned char *from, size_t from_length,

Re: [HACKERS] PQescapeIdentifier

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
>> * Add PQescapeIdentifier() to libpq

> Updated patch applied.  Thanks.

Have either of you inquired into the encoding-safety of this code?
It certainly looks like no consideration was given for that.

            regards, tom lane

Re: [HACKERS] PQescapeIdentifier

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> >> * Add PQescapeIdentifier() to libpq
>
> > Updated patch applied.  Thanks.
>
> Have either of you inquired into the encoding-safety of this code?
> It certainly looks like no consideration was given for that.

I thought of that but I assume we were not accepting user-supplied
identifiers for this --- that this was only for application use.  Am I
wrong?

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [HACKERS] PQescapeIdentifier

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> Have either of you inquired into the encoding-safety of this code?
>> It certainly looks like no consideration was given for that.

> I thought of that but I assume we were not accepting user-supplied
> identifiers for this --- that this was only for application use.  Am I
> wrong?

By definition, an escaping routine is not supposed to trust the data it
is handed.  We *will* be seeing a CVE report if this function has got
any escaping vulnerability.

If you insist on a practical example, I can certainly imagine someone
thinking it'd be cool to allow searches on a user-selected column, and
implementing that by passing the user-given column name straight into
the query with only PQescapeIdentifier for safety.

            regards, tom lane

Re: [HACKERS] PQescapeIdentifier

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Tom Lane wrote:
> >> Have either of you inquired into the encoding-safety of this code?
> >> It certainly looks like no consideration was given for that.
>
> > I thought of that but I assume we were not accepting user-supplied
> > identifiers for this --- that this was only for application use.  Am I
> > wrong?
>
> By definition, an escaping routine is not supposed to trust the data it
> is handed.  We *will* be seeing a CVE report if this function has got
> any escaping vulnerability.
>
> If you insist on a practical example, I can certainly imagine someone
> thinking it'd be cool to allow searches on a user-selected column, and
> implementing that by passing the user-given column name straight into
> the query with only PQescapeIdentifier for safety.

OK, does someone want to fix it, or should I revert it?

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [HACKERS] PQescapeIdentifier

From
Tom Lane
Date:
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> Yes, phpPgAdmin sure would.  I imagine this would be a nightmare to
> address properly, so perhaps we should remove the function :(

Well, it's fixable, cf PQescapeStringConn, but we should fix it *before*
it gets into the field not after.

            regards, tom lane

Re: [HACKERS] PQescapeIdentifier

From
Christopher Kings-Lynne
Date:
>> I thought of that but I assume we were not accepting user-supplied
>> identifiers for this --- that this was only for application use.  Am I
>> wrong?

Well, yes the plan was to accept user-supplied identifiers...

> If you insist on a practical example, I can certainly imagine someone
> thinking it'd be cool to allow searches on a user-selected column, and
> implementing that by passing the user-given column name straight into
> the query with only PQescapeIdentifier for safety.

Yes, phpPgAdmin sure would.  I imagine this would be a nightmare to
address properly, so perhaps we should remove the function :(


Re: [HACKERS] PQescapeIdentifier

From
Christopher Kings-Lynne
Date:
Hang on a second.  Has someone considered the encoding issues this might
suffer from, same as PQescapeString? I remember we discussed it briefly
and I mentioned it's outta my league to prove one way or the other...

Bruce Momjian wrote:
> Christopher Kings-Lynne wrote:
>> TODO item done for 8.2:
>>
>> * Add PQescapeIdentifier() to libpq
>>
>> Someone probably needs to check this :)
>
> Updated patch applied.  Thanks.
>
>
>
> ------------------------------------------------------------------------
>
> Index: doc/src/sgml/libpq.sgml
> ===================================================================
> RCS file: /cvsroot/pgsql/doc/src/sgml/libpq.sgml,v
> retrieving revision 1.211
> diff -c -c -r1.211 libpq.sgml
> *** doc/src/sgml/libpq.sgml    23 May 2006 22:13:19 -0000    1.211
> --- doc/src/sgml/libpq.sgml    26 Jun 2006 23:54:12 -0000
> ***************
> *** 2279,2284 ****
> --- 2279,2347 ----
>   </para>
>   </sect2>
>
> + <sect2 id="libpq-exec-escape-identifier">
> +   <title>Escaping Identifier for Inclusion in SQL Commands</title>
> +
> +    <indexterm zone="libpq-exec-escape-identifier"><primary>PQescapeIdentifier</></>
> +    <indexterm zone="libpq-exec-escape-identifier"><primary>escaping strings</></>
> +
> + <para>
> + <function>PQescapeIdentifier</function> escapes a string for use
> + as an identifier name within an SQL command.  For example; table names,
> + column names, view names and user names are all identifiers.
> + Double quotes (") must be escaped to prevent them from being interpreted
> + specially by the SQL parser. <function>PQescapeIdentifier</> performs this
> + operation.
> + </para>
> +
> + <tip>
> + <para>
> + It is especially important to do proper escaping when handling strings that
> + were received from an untrustworthy source.  Otherwise there is a security
> + risk: you are vulnerable to <quote>SQL injection</> attacks wherein unwanted
> + SQL commands are fed to your database.
> + </para>
> + </tip>
> +
> + <para>
> + Note that it is still necessary to do escaping of identifiers when
> + using functions that support parameterized queries such as <function>PQexecParams</> or
> + its sibling routines. Only literal values are automatically escaped
> + using these functions, not identifiers.
> +
> + <synopsis>
> + size_t PQescapeIdentifier (char *to, const char *from, size_t length);
> + </synopsis>
> + </para>
> +
> + <para>
> + The parameter <parameter>from</> points to the first character of the string
> + that is to be escaped, and the <parameter>length</> parameter gives the
> + number of characters in this string.  A terminating zero byte is not
> + required, and should not be counted in <parameter>length</>.  (If
> + a terminating zero byte is found before <parameter>length</> bytes are
> + processed, <function>PQescapeIdentifier</> stops at the zero; the behavior
> + is thus rather like <function>strncpy</>.)
> + <parameter>to</> shall point to a
> + buffer that is able to hold at least one more character than twice
> + the value of <parameter>length</>, otherwise the behavior is
> + undefined.  A call to <function>PQescapeIdentifier</> writes an escaped
> + version of the <parameter>from</> string to the <parameter>to</>
> + buffer, replacing special characters so that they cannot cause any
> + harm, and adding a terminating zero byte.  The double quotes that
> + may surround <productname>PostgreSQL</> identifiers are not
> + included in the result string; they should be provided in the SQL
> + command that the result is inserted into.
> + </para>
> + <para>
> + <function>PQescapeIdentifier</> returns the number of characters written
> + to <parameter>to</>, not including the terminating zero byte.
> + </para>
> + <para>
> + Behavior is undefined if the <parameter>to</> and <parameter>from</>
> + strings overlap.
> + </para>
> + </sect2>
>
>    <sect2 id="libpq-exec-escape-bytea">
>     <title>Escaping Binary Strings for Inclusion in SQL Commands</title>
> Index: src/interfaces/libpq/exports.txt
> ===================================================================
> RCS file: /cvsroot/pgsql/src/interfaces/libpq/exports.txt,v
> retrieving revision 1.11
> diff -c -c -r1.11 exports.txt
> *** src/interfaces/libpq/exports.txt    28 May 2006 22:42:05 -0000    1.11
> --- src/interfaces/libpq/exports.txt    26 Jun 2006 23:54:20 -0000
> ***************
> *** 130,132 ****
> --- 130,134 ----
>   PQencryptPassword         128
>   PQisthreadsafe            129
>   enlargePQExpBuffer        130
> + PQescapeIdentifier        131
> +
> Index: src/interfaces/libpq/fe-exec.c
> ===================================================================
> RCS file: /cvsroot/pgsql/src/interfaces/libpq/fe-exec.c,v
> retrieving revision 1.186
> diff -c -c -r1.186 fe-exec.c
> *** src/interfaces/libpq/fe-exec.c    28 May 2006 21:13:54 -0000    1.186
> --- src/interfaces/libpq/fe-exec.c    26 Jun 2006 23:54:21 -0000
> ***************
> *** 2516,2521 ****
> --- 2516,2557 ----
>   }
>
>   /*
> +  * Escaping arbitrary strings to get valid SQL identifier strings.
> +  *
> +  * Replaces " with "".
> +  *
> +  * length is the length of the source string.  (Note: if a terminating NUL
> +  * is encountered sooner, PQescapeIdentifier stops short of "length"; the behavior
> +  * is thus rather like strncpy.)
> +  *
> +  * For safety the buffer at "to" must be at least 2*length + 1 bytes long.
> +  * A terminating NUL character is added to the output string, whether the
> +  * input is NUL-terminated or not.
> +  *
> +  * Returns the actual length of the output (not counting the terminating NUL).
> +  */
> + size_t
> + PQescapeIdentifier(char *to, const char *from, size_t length)
> + {
> +     const char *source = from;
> +     char       *target = to;
> +     size_t        remaining = length;
> +
> +     while (remaining > 0 && *source != '\0')
> +     {
> +         if (*source  == '"')
> +             *target++ = *source;
> +         *target++ = *source++;
> +         remaining--;
> +     }
> +
> +     /* Write the terminating NUL character. */
> +     *target = '\0';
> +
> +     return target - to;
> + }
> +
> + /*
>    *        PQescapeBytea    - converts from binary string to the
>    *        minimal encoding necessary to include the string in an SQL
>    *        INSERT statement with a bytea type column as the target.
> Index: src/interfaces/libpq/libpq-fe.h
> ===================================================================
> RCS file: /cvsroot/pgsql/src/interfaces/libpq/libpq-fe.h,v
> retrieving revision 1.129
> diff -c -c -r1.129 libpq-fe.h
> *** src/interfaces/libpq/libpq-fe.h    23 May 2006 22:13:19 -0000    1.129
> --- src/interfaces/libpq/libpq-fe.h    26 Jun 2006 23:54:21 -0000
> ***************
> *** 436,441 ****
> --- 436,443 ----
>                     size_t *to_length);
>   extern unsigned char *PQunescapeBytea(const unsigned char *strtext,
>                   size_t *retbuflen);
> + extern size_t PQescapeIdentifier(char *to, const char *from, size_t length);
> +
>   /* These forms are deprecated! */
>   extern size_t PQescapeString(char *to, const char *from, size_t length);
>   extern unsigned char *PQescapeBytea(const unsigned char *from, size_t from_length,


Re: [HACKERS] PQescapeIdentifier

From
Tom Lane
Date:
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> Hang on a second.  Has someone considered the encoding issues this might
> suffer from, same as PQescapeString?

That was the point I raised when I saw the commit.

My advice is we shouldn't have PQescapeIdentifier at all.
PQescapeIdentifierConn would be the thing to define,
parallel to PQescapeStringConn.

            regards, tom lane

Re: [HACKERS] PQescapeIdentifier

From
Bruce Momjian
Date:
Tom Lane wrote:
> Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> > Hang on a second.  Has someone considered the encoding issues this might
> > suffer from, same as PQescapeString?
>
> That was the point I raised when I saw the commit.
>
> My advice is we shouldn't have PQescapeIdentifier at all.
> PQescapeIdentifierConn would be the thing to define,
> parallel to PQescapeStringConn.

Patch reverted, TODO updated to:

        o Add PQescapeIdentifierConn()

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +