optimize md5_text - Mailing list pgsql-patches

From Neil Conway
Subject optimize md5_text
Date
Msg-id 421C2D53.1020207@samurai.com
Whole thread Raw
Responses Re: optimize md5_text
List pgsql-patches
This patch optimizes the md5_text() function (which is used to implement
the md5() SQL-level function). The old code did the following:

1. de-toast the datum
2. convert it to a cstring via textout()
3. get the length of the cstring via strlen()

Since we are treating the datum context as a blob of binary data, the
latter two steps are unnecessary. Once the data has been detoasted, we
can just use it as-is, and derive its length from the varlena metadata.

This patch improves some run-of-the-mill md5() computations by just
under 10% in my limited tests, and passes the regression tests.

I also noticed that md5_text() wasn't checking the return value of
md5_hash(); encountering OOM at precisely the right moment could result
in returning a random md5 hash. This patch corrects that. A better fix
would be to make md5_hash() only return on success (and/or allocate via
palloc()), but since it's used in the frontend as well I don't see an
easy way to do that.

Barring any objections, I'll apply this to HEAD tomorrow.

-Neil
Index: src/backend/libpq/md5.c
===================================================================
RCS file: /var/lib/cvs/pgsql/src/backend/libpq/md5.c,v
retrieving revision 1.27
diff -c -r1.27 md5.c
*** src/backend/libpq/md5.c    31 Dec 2004 21:59:50 -0000    1.27
--- src/backend/libpq/md5.c    23 Feb 2005 07:11:43 -0000
***************
*** 289,296 ****
   *                          characters.  you thus need to provide an array
   *                          of 33 characters, including the trailing '\0'.
   *
!  *    RETURNS          0 on failure (out of memory for internal buffers) or
!  *                  non-zero on success.
   *
   *    STANDARDS      MD5 is described in RFC 1321.
   *
--- 289,296 ----
   *                          characters.  you thus need to provide an array
   *                          of 33 characters, including the trailing '\0'.
   *
!  *    RETURNS          false on failure (out of memory for internal buffers) or
!  *                  true on success.
   *
   *    STANDARDS      MD5 is described in RFC 1321.
   *
Index: src/backend/utils/adt/varlena.c
===================================================================
RCS file: /var/lib/cvs/pgsql/src/backend/utils/adt/varlena.c,v
retrieving revision 1.118
diff -c -r1.118 varlena.c
*** src/backend/utils/adt/varlena.c    31 Dec 2004 22:01:22 -0000    1.118
--- src/backend/utils/adt/varlena.c    23 Feb 2005 07:06:44 -0000
***************
*** 2310,2325 ****
  Datum
  md5_text(PG_FUNCTION_ARGS)
  {
!     char       *buff = PG_TEXT_GET_STR(PG_GETARG_TEXT_P(0));
!     size_t        len = strlen(buff);
      char       *hexsum;
      text       *result_text;

      /* leave room for the terminating '\0' */
      hexsum = (char *) palloc(MD5_HASH_LEN + 1);

      /* get the hash result */
!     md5_hash((void *) buff, len, hexsum);

      /* convert to text and return it */
      result_text = PG_STR_GET_TEXT(hexsum);
--- 2310,2331 ----
  Datum
  md5_text(PG_FUNCTION_ARGS)
  {
!     text       *in_text = PG_GETARG_TEXT_P(0);
!     size_t        len;
      char       *hexsum;
      text       *result_text;

+     /* Calculate the length of the buffer using varlena metadata */
+     len = VARSIZE(in_text) - VARHDRSZ;
+
      /* leave room for the terminating '\0' */
      hexsum = (char *) palloc(MD5_HASH_LEN + 1);

      /* get the hash result */
!     if (md5_hash(VARDATA(in_text), len, hexsum) == false)
!         ereport(ERROR,
!                 (errcode(ERRCODE_OUT_OF_MEMORY),
!                  errmsg("out of memory")));

      /* convert to text and return it */
      result_text = PG_STR_GET_TEXT(hexsum);

pgsql-patches by date:

Previous
From: Kurt Roeckx
Date:
Subject: Re: [pgsql-hackers-win32] Repleacement for src/port/snprintf.c
Next
From: Greg Sabino Mullane
Date:
Subject: New FAQ questions. minor fixes