This patch optimizes the md5_text() function (which is used to implement
the md5() SQL-level function). The old code did the following:
1. de-toast the datum
2. convert it to a cstring via textout()
3. get the length of the cstring via strlen()
Since we are treating the datum context as a blob of binary data, the
latter two steps are unnecessary. Once the data has been detoasted, we
can just use it as-is, and derive its length from the varlena metadata.
This patch improves some run-of-the-mill md5() computations by just
under 10% in my limited tests, and passes the regression tests.
I also noticed that md5_text() wasn't checking the return value of
md5_hash(); encountering OOM at precisely the right moment could result
in returning a random md5 hash. This patch corrects that. A better fix
would be to make md5_hash() only return on success (and/or allocate via
palloc()), but since it's used in the frontend as well I don't see an
easy way to do that.
Barring any objections, I'll apply this to HEAD tomorrow.
-Neil
Index: src/backend/libpq/md5.c
===================================================================
RCS file: /var/lib/cvs/pgsql/src/backend/libpq/md5.c,v
retrieving revision 1.27
diff -c -r1.27 md5.c
*** src/backend/libpq/md5.c 31 Dec 2004 21:59:50 -0000 1.27
--- src/backend/libpq/md5.c 23 Feb 2005 07:11:43 -0000
***************
*** 289,296 ****
* characters. you thus need to provide an array
* of 33 characters, including the trailing '\0'.
*
! * RETURNS 0 on failure (out of memory for internal buffers) or
! * non-zero on success.
*
* STANDARDS MD5 is described in RFC 1321.
*
--- 289,296 ----
* characters. you thus need to provide an array
* of 33 characters, including the trailing '\0'.
*
! * RETURNS false on failure (out of memory for internal buffers) or
! * true on success.
*
* STANDARDS MD5 is described in RFC 1321.
*
Index: src/backend/utils/adt/varlena.c
===================================================================
RCS file: /var/lib/cvs/pgsql/src/backend/utils/adt/varlena.c,v
retrieving revision 1.118
diff -c -r1.118 varlena.c
*** src/backend/utils/adt/varlena.c 31 Dec 2004 22:01:22 -0000 1.118
--- src/backend/utils/adt/varlena.c 23 Feb 2005 07:06:44 -0000
***************
*** 2310,2325 ****
Datum
md5_text(PG_FUNCTION_ARGS)
{
! char *buff = PG_TEXT_GET_STR(PG_GETARG_TEXT_P(0));
! size_t len = strlen(buff);
char *hexsum;
text *result_text;
/* leave room for the terminating '\0' */
hexsum = (char *) palloc(MD5_HASH_LEN + 1);
/* get the hash result */
! md5_hash((void *) buff, len, hexsum);
/* convert to text and return it */
result_text = PG_STR_GET_TEXT(hexsum);
--- 2310,2331 ----
Datum
md5_text(PG_FUNCTION_ARGS)
{
! text *in_text = PG_GETARG_TEXT_P(0);
! size_t len;
char *hexsum;
text *result_text;
+ /* Calculate the length of the buffer using varlena metadata */
+ len = VARSIZE(in_text) - VARHDRSZ;
+
/* leave room for the terminating '\0' */
hexsum = (char *) palloc(MD5_HASH_LEN + 1);
/* get the hash result */
! if (md5_hash(VARDATA(in_text), len, hexsum) == false)
! ereport(ERROR,
! (errcode(ERRCODE_OUT_OF_MEMORY),
! errmsg("out of memory")));
/* convert to text and return it */
result_text = PG_STR_GET_TEXT(hexsum);