Re: Sort time - Mailing list pgsql-performance

From Tom Lane
Subject Re: Sort time
Date
Msg-id 8037.1037574320@sss.pgh.pa.us
Whole thread Raw
In response to Re: Sort time  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
I've applied the attached patch to current sources (7.4devel).  It
eliminates palloc/pfree overhead in varstr_cmp() for short strings
(up to 1K as committed).  I find that this reduces the sort time for
700,000 rows by about 10% on my HPUX box; might be better on machines
with better-optimized strcoll().

            regards, tom lane

*** src/backend/utils/adt/varlena.c.orig    Wed Sep  4 17:30:48 2002
--- src/backend/utils/adt/varlena.c    Sun Nov 17 17:21:43 2002
***************
*** 736,771 ****
  varstr_cmp(char *arg1, int len1, char *arg2, int len2)
  {
      int            result;
-     char       *a1p,
-                *a2p;

      /*
       * Unfortunately, there is no strncoll(), so in the non-C locale case
       * we have to do some memory copying.  This turns out to be
       * significantly slower, so we optimize the case where LC_COLLATE is
!      * C.
       */
      if (!lc_collate_is_c())
      {
!         a1p = (char *) palloc(len1 + 1);
!         a2p = (char *) palloc(len2 + 1);

          memcpy(a1p, arg1, len1);
!         *(a1p + len1) = '\0';
          memcpy(a2p, arg2, len2);
!         *(a2p + len2) = '\0';

          result = strcoll(a1p, a2p);

!         pfree(a1p);
!         pfree(a2p);
      }
      else
      {
!         a1p = arg1;
!         a2p = arg2;
!
!         result = strncmp(a1p, a2p, Min(len1, len2));
          if ((result == 0) && (len1 != len2))
              result = (len1 < len2) ? -1 : 1;
      }
--- 736,782 ----
  varstr_cmp(char *arg1, int len1, char *arg2, int len2)
  {
      int            result;

      /*
       * Unfortunately, there is no strncoll(), so in the non-C locale case
       * we have to do some memory copying.  This turns out to be
       * significantly slower, so we optimize the case where LC_COLLATE is
!      * C.  We also try to optimize relatively-short strings by avoiding
!      * palloc/pfree overhead.
       */
+ #define STACKBUFLEN        1024
+
      if (!lc_collate_is_c())
      {
!         char    a1buf[STACKBUFLEN];
!         char    a2buf[STACKBUFLEN];
!         char   *a1p,
!                *a2p;
!
!         if (len1 >= STACKBUFLEN)
!             a1p = (char *) palloc(len1 + 1);
!         else
!             a1p = a1buf;
!         if (len2 >= STACKBUFLEN)
!             a2p = (char *) palloc(len2 + 1);
!         else
!             a2p = a2buf;

          memcpy(a1p, arg1, len1);
!         a1p[len1] = '\0';
          memcpy(a2p, arg2, len2);
!         a2p[len2] = '\0';

          result = strcoll(a1p, a2p);

!         if (len1 >= STACKBUFLEN)
!             pfree(a1p);
!         if (len2 >= STACKBUFLEN)
!             pfree(a2p);
      }
      else
      {
!         result = strncmp(arg1, arg2, Min(len1, len2));
          if ((result == 0) && (len1 != len2))
              result = (len1 < len2) ? -1 : 1;
      }

pgsql-performance by date:

Previous
From: Hannu Krosing
Date:
Subject: Re: Sort time
Next
From: Tom Lane
Date:
Subject: Re: Sort time