Re: 64-bit pgbench V2 - Mailing list pgsql-hackers

From Greg Smith
Subject Re: 64-bit pgbench V2
Date
Msg-id 4C3B7383.5010200@2ndquadrant.com
Whole thread Raw
In response to Re: 64-bit pgbench V2  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: 64-bit pgbench V2
List pgsql-hackers
Tom Lane wrote:
> Please choose a way that doesn't introduce new portability assumptions.
> The backend gets along fine without strtoll, and I don't see why pgbench
> should have to require it.
>

Funny you should mention this...it turns out there is some code already
there, I just didn't notice it before because it's only the unsigned
64-bit strtoul used, not the signed one I was looking for, and it's only
called in one place I didn't previously check.
src/interfaces/ecpg/ecpglib/data.c does this:

*((unsigned long long int *) (var + offset * act_tuple)) =
strtoull(pval, &scan_length, 10);

The appropriate autoconf magic was in the code all along for both
versions, so my bad not noticing it until now.  It even transparently
remaps the BSD-ism of calling it strtoq.

I suspect that this alone isn't sufficient to make the code I'm trying
to wedge into pgbench to always work on the platforms I consider must
haves, because of the weird names like _strtoi64 that Windows uses:
http://msdn.microsoft.com/en-us/library/h80404d3(v=VS.80).aspx  In fact,
I wouldn't be surprised to discover the ECPG code above doesn't do the
right thing if compiled with a 64-bit MSVC version.  Don't expect that's
a popular combination to explicitly test in a way that hits the code
path where this line is at.

The untested (I need to setup for building Windows to really confirm
this works) next patch attempt I've attached does what I think is the
right general sort of thing here.  It extends the autoconf remapping
that was already being done to include the second variation on how the
function needed can be named in a MSVC build.  This might improve the
ECPG compatibility issue I theorize could be there on that platform.
Given the autoconf stuff and use of the unsigned version was already a
dependency, I'd rather improve that code (so it's more obvious when it
is broken) than do the refactoring work suggested to re-use the server's
internal 64-bit parsing method instead.  I could split this into two
patches instead--"add 64-bit strtoull/strtoll support for MSVC" on the
presumption it's actually broken now (possibly wrong on my part) and
"make pgbench use 64-bit values"--but it's not so complicated as one.

I expect there is almost zero overlap between "needs pgbench setshell to
return >32 bit return values" and "not on a platform with a working
64-bit strtoull variation".  What I did to hedge against that was add a
little check to pgbench that lets you confirm whether setshell lines are
limited to 32 bits or not, depending on whether the appropriate function
was found.  It tries to fall back to the existing strtol in that case,
and I've put a note when that happens (and matching documentation to
look for it) into the debug output of the program.

I'll continue with testing work here, but what's attached is now the
first form I think this could potentially be committed in once it's
known to be free of obvious bugs (testing at this database scale takes
forever).  I can revisit not using the library function instead if Tom
or someone else really opposes this new approach.  Given most of the
autoconf bits are already there and the limited number of platforms
where this is a problem, I think there's little gain for doing that work
though.

Style/functional suggestions appreciated.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us

diff --git a/configure b/configure
index f6b891e..a5371ba 100755
--- a/configure
+++ b/configure
@@ -21624,7 +21624,8 @@ fi



-for ac_func in strtoll strtoq
+
+for ac_func in strtoll strtoq _strtoi64
 do
 as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
 { $as_echo "$as_me:$LINENO: checking for $ac_func" >&5
@@ -21726,7 +21727,8 @@ done



-for ac_func in strtoull strtouq
+
+for ac_func in strtoull strtouq _strtoui64
 do
 as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
 { $as_echo "$as_me:$LINENO: checking for $ac_func" >&5
diff --git a/configure.in b/configure.in
index 0a529fa..cca6453 100644
--- a/configure.in
+++ b/configure.in
@@ -1385,8 +1385,8 @@ if test x"$pgac_cv_var_int_optreset" = x"yes"; then
   AC_DEFINE(HAVE_INT_OPTRESET, 1, [Define to 1 if you have the global variable 'int optreset'.])
 fi

-AC_CHECK_FUNCS([strtoll strtoq], [break])
-AC_CHECK_FUNCS([strtoull strtouq], [break])
+AC_CHECK_FUNCS([strtoll strtoq _strtoi64], [break])
+AC_CHECK_FUNCS([strtoull strtouq _strtoui64], [break])

 # Check for one of atexit() or on_exit()
 AC_CHECK_FUNCS(atexit, [],
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index c830dee..541510b 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -56,6 +56,15 @@
 #include <sys/resource.h>        /* for getrlimit */
 #endif

+/*
+ * If this platform doesn't have a 64-bit strtoll, fall back to
+ * using the 32-bit version.
+ */
+#ifndef HAVE_STRTOLL
+#define strtoll strtol
+#define LIMITED_STRTOLL
+#endif
+
 #ifndef INT64_MAX
 #define INT64_MAX    INT64CONST(0x7FFFFFFFFFFFFFFF)
 #endif
@@ -310,14 +319,14 @@ usage(const char *progname)
 }

 /* random number generator: uniform distribution from min to max inclusive */
-static int
-getrand(int min, int max)
+static int64
+getrand(int64 min, int64 max)
 {
     /*
      * Odd coding is so that min and max have approximately the same chance of
      * being selected as do numbers between them.
      */
-    return min + (int) (((max - min + 1) * (double) random()) / (MAX_RANDOM_VALUE + 1.0));
+    return min + (int64) (((max - min + 1) * (double) random()) / (MAX_RANDOM_VALUE + 1.0));
 }

 /* call PQexec() and exit() on failure */
@@ -627,7 +636,7 @@ runShellCommand(CState *st, char *variable, char **argv, int argc)
     FILE       *fp;
     char        res[64];
     char       *endptr;
-    int            retval;
+    int64            retval;

     /*
      * Join arguments with whilespace separaters. Arguments starting with
@@ -700,7 +709,7 @@ runShellCommand(CState *st, char *variable, char **argv, int argc)
     }

     /* Check whether the result is an integer and assign it to the variable */
-    retval = (int) strtol(res, &endptr, 10);
+    retval = strtoll(res, &endptr, 19);
     while (*endptr != '\0' && isspace((unsigned char) *endptr))
         endptr++;
     if (*res == '\0' || *endptr != '\0')
@@ -708,7 +717,7 @@ runShellCommand(CState *st, char *variable, char **argv, int argc)
         fprintf(stderr, "%s: must return an integer ('%s' returned)\n", argv[0], res);
         return false;
     }
-    snprintf(res, sizeof(res), "%d", retval);
+    snprintf(res, sizeof(res), INT64_FORMAT, retval);
     if (!putVariable(st, "setshell", variable, res))
         return false;

@@ -956,8 +965,9 @@ top:
         if (pg_strcasecmp(argv[0], "setrandom") == 0)
         {
             char       *var;
-            int            min,
-                        max;
+            int64        min,
+                        max,
+                        rand;
             char        res[64];

             if (*argv[2] == ':')
@@ -997,15 +1007,16 @@ top:

             if (max < min || max > MAX_RANDOM_VALUE)
             {
-                fprintf(stderr, "%s: invalid maximum number %d\n", argv[0], max);
+                fprintf(stderr, "%s: invalid maximum number " INT64_FORMAT "\n", argv[0], max);
                 st->ecnt++;
                 return true;
             }
+            rand=getrand(min,max);

 #ifdef DEBUG
-            printf("min: %d max: %d random: %d\n", min, max, getrand(min, max));
+            printf("min: " INT64_FORMAT " max: " INT64_FORMAT " random: " INT64_FORMAT "\n", min, max, rand);
 #endif
-            snprintf(res, sizeof(res), "%d", getrand(min, max));
+            snprintf(res, sizeof(res), INT64_FORMAT, rand);

             if (!putVariable(st, argv[0], argv[1], res))
             {
@@ -1121,6 +1132,10 @@ top:
         }
         else if (pg_strcasecmp(argv[0], "setshell") == 0)
         {
+#ifdef LIMITED_STRTOLL
+            if (debug)
+                fprintf(stderr, "Range of \\setshell limited to 32 bits");
+#endif
             bool        ret = runShellCommand(st, argv[1], argv + 2, argc - 2);

             if (timer_exceeded) /* timeout */
@@ -1188,7 +1203,7 @@ init(void)
         "drop table if exists pgbench_tellers",
         "create table pgbench_tellers(tid int not null,bid int,tbalance int,filler char(84)) with (fillfactor=%d)",
         "drop table if exists pgbench_accounts",
-        "create table pgbench_accounts(aid int not null,bid int,abalance int,filler char(84)) with (fillfactor=%d)",
+        "create table pgbench_accounts(aid bigint not null,bid int,abalance int,filler char(80)) with
(fillfactor=%d)",
         "drop table if exists pgbench_history",
         "create table pgbench_history(tid int,bid int,aid int,delta int,mtime timestamp,filler char(22))"
     };
@@ -1201,7 +1216,7 @@ init(void)
     PGconn       *con;
     PGresult   *res;
     char        sql[256];
-    int            i;
+    int64        i;

     if ((con = doConnect()) == NULL)
         exit(1);
@@ -1229,13 +1244,13 @@ init(void)

     for (i = 0; i < nbranches * scale; i++)
     {
-        snprintf(sql, 256, "insert into pgbench_branches(bid,bbalance) values(%d,0)", i + 1);
+        snprintf(sql, 256, "insert into pgbench_branches(bid,bbalance) values(" INT64_FORMAT ",0)", i + 1);
         executeStatement(con, sql);
     }

     for (i = 0; i < ntellers * scale; i++)
     {
-        snprintf(sql, 256, "insert into pgbench_tellers(tid,bid,tbalance) values (%d,%d,0)",
+        snprintf(sql, 256, "insert into pgbench_tellers(tid,bid,tbalance) values (" INT64_FORMAT "," INT64_FORMAT
",0)",
                  i + 1, i / ntellers + 1);
         executeStatement(con, sql);
     }
@@ -1260,9 +1275,9 @@ init(void)

     for (i = 0; i < naccounts * scale; i++)
     {
-        int            j = i + 1;
+        int64            j = i + 1;

-        snprintf(sql, 256, "%d\t%d\t%d\t\n", j, i / naccounts + 1, 0);
+        snprintf(sql, 256, INT64_FORMAT "\t" INT64_FORMAT "\t%d\t\n", j, i / naccounts + 1, 0);
         if (PQputline(con, sql))
         {
             fprintf(stderr, "PQputline failed\n");
@@ -1270,7 +1285,7 @@ init(void)
         }

         if (j % 10000 == 0)
-            fprintf(stderr, "%d tuples done.\n", j);
+            fprintf(stderr, INT64_FORMAT " tuples done.\n", j);
     }
     if (PQputline(con, "\\.\n"))
     {
diff --git a/doc/src/sgml/pgbench.sgml b/doc/src/sgml/pgbench.sgml
index 2581190..272ba6c 100644
--- a/doc/src/sgml/pgbench.sgml
+++ b/doc/src/sgml/pgbench.sgml
@@ -553,6 +553,12 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
      </para>

      <para>
+      The size of the return value may be limited to the range of a signed
+      32 bit value (just over two billion) on some older platforms.  When
+      debugging output is enabled, a warning will appear if this is the case.
+     </para>
+
+     <para>
       Example:
       <programlisting>
 \setshell variable_to_be_assigned command literal_argument :variable ::literal_starting_with_colon
diff --git a/src/include/c.h b/src/include/c.h
index e3b4b0b..7174e1b 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -826,12 +826,24 @@ extern int    fdatasync(int fildes);
 #define HAVE_STRTOLL 1
 #endif

+/* If _strtoi64() exists, rename it to the more standard strtoll() */
+#if defined(HAVE_LONG_LONG_INT_64) && !defined(HAVE_STRTOLL) && defined(HAVE__STRTOI64)
+#define strtoll _strtoi64
+#define HAVE_STRTOLL 1
+#endif
+
 /* If strtouq() exists, rename it to the more standard strtoull() */
 #if defined(HAVE_LONG_LONG_INT_64) && !defined(HAVE_STRTOULL) && defined(HAVE_STRTOUQ)
 #define strtoull strtouq
 #define HAVE_STRTOULL 1
 #endif

+/* If _strtoui64() exists, rename it to the more standard strtoull() */
+#if defined(HAVE_LONG_LONG_INT_64) && !defined(HAVE_STRTOULL) && defined(HAVE__STRTOUI64)
+#define strtoull _strtoui64
+#define HAVE_STRTOULL 1
+#endif
+
 /*
  * We assume if we have these two functions, we have their friends too, and
  * can use the wide-character functions.

pgsql-hackers by date:

Previous
From: Marko Tiikkaja
Date:
Subject: Re: Status report on writeable CTEs
Next
From: Peter Eisentraut
Date:
Subject: Re: gSoC - ADD MERGE COMMAND - code patch submission