Thread: [PATCH] pgbench: add multiconnect option

[PATCH] pgbench: add multiconnect option

From
David Christensen
Date:
-hackers,

This patch adds the concept of "multiconnect" to pgbench (better
terminology welcome).  The basic idea here is to allow connections made
with pgbench to use different auth values or connect to multiple
databases. We implement this using a user-provided PGSERVICEFILE and
choosing a PGSERVICE from this based on a number of strategies.
(Currently the only supported strategies are round robin or random.)

There is definite room for improvement here; at the very least, teaching
`pgbench -i` about all of the distinct DBs referenced in this service
file would ensure that initialization works as expected in all places.
For now, we are punting initialization to the user in this version of
the patch if using more that one database in the given service file.

Best,

David
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 0c60077e1f..94616c13c2 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -161,6 +161,11 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
         <envar>PGDATABASE</envar> is used. If that is not set, the
         user name specified for the connection is used.
        </para>
+       <para>
+        If <literal>multiconnect</literal> mode is enabled, a defined
+        <literal>dbname</literal> in the chosen service will override this
+        value.
+       </para>
       </listitem>
      </varlistentry>
 
@@ -840,6 +845,39 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
 
     <variablelist>
 
+     <varlistentry>
+      <term><option>-m</option> <replaceable>servicefile</replaceable></term>
+      <term><option>--multiconnect-file=</option><replaceable>servicefile</replaceable></term>
+      <listitem>
+       <para>
+        Turns on <literal>multiconnect</literal> mode and uses the given
+        <literal>pg_service</literal>-style file to derive connection
+        information from.  Any/all connection parameters in this file will
+        overwrite any that were provided in the command-line.
+       </para>
+       <para>
+        Since this behavior will make a connection using
+        the <envar>PGSERVICEFILE</envar> mechanism, it is possible to
+        connect to other databases than the one provided in the original
+        command invocation.  This option assumes that the user has previously
+        run the necessarily initialization steps against all databases that
+        would be accessed via this service file.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><option>-g roundrobin|random</option></term>
+      <term><option>--multiconnect-strategy=roundrobin|random</option></term>
+      <listitem>
+       <para>
+        Selects the strategy by which <literal>multiconnect</literal> mode
+        uses the connections defined in the indicated service file.  The
+        default value is <literal>roundrobin</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-h</option> <replaceable>hostname</replaceable></term>
       <term><option>--host=</option><replaceable>hostname</replaceable></term>
@@ -847,6 +885,11 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
        <para>
         The database server's host name
        </para>
+       <para>
+        If <literal>multiconnect</literal> mode is enabled, a defined
+        <literal>host</literal> in the chosen service will override this
+        value.
+       </para>
       </listitem>
      </varlistentry>
 
@@ -857,6 +900,11 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
        <para>
         The database server's port number
        </para>
+       <para>
+        If <literal>multiconnect</literal> mode is enabled, a defined
+        <literal>port</literal> in the chosen service will override this
+        value.
+       </para>
       </listitem>
      </varlistentry>
 
@@ -867,6 +915,11 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
        <para>
         The user name to connect as
        </para>
+       <para>
+        If <literal>multiconnect</literal> mode is enabled, a defined
+        <literal>user</literal> in the chosen service will override this
+        value.
+       </para>
       </listitem>
      </varlistentry>
 
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 4aeccd93af..2834c9ef3c 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -69,6 +69,7 @@
 #include "pgbench.h"
 #include "port/pg_bitutils.h"
 #include "portability/instr_time.h"
+#include "lib/stringinfo.h"
 
 #ifndef M_PI
 #define M_PI 3.14159265358979323846
@@ -275,6 +276,8 @@ int            nthreads = 1;        /* number of threads */
 bool        is_connect;            /* establish connection for each transaction */
 bool        report_per_command; /* report per-command latencies */
 int            main_pid;            /* main process id used in log filename */
+int         num_service_names = 0; /* how many service file names are in the indicated service file */
+int         cur_service_index = 0; /* the index of the next service file; used for round-robin */
 
 const char *pghost = NULL;
 const char *pgport = NULL;
@@ -282,6 +285,7 @@ const char *username = NULL;
 const char *dbName = NULL;
 char       *logfile_prefix = NULL;
 const char *progname;
+const char **service_names = NULL;
 
 #define WSEP '@'                /* weight separator */
 
@@ -549,6 +553,14 @@ typedef enum QueryMode
 static QueryMode querymode = QUERY_SIMPLE;
 static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
 
+typedef enum MultiConnectStrategy
+{
+    MC_ROUND_ROBIN,
+    MC_RANDOM
+} MultiConnectStrategy;
+
+static MultiConnectStrategy multiconnect_strategy = MC_ROUND_ROBIN;
+
 /*
  * struct Command represents one command in a script.
  *
@@ -663,7 +675,7 @@ static void clear_socket_set(socket_set *sa);
 static void add_socket_to_set(socket_set *sa, int fd, int idx);
 static int    wait_on_socket_set(socket_set *sa, int64 usecs);
 static bool socket_has_input(socket_set *sa, int fd, int idx);
-
+static const char **availableServiceEntries(const char *serviceFile);
 
 /* callback functions for our flex lexer */
 static const PsqlScanCallbacks pgbench_callbacks = {
@@ -727,6 +739,10 @@ usage(void)
            "  -j, --jobs=NUM           number of threads (default: 1)\n"
            "  -l, --log                write transaction times to log file\n"
            "  -L, --latency-limit=NUM  count transactions lasting more than NUM ms as late\n"
+           "  -m, --multiconnect=FILE  use multiple auth defined in the given service file\n"
+           "  -g, --multiconnect-strategy=roundrobin|random\n"
+           "                           use the given strategy for choosing the service to connect as\n"
+           "                           (default: roundrobin)\n"
            "  -M, --protocol=simple|extended|prepared\n"
            "                           protocol for submitting queries (default: simple)\n"
            "  -n, --no-vacuum          do not run VACUUM before tests\n"
@@ -1351,6 +1367,33 @@ doConnect(void)
     bool        new_pass;
     static char *password = NULL;
 
+    /*
+     * If we are doing a round-robin of service files names, then use/choose the next name
+     */
+    if (num_service_names) {
+        const char *service;
+
+        if (multiconnect_strategy == MC_ROUND_ROBIN)
+        {
+            service = service_names[cur_service_index++];
+
+            if (cur_service_index >= num_service_names)
+                cur_service_index = 0;
+        }
+        else if (multiconnect_strategy == MC_RANDOM)
+        {
+            /*
+             * We need to get random int <= num_service_names; since this is
+             * infrequently-called and just need uniform integer distribution,
+             * we are using system random() instead of one of the more complex
+             * functions available in this file.
+             */
+            service = service_names[ ((unsigned long)random()) % num_service_names ];
+        }
+        pg_log_info("using service: %s", service);
+        setenv("PGSERVICE", service, true);
+    }
+
     /*
      * Start the connection.  Loop until we have a password if requested by
      * backend.
@@ -5724,6 +5767,74 @@ set_random_seed(const char *seed)
     return true;
 }
 
+static const char**
+availableServiceEntries(const char *serviceFile)
+{
+    int            linenr = 0,
+                num_svc = 0,
+                max_svc = 10;
+    FILE       *f;
+    char       *line, **services;
+    StringInfoData linebuf;
+
+    f = fopen(serviceFile, "r");
+    if (f == NULL)
+    {
+        return NULL;
+    }
+
+    initStringInfo(&linebuf);
+
+    services = (char **)pg_malloc0(max_svc * sizeof(char *));
+    
+    while (pg_get_line_buf(f, &linebuf))
+    {
+        linenr++;
+
+        /* ignore whitespace at end of line, especially the newline */
+        while (linebuf.len > 0 &&
+               isspace((unsigned char) linebuf.data[linebuf.len - 1]))
+            linebuf.data[--linebuf.len] = '\0';
+
+        line = linebuf.data;
+
+        /* ignore leading whitespace too */
+        while (*line && isspace((unsigned char) line[0]))
+            line++;
+
+        /* ignore comments and empty lines */
+        if (line[0] == '\0' || line[0] == '#')
+            continue;
+
+        /* Check for groupname section */
+        if (line[0] == '[')
+        {
+            char *endp;
+
+            line++;
+
+            endp = strchr(line, ']');
+            if (endp && (endp - line) > 0) {
+                /* add the literal block to the chunk */
+                services[num_svc] = pnstrdup(line, (endp - line));
+
+                /* possibly expand memory */
+                if (++num_svc >= max_svc) {
+                    max_svc += 10;
+                    services = pg_realloc(services, max_svc * sizeof(char *));
+                }
+
+                /* null out the next possible entry */
+                services[num_svc] = NULL;
+            }
+        }
+    }
+
+    fclose(f);
+    pfree(linebuf.data);
+    return (const char**)services;
+}
+
 int
 main(int argc, char **argv)
 {
@@ -5742,6 +5853,8 @@ main(int argc, char **argv)
         {"jobs", required_argument, NULL, 'j'},
         {"log", no_argument, NULL, 'l'},
         {"latency-limit", required_argument, NULL, 'L'},
+        {"multiconnect", required_argument, NULL, 'm'},
+        {"multiconnect-strategy", required_argument, NULL, 'g'},
         {"no-vacuum", no_argument, NULL, 'n'},
         {"port", required_argument, NULL, 'p'},
         {"progress", required_argument, NULL, 'P'},
@@ -5835,7 +5948,7 @@ main(int argc, char **argv)
         exit(1);
     }
 
-    while ((c = getopt_long(argc, argv, "iI:h:nvp:dqb:SNc:j:Crs:t:T:U:lf:D:F:M:P:R:L:", long_options, &optindex)) !=
-1)
+    while ((c = getopt_long(argc, argv, "iI:h:nvp:dqb:SNc:j:Crs:t:T:U:lf:D:F:M:m:g:P:R:L:", long_options, &optindex))
!=-1)
 
     {
         char       *script;
 
@@ -6008,6 +6121,55 @@ main(int argc, char **argv)
                     exit(1);
                 }
                 break;
+            case 'm':
+                {
+                    char **p;
+
+                    service_names = availableServiceEntries(optarg);
+                    p = (char**)service_names;
+
+                    if (!service_names)
+                    {
+                        pg_log_fatal("Couldn't find any services in file '%s'", optarg);
+                        exit(1);
+                    }
+
+                    while (*(p++))
+                        num_service_names++;
+
+                    /*
+                     * If we found non-zero services in our file then we can set
+                     * the PGSERVICEFILE variable to point to the file we parsed,
+                     * otherwise there is no point.
+                     */
+
+                    if (num_service_names) {
+                        setenv("PGSERVICEFILE", optarg, true);
+
+                        /*
+                         * Warn if number of services exceeds the number of
+                         * clients expected.
+                         */
+
+                        if (num_service_names > nclients)
+                            pg_log_warning("Found %d services defined, but -c is set to %d; did you mean to increase
-c?",
+                                           num_service_names,
+                                           nclients
+                                );
+                    }
+                }
+                break;
+            case 'g':
+                if (strcmp(optarg, "roundrobin") == 0)
+                    multiconnect_strategy = MC_ROUND_ROBIN;
+                else if (strcmp(optarg, "random") == 0)
+                    multiconnect_strategy = MC_RANDOM;
+                else
+                {
+                    pg_log_fatal("Unrecognized multiconnect strategy: %s", optarg);
+                    exit(1);
+                }
+                break;
             case 'M':
                 benchmarking_option_set = true;
                 for (querymode = 0; querymode < NUM_QUERYMODE; querymode++)

Re: [PATCH] pgbench: add multiconnect option

From
Fabien COELHO
Date:
Hello David,

> This patch adds the concept of "multiconnect" to pgbench (better
> terminology welcome).

Good. I was thinking of adding such capability, possibly for handling 
connection errors and reconnecting…

> The basic idea here is to allow connections made with pgbench to use 
> different auth values or connect to multiple databases. We implement 
> this using a user-provided PGSERVICEFILE and choosing a PGSERVICE from 
> this based on a number of strategies. (Currently the only supported 
> strategies are round robin or random.)

I was thinking of providing a allowing a list of conninfo strings with 
repeated options, eg --conninfo "foo" --conninfo "bla"…

Your approach using PGSERVICEFILE also make sense!

Maybe it could be simplified, the code base reduced, and provide more 
benefits, by mixing both ideas.

In particular, pgbench parses the file but then it will be read also by 
libpq, yuk yuk.

Also, I do not like that PGSERVICE is overriden by pgbench, while other 
options are passed with the parameters approach in doConnect. It would 
make proce sense to add a "service" field to the parameters for 
consistency, if this approach was to be pursued.

On reflexion, I'd suggest to use the --conninfo (or some other name) 
approach, eg "pgbench --conninfo='service=s1' --conninfo='service=s2'" and 
users just have to set PGSERVICEFILE env themselves, which I think is 
better than pgbench overriding env variables behind their back.

This allow to have a service file with more connections and just tell 
pgbench which ones to use, which is the expected way to use this feature. 
This drops file parsing.

I can only see benefit to this simplified approach.
What do you think?

About the patch:

There are warnings about trailing whitespaces when applying the patch, and 
there are some tabbing issues in the file.

I would not consume "-g" option unless there is some logical link with the 
feature. I'd be okay with "-m" if it is still needed. I would suggest to 
use it for the choice strategy?

stringinfo: We already have PQExpBuffer imported, could we use that 
instead? Having two set of struct/functions which do the same in the same 
source file does not look like a good idea. If we do not parse the file, 
nothing is needed, which is a relief.

Attached is my work-in-progress start at adding conninfo, that would need 
to be improved with strategies.

-- 
Fabien.
Attachment

Re: [PATCH] pgbench: add multiconnect option

From
Fabien COELHO
Date:
Bonjour Michaël,

>> Good. I was thinking of adding such capability, possibly for handling
>> connection errors and reconnecting…
>
> round-robin and random make sense.  I am wondering how round-robin
> would work with -C, though?  Would you just reuse the same connection
> string as the one chosen at the starting point.

Well, not necessarily, but this is debatable.

>> I was thinking of providing a allowing a list of conninfo strings with
>> repeated options, eg --conninfo "foo" --conninfo "bla"…
>
> That was my first thought when reading the subject of this thread:
> create a list of connection strings and pass one of them to
> doConnect() to grab the properties looked for.  That's a bit confusing
> though as pgbench does not support directly connection strings,

They are supported because libpq silently assumes that "dbname" can be a 
full connection string.

> and we should be careful to keep fallback_application_name intact.

Hmmm. See attached patch, ISTM that it does the right thing.

>> Your approach using PGSERVICEFILE also make sense!
>
> I am not sure that's actually needed here, as it is possible to pass
> down a service name within a connection string.  I think that you'd
> better leave libpq do all the work related to a service file, if
> specified.  pgbench does not need to know any of that.

Yes, this is an inconvenient with this approach, part of libpq machinery
is more or less replicated in pgbench, which is quite annoying, and less 
powerful.

Attached my work-in-progress version, with a few open issues (eg probably 
not thread safe), but comments about the provided feature are welcome.

I borrowed the "strategy" option, renamed policy, from the initial patch. 
Pgbench just accepts several connection strings as parameters, eg:

   pgbench ... "service=db1" "service=db2" "service=db3"

The next stage is to map scripts to connections types and connections
to connection types, so that pgbench could run W transactions against a 
primary and R transactions agains a hot standby, for instance. I have a 
some design for that, but nothing is implemented.

There is also the combination with the error handling patch to consider: 
if a connection fails, a connection to a replica could be issued instead.

-- 
Fabien.
Attachment

Re: [PATCH] pgbench: add multiconnect option

From
David Christensen
Date:
> >> Good. I was thinking of adding such capability, possibly for handling
> >> connection errors and reconnecting…
> >
> > round-robin and random make sense.  I am wondering how round-robin
> > would work with -C, though?  Would you just reuse the same connection
> > string as the one chosen at the starting point.
>
> Well, not necessarily, but this is debatable.

My expectation for such a behavior would be that it would reconnect to
a random connstring each time, otherwise what's the point of using
this with -C?  If we needed to forbid some option combinations that is
also an option.

> >> I was thinking of providing a allowing a list of conninfo strings with
> >> repeated options, eg --conninfo "foo" --conninfo "bla"…
> >
> > That was my first thought when reading the subject of this thread:
> > create a list of connection strings and pass one of them to
> > doConnect() to grab the properties looked for.  That's a bit confusing
> > though as pgbench does not support directly connection strings,
>
> They are supported because libpq silently assumes that "dbname" can be a
> full connection string.
>
> > and we should be careful to keep fallback_application_name intact.
>
> Hmmm. See attached patch, ISTM that it does the right thing.

I guess the multiple --conninfo approach is fine; I personally liked
having the list come from a file, as you could benchmark different
groups/clusters based on a file, much easier than constructing
multiple pgbench invocations depending.  I can see an argument for
both approaches.  The PGSERVICEFILE was an idea I'd had to store
easily indexed groups of connection information in a way that I didn't
need to know all the details, could easily parse, and could later pass
in the ENV so libpq could just pull out the information.

> >> Your approach using PGSERVICEFILE also make sense!
> >
> > I am not sure that's actually needed here, as it is possible to pass
> > down a service name within a connection string.  I think that you'd
> > better leave libpq do all the work related to a service file, if
> > specified.  pgbench does not need to know any of that.
>
> Yes, this is an inconvenient with this approach, part of libpq machinery
> is more or less replicated in pgbench, which is quite annoying, and less
> powerful.

There is some small fraction reproduced here just to pull out the
named sections; no other parsing should be done though.

> Attached my work-in-progress version, with a few open issues (eg probably
> not thread safe), but comments about the provided feature are welcome.
>
> I borrowed the "strategy" option, renamed policy, from the initial patch.
> Pgbench just accepts several connection strings as parameters, eg:
>
>    pgbench ... "service=db1" "service=db2" "service=db3"
>
> The next stage is to map scripts to connections types and connections
> to connection types, so that pgbench could run W transactions against a
> primary and R transactions agains a hot standby, for instance. I have a
> some design for that, but nothing is implemented.
>
> There is also the combination with the error handling patch to consider:
> if a connection fails, a connection to a replica could be issued instead.

I'll see if I can take a look at your latest patch.  I was also
wondering about how we should handle `pgbench -i` with multiple
connection strings; currently it would only initialize with the first
DSN it gets, but it probably makes sense to run initialize against all
of the databases (or at least attempt to).  Maybe this is one argument
for the multiple --conninfo handling, since you could explicitly pass
the databases you want.  (Not that it is hard to just loop over
connection info and `pgbench -i` with ENV, or any other number of ways
to accomplish the same thing.)

Best,

David



Re: [PATCH] pgbench: add multiconnect option

From
Fabien COELHO
Date:
Hello David,

>>> round-robin and random make sense.  I am wondering how round-robin
>>> would work with -C, though?  Would you just reuse the same connection
>>> string as the one chosen at the starting point.
>>
>> Well, not necessarily, but this is debatable.
>
> My expectation for such a behavior would be that it would reconnect to
> a random connstring each time, otherwise what's the point of using
> this with -C?  If we needed to forbid some option combinations that is
> also an option.

Yep. ISTM that it should follow the connection policy/strategy, what ever 
it is.

>>>> I was thinking of providing a allowing a list of conninfo strings with
>>>> repeated options, eg --conninfo "foo" --conninfo "bla"…
>>>
>>> That was my first thought when reading the subject of this thread:
>>> create a list of connection strings and pass one of them to
>>> doConnect() to grab the properties looked for.  That's a bit confusing
>>> though as pgbench does not support directly connection strings,
>>
>> They are supported because libpq silently assumes that "dbname" can be a
>> full connection string.
>>
>>> and we should be careful to keep fallback_application_name intact.
>>
>> Hmmm. See attached patch, ISTM that it does the right thing.
>
> I guess the multiple --conninfo approach is fine; I personally liked
> having the list come from a file, as you could benchmark different
> groups/clusters based on a file, much easier than constructing
> multiple pgbench invocations depending.  I can see an argument for
> both approaches.  The PGSERVICEFILE was an idea I'd had to store
> easily indexed groups of connection information in a way that I didn't
> need to know all the details, could easily parse, and could later pass
> in the ENV so libpq could just pull out the information.

The attached version does work with the service file if the user provides 
"service=whatever" on the command line. The main difference is that it 
sticks to the libpq policy to use an explicit connection string or list of 
connection strings.

Also, note that the patch I sent dropped the --conninfo option. 
Connections are simply tghe last arguments to pgbench.

> I'll see if I can take a look at your latest patch.

Thanks!

> I was also wondering about how we should handle `pgbench -i` with 
> multiple connection strings; currently it would only initialize with the 
> first DSN it gets, but it probably makes sense to run initialize against 
> all of the databases (or at least attempt to).

I'll tend to disagree on this one. Pgbench whole expectation is to run 
against "one" system, which might be composed of several nodes because of 
replications. I do not think that it is desirable to jump to "serveral 
fully independent databases".

> Maybe this is one argument for the multiple --conninfo handling, since 
> you could explicitly pass the databases you want.  (Not that it is hard 
> to just loop over connection info and `pgbench -i` with ENV, or any 
> other number of ways to accomplish the same thing.)

Yep.

-- 
Fabien.

Re: [PATCH] pgbench: add multiconnect option

From
Greg Stark
Date:
Hi guys,

It looks like David sent a patch and Fabien sent a followup patch. But
there hasn't been a whole lot of discussion or further patches.

It sounds like there are some basic questions about what the right
interface should be. Are there specific questions that would be
helpful for moving forward?



Re: [PATCH] pgbench: add multiconnect option

From
Fabien COELHO
Date:
Hello Greg,

> It looks like David sent a patch and Fabien sent a followup patch. But
> there hasn't been a whole lot of discussion or further patches.
>
> It sounds like there are some basic questions about what the right
> interface should be. Are there specific questions that would be
> helpful for moving forward?

Review the designs and patches and tell us what you think?

Personnaly, I think that allowing multiple connections is a good thing, 
especially if the code impact is reduced, which is the case with the 
version I sent.

Then for me the next step would be to have a reconnection on errors so as 
to implement a client-side failover policy that could help testing a 
server-failover performance impact. I have done that internally but it 
requires that "Pgbench Serialization and deadlock errors" to land, as it 
would just be another error that can be handled.

-- 
Fabien.



Re: [PATCH] pgbench: add multiconnect option

From
"Imseih (AWS), Sami"
Date:
The current version of the patch does not apply, so I could not test it.

Here are some comments I have.

Pgbench is a simple benchmark tool by design, and I wonder if adding 
a multiconnect feature will cause pgbench to be used incorrectly.
A real world use-case will be helpful for this thread.

For the current patch, Should the report also cover per-database statistics (tps/latency/etc.) ?

Regards,

Sami Imseih
Amazon Web Services



Re: [PATCH] pgbench: add multiconnect option

From
Fabien COELHO
Date:
Hi Sami,

> Pgbench is a simple benchmark tool by design, and I wonder if adding
> a multiconnect feature will cause pgbench to be used incorrectly.

Maybe, but I do not see how it would be worse that what pgbench already 
allows.

> A real world use-case will be helpful for this thread.

Basically more versatile testing for non single host setups.

For instance, it would allow testing directly a multi-master setup, such 
as bucardo, symmetricds or coackroachdb.

It would be a first step on the path to allow interesting features such 
as:

  - testing failover setup, on connection error a client could connect to 
another host.

  - testing a primary/standby setup, with write transactions sent to the 
primary and read transactions sent to the standbyes.

Basically I have no doubt that it can be useful.

> For the current patch, Should the report also cover per-database 
> statistics (tps/latency/etc.) ?

That could be a "per-connection" option. If there is a reasonable use case 
I think that it would be an easy enough feature to implement.

Attached a rebased version.

-- 
Fabien.
Attachment

Re: [PATCH] pgbench: add multiconnect option

From
David Christensen
Date:
On Sat, Mar 19, 2022 at 11:43 AM Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Hi Sami,

> Pgbench is a simple benchmark tool by design, and I wonder if adding
> a multiconnect feature will cause pgbench to be used incorrectly.

Maybe, but I do not see how it would be worse that what pgbench already
allows.

I agree that pgbench is simple; perhaps really too simple when it comes to being able to measure much more than basic query flows.  What pgbench does have in its favor is being distributed with the core distribution.

I think there is definitely space for a more complicated benchmarking tool that exercises more scenarios and more realistic query patterns and scenarios.  Whether that is distributed with the core is another question.

David
 

Re: [PATCH] pgbench: add multiconnect option

From
Fabien COELHO
Date:
>>> Pgbench is a simple benchmark tool by design, and I wonder if adding
>>> a multiconnect feature will cause pgbench to be used incorrectly.
>>
>> Maybe, but I do not see how it would be worse that what pgbench already
>> allows.
>>
>
> I agree that pgbench is simple; perhaps really too simple when it comes to
> being able to measure much more than basic query flows.  What pgbench does
> have in its favor is being distributed with the core distribution.
>
> I think there is definitely space for a more complicated benchmarking tool
> that exercises more scenarios and more realistic query patterns and
> scenarios.  Whether that is distributed with the core is another question.

As far as this feature is concerned, the source code impact of the patch 
is very small, so I do not think that is worth barring this feature on 
that ground.

-- 
Fabien.



Re: [PATCH] pgbench: add multiconnect option

From
Greg Stark
Date:
According to the cfbot this patch needs a rebase



Re: [PATCH] pgbench: add multiconnect option

From
Fabien COELHO
Date:
> According to the cfbot this patch needs a rebase

Indeed. v4 attached.

-- 
Fabien.
Attachment

Re: [PATCH] pgbench: add multiconnect option

From
Ian Lawrence Barwick
Date:
2022年4月2日(土) 22:35 Fabien COELHO <coelho@cri.ensmp.fr>:
>
>
> > According to the cfbot this patch needs a rebase
>
> Indeed. v4 attached.

Hi

cfbot reports the patch no longer applies.  As CommitFest 2022-11 is
currently underway, this would be an excellent time to update the patch.

Thanks

Ian Barwick



Re: [PATCH] pgbench: add multiconnect option

From
Fabien COELHO
Date:
Hello Ian,

> cfbot reports the patch no longer applies.  As CommitFest 2022-11 is
> currently underway, this would be an excellent time to update the patch.

Attached a v5 which is just a rebase.

-- 
Fabien.
Attachment

Re: [PATCH] pgbench: add multiconnect option

From
Jelte Fennema
Date:
This patch seems to have quite some use case overlap with my patch which adds load balancing to libpq itself: https://www.postgresql.org/message-id/flat/PR3PR83MB04768E2FF04818EEB2179949F7A69@PR3PR83MB0476.EURPRD83.prod.outlook.com

My patch is only able to add "random" load balancing though, not "round-robin". So this patch still definitely seems useful, even when mine gets merged.

I'm not sure that the support for the "working" connection is necessary from a feature perspective though (usability/discoverability is another question). It's already possible to achieve the same behaviour by simply providing multiple host names in the connection string. You can even tell libpq to connect to a primary or secondary by using the target_session_attrs option.

On Fri, 6 Jan 2023 at 11:33, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Hello Ian,

> cfbot reports the patch no longer applies.  As CommitFest 2022-11 is
> currently underway, this would be an excellent time to update the patch.

Attached a v5 which is just a rebase.

--
Fabien.

Re: [PATCH] pgbench: add multiconnect option

From
Fabien COELHO
Date:
Hello Jelte,

> This patch seems to have quite some use case overlap with my patch which
> adds load balancing to libpq itself:
>
https://www.postgresql.org/message-id/flat/PR3PR83MB04768E2FF04818EEB2179949F7A69@PR3PR83MB0476.EURPRD83.prod.outlook.com

Thanks for the pointer.

The end purpose of the patch is to allow pgbench to follow a failover at 
some point, at the client level, AFAICR.

> My patch is only able to add "random" load balancing though, not
> "round-robin". So this patch still definitely seems useful, even when mine
> gets merged.

Yep. I'm not sure the end purpose is the same, but possibly the pgbench 
patch could take advantage of libpq extension.

> I'm not sure that the support for the "working" connection is necessary
> from a feature perspective though (usability/discoverability is another
> question). It's already possible to achieve the same behaviour by simply
> providing multiple host names in the connection string. You can even tell
> libpq to connect to a primary or secondary by using the
> target_session_attrs option.

-- 
Fabien.



Re: [PATCH] pgbench: add multiconnect option

From
vignesh C
Date:
On Tue, 8 Nov 2022 at 02:16, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>
>
> Hello Ian,
>
> > cfbot reports the patch no longer applies.  As CommitFest 2022-11 is
> > currently underway, this would be an excellent time to update the patch.
>
> Attached a v5 which is just a rebase.

The patch does not apply on top of HEAD as in [1], please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
3c6fc58209f24b959ee18f5d19ef96403d08f15c ===
=== applying patch ./pgbench-multi-connect-conninfo-5.patch
(Stripping trailing CRs from patch; use --binary to disable.)
patching file doc/src/sgml/ref/pgbench.sgml
Hunk #3 FAILED at 921.
1 out of 3 hunks FAILED -- saving rejects to file
doc/src/sgml/ref/pgbench.sgml.rej

[1] - http://cfbot.cputube.org/patch_41_3227.log

Regards,
Vignesh



Re: [PATCH] pgbench: add multiconnect option

From
vignesh C
Date:
On Wed, 11 Jan 2023 at 22:17, vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, 8 Nov 2022 at 02:16, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> >
> >
> > Hello Ian,
> >
> > > cfbot reports the patch no longer applies.  As CommitFest 2022-11 is
> > > currently underway, this would be an excellent time to update the patch.
> >
> > Attached a v5 which is just a rebase.
>
> The patch does not apply on top of HEAD as in [1], please post a rebased patch:
> === Applying patches on top of PostgreSQL commit ID
> 3c6fc58209f24b959ee18f5d19ef96403d08f15c ===
> === applying patch ./pgbench-multi-connect-conninfo-5.patch
> (Stripping trailing CRs from patch; use --binary to disable.)
> patching file doc/src/sgml/ref/pgbench.sgml
> Hunk #3 FAILED at 921.
> 1 out of 3 hunks FAILED -- saving rejects to file
> doc/src/sgml/ref/pgbench.sgml.rej

There has been no updates on this thread for some time, so this has
been switched as Returned with Feedback. Feel free to change it open
in the next commitfest if you plan to continue on this.

Regards,
Vignesh