Re: [PATCH] pgbench: add multiconnect option - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: [PATCH] pgbench: add multiconnect option
Date
Msg-id alpine.DEB.2.22.394.2108281055020.3654177@pseudo
Whole thread Raw
In response to Re: [PATCH] pgbench: add multiconnect option  (David Christensen <david.christensen@crunchydata.com>)
Responses Re: [PATCH] pgbench: add multiconnect option
List pgsql-hackers
Hello David,

>>> round-robin and random make sense.  I am wondering how round-robin
>>> would work with -C, though?  Would you just reuse the same connection
>>> string as the one chosen at the starting point.
>>
>> Well, not necessarily, but this is debatable.
>
> My expectation for such a behavior would be that it would reconnect to
> a random connstring each time, otherwise what's the point of using
> this with -C?  If we needed to forbid some option combinations that is
> also an option.

Yep. ISTM that it should follow the connection policy/strategy, what ever 
it is.

>>>> I was thinking of providing a allowing a list of conninfo strings with
>>>> repeated options, eg --conninfo "foo" --conninfo "bla"…
>>>
>>> That was my first thought when reading the subject of this thread:
>>> create a list of connection strings and pass one of them to
>>> doConnect() to grab the properties looked for.  That's a bit confusing
>>> though as pgbench does not support directly connection strings,
>>
>> They are supported because libpq silently assumes that "dbname" can be a
>> full connection string.
>>
>>> and we should be careful to keep fallback_application_name intact.
>>
>> Hmmm. See attached patch, ISTM that it does the right thing.
>
> I guess the multiple --conninfo approach is fine; I personally liked
> having the list come from a file, as you could benchmark different
> groups/clusters based on a file, much easier than constructing
> multiple pgbench invocations depending.  I can see an argument for
> both approaches.  The PGSERVICEFILE was an idea I'd had to store
> easily indexed groups of connection information in a way that I didn't
> need to know all the details, could easily parse, and could later pass
> in the ENV so libpq could just pull out the information.

The attached version does work with the service file if the user provides 
"service=whatever" on the command line. The main difference is that it 
sticks to the libpq policy to use an explicit connection string or list of 
connection strings.

Also, note that the patch I sent dropped the --conninfo option. 
Connections are simply tghe last arguments to pgbench.

> I'll see if I can take a look at your latest patch.

Thanks!

> I was also wondering about how we should handle `pgbench -i` with 
> multiple connection strings; currently it would only initialize with the 
> first DSN it gets, but it probably makes sense to run initialize against 
> all of the databases (or at least attempt to).

I'll tend to disagree on this one. Pgbench whole expectation is to run 
against "one" system, which might be composed of several nodes because of 
replications. I do not think that it is desirable to jump to "serveral 
fully independent databases".

> Maybe this is one argument for the multiple --conninfo handling, since 
> you could explicitly pass the databases you want.  (Not that it is hard 
> to just loop over connection info and `pgbench -i` with ENV, or any 
> other number of ways to accomplish the same thing.)

Yep.

-- 
Fabien.

pgsql-hackers by date:

Previous
From: Trafalgar Ricardo Lu
Date:
Subject: Summary of GSoC 2021
Next
From: Amit Kapila
Date:
Subject: Re: Added schema level support for publication.