Re: Support load balancing in libpq - Mailing list pgsql-hackers
From | Michael Banck |
---|---|
Subject | Re: Support load balancing in libpq |
Date | |
Msg-id | 63b85894.170a0220.304c7.19a6@mx.google.com Whole thread Raw |
In response to | Re: Support load balancing in libpq (Jelte Fennema <Jelte.Fennema@microsoft.com>) |
Responses |
Re: [EXTERNAL] Re: Support load balancing in libpq
|
List | pgsql-hackers |
Hi, On Tue, Nov 29, 2022 at 02:57:08PM +0000, Jelte Fennema wrote: > Attached is a new version with the tests cleaned up a bit (more > comments mostly). > > @Michael, did you have a chance to look at the last version? Because I > feel that the patch is pretty much ready for a committer to look at, > at this point. I had another look; it still applies and tests pass. I still find the distribution over three hosts a bit skewed (even when OpenSSL is enabled, which wasn't the case when I first tested it), but couldn't find a general fault and it worked well enough in my testing. I wonder whether making the parameter a boolean will paint us into a corner, and whether maybe additional modes could be envisioned in the future, but I can't think of some right now (you can pretty neatly restrict load-balancing to standbys by setting target_session_attrs=standby in addition). Maybe a way to change the behaviour if a dns hostname is backed by multiple entries? Some further (mostly nitpicking) comments on the patch: > From 6e20bb223012b666161521b5e7249c066467a5f3 Mon Sep 17 00:00:00 2001 > From: Jelte Fennema <github-tech@jeltef.nl> > Date: Mon, 12 Sep 2022 09:44:06 +0200 > Subject: [PATCH v5] Support load balancing in libpq > > Load balancing connections across multiple read replicas is a pretty > common way of scaling out read queries. There are two main ways of doing > so, both with their own advantages and disadvantages: > 1. Load balancing at the client level > 2. Load balancing by connecting to an intermediary load balancer > > Both JBDC (Java) and Npgsql (C#) already support client level load > balancing (option #1). This patch implements client level load balancing > for libpq as well. To stay consistent with the JDBC and Npgsql part of > the ecosystem, a similar implementation and name for the option are > used. I still think all of the above has no business in the commit message, though maybe the first paragraph can stay as introduction. > It contains two levels of load balancing: > 1. The given hosts are randomly shuffled, before resolving them > one-by-one. > 2. Once a host its addresses get resolved, those addresses are shuffled, > before trying to connect to them one-by-one. That's fine. What should be in the commit message is at least a mention of what the new connection parameter is called and possibly what is done to accomplish it. But the committer will pick this up if needed I guess. > diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml > index f9558dec3b..6ce7a0c9cc 100644 > --- a/doc/src/sgml/libpq.sgml > +++ b/doc/src/sgml/libpq.sgml > @@ -1316,6 +1316,54 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname > </listitem> > </varlistentry> > > + <varlistentry id="libpq-load-balance-hosts" xreflabel="load_balance_hosts"> > + <term><literal>load_balance_hosts</literal></term> > + <listitem> > + <para> > + Controls whether the client load balances connections across hosts and > + adresses. The default value is 0, meaning off, this means that hosts are > + tried in order they are provided and addresses are tried in the order > + they are received from DNS or a hosts file. If this value is set to 1, > + meaning on, the hosts and addresses that they resolve to are tried in > + random order. Subsequent queries once connected will still be sent to > + the same server. Setting this to 1, is mostly useful when opening > + multiple connections at the same time, possibly from different machines. > + This way connections can be load balanced across multiple Postgres > + servers. > + </para> > + <para> > + When providing multiple hosts, these hosts are resolved in random order. > + Then if that host resolves to multiple addresses, these addresses are > + connected to in random order. Only once all addresses for a single host > + have been tried, the addresses for the next random host will be > + resolved. This behaviour can lead to non-uniform address selection in > + certain cases. Such as when not all hosts resolve to the same number of > + addresses, or when multiple hosts resolve to the same address. So if you > + want uniform load balancing, this is something to keep in mind. However, > + non-uniform load balancing also has usecases, e.g. providing the > + hostname of a larger server multiple times in the host string so it gets > + more requests. > + </para> > + <para> > + When using this setting it's recommended to also configure a reasonable > + value for <xref linkend="libpq-connect-connect-timeout"/>. Because then, > + if one of the nodes that are used for load balancing is not responding, > + a new node will be tried. > + </para> > + </listitem> > + </varlistentry> I think this whole section is generally fine, but needs some copyediting. > + <varlistentry id="libpq-random-seed" xreflabel="random_seed"> > + <term><literal>random_seed</literal></term> > + <listitem> > + <para> > + Sets the random seed that is used by <xref linkend="libpq-load-balance-hosts"/> > + to randomize the host order. This option is mostly useful when running > + tests that require a stable random order. > + </para> > + </listitem> > + </varlistentry> I wonder whether this needs to be documented if it is mostly a development/testing parameter? > diff --git a/src/include/libpq/pqcomm.h b/src/include/libpq/pqcomm.h > index fcf68df39b..39e93b1392 100644 > --- a/src/include/libpq/pqcomm.h > +++ b/src/include/libpq/pqcomm.h > @@ -27,6 +27,12 @@ typedef struct > socklen_t salen; > } SockAddr; > > +typedef struct > +{ > + int family; > + SockAddr addr; > +} AddrInfo; That last line looks weirdly indented compared to SockAddr; in the struct above. > /* Configure the UNIX socket location for the well known port. */ > > #define UNIXSOCK_PATH(path, port, sockdir) \ > diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c > index f88d672c6c..b4d3613713 100644 > --- a/src/interfaces/libpq/fe-connect.c > +++ b/src/interfaces/libpq/fe-connect.c > @@ -241,6 +241,14 @@ static const internalPQconninfoOption PQconninfoOptions[] = { > "Fallback-Application-Name", "", 64, > offsetof(struct pg_conn, fbappname)}, > > + {"load_balance_hosts", NULL, NULL, NULL, > + "Load-Balance", "", 1, /* should be just '0' or '1' */ > + offsetof(struct pg_conn, loadbalance)}, > + > + {"random_seed", NULL, NULL, NULL, > + "Random-Seed", "", 10, /* strlen(INT32_MAX) == 10 */ > + offsetof(struct pg_conn, randomseed)}, > + > {"keepalives", NULL, NULL, NULL, > "TCP-Keepalives", "", 1, /* should be just '0' or '1' */ > offsetof(struct pg_conn, keepalives)}, > @@ -379,6 +387,7 @@ static bool fillPGconn(PGconn *conn, PQconninfoOption *connOptions); > static void freePGconn(PGconn *conn); > static void closePGconn(PGconn *conn); > static void release_conn_addrinfo(PGconn *conn); > +static bool store_conn_addrinfo(PGconn *conn, struct addrinfo *addrlist); > static void sendTerminateConn(PGconn *conn); > static PQconninfoOption *conninfo_init(PQExpBuffer errorMessage); > static PQconninfoOption *parse_connection_string(const char *connstr, > @@ -424,6 +433,9 @@ static void pgpassfileWarning(PGconn *conn); > static void default_threadlock(int acquire); > static bool sslVerifyProtocolVersion(const char *version); > static bool sslVerifyProtocolRange(const char *min, const char *max); > +static int loadBalance(PGconn *conn); > +static bool parse_int_param(const char *value, int *result, PGconn *conn, > + const char *context); > > > /* global variable because fe-auth.c needs to access it */ > @@ -1007,6 +1019,46 @@ parse_comma_separated_list(char **startptr, bool *more) > return p; > } > > +/* > + * Initializes the prng_state field of the connection. We want something > + * unpredictable, so if possible, use high-quality random bits for the > + * seed. Otherwise, fall back to a seed based on timestamp and PID. > + */ > +static bool > +libpq_prng_init(PGconn *conn) > +{ > + if (unlikely(conn->randomseed)) > + { > + int rseed; > + > + if (!parse_int_param(conn->randomseed, &rseed, conn, "random_seed")) > + { > + return false; > + }; I think it's project policy to drop the braces for single statements in if blocks. > + pg_prng_seed(&conn->prng_state, rseed); > + } > + else if (unlikely(!pg_prng_strong_seed(&conn->prng_state))) > + { > + uint64 rseed; > + time_t now = time(NULL); > + > + /* > + * Since PIDs and timestamps tend to change more frequently in their > + * least significant bits, shift the timestamp left to allow a larger > + * total number of seeds in a given time period. Since that would > + * leave only 20 bits of the timestamp that cycle every ~1 second, > + * also mix in some higher bits. > + */ > + rseed = ((uint64) getpid()) ^ > + ((uint64) now << 12) ^ > + ((uint64) now >> 20); > + > + pg_prng_seed(&conn->prng_state, rseed); > + } > + return true; > +} > + > + > /* Additional newline. > @@ -1164,6 +1217,36 @@ connectOptions2(PGconn *conn) > } > } > > + if (loadbalancehosts < 0) > + { > + appendPQExpBufferStr(&conn->errorMessage, > + libpq_gettext("loadbalance parameter must be an integer\n")); > + return false; > + } > + > + if (loadbalancehosts) > + { > + if (!libpq_prng_init(conn)) > + { > + return false; > + } > + > + /* > + * Shuffle connhost with a Durstenfeld/Knuth version of the > + * Fisher-Yates shuffle. Source: > + * https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#The_modern_algorithm > + */ > + for (i = conn->nconnhost - 1; i > 0; i--) > + { > + int j = pg_prng_uint64_range(&conn->prng_state, 0, i); > + pg_conn_host temp = conn->connhost[j]; > + > + conn->connhost[j] = conn->connhost[i]; > + conn->connhost[i] = temp; > + } > + } > + > + > /* Additional newline. > @@ -1726,6 +1809,27 @@ connectFailureMessage(PGconn *conn, int errorno) > libpq_append_conn_error(conn, "\tIs the server running on that host and accepting TCP/IP connections?"); > } > > +/* > + * Should we load balance across hosts? Returns 1 if yes, 0 if no, and -1 if > + * conn->loadbalance is set to a value which is not parseable as an integer. > + */ > +static int > +loadBalance(PGconn *conn) > +{ > + char *ep; > + int val; > + > + if (conn->loadbalance == NULL) > + { > + return 0; > + } Another case of additional braces. > + val = strtol(conn->loadbalance, &ep, 10); > + if (*ep) > + return -1; > + return val != 0 ? 1 : 0; > +} > + > + > /* Additional newline. > @@ -4041,6 +4154,63 @@ freePGconn(PGconn *conn) > free(conn); > } > > + > +/* Additional newline. > + * Copies over the AddrInfos from addrlist to the PGconn. > + */ > +static bool > +store_conn_addrinfo(PGconn *conn, struct addrinfo *addrlist) > +{ > + struct addrinfo *ai = addrlist; > + > + conn->whichaddr = 0; > + > + conn->naddr = 0; > + while (ai) > + { > + ai = ai->ai_next; > + conn->naddr++; > + } > + > + conn->addr = calloc(conn->naddr, sizeof(AddrInfo)); > + if (conn->addr == NULL) > + { > + return false; > + } Additional braces. > @@ -4048,11 +4218,10 @@ freePGconn(PGconn *conn) > static void > release_conn_addrinfo(PGconn *conn) > { > - if (conn->addrlist) > + if (conn->addr) > { > - pg_freeaddrinfo_all(conn->addrlist_family, conn->addrlist); > - conn->addrlist = NULL; > - conn->addr_cur = NULL; /* for safety */ > + free(conn->addr); > + conn->addr = NULL; > } > } > > diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h > index 512762f999..76ee988038 100644 > --- a/src/interfaces/libpq/libpq-int.h > +++ b/src/interfaces/libpq/libpq-int.h > @@ -82,6 +82,8 @@ typedef struct > #endif > #endif /* USE_OPENSSL */ > > +#include "common/pg_prng.h" > + > /* > * POSTGRES backend dependent Constants. > */ > @@ -373,6 +375,8 @@ struct pg_conn > char *pgpassfile; /* path to a file containing password(s) */ > char *channel_binding; /* channel binding mode > * (require,prefer,disable) */ > + char *loadbalance; /* load balance over hosts */ > + char *randomseed; /* seed for randomization of load balancing */ > char *keepalives; /* use TCP keepalives? */ A bit unclear why you put the variables at this point in the list, but the list doesn't seem to be ordered strictly anyway; still, maybe they would fit best at the bottom below target_session_attrs? > char *keepalives_idle; /* time between TCP keepalives */ > char *keepalives_interval; /* time between TCP keepalive > @@ -461,8 +465,10 @@ struct pg_conn > PGTargetServerType target_server_type; /* desired session properties */ > bool try_next_addr; /* time to advance to next address/host? */ > bool try_next_host; /* time to advance to next connhost[]? */ > - struct addrinfo *addrlist; /* list of addresses for current connhost */ > - struct addrinfo *addr_cur; /* the one currently being tried */ > + int naddr; /* number of addrs returned by getaddrinfo */ > + int whichaddr; /* the addr currently being tried */ Address(es) is always spelt out in the comments, those two addr(s) should also I think. > diff --git a/src/interfaces/libpq/t/003_loadbalance.pl b/src/interfaces/libpq/t/003_loadbalance.pl > new file mode 100644 > index 0000000000..07eddbe9cc > --- /dev/null > +++ b/src/interfaces/libpq/t/003_loadbalance.pl > @@ -0,0 +1,167 @@ > +# Copyright (c) 2022, PostgreSQL Global Development Group Copyright bump needed. Cheers, Michael
pgsql-hackers by date: