Re: [EXTERNAL] Support load balancing in libpq - Mailing list pgsql-hackers
From | Daniel Gustafsson |
---|---|
Subject | Re: [EXTERNAL] Support load balancing in libpq |
Date | |
Msg-id | BCEC6618-4C6E-41CE-9BB7-EC2A6126EE37@yesql.se Whole thread Raw |
In response to | Re: [EXTERNAL] Re: Support load balancing in libpq (Jelte Fennema <postgres@jeltef.nl>) |
Responses |
Re: [EXTERNAL] Support load balancing in libpq
|
List | pgsql-hackers |
In general I think this feature makes sense (which has been echoed many times in the thread), and the implementation strikes a good balance of robustness and simplicity. Reading this I think it's very close to being committable, but I have a few comments on the patch series: + sent to the same server. There are currently two modes: The documentation lists the modes disabled and random, but I wonder if it's worth expanding the docs to mention that "disabled" is pretty much a round robin load balancing scheme? It reads a bit odd to present load balancing without a mention of round robin balancing given how common it is. - conn->addrlist_family = hint.ai_family = AF_UNSPEC; + hint.ai_family = AF_UNSPEC; This removes all uses of conn->addrlist_family and that struct member can be removed. + to, for example, load balance over stanby servers only. Once successfully s/stanby/standby/ + Postgres servers. s/Postgres/<productname>PostgreSQL</productname>/ + more addresses than others. So if you want uniform load balancing, + this is something to keep in mind. However, non-uniform load + balancing can also be used to your advantage, e.g. by providing the The documentation typically use a less personal form, I would suggest something along the lines of: "If uniform load balancing is required then an external load balancing tool must be used. Non-uniform load balancing can also be used to skew the results, e.g. by providing the.." + if (!libpq_prng_init(conn)) + return false; This needs to set a returned error message with libpq_append_conn_error (feel free to come up with a better wording of course): libpq_append_conn_error(conn, "unable to initiate random number generator"); -#ifndef WIN32 +/* MinGW has sys/time.h, but MSVC doesn't */ +#ifndef _MSC_VER #include <sys/time.h> This seems unrelated to the patch in question, and should be a separate commit IMO. + LOAD_BALANCE_RANDOM, /* Read-write server */ I assume this comment is a copy/paste and should have been reflecting random order? +++ b/src/interfaces/libpq/t/003_loadbalance_host_list.pl Nitpick, but we should probably rename this load_balance to match the parameter being tested. +++ b/src/interfaces/libpq/t/004_loadbalance_dns.pl On the subject of tests, I'm a bit torn. I appreciate that the patch includes a thorough test, but I'm not convinced we should add that to the tree. A test which require root permission level manual system changes stand a very low chance of ever being executed, and as such will equate to dead code that may easily be broken or subtly broken. I am also not a fan of the random_seed parameter. A parameter only useful for testing, and which enables testing by circumventing the mechanism to test (making randomness predictable), seems like expensive bagage to carry around. From experience we also know this risks ending up in production configs for all the wrong reasons. Given the implementation of this feature, the actual connection phase isn't any different from just passing multiple hostnames and operating in the round-robin fashion. What we want to ensure is that the randomization isn't destroying the connection array. Let's see what we can do while still having tests that can be executed in the buildfarm. A few ideas: * Add basic tests for the load_balance_host connection param only accepting sane values * Alter the connect_ok() tests in 003_loadbalance_host_list.pl to not require random_seed but instead using randomization. Thinking out loud, how about something along these lines? - Passing a list of unreachable hostnames with a single good hostname should result in a connection. - Passing a list of one good hostname should result in a connection - Passing a list on n good hostname (where all are equal) should result in a connection - Passing a list of n unreachable hostnames should result in log entries for n broken resolves in some order, and running that test n' times shouldn't - statistically - result in the same order for a large enough n'. * Remove random_seed and 004_loadbalance_dns.pl Thoughts? -- Daniel Gustafsson
pgsql-hackers by date: