Thread: PATCH: pgbench - option to build using ppoll() for larger connectioncounts
[trying again for 2018-01]
This patch enables building pgbench to use ppoll() instead of select()
to allow for more than (FD_SETSIZE - 10) connections. As implemented,
when using ppoll(), the only connection limitation is system resources.
The patch has been implemented to introduce a minimal of #ifdef/#ifndef
clutter in the code.
Without this patch, one is limited to '(FD_SETSIZE - 10)’ number of connections.
Example of something that fails without this patch but works with the patch:
Without the patch:
$ pgbench -j 3000 -c 1500
invalid number of clients: "1500"
With the patch:
$ pgbench -j 3000 -c 1500
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 2000
query mode: simple
number of clients: 1500
number of threads: 1500
number of transactions per client: 10
number of transactions actually processed: 15000/15000
latency average = 631.730 ms
tps = 2374.430587 (including connections establishing)
tps = 4206.524986 (excluding connections establishing)
doug
--
Doug Rady
Amazon Aurora, RDS PostgreSQL
radydoug@amazon.com
Attachment
Re: PATCH: pgbench - option to build using ppoll() for largerconnection counts
On Tue, Nov 28, 2017 at 10:38 AM, Rady, Doug <radydoug@amazon.com> wrote: > This patch enables building pgbench to use ppoll() instead of select() > to allow for more than (FD_SETSIZE - 10) connections. As implemented, > when using ppoll(), the only connection limitation is system resources. IIUC, ppoll() is to poll() as pselect() is to select(). Since the existing code uses select(), why not convert to poll() rather than ppoll()? +#ifdef USE_PPOLL +#define SCKTWTMTHD "ppoll" +#undef MAXCLIENTS +#define POLL_EVENTS (POLLIN|POLLPRI|POLLRDHUP) +#define POLL_FAIL (POLLERR|POLLHUP|POLLNVAL|POLLRDHUP) +#define PFD_RESETEV(s) { do { if ((s)->pfdp) { (s)->pfdp->events = POLL_EVENTS; (s)->pfdp->revents = 0; } } while(0); } +#define PFD_ZERO(s) { do { if ((s)->pfdp) { (s)->pfdp->fd = -1; (s)->pfdp->events = POLL_EVENTS; (s)->pfdp->revents = 0; } } while(0); } +#define PFD_SETFD(s) { do { (s)->pfdp->fd = PQsocket((s)->con); } while(0); } +#define PFD_STRUCT_POLLFD(p) struct pollfd (p); +#define PFD_THREAD_FREE(t) { do { if ((t)->pfds) { pg_free((t)->pfds); (t)->pfds = NULL; } } while (0); } +#define PFD_THREAD_INIT(t,s,n) { do { int _i; (t)->pfds = (struct pollfd *) pg_malloc0(sizeof(struct pollfd) * (n)); for (_i = 0; _i < (n); _i++) { (s)[_i].pfdp = &(t)->pfds[_i]; (s)[_i].pfdp->fd = -1; } } while (0); } +#else +#define SCKTWTMTHD "select" +#define PFD_RESETEV(s) +#define PFD_ZERO(s) +#define PFD_SETFD(s) +#define PFD_STRUCT_POLLFD(p) +#define PFD_THREAD_FREE(t) +#define PFD_THREAD_INIT(t,s,n) +#endif I find this an impenetrable mess. I am not going to commit a macro with a ten-letter name that contains neither punctuation marks nor vowels, nor one that is a single 200+ character that embeds a malloc and a loop call but no comments. While I agree with the general goal of avoiding introducing lots of #ifdefs, and while I think that the introduction of finishCon() seems like a possibly-useful step in that direction, I don't think that having a bunch of long, complex, comment-free macros like this actually improves code clarity. Better to have some embedded #ifdefs than this. + if (debug) + fprintf(stderr, "client %d connecting ...\n", + st->id); + This is unrelated to the purpose of the patch. Please don't mix in unrelated changes. + if (debug) + fprintf(stderr, "client %d connecting\n", + state[i].id); Ditto. IMHO, this looks like a broadly reasonable thing to do. I had no idea that FD_SETSIZE was ever set to a value much less than the number of FDs the system could handle -- but if it is, then this seems like a reasonable fix, once we agree on exactly how the details should look. I don't remember off-hand whether you've done testing to see how poll()/ppoll() performs compares to select(), but that might also be something to check. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: PATCH: pgbench - option to build using ppoll() for largerconnection counts
Hello, > This patch enables building pgbench to use ppoll() instead of select() > to allow for more than (FD_SETSIZE - 10) connections. As implemented, > when using ppoll(), the only connection limitation is system resources. I'm fine with allowing more clients through ppoll, as large multi/many/whatever-core hardware becomes more common and is expected to grow. However, I'm at odds with the how and the ppoll/select compatibility layer offered by the patch, and the implied maintenance cost just to understand what is happening while reading the code cluttered with possibly empty macros and ifdef stuff. There are many ifdef, many strange/astute macros, significant dissymetry in the code, which reduce maintainability noticeably. I think that it should abstract cleanly the two implementations into a set of functions with the same signatures, even if some arguments are unsused, and use these functions without ifdef stuff later. Maybe some macros too, ok, but not too much. "SCKTWTMTHD"... WTH? I do not want to know what it means. I'm sure a better name (if something without vowels can be a name of anything in English...) can be found. Names must be chosen with great care. MAXCLIENTS: could be set to -1 when there is no limit, and tested to avoid one ifdef. I must admit that I do not see the benefit of "{ do { ... } while(0); }" compared to "{ ... }". Maybe it has to do with compiler warnings. > The patch has been implemented to introduce a minimal of #ifdef/#ifndef > clutter in the code. I think that it could be much more minimal than that. I would aim at having just one. #ifdef USE_PPOLL specific functions definitions some macros #else /* SELECT */ idem for select #endif Ok, it may not be the best solution everywhere, but it should be significantly reduce compared to the current state. ISTM that ifdef which breaks the code structure should be avoided, eg #ifdef ... if (...) #else if (...) #endif { // condition was true... It is unclear to me why the input_mask is declared in the threadRun function (so is thread specific) but pfds is in each thread struct (so is also thread specific, but differently). The same logic should be used for both, which ever is best. Not sure that "pfds" is the right name. If the two variables means the same thing, they should have the same name, although possibly different types. > $ pgbench -j 3000 -c 1500 Probably you intended 1500 threads for 3000 clients. -- Fabien.
Re: PATCH: pgbench - option to build using ppoll() for largerconnection counts
On Wed, Nov 29, 2017 at 3:40 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > ever is best. Not sure that "pfds" is the right name. If the two variables > means the same thing, they should have the same name, although possibly > different types. Although I agree with a good bit of what you say here, I don't agree with that. If the member used by ppoll() (or just poll()) has a different name than the one used for select(), it's much easier to, say, grep for everyplace where the field you care about is used. If you use the same name for different things, that doesn't work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: PATCH: pgbench - option to build using ppoll() for largerconnection counts
Hello Robert, >> ever is best. Not sure that "pfds" is the right name. If the two variables >> means the same thing, they should have the same name, although possibly >> different types. > > Although I agree with a good bit of what you say here, I don't agree > with that. If the member used by ppoll() (or just poll()) has a > different name than the one used for select(), it's much easier to, > say, grep for everyplace where the field you care about is used. If > you use the same name for different things, that doesn't work. I'm not sure it is incompatible. My point is consistent with my other advice which is to hide the stuff in functions with identical (or compatible) signatures, so that the only place where it would differ would be in the functions, where greping would work. #ifdef USE_POLL // pool specific stuff #define SOME_TYPE v1_type (or typedef) void do_stuff(v1_type * stuff) { ... } ... #else /* default to SELECT */ // select specific stuff #define SOME_TYPE v2_type (idem) void do_stuff(v2_type *stuff) { ... } ... #endif Then later the code is not specific to poll or select, eg: SOME_TYPE mask; do_stuff(mask); do_other_stuff(...); if (is_ready(mask, ...)) { ... } -- Fabien.
Re: PATCH: pgbench - option to build using ppoll() for largerconnection counts
On Wed, Nov 29, 2017 at 8:10 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > My point is consistent with my other advice which is to hide the stuff in > functions with identical (or compatible) signatures, so that the only place > where it would differ would be in the functions, where greping would work. > > #ifdef USE_POLL > // pool specific stuff > #define SOME_TYPE v1_type (or typedef) > void do_stuff(v1_type * stuff) { ... } > ... > #else /* default to SELECT */ > // select specific stuff > #define SOME_TYPE v2_type (idem) > void do_stuff(v2_type * stuff) { ... } > ... > #endif > > Then later the code is not specific to poll or select, eg: > > SOME_TYPE mask; > do_stuff(mask); > do_other_stuff(...); > if (is_ready(mask, ...)) { ... } Yeah, that sort of style would be OK with me. But I wouldn't like: struct blah { #ifdef FOO int doohicky; #else char *doohicky; }; ...because now any place in the code where you see "doohicky" you don't immediately know whether it's an int or a char * -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: PATCH: pgbench - option to build using ppoll() for largerconnection counts
> [...] Yeah, that sort of style would be OK with me. But I wouldn't > like: > > struct blah { > #ifdef FOO > int doohicky; > #else > char *doohicky; > }; Indeed. Me neither. -- Fabien.
Re: PATCH: pgbench - option to build using ppoll() for largerconnection counts
Doug, * Rady, Doug (radydoug@amazon.com) wrote: > This patch enables building pgbench to use ppoll() instead of select() > to allow for more than (FD_SETSIZE - 10) connections. As implemented, > when using ppoll(), the only connection limitation is system resources. Looks like this patch had some good feedback back when it was posted, but we haven't seen any update to it based on that feedback and we're now nearly a week into the January commitfest. In order to keep the commitfest moving and to give you the best opportunity at having the patch be reviewed and committed during this cycle, I'd suggest trying to find some time soon to incorporate the recommendations provided and post an updated patch for review. Thanks! Stephen