Thread: [PATCH] add --throttle to pgbench (submission 3)
Add --throttle to pgbench Each client is throttled to the specified rate, which can be expressed in tps or in time (s, ms, us). Throttling is achieved by scheduling transactions along a Poisson-distribution. This is an update of the previous proposal which fix a typo in the sgml documentation. The use case of the option is to be able to generate a continuous gentle load for functional tests, eg in a practice session with students or for testing features on a laptop. -- Fabien.
On 5/1/13 4:57 AM, Fabien COELHO wrote: > The use case of the option is to be able to generate a continuous gentle > load for functional tests, eg in a practice session with students or for > testing features on a laptop. If you add this to https://commitfest.postgresql.org/action/commitfest_view?id=18 I'll review it next month. I have a lot of use cases for a pgbench that doesn't just run at 100% all the time. I had tried to simulate something with simple sleep calls, but I realized it was going to take a stronger math basis to do the job well. The situations where I expect this to be useful all require collecting latency data and then both plotting it and doing some statistical analysis. pgbench-tools computes worst-case and 90th percentile latency for example, along with the graph over time. There's a useful concept that some of the official TPC tests have: how high can you get the throughput while still keeping the latency within certain parameters. Right now we have no way to simulate that. What we see with write-heavy pgbench is that latency goes crazy (>60 second commits sometimes) if all you do is hit the server with maximum throughput. That's interesting, but it's not necessarily relevant in many cases. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
Hello Greg, > If you add this to > https://commitfest.postgresql.org/action/commitfest_view?id=18 I'll review it > next month. Ok. Thanks. I just did that. > I have a lot of use cases for a pgbench that doesn't just run at 100% > all the time. I had tried to simulate something with simple sleep > calls, but I realized it was going to take a stronger math basis to do > the job well. > > The situations where I expect this to be useful all require collecting > latency data and then both plotting it and doing some statistical analysis. > pgbench-tools computes worst-case and 90th percentile latency for example, > along with the graph over time. There's a useful concept that some of the > official TPC tests have: how high can you get the throughput while still > keeping the latency within certain parameters. Right now we have no way to > simulate that. What we see with write-heavy pgbench is that latency goes > crazy (>60 second commits sometimes) if all you do is hit the server with > maximum throughput. That's interesting, but it's not necessarily relevant in > many cases. Indeed. It is a good thing that my proposed feature can help in more situations than my particular need. -- Fabien.
On 05/02/2013 12:56 AM, Greg Smith wrote: > On 5/1/13 4:57 AM, Fabien COELHO wrote: >> The use case of the option is to be able to generate a continuous gentle >> load for functional tests, eg in a practice session with students or for >> testing features on a laptop. > > If you add this to > https://commitfest.postgresql.org/action/commitfest_view?id=18 I'll > review it next month. I have a lot of use cases for a pgbench that > doesn't just run at 100% all the time. As do I - in particular, if time permits I'll merge this patch into my working copy of pgbench so I can find the steady-state transaction rate where BDR replication's lag is stable and doesn't increase continually. Right now I don't really have any way of doing that, only measuring how long it takes to catch up once the test run completes. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
>>> The use case of the option is to be able to generate a continuous gentle >>> load for functional tests, eg in a practice session with students or for >>> testing features on a laptop. >> >> If you add this to >> https://commitfest.postgresql.org/action/commitfest_view?id=18 I'll >> review it next month. I have a lot of use cases for a pgbench that >> doesn't just run at 100% all the time. > As do I - in particular, if time permits I'll merge this patch into my > working copy of pgbench so I can find the steady-state transaction rate > where BDR replication's lag is stable and doesn't increase continually. > Right now I don't really have any way of doing that, only measuring how > long it takes to catch up once the test run completes. You can try to use and improve the --progress option in another patch submission which shows how things are going. -- Fabien.
On 05/28/2013 04:13 PM, Fabien COELHO wrote: > > You can try to use and improve the --progress option in another patch > submission which shows how things are going. That'll certainly be useful, but won't solve this issue. The thing is that with asynchronous replication you need to know how long it takes until all nodes are back in sync, with no replication lag. I can probably do it with a custom pgbench script, but I'm tempted to add support for timing that part separately with a "wait command" to run at the end of the benchmark. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
>> You can try to use and improve the --progress option in another patch >> submission which shows how things are going. > That'll certainly be useful, but won't solve this issue. The thing is > that with asynchronous replication you need to know how long it takes > until all nodes are back in sync, with no replication lag. > I can probably do it with a custom pgbench script, but I'm tempted to > add support for timing that part separately with a "wait command" to run > at the end of the benchmark. ISTM that a separate process not related to pgbench should try to monitor the master-slave async lag, as it is an interesting information anyway... However I'm not sure that pg_stat_replication currently has the necessary information on either side to measure the lag (in time transactions, but how do I know when a transaction was committed? or number of transactions?). -- Fabien.
On 05/28/2013 07:52 PM, Fabien COELHO wrote: > > However I'm not sure that pg_stat_replication currently has the > necessary information on either side to measure the lag (in time > transactions, but how do I know when a transaction was committed? or > number of transactions?). The BDR codebase now has a handy function to report when a transaction was committed, pg_get_transaction_committime(xid) . It also adds pg_xlog_wait_remote_apply and pg_xlog_wait_remote_receive that can be used with pg_current_xlog_location() to wait until one or all replicas have caught up, or with LSNs from pg_stat_replication to (say) wait until all replicas have caught up with the most up-to-date one. I don't think these depend on anything BDR-specific, though Andres or Álvaro would be able to say for sure. Take a look in: git://git.postgresql.org/git/users/andresfreund/postgres.git on the 'bdr' branch. Be aware that it is rebased regularly, though the '0.4' tag applied earlier today will remain constant and contains the functions of interest. I hope this helps. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 05/30/2013 03:10 PM, Craig Ringer wrote: > On 05/28/2013 07:52 PM, Fabien COELHO wrote: >> However I'm not sure that pg_stat_replication currently has the >> necessary information on either side to measure the lag (in time >> transactions, but how do I know when a transaction was committed? or >> number of transactions?). > The BDR codebase now has a handy function to report when a transaction > was committed, pg_get_transaction_committime(xid) . > > It also adds pg_xlog_wait_remote_apply and pg_xlog_wait_remote_receive > that can be used with pg_current_xlog_location() to wait until one or > all replicas have caught up, or with LSNs from pg_stat_replication to > (say) wait until all replicas have caught up with the most up-to-date one. > > I don't think these depend on anything BDR-specific They do, however, require changes to Pg core. These aren't functions you can just borrow and add to an extension, they require additional changes to core to collect the data they use. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2013-05-30 15:54:01 +0800, Craig Ringer wrote: > On 05/30/2013 03:10 PM, Craig Ringer wrote: > > On 05/28/2013 07:52 PM, Fabien COELHO wrote: > >> However I'm not sure that pg_stat_replication currently has the > >> necessary information on either side to measure the lag (in time > >> transactions, but how do I know when a transaction was committed? or > >> number of transactions?). > > The BDR codebase now has a handy function to report when a transaction > > was committed, pg_get_transaction_committime(xid) . > > > > It also adds pg_xlog_wait_remote_apply and pg_xlog_wait_remote_receive > > that can be used with pg_current_xlog_location() to wait until one or > > all replicas have caught up, or with LSNs from pg_stat_replication to > > (say) wait until all replicas have caught up with the most up-to-date one. > > > > I don't think these depend on anything BDR-specific > They do, however, require changes to Pg core. These aren't functions you > can just borrow and add to an extension, they require additional changes > to core to collect the data they use. pg_xlog_wait_remote_receive() doesn't require changes afaics and should be easily packable as an extension. We might want to make it use the sync commit infrastructure at some point instead of essentially busy waiting, but... 'committs' - the mapping of xids to timestamp certainly does though. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
>> However I'm not sure that pg_stat_replication currently has the >> necessary information on either side to measure the lag (in time >> transactions, but how do I know when a transaction was committed? or >> number of transactions?). > > The BDR codebase now has a handy function to report when a transaction > was committed, pg_get_transaction_committime(xid) . This looks handy for monitoring a replication setup. It should really be in core... Any plans? Or is there other ways to get this kind of information in core? -- Fabien.
On 05/31/2013 03:41 PM, Fabien COELHO wrote: > >>> However I'm not sure that pg_stat_replication currently has the >>> necessary information on either side to measure the lag (in time >>> transactions, but how do I know when a transaction was committed? or >>> number of transactions?). >> >> The BDR codebase now has a handy function to report when a transaction >> was committed, pg_get_transaction_committime(xid) . > > This looks handy for monitoring a replication setup. > It should really be in core... > > Any plans? Or is there other ways to get this kind of information in > core? Yes, it's my understanding that the idea is to eventually get all the BDR functionality merged, piece by piece, including the commit time tracking feature. pg_get_transaction_committime isn't trivial to just add to core because it requires a commit time to be recorded with commit records in the transaction logs, among other changes. I don't know if Andres or any of the others involved are planning on trying to get this particular feature merged in 9.4, but I wouldn't be too surprised since (AFAIK) it's fairly self-contained and would be useful for monitoring streaming replication setups as well. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2013-06-09 17:50:13 +0800, Craig Ringer wrote: > On 05/31/2013 03:41 PM, Fabien COELHO wrote: > > > >>> However I'm not sure that pg_stat_replication currently has the > >>> necessary information on either side to measure the lag (in time > >>> transactions, but how do I know when a transaction was committed? or > >>> number of transactions?). > >> > >> The BDR codebase now has a handy function to report when a transaction > >> was committed, pg_get_transaction_committime(xid) . > > > > This looks handy for monitoring a replication setup. > > It should really be in core... > > > > Any plans? Or is there other ways to get this kind of information in > > core? > pg_get_transaction_committime isn't trivial to just add to core because > it requires a commit time to be recorded with commit records in the > transaction logs, among other changes. The commit records actually already have that information available (c.f. xl_xact_commit(_compact) in xact.h), the problem is having a datastructure which collects all that. That's why the committs (written by Alvaro) added an slru mapping xids to timestamps. And yes, we want to submit that sometime. The pg_xlog_wait_remote_apply(), pg_xlog_wait_remote_receive() functions however don't need any additional infrastructure, so I think those are easier and less controversial to add. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 05/01/13 04:57, Fabien COELHO wrote: > > Add --throttle to pgbench > > Each client is throttled to the specified rate, which can be expressed in > tps or in time (s, ms, us). Throttling is achieved by scheduling > transactions along a Poisson-distribution. > > This is an update of the previous proposal which fix a typo in the sgml > documentation. > > The use case of the option is to be able to generate a continuous gentle > load for functional tests, eg in a practice session with students or for > testing features on a laptop. Why does this need two option formats (-H and --throttle)? Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin
>> The use case of the option is to be able to generate a continuous gentle >> load for functional tests, eg in a practice session with students or for >> testing features on a laptop. > > Why does this need two option formats (-H and --throttle)? On the latest version it is --rate and -R. Because you may want to put something very readable and understandable in a script and like long options, or have to type it interactively every day in a terminal and like short ones. Most UNIX commands include both kind. -- Fabien.
On 06/19/13 14:34, Fabien COELHO wrote: > >>> The use case of the option is to be able to generate a continuous gentle >>> load for functional tests, eg in a practice session with students or for >>> testing features on a laptop. >> >> Why does this need two option formats (-H and --throttle)? > > On the latest version it is --rate and -R. > > Because you may want to put something very readable and understandable in > a script and like long options, or have to type it interactively every day > in a terminal and like short ones. Most UNIX commands include both kind. > Would it make sense then to add long versions for all the other standard options too? Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin
>> Because you may want to put something very readable and understandable in >> a script and like long options, or have to type it interactively every day >> in a terminal and like short ones. Most UNIX commands include both kind. > > Would it make sense then to add long versions for all the other standard > options too? Yep. It is really a stylistic (pedantic?) matter. See for pgbench: https://commitfest.postgresql.org/action/patch_view?id=1106 -- Fabien.