Thread: pgbench --latency-limit option

pgbench --latency-limit option

From

Tatsuo Ishii

Date:

23 December 2015, 02:28:49

While playing with 9.5's pgbench, I faced with a strange behavior.

$ pgbench -p 11002 --rate 2 --latency-limit 1 -c 10 -T 10 test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
duration: 10 s
number of transactions actually processed: 16
number of transactions skipped: 0 (0.000 %)
number of transactions above the 1.0 ms latency limit: 16 (100.000 %)
latency average: 5.518 ms
latency stddev: 1.834 ms
rate limit schedule lag: avg 0.694 (max 1.823) ms
tps = 1.599917 (including connections establishing)
tps = 1.600319 (excluding connections establishing)

From the pgbench manual:
     <term><option>--latency-limit=</option><replaceable>limit</></term>     <listitem>      <para>       Transaction
whichlast more than <replaceable>limit</> milliseconds       are counted and reported separately, as
<firstterm>late</>.     </para>      <para>       When throttling is used (<option>--rate=...</>), transactions that
  lag behind schedule by more than <replaceable>limit</> ms, and thus       have no hope of meeting the latency limit,
arenot sent to the server       at all. They are counted and reported separately as       <firstterm>skipped</>.
 

In my understanding, this says: any transaction takes longer than the
duration specified by --latency-limit (in this case 1.0 ms) will not
be sent the sever. In the case above all (16) transactions were behind
the latency limit 1.0 ms:

number of transactions above the 1.0 ms latency limit: 16 (100.000 %)

So the number of skipped transactions should be 16 (100%), rather
than:

number of transactions skipped: 0 (0.000 %)

in this case I think.

Am I missing something?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: pgbench --latency-limit option

From

Robert Haas

Date:

23 December 2015, 14:28:50

On Tue, Dec 22, 2015 at 9:28 PM, Tatsuo Ishii <ishii@postgresql.org> wrote:
> While playing with 9.5's pgbench, I faced with a strange behavior.
>
> $ pgbench -p 11002 --rate 2 --latency-limit 1 -c 10 -T 10 test
> starting vacuum...end.
> transaction type: TPC-B (sort of)
> scaling factor: 1
> query mode: simple
> number of clients: 10
> number of threads: 1
> duration: 10 s
> number of transactions actually processed: 16
> number of transactions skipped: 0 (0.000 %)
> number of transactions above the 1.0 ms latency limit: 16 (100.000 %)
> latency average: 5.518 ms
> latency stddev: 1.834 ms
> rate limit schedule lag: avg 0.694 (max 1.823) ms
> tps = 1.599917 (including connections establishing)
> tps = 1.600319 (excluding connections establishing)
>
> From the pgbench manual:
>
>       <term><option>--latency-limit=</option><replaceable>limit</></term>
>       <listitem>
>        <para>
>         Transaction which last more than <replaceable>limit</> milliseconds
>         are counted and reported separately, as <firstterm>late</>.
>        </para>
>        <para>
>         When throttling is used (<option>--rate=...</>), transactions that
>         lag behind schedule by more than <replaceable>limit</> ms, and thus
>         have no hope of meeting the latency limit, are not sent to the server
>         at all. They are counted and reported separately as
>         <firstterm>skipped</>.
>
> In my understanding, this says: any transaction takes longer than the
> duration specified by --latency-limit (in this case 1.0 ms) will not
> be sent the sever.

I don't think that's what it says.  There seem to be three points here:

1. If the transaction is sent to the server, we'll check whether it
runs for longer than the amount of time specified by the limit; if so,
it will be reported separately.  This is true with or without --rate.

2. If --rate is used, we'll calculate the latency for each statement
based on the time it was due to be sent, rather than the time it
actually got sent.  (This is further clarified in the documentation
for --rate.)

3. If --rate is used and the server is so far behind that
--latency-limit cannot possibly be met, then we'll skip sending the
query at all.

In your example, you've got 10 connections and are trying to run at 2
tps, so to avoid having to start skipping things you need transaction
response times to be <~ 5 ms.  The actual response time is ~5.5ms, so
if you ran the test for longer I think you would see some skips.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pgbench --latency-limit option

From

Fabien COELHO

Date:

23 December 2015, 14:53:02

Hello Robert & Tatsuo,

Some paraphrasing and additional comments.

>> $ pgbench -p 11002 --rate 2 --latency-limit 1 -c 10 -T 10 test

You are targetting 2 tps over 10 connections, so that is about one 
transaction every 5 seconds for each connection, the target is about 20 
transactions in 10 seconds. You want transaction latency below 1 *ms*.

>> number of transactions actually processed: 16

The stochastic process scheduled 16 transactions during the 10 seconds, 
1.6 tps. In the long run it should be close to 2 tps, if the stochastic 
process does its job as expected.

>> number of transactions skipped: 0 (0.000 %)

All transactions were started (i.e. no transaction was skipped).

>> number of transactions above the 1.0 ms latency limit: 16 (100.000 %)

But none responded within 1 ms, they were all late.

>> latency average: 5.518 ms
>> latency stddev: 1.834 ms

Indeed, unlikely to be under 1 ms.

>> In my understanding, this says: any transaction takes longer than the
>> duration specified by --latency-limit (in this case 1.0 ms) will not
>> be sent the sever.

We cannot know that a transaction will take longer before running it.

> I don't think that's what it says.  There seem to be three points here:
>
> 1. If the transaction is sent to the server, we'll check whether it
> runs for longer than the amount of time specified by the limit; if so,
> it will be reported separately.  This is true with or without --rate.

Yes.

> 2. If --rate is used, we'll calculate the latency for each statement
> based on the time it was due to be sent, rather than the time it
> actually got sent.  (This is further clarified in the documentation
> for --rate.)

Yes. The idea is that the client wanted (say a web server) to send a 
transaction a time t, but due to the load or whatever it may not have been 
able to send it at that time, but it sends it later.

> 3. If --rate is used and the server is so far behind that 
> --latency-limit cannot possibly be met, then we'll skip sending the 
> query at all.

Yes. The time when the client finally get to send the transaction, the 
current time is beyond schedule + delay limit, no way to get an answer
in time, this simulates a client timeout, where the client gives up 
getting an answer.

> In your example, you've got 10 connections and are trying to run at 2
> tps, so to avoid having to start skipping things you need transaction
> response times to be <~ 5 ms.  The actual response time is ~5.5ms, so
> if you ran the test for longer I think you would see some skips.

Probably no skips though, because the response time needed is below 5 
*seconds*, not ms : 2 tps on 10 connections, 1 transaction every 5 seconds 
for each connection.

-- 
Fabien.

Re: pgbench --latency-limit option

From

Robert Haas

Date:

23 December 2015, 15:44:45

On Wed, Dec 23, 2015 at 9:52 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>> In your example, you've got 10 connections and are trying to run at 2
>> tps, so to avoid having to start skipping things you need transaction
>> response times to be <~ 5 ms.  The actual response time is ~5.5ms, so
>> if you ran the test for longer I think you would see some skips.
>
> Probably no skips though, because the response time needed is below 5
> *seconds*, not ms : 2 tps on 10 connections, 1 transaction every 5 seconds
> for each connection.

Oops.  Right.  But why did this test only run 16 transactions in total
instead of 20?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pgbench --latency-limit option

From

Fabien COELHO

Date:

23 December 2015, 16:23:59

>> Probably no skips though, because the response time needed is below 5
>> *seconds*, not ms : 2 tps on 10 connections, 1 transaction every 5 seconds
>> for each connection.
>
> Oops.  Right.  But why did this test only run 16 transactions in total
> instead of 20?

Because the schedule is based on a stochastic process, transactions are 
not set regularly (that would induce patterns and is not representative of 
real-life load) but randomly.

The long term average is expected to converge to 2 tps, but on a short run 
it may differ significantly.

-- 
Fabien.

Re: pgbench --latency-limit option

From

Robert Haas

Date:

23 December 2015, 16:28:00

On Wed, Dec 23, 2015 at 11:23 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>>> Probably no skips though, because the response time needed is below 5
>>> *seconds*, not ms : 2 tps on 10 connections, 1 transaction every 5
>>> seconds
>>> for each connection.
>>
>> Oops.  Right.  But why did this test only run 16 transactions in total
>> instead of 20?
>
> Because the schedule is based on a stochastic process, transactions are not
> set regularly (that would induce patterns and is not representative of
> real-life load) but randomly.
>
> The long term average is expected to converge to 2 tps, but on a short run
> it may differ significantly.

Hmm.  Is that documented somewhere?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pgbench --latency-limit option

From

Fabien COELHO

Date:

23 December 2015, 18:07:28

>> [...]
>> Because the schedule is based on a stochastic process, transactions are not
>> set regularly (that would induce patterns and is not representative of
>> real-life load) but randomly.
>>
>> The long term average is expected to converge to 2 tps, but on a short run
>> it may differ significantly.
>
> Hmm.  Is that documented somewhere?

Sure, see --rate option in pgbench doc, which states:

"The rate is targeted by starting transactions along a Poisson-distributed 
schedule time line."

The impact on the tps is implied, though.

-- 
Fabien.