Thread: PATCH: pgbench - aggregation of info written into log

PATCH: pgbench - aggregation of info written into log

From
"Tomas Vondra"
Date:
Hi,

this patch adds support for aggregation of info written into the log.
Instead of info about each transaction, a summary for time intervals (with
custom length) is written into the log. All you need to do is add "-A
seconds", e.g.

  $ pgbench -T 3600 -A 10 -l db

which will produce log with 10-second summaries, containing interval start
timestamp, number of transactions, sum of latencies, sum of 2nd power of
latencies, min and max latency (it's done this way to allow handling of
multiple logs produced by "-j" option).

This patch is a bit less polished (and more complex) than the other
pgbench patch I've sent a while back, and I'm not sure how to handle the
Windows branch. That needs to be fixed during the commit fest.

kind regards
Tomas
Attachment

Re: PATCH: pgbench - aggregation of info written into log

From
"Tomas Vondra"
Date:
On 24 Srpen 2012, 23:25, Tomas Vondra wrote:
> Hi,
>
> this patch adds support for aggregation of info written into the log.
> Instead of info about each transaction, a summary for time intervals (with
> custom length) is written into the log. All you need to do is add "-A
> seconds", e.g.
>
>   $ pgbench -T 3600 -A 10 -l db
>
> which will produce log with 10-second summaries, containing interval start
> timestamp, number of transactions, sum of latencies, sum of 2nd power of
> latencies, min and max latency (it's done this way to allow handling of
> multiple logs produced by "-j" option).

I've forgot to mention that just like the other patch, this is meant for
cases when you really don't need the raw per-transaction info, but
per-second summary is just enough. This is helpful especially when the
test produces a lot of transaction, thus huge log files etc.

So you can either log all the transactions and then post-process the huge
files, or completely skip that and just log the summaries.

Tomas




Re: PATCH: pgbench - aggregation of info written into log

From
Robert Haas
Date:
On Fri, Aug 24, 2012 at 5:25 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
> This patch is a bit less polished (and more complex) than the other
> pgbench patch I've sent a while back, and I'm not sure how to handle the
> Windows branch. That needs to be fixed during the commit fest.

What's the problem with the Windows branch?

Could you un-cuddle your brances to follow the PostgreSQL style, omit
braces around single-statement blocks, and remove the spurious
whitespace changes?

Instead of having both use_log_agg and naggseconds, I think you can
get by with just having a single variable that is zero if aggregation
is not in use and is otherwise the aggregation period.  On a related
note, you can't specify -A without an associated value, so there is no
point in documenting a "default".  As with your other patch, I suggest
using a long option name like --latency-aggregate-interval and then
making the name of the variable in the code match the option name,
with s/-/_/g, for clarity.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: PATCH: pgbench - aggregation of info written into log

From
"Tomas Vondra"
Date:
On 30 Srpen 2012, 18:02, Robert Haas wrote:
> On Fri, Aug 24, 2012 at 5:25 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>> This patch is a bit less polished (and more complex) than the other
>> pgbench patch I've sent a while back, and I'm not sure how to handle the
>> Windows branch. That needs to be fixed during the commit fest.
>
> What's the problem with the Windows branch?

Well, there are comments about how timestamp does not work on Windows
(filling 0), and I'm not sure how that affects the patch (e.g. determining
the aggregation interval). I have no Windows workstation available so I
can't actually try that.

> Could you un-cuddle your brances to follow the PostgreSQL style, omit
> braces around single-statement blocks, and remove the spurious
> whitespace changes?

OK, will do.

> Instead of having both use_log_agg and naggseconds, I think you can
> get by with just having a single variable that is zero if aggregation
> is not in use and is otherwise the aggregation period.  On a related
> note, you can't specify -A without an associated value, so there is no
> point in documenting a "default".  As with your other patch, I suggest
> using a long option name like --latency-aggregate-interval and then
> making the name of the variable in the code match the option name,
> with s/-/_/g, for clarity.

Why --latency-aggregate-interval? It has nothing to do with latencies,
it's merely number of transactions per interval.

Tomas




Re: PATCH: pgbench - aggregation of info written into log

From
"Tomas Vondra"
Date:
On 30 Srpen 2012, 23:47, Robert Haas wrote:
> On Thu, Aug 30, 2012 at 3:25 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>> On 30 Srpen 2012, 18:02, Robert Haas wrote:
>>> On Fri, Aug 24, 2012 at 5:25 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>>>> This patch is a bit less polished (and more complex) than the other
>>>> pgbench patch I've sent a while back, and I'm not sure how to handle
>>>> the
>>>> Windows branch. That needs to be fixed during the commit fest.
>>>
>>> What's the problem with the Windows branch?
>>
>> Well, there are comments about how timestamp does not work on Windows
>> (filling 0), and I'm not sure how that affects the patch (e.g.
>> determining
>> the aggregation interval). I have no Windows workstation available so I
>> can't actually try that.
>
> Hmm.  That seems like it might be something we need to fix first...
>
>>> Could you un-cuddle your brances to follow the PostgreSQL style, omit
>>> braces around single-statement blocks, and remove the spurious
>>> whitespace changes?
>>
>> OK, will do.
>>
>>> Instead of having both use_log_agg and naggseconds, I think you can
>>> get by with just having a single variable that is zero if aggregation
>>> is not in use and is otherwise the aggregation period.  On a related
>>> note, you can't specify -A without an associated value, so there is no
>>> point in documenting a "default".  As with your other patch, I suggest
>>> using a long option name like --latency-aggregate-interval and then
>>> making the name of the variable in the code match the option name,
>>> with s/-/_/g, for clarity.
>>
>> Why --latency-aggregate-interval? It has nothing to do with latencies,
>> it's merely number of transactions per interval.
>
> Oh, I thought it was modifying the behavior of -l.

It does, but AFAIK the "-l" means logging. I suppose
"--aggregate-interval" would be a good option name, I don't see a reason
to put there the additional word when there are other aggregated values
(e.g. num of transactions).

T.




Re: PATCH: pgbench - aggregation of info written into log

From
Robert Haas
Date:
On Thu, Aug 30, 2012 at 3:25 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
> On 30 Srpen 2012, 18:02, Robert Haas wrote:
>> On Fri, Aug 24, 2012 at 5:25 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>>> This patch is a bit less polished (and more complex) than the other
>>> pgbench patch I've sent a while back, and I'm not sure how to handle the
>>> Windows branch. That needs to be fixed during the commit fest.
>>
>> What's the problem with the Windows branch?
>
> Well, there are comments about how timestamp does not work on Windows
> (filling 0), and I'm not sure how that affects the patch (e.g. determining
> the aggregation interval). I have no Windows workstation available so I
> can't actually try that.

Hmm.  That seems like it might be something we need to fix first...

>> Could you un-cuddle your brances to follow the PostgreSQL style, omit
>> braces around single-statement blocks, and remove the spurious
>> whitespace changes?
>
> OK, will do.
>
>> Instead of having both use_log_agg and naggseconds, I think you can
>> get by with just having a single variable that is zero if aggregation
>> is not in use and is otherwise the aggregation period.  On a related
>> note, you can't specify -A without an associated value, so there is no
>> point in documenting a "default".  As with your other patch, I suggest
>> using a long option name like --latency-aggregate-interval and then
>> making the name of the variable in the code match the option name,
>> with s/-/_/g, for clarity.
>
> Why --latency-aggregate-interval? It has nothing to do with latencies,
> it's merely number of transactions per interval.

Oh, I thought it was modifying the behavior of -l.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: PATCH: pgbench - aggregation of info written into log

From
Robert Haas
Date:
On Thu, Aug 30, 2012 at 5:55 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
> It does, but AFAIK the "-l" means logging. I suppose
> "--aggregate-interval" would be a good option name, I don't see a reason
> to put there the additional word when there are other aggregated values
> (e.g. num of transactions).

Oh, I was thinking that the l was for latency, but maybe not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: PATCH: pgbench - aggregation of info written into log

From
Tomas Vondra
Date:
Dne 30.08.2012 18:02, Robert Haas napsal:
> On Fri, Aug 24, 2012 at 5:25 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>> This patch is a bit less polished (and more complex) than the other
>> pgbench patch I've sent a while back, and I'm not sure how to handle
>> the
>> Windows branch. That needs to be fixed during the commit fest.
>
> What's the problem with the Windows branch?
>
> Could you un-cuddle your brances to follow the PostgreSQL style, omit
> braces around single-statement blocks, and remove the spurious
> whitespace changes?

Done, or at least I don't see other formatting issues. Let me know if I
missed something.

> Instead of having both use_log_agg and naggseconds, I think you can
> get by with just having a single variable that is zero if aggregation
> is not in use and is otherwise the aggregation period.  On a related
> note, you can't specify -A without an associated value, so there is
> no
> point in documenting a "default".  As with your other patch, I
> suggest
> using a long option name like --latency-aggregate-interval and then
> making the name of the variable in the code match the option name,
> with s/-/_/g, for clarity.

Fixed. I've kept use_log_agg only and I've removed the default. Also
I've added one more check (that the total duration is a multiple of
the aggregation interval).

And just as with the sampling patch, I believe the "-l" should not be
enabled by default, but required. But if more people ask to enable it
whenever the aggregation or sampling is used, I'm fine with it.

Tomas
Attachment

Re: PATCH: pgbench - aggregation of info written into log

From
Jeff Janes
Date:
On Sun, Sep 2, 2012 at 3:46 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>
> Fixed. I've kept use_log_agg only and I've removed the default. Also
> I've added one more check (that the total duration is a multiple of
> the aggregation interval).

Hi Tomas,

In the docs, -l is an option, not an application:
"<application>-l</application>"

Also, the docs still refer to <option>-A</option>, rather than to
<option>--aggregate-interval</option>, in a few places.

I think a section in the docs that points out that transactions run by
specifying mulitple -f will be mingled together into an interval would
be a good idea, as that could easily be overlooked (as I did earlier).

The docs do not mention anywhere what the units for the latencies are.

I think the format of the logfile should somehow make it unmistakably
different from the regular -l logfile, for example by prefixing every
line with "Aggregate: ".   Or maybe there is some other solution, but
I think that having two log formats, both of which consist of nothing
but 6 integers, but yet mean very different things, is a recipe for
confusion.

Is it unavoidable that the interval be an integral number of seconds?
I found the units in the original code confusing, so maybe there is
some reason it needs to be like that that I don't understand yet.
I'll look into it some more.

Thanks,

Jeff



Re: PATCH: pgbench - aggregation of info written into log

From
Tomas Vondra
Date:
On 3.9.2012 01:28, Jeff Janes wrote:
> On Sun, Sep 2, 2012 at 3:46 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>>
>> Fixed. I've kept use_log_agg only and I've removed the default. Also
>> I've added one more check (that the total duration is a multiple of
>> the aggregation interval).
>
> Hi Tomas,
>
> In the docs, -l is an option, not an application:
> "<application>-l</application>"
>
> Also, the docs still refer to <option>-A</option>, rather than to
> <option>--aggregate-interval</option>, in a few places.

Fixed.

> I think a section in the docs that points out that transactions run by
> specifying mulitple -f will be mingled together into an interval would
> be a good idea, as that could easily be overlooked (as I did earlier).

Fixed, added a paragraph mentioning this.

> The docs do not mention anywhere what the units for the latencies are.
>
> I think the format of the logfile should somehow make it unmistakably
> different from the regular -l logfile, for example by prefixing every
> line with "Aggregate: ".   Or maybe there is some other solution, but
> I think that having two log formats, both of which consist of nothing
> but 6 integers, but yet mean very different things, is a recipe for
> confusion.

Not sure how to do that, I'd say it's a responsibility of the user.

> Is it unavoidable that the interval be an integral number of seconds?
> I found the units in the original code confusing, so maybe there is
> some reason it needs to be like that that I don't understand yet.
> I'll look into it some more.

Not really, but I don't see a real use case for such intervals, so I'm
keeping the integers for now.

Also, I've realized the last interval may not be logged at all - I'll
take a look into this in the next version of the patch.

Tomas

Attachment

Re: PATCH: pgbench - aggregation of info written into log

From
Alvaro Herrera
Date:
Tomas Vondra wrote:

> Also, I've realized the last interval may not be logged at all - I'll
> take a look into this in the next version of the patch.

I didn't see any later version of this patch posted anywhere.  I guess
it'll have to wait until the next commitfest.  Please fix the remaining
issues and resubmit.

Thanks to Robert Haas, Jeff Janes and Pavel Stehule for the reviews.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services