Thread: Analyzer for postgresql.log

Analyzer for postgresql.log

From
Antonio Fiol Bonnín
Date:
Attached is a little AWK script that you may find of some use.

It takes as standard input a file with the format of the Postgresql log
(level 2, for PostgreSQL 7.0.3 tested).

On standard output, you get a list of times and a mangled form of every
executed request.

Something like

13;select * from my_table where field=''
15;select * from other_table where id= and test>

All number are removed from requests. Also removed are the contents
between single quotes.

This allows me to sort by request and then either count them or add up
the times. That way I know what I need to optimize.

The script is far from perfect. It just works for me ;-)

Antonio Fiol
#!/usr/bin/awk -f

# Copyright 2001 Antonio Fiol - W3ping S.A.
# E-mail: fiol@w3ping.com

#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.

#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.

#  You should have received a copy of the GNU General Public License
#  along with this program; if not, write to the Free Software
#  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA



function datediff ( start, end,    sf, ef, diff ) {
    split (start, sf, /\.|:/);
    split (end,   ef,   /\.|:/);
    diff= (ef[5]-sf[5]);
    diff+=(ef[4]-sf[4])*1000;
    diff+=(ef[3]-sf[3])*1000*60;
    diff+=(ef[2]-sf[2])*1000*60*60;
    diff+=(ef[1]-sf[1])*1000*60*60*24;
    #Month change not controlled
    return diff;
}

BEGIN {
    OFS=";"
}
/StartTransactionCommand/ {
    start[$2]=$1;
}
/query:/ {
    record=$0;
    gsub(/.*query: /,"",record);
    gsub(/[m-]?([0-9]+\.?|\.[0-9])[0-9]*/,"",record); # Consider numbers like 123, -123.4 or m123
    gsub(/'[^']*'/,"''",record);
    gsub(/ *;.*$/,"",record); # Hope that query was not like query1;query2;. We will delete query2 :-(
    query[$2]=record;
}
/CommitTransactionCommand/ {
    if($2 in start) {
      print datediff(start[$2], $1), "\""query[$2]"\"";
      fflush();
    }
    delete start[$2];
    delete query[$2];
}

Re: Analyzer for postgresql.log

From
Tielman J de Villiers
Date:
Thank you,

I will be using this in a cronjob to act kind of as the "analog" web log
analyser for my postgres logs ...

Just one question -- the "list of times" (eg 13,15) wheat do they refer to
(mine all shows a "0")

Regards

Tielman J de Villiers
BondNet Pty Ltd

-----Original Message-----
From: Antonio Fiol Bonnín [mailto:fiol@w3ping.com]
Sent: Monday, December 10, 2001 2:37 PM
To: PostgreSQL General Mailing list
Subject: [GENERAL] Analyzer for postgresql.log


Attached is a little AWK script that you may find of some use.

It takes as standard input a file with the format of the Postgresql log
(level 2, for PostgreSQL 7.0.3 tested).

On standard output, you get a list of times and a mangled form of every
executed request.

Something like

13;select * from my_table where field=''
15;select * from other_table where id= and test>

All number are removed from requests. Also removed are the contents
between single quotes.

This allows me to sort by request and then either count them or add up
the times. That way I know what I need to optimize.

The script is far from perfect. It just works for me ;-)

Antonio Fiol

Re: Analyzer for postgresql.log

From
Antonio Fiol Bonnín
Date:
Hi Tielman,

The times I refer to are the execution times of the queries.

In my log file, there is a line containing the beginning of statement
timestamp, another containing a timestamp and a query, and finally one
that includes a timestamp when the instruction finished.

The timestamp is formatted like 011210.18:04:44.123 for the date and
time I was writing this e-mail.

I take the beginning timestamp and substract the end timestamp. The
resulting value, in milliseconds (see script for details) is printed
along with the query.

If your queries take 0 ms then you will get

0;begin

for example.


As I said, "It works for me.", which means that I took my very own log,
read it, and built a dumb reader for that specific log.

The script is not very well commented. Hmmm... is not commented at all
(sorry), but hopefully it shouldn't be too hard to understand.

Probably if you are not using PostgreSQL 7.0.3 the log file format may
have changed. You will need to adapt it to your very needs. Particularly
if the timestamp format has changed, my script may well understand
absolutely nothing.

BTW, if a number on a query has the form "m1234" (yes, it's a "m"), that
will be considered a number, as if it were "-1234". I needed that.
Taking that functionality out of the script is just a matter of taking
the "m" out of the regex.

I use the PID (which is shown right after the timestamp, in square
brackets, on my log file) to track which backend executes which query.
That allows me to know which actual query a given beginning and end match.

cat complaints > /dev/null, but I promise to try to help you getting it
work, if you send me a piece of your log file (several statements should
be enough).

Yours,

Antonio Fiol

P.S. Here are three lines I find particularly useful, once the script works:

# Sort by request (needed for the other lines)
sort -t ';' -k 2 <post.analysis > post.analysis.sorted
# Count every request (requires a sorted input)
cut -f 2 -d ';' post.analysis.sorted | uniq -c | sort -r -n >
post.analysis.bycount
# Add up the times used by every request type (also requires a sorted input)
awk 'BEGIN { FS=";"; s=0; q=""; } // { if(q==$2) s+=$1; else { print s,
q; s=$1; q=$2; } }' < post.analysis.sorted | sort -r -n >
post.analysis.bytime

Have fun!



Tielman J de Villiers wrote:

>Thank you,
>
>I will be using this in a cronjob to act kind of as the "analog" web log
>analyser for my postgres logs ...
>
>Just one question -- the "list of times" (eg 13,15) wheat do they refer to
>(mine all shows a "0")
>
>Regards
>
>Tielman J de Villiers
>BondNet Pty Ltd
>
>-----Original Message-----
>From: Antonio Fiol Bonnín [mailto:fiol@w3ping.com]
>Sent: Monday, December 10, 2001 2:37 PM
>To: PostgreSQL General Mailing list
>Subject: [GENERAL] Analyzer for postgresql.log
>
>
>Attached is a little AWK script that you may find of some use.
>
>It takes as standard input a file with the format of the Postgresql log
>(level 2, for PostgreSQL 7.0.3 tested).
>
>On standard output, you get a list of times and a mangled form of every
>executed request.
>
>Something like
>
>13;select * from my_table where field=''
>15;select * from other_table where id= and test>
>
>All number are removed from requests. Also removed are the contents
>between single quotes.
>
>This allows me to sort by request and then either count them or add up
>the times. That way I know what I need to optimize.
>
>The script is far from perfect. It just works for me ;-)
>
>Antonio Fiol
>
>.
>




Re: Analyzer for postgresql.log

From
Justin Clift
Date:
Hi,

You might also like to check out "PostgreSQL Log Analyzer v1.1" by
Gilles Darold :

http://www.samse.fr/GPL/pg_analyzer/sample/

:)

Regards and best wishes,

Justin Clift


Tielman J de Villiers wrote:
>
> Thank you,
>
> I will be using this in a cronjob to act kind of as the "analog" web log
> analyser for my postgres logs ...
>
> Just one question -- the "list of times" (eg 13,15) wheat do they refer to
> (mine all shows a "0")
>
> Regards
>
> Tielman J de Villiers
> BondNet Pty Ltd
>
> -----Original Message-----
> From: Antonio Fiol Bonnín [mailto:fiol@w3ping.com]
> Sent: Monday, December 10, 2001 2:37 PM
> To: PostgreSQL General Mailing list
> Subject: [GENERAL] Analyzer for postgresql.log
>
> Attached is a little AWK script that you may find of some use.
>
> It takes as standard input a file with the format of the Postgresql log
> (level 2, for PostgreSQL 7.0.3 tested).
>
> On standard output, you get a list of times and a mangled form of every
> executed request.
>
> Something like
>
> 13;select * from my_table where field=''
> 15;select * from other_table where id= and test>
>
> All number are removed from requests. Also removed are the contents
> between single quotes.
>
> This allows me to sort by request and then either count them or add up
> the times. That way I know what I need to optimize.
>
> The script is far from perfect. It just works for me ;-)
>
> Antonio Fiol
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

--
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
     - Indira Gandhi

Re: Analyzer for postgresql.log

From
Antonio Fiol Bonnín
Date:
A quick comparison. Sorry about my /partial/ point of view. Gilles
DAROLD is on BCC: so that he can also give his opinion.

His is very nice, but the main point in mine is completely different. I
focus on TIMING the different types of queries, to be able to add times
up and know which ones are more important to optimize.

I think his parsing of requests is far better than mine, however. I just
did not do any better because I did not need any better ;-)

He is generating nice graphs. I just output some bare numbers.

He minds when queries are executed. I do not mind about that at all.

Mine is completely focused to later optimization. His is focusing reporting.

He is targetting some public. I wrote it for me. (Yes, I am probably
more egoist ;-)

Note that I am not saying that mine is better. It is simply different.

Gilles: You will be able to find the AWK script on PostgreSQL archives.
If you wish me to send it to you directly, please e-mail me.

Gilles (again): If you want to respond to the whole PostgreSQL-general
list, you may need to subscribe. And sorry for abruptly including you on
this mailing list thread. Your address is now on BCC, so that you can
keep it protected from others it if you wish.

Gilles (once again): If you feel that the script I wrote fits into your
pack, it is under GPL. Just include it ;-)

Justin: Thank you for showing other possibilities. For web log analysis,
Gilles' software is definitely the choice.


Have a nice day!

Antonio Fiol



Justin Clift wrote:

>Hi,
>
>You might also like to check out "PostgreSQL Log Analyzer v1.1" by
>Gilles Darold :
>
>http://www.samse.fr/GPL/pg_analyzer/sample/
>
>:)
>
>Regards and best wishes,
>
>Justin Clift
>
>
>Tielman J de Villiers wrote:
>
>>Thank you,
>>
>>I will be using this in a cronjob to act kind of as the "analog" web log
>>analyser for my postgres logs ...
>>
>>Just one question -- the "list of times" (eg 13,15) wheat do they refer to
>>(mine all shows a "0")
>>
>>Regards
>>
>>Tielman J de Villiers
>>BondNet Pty Ltd
>>
>>-----Original Message-----
>>From: Antonio Fiol Bonnín [mailto:fiol@w3ping.com]
>>Sent: Monday, December 10, 2001 2:37 PM
>>To: PostgreSQL General Mailing list
>>Subject: [GENERAL] Analyzer for postgresql.log
>>
>>Attached is a little AWK script that you may find of some use.
>>
>>It takes as standard input a file with the format of the Postgresql log
>>(level 2, for PostgreSQL 7.0.3 tested).
>>
>>On standard output, you get a list of times and a mangled form of every
>>executed request.
>>
>>Something like
>>
>>13;select * from my_table where field=''
>>15;select * from other_table where id= and test>
>>
>>All number are removed from requests. Also removed are the contents
>>between single quotes.
>>
>>This allows me to sort by request and then either count them or add up
>>the times. That way I know what I need to optimize.
>>
>>The script is far from perfect. It just works for me ;-)
>>
>>Antonio Fiol
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 4: Don't 'kill -9' the postmaster
>>
>