Thread: sytem log audit/reporting and psql

sytem log audit/reporting and psql

From

Fei Liu

Date:

30 April 2007, 11:13:48

Hello group, I need to design and develop a web reporting system to let
users query/view syslog files on a unix host. For now, I am
concentrating on the authentication file that has user logon
(success/failure) and logoff records. The log file is logrotated every
week or so. My reporting system parses the log entries and put the
result into a postgresql database (I am proposing to use postgresql as
the backend). Since this deals with multi-year archive and I believe
'partitioing' is an ideal feature to handle this problem. So here is the
design scheme:

CREATE TABLE logon_success(
  name varchar(32) not null,
  srcip inet not null,
  date date not null,
  time time not null,
  ...
);


CREATE TABLE logon_success_yy${year}mm${month}(
  CHECK (date >= DATE '$year-$month-01' AND date < DATE
'$next_year-$next_month-1')
)
INHERITS ($tname)
;

As you can see from the sample code, I am using perl to dynamically
generate children tables as I parse log files in a daily cron job
script. Once the log file is analyzed and archived in the database, I
have a simple web UI that sysadmin can select and view user logon
events. I have built a sample framework and it works so far. Keep in
mind, this reporting system is not limited to just user logon, it should
also work with system events such as services failures/startup, hardware
failures, etc

My initial testing has not shown any significant difference between a
partitioning approach and a plain (all entries in master) database
approach...
2005-01-01 | 00:27:55 | firewood | ssh | Login Successful | None | local
| user9819 | 192.168.1.31

My test was based on two artificial tables that has 1700 records per day
from 2004-02-01 to 2007-04-27, around 2 million entries that are
identical in both tables.
My test script:
echo Testing database $t1 time based
time psql -p 5583 netilla postgres << EOF
select count(date) from $t1 where date > '2005-03-01' and date <
'2006-12-11';
\q
EOF

echo Testing database $t2 time based
time psql -p 5583 netilla postgres << EOF
select count(date) from $t2 where date > '2005-03-01' and date <
'2006-12-11';
\q
EOF

Result:
./timing_test.sh
Testing database logon_test time based
 count
---------
1121472
(1 row)

0.00user 0.00system 0:02.92elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+456minor)pagefaults 0swaps
Testing database logon_test2 time based
 count
---------
1121472
(1 row)

0.00user 0.00system 0:02.52elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+456minor)pagefaults 0swaps

But the numbers are really not static and logon_test2 (with
partitioning) sometimes behave worse than logon_test...

Now here are my questions:
1) Should I use database to implement such a reporting system? Are there
any alternatives, architects, designs?
2) Is partitioning a good approach to speed up log query/view? The user
comment in partitioning in pgsql manual seems to indicate partitioning
may be slower than non-partitioned table under certain circumstances.
3) How to avoid repetitive log entry scanning since my cron job script
is run daily but logrotate runs weekly? This means everytime my script
will be parsing duplicate entries.
4) When parsing log files, it's quite possible that there are identical
entries, for example a user logins really fast, resulting 2 or more
identical entries..In this case can I still use primary key/index at
all? If I can, how do I design primary key or index to speed up query?
5) What are the most glaring limitations and flaws in my design?
6) What are the best approaches to analyze postgresql query performance
and how to improve postgresql query performance?
Thank you for taking time to review and answer my questions! Let me know
if I am not clear on any specific detail..

Fei

Re: sytem log audit/reporting and psql

From

Andreas Haumer

Date:

01 May 2007, 14:36:21

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi!

Fei Liu schrieb:
[...]
>
> Now here are my questions:

These are a lot of questions, and some of them are not related
to pgsql-performance or even PostgreSQL.

I'll try to answer some of them, because I'm currently experimenting
with partitioned tables, too.


> 2) Is partitioning a good approach to speed up log query/view? The user
> comment in partitioning in pgsql manual seems to indicate partitioning
> may be slower than non-partitioned table under certain circumstances.

You can look at table partitioning under several points of view.
My experience with table partitioning under the aspect of "performance" is:

*) It is a benefit for large tables only, and the definition of
   "large" depends on your data.
   I did some testing for an application we are developing and here
   it shows that table partitioning does not seem to make sense for
   tables with less than 10 million rows (perhaps even more)

*) The performance benefit depends on your queries. Some queries get
   a big improvement, but some queries might even run significantly
   slower.

*) Depending on the way you setup your system, inserts can be much
   slower with partitioned tables (e.g. if you are using triggers
   to automatically create your partitions on demand)

> 3) How to avoid repetitive log entry scanning since my cron job script
> is run daily but logrotate runs weekly? This means everytime my script
> will be parsing duplicate entries.

This has nothing to do with postgres, but I wrote something similar years
ago. Here's what I did and what you could do:

Remember the last line of your logfile in some external file (or even the
database). Then on the next run you can read the logfile again line by line
and skip all lines until you have found the line you saved on the last run.
* If you find the line that way, just start parsing the logfile beginning
  at the next line.
* If you can not find your line and you reach EOF, start parsing again at
  the beginning of the logfile.
* If this is your first run and you don't have a line stored yet, start
  parsing at the beginning of the logfile

When you are finished you have to remember the last line from the logfile
again at some place.


> 6) What are the best approaches to analyze postgresql query performance
> and how to improve postgresql query performance?

Here are some general recommendations for performance testing from
my experience:

*) Test with real data. For table partitioning this means you have
   to create really large datasets to make your tests useful.
   You should write a small program to generate your test data if
   possible or use some other means to create your test database.
   You also need time: creating a test database and importing 100
   million rows of test data will take several hours or even days!

*) Test with the real queries from your application!
   Testing with just a few easy standard queries will almost for sure
   not be sufficient to get the right numbers for the performance you
   will see in your application later on!
   Look at the thread "Query performance problems with partitioned tables"
   I started on pgsql-performance just yesterday to see what I mean!

*) Use "EXPLAIN ANALYZE" and look at the "cost" and "actual time" numbers
   this gives you. It also will show you the query plan used by PostgreSQL
   when executing your query. You sometimes might be surprised what is going
   on behind the scenes...

*) If you are just using some stopwatch to time your queries be aware
   of other factors which might significantly influence your test:
   Caching, other jobs running on the machine in parallel, cosmic rays, ...

*) Before running your tests you should always try to get to some well
   defined starting point (this might even mean rebooting your server
   before running each test) and you should always repeat each test
   several times and then calculate a mean value (and standard deviation
   to see how "good" your results are...)

*) Document your test setup and procedure as well as your results
   (otherwise two days later you won't remember which test obtained
   what result)

HTH

- - andreas

- --
Andreas Haumer                     | mailto:andreas@xss.co.at
*x Software + Systeme              | http://www.xss.co.at/
Karmarschgasse 51/2/20             | Tel: +43-1-6060114-0
A-1100 Vienna, Austria             | Fax: +43-1-6060114-71
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGN3NIxJmyeGcXPhERAspaAJ9MgymiwyehN6yU6jGtA0pbkdolsACfb6JC
kB5KLyQ5WOTUD9uabVzsjwY=
=3QSa
-----END PGP SIGNATURE-----