Re: database design question, new to postgresql - Mailing list pgsql-novice
From | Fei Liu |
---|---|
Subject | Re: database design question, new to postgresql |
Date | |
Msg-id | 46325C1C.7060108@aepnetworks.com Whole thread Raw |
In response to | database design question, new to postgresql (Fei Liu <fei.liu@aepnetworks.com>) |
Responses |
Re: database design question, new to postgresql
|
List | pgsql-novice |
Fei Liu wrote: > Hello group, I need to design and develop a web reporting system to > let users query/view syslog files on a unix host. For now, I am > concentrating on the authentication file that has user logon > (success/failure) and logoff records. The log file is logrotated every > week or so. My reporting system parses the log entries and put the > result into a postgresql database (I am proposing to use postgresql as > the backend). Since this deals with multi-year archive and I believe > 'partitioing' is an ideal feature to handle this problem. So here is > the design scheme: > > CREATE TABLE logon_success( > name varchar(32) not null, > srcip inet not null, > date date not null, > time time not null, > ... > ); > > > CREATE TABLE logon_success_yy${year}mm${month}( > CHECK (date >= DATE '$year-$month-01' AND date < DATE > '$next_year-$next_month-1') > ) > INHERITS ($tname) > ; > > As you can see from the sample code, I am using perl to dynamically > generate children tables as I parse log files in a daily cron job > script. Once the log file is analyzed and archived in the database, I > have a simple web UI that sysadmin can select and view user logon > events. I have built a sample framework and it works so far. Keep in > mind, this reporting system is not limited to just user logon, it > should also work with system events such as services failures/startup, > hardware failures, etc > > Now here are my questions: > 1) Should I use database to implement such a reporting system? Are > there any alternatives, architects, designs? > 2) Is partitioning a good approach to speed up log query/view? The > user comment in partitioning in pgsql manual seems to indicate > partitioning may be slower than non-partitioned table under certain > circumstances. > 3) How to avoid repetitive log entry scanning since my cron job script > is run daily but logrotate runs weekly? This means everytime my script > will be parsing duplicate entries. > 4) When parsing log files, it's quite possible that there are > identical entries, for example a user logins really fast, resulting 2 > or more identical entries..In this case can I still use primary > key/index at all? If I can, how do I design primary key or index to > speed up query? > 3) What are the most glaring limitations and flaws in my design? > > Thank you for taking time to review and answer my questions! Let me > know if I am not clear on any specific detail.. > > Fei > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend Let me add one more question, what are the best approaches to analyze postgresql query performance and how to improve postgresql query performance? Fei
pgsql-novice by date: