RFC: Industrial-strength logging (long message) - Mailing list pgsql-hackers
From | Tim Holloway |
---|---|
Subject | RFC: Industrial-strength logging (long message) |
Date | |
Msg-id | 3810FBDA.482F904A@southeast.net Whole thread Raw |
Responses |
Re: [HACKERS] RFC: Industrial-strength logging (long message)
Re: [HACKERS] RFC: Industrial-strength logging (long message) |
List | pgsql-hackers |
Request For Comments: Towards an industrial-strength logging facility 1999-10-19, Tim Holloway mtsinc@leading.net Introduction. PostgreSQL is a commercial-quality DBMS. However, one item generally found in commercial DBMS's that PostgreSQL has so far lacked has been a logging facility. Yes, it has a debugging facility that can spit out reams of useful information, but debugging is not logging - it has different goals and constraints. This, then, is an attempt to provide that missg item. What should a log look like? This depends. I like a console-style listing, as my needs are simple. Others would prefer that the log be itself a database. Happily, I think that what I have developed so far can be used for both. While it's perilous to attempt to be all things to all people, my experience in working with the Amiga was that simplicity doesn't have to mean rigidity or lack of ability. Preliminary design efforts have resulted in a plan that I think will satisfy the majority of DBA's. Time will tell. Design goals 1. Robustness. Adding logging should not cause the system to become unstable. 2. Performance. Unless you're IBM at least, logging is a means, not an end. The performance of the system must not be degraded. 3. Security. Since logging is often part of a security effort, it's only reasonable that the log itself be secure. As of this writing, security is that of the PostgreSQL backend and/or syslog facilities. 4. Routability. Preliminary design permits routing any or all events to multiple destinations, each of which is individually controllable as to format. Abuse of this power may impact 2), above, however. 5. Locale support. Not everyone's preferred language is English. Because each and every log message is fully configurable, and because care is given to formatting based on locale, the DBA can customize logging to the convenience of his or her own culture. I hope that those who benfit from this will keep my on the proper path. Implementation "Simple things should be simple and complex things should be possible" Alan Kay I've seen far too many systems where simple things were complex and complex things were simple and other variantions on that theme. I HOPE that's not what I'm producing. If I am, PLEASE LET ME KNOW!!! Although the same mechanism is at work at all times, the defaults are set to display just enough information to let you know that more is possible: Postgres [123] 900 - Logging configuration file "/usr/local/pgsql/data/postgresql.conf" was not found or denied access. Using default logging. Postgres [123] 101 - Server started Postgres [123] 102 - Server shutdown These messages are routed to stderr (if available) for the backend process AND to syslog (if available). If there are other worthy default channels, I'd like to know them. The Next Level The logging configuration file allows customizing of logging. At one extreme, it can be used to suppress ALL logging - even the default items listed above. At the other - suffice it to say that you can get VERY elaborate. There are 3 types of information in the logging configuration file (which may, but likely won't, be part of pg_hba.conf) Logging info is read at startup. There may exist signals that cause it to be reread, but not just yet. They are: 1. General log control. For example, suppression of high-demand activities BEFORE they get run, formatted and sent to the log subsystem. 2. Message format. Allows definition/override of message formats on a class (see below) and individual basis. This is both how formatting for database load and locale are done. Multiple message formats are supported! 3. Message routing. Allows definition of the destination(s) of log messages. Each destination (channel) can select which messages it will format and report. To avoid potential loss of critical info, any message not explicitly routed at least once gets reported on the default channel - stderr/syslog, unless otherwise configured. Message classes Implicit in the desire for logging into a database is the understanding that some types of messages may have identical formats but different content. To facilitate this (and to aid in locale support) each possible message has a unique identifier, and related messages (those which would route to the same table) are grouped into classes, identified by century, as in the http and other familiar protocols. Classes for PostgreSQL logging are not grouped by severity, however, but by their affinity for a given statistical table. Tentatively: 1xz - The PostgreSQL server 2xx - User-related information 3xx - Transaction information 4xx - EXPLAIN results (???) 9xx - General system alerts Right now, the following are considered likely candidates, subject to user feedback: server infoServer name, signal ID 101 - Server started 102 - Server shutdown 103 - Signal xxx received 104 - Server ABEND user sessionuserid, port or terminal ID, authentication scheme name (e.g. md5). session ID 201 - User xxxx connected via port/terminal xxxxxxxx authenticated by aaaaa 202 - User xxxx disconnected 203 - FORBIDDEN - connection denied for user xxxx via port/terminal xxxxxxxxxx rejected by aaaaaaa show commandsSession ID, command text 301 - SELECT text 302 - INSERT text 303 - UPDATE text 304 - DELETE text show resultssession ID, count or OID. primary/first/only table ID affected 401 - SUCCESS - nnn records retrieved 402 - SUCCESS - record inserted at OID 403 - SUCCESS - nnn records updated 404 - SUCCESS - nnn records deleted 405 - FORBIDDEN - action xxxxxx denied to user xxxx on table xxxxxxxx explainas below: 500 EXPLAIN transaction ID sequence cost rows bytes miscellaneousexplanatory text 900 - Logging configuration file "ffff" was not found or denied read access. Using default logging. 901 - Logging configuration file "ffff" could not be processed - invalid text at line nnn. 902 - User overrides non-existent message ID nnn 903 - Channel requests non-existent message ID nnn 904 - end of section starting on line nnn was not found 905 - start of section ending on line nnn was not found 906 - (message from logging configuration file) Multiple message format tables may be defined. Each of these overrides or disables one or more of of the messages listed above - or its "final" equivalent. Messages which aren't overridden display in their default (English text) mode. Because this could be VERY rude to a table loader, each channel must explicitly request which messages are acceptable (this also facilitates routing of message classes). The default channel catches unsolicited messages as a safeguard. To make it easier, both common message formats in the message format tables and their correspoonding solicitations in the channel definitions allow ranges to be defined. E.g., you can define a logfile format for messages 301-304 rather than having to replicate the same format for each. A brief example --- SUBJECT TO RADICAL CHANGE! --- ; One possible implementation of logging configuration: ; an SQL style - verb attribute(value[s]) might be better? ; <logging> <options> level = warning ; of INFO, NOTICE, DEBUG, WARNING, ERROR, or FATAL log directory = /var/log/postgresql startmessage = " This is a sample log configuration" ; output via message 906 endmessage = "Have a Nice Day" ; output via message 906 </options> <format name=info> 101 INFO "%n [%p] fing an" 102 INFO "%n [%p] ist zu Ende" </format> <format name=database> 201-203 INFO "%u %p" 301-304 INFO "'%2s'" ; quote with sql escapes </format> <channel> format name : info output : syslog( localost ) level : INFO solicit : 101-104, default </channel> <channel> format name : database-user timestamp: local file : user.log solicit : 201-203 </channel> <channel> format name : database-session file : session.log solicit : 301-304 </channel> ; *** The default message channel *** <channel> output : syslog( dba.mousetech.com ) solicit : 1**, 9**, default </channel> </logging> Apology Although this scheme may appear elaborate, the internal realization is fairly simple. I have far more concern that it may overwhelm someone who is new to the entire PostgreSQL system and is FAR more interested at that time in learning POSTGRES! The plus side is that it's possible to amass a library of mix-and-match blocks so that the more sordid details need not be recreated endlessly by every DBA in the world. Credit where it's due Asture observers may have noticed that the user-definable message format is a blatant ripoff from Apache. The concept of logging channels I lifted from bind, the DNS utility. Some folding, spindling, stapling and/or mutilation may have occurred in the process. Feedback Needed The details are still very much fluid, so your opinion counts!!!! What's good? What's bad? What can be improved, and what should be immediately hauled off to the nearest toxic waste disposal center? Especially of interest is what the shape of the config file should be. Is the the pseudo-HTML format shown good? Would an SQL statement form be preferable? Maybe something like LISP or C? Or something entirely different? Please tell me! All I ask is that it be YACC-parseable. Email your thoughts to me at mtsinc@leading.net, subject: PostgreSQL logging. Results will be posted to pgsql-admin and pgsql-hackers mailing lists. Thank You. Tim Holloway MTS Associates, Inc.
pgsql-hackers by date: