RFC: Industrial-strength logging (long message) - Mailing list pgsql-hackers

From Tim Holloway
Subject RFC: Industrial-strength logging (long message)
Date
Msg-id 3810FBDA.482F904A@southeast.net
Whole thread Raw
Responses Re: [HACKERS] RFC: Industrial-strength logging (long message)  ("D'Arcy" "J.M." Cain <darcy@druid.net>)
Re: [HACKERS] RFC: Industrial-strength logging (long message)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Request For Comments:  Towards an industrial-strength
logging facility

1999-10-19, Tim Holloway mtsinc@leading.net

Introduction.

PostgreSQL is a commercial-quality DBMS. However, one item
generally found in commercial DBMS's that
PostgreSQL has so far lacked has been a logging facility.
Yes, it has a debugging facility that can
spit out reams of useful information, but debugging is not
logging - it has different goals and constraints. This,
then, is an attempt to provide that missg item.

What should a log look like?

This depends. I like a console-style listing, as my needs
are simple. Others would prefer that the
log be itself a database. Happily, I think that what I have
developed so far can be used for both.
While it's perilous to attempt to be all things to all
people, my experience in working with the Amiga was that
simplicity doesn't have to mean rigidity or lack of ability.
Preliminary design efforts have resulted in a plan that I
think will satisfy the majority of DBA's. Time will tell.

Design goals

1. Robustness. Adding logging should not cause the system to
become unstable.
2. Performance. Unless you're IBM at least, logging is a
means, not an end. The performance of the system
must not be degraded.
3. Security. Since logging is often part of a security
effort, it's only reasonable that the log itself
be secure. As of this writing, security is that of the
PostgreSQL backend and/or syslog facilities.
4. Routability. Preliminary design permits routing any or
all events to multiple destinations, each of which is
individually controllable as to format. Abuse of this power
may impact 2), above, however.
5. Locale support.  Not everyone's preferred language is
English. Because each and every log message is fully
configurable, and because care is given to formatting based
on locale, the DBA can customize logging to the convenience
of his or her own culture. I hope that those who benfit from
this will keep my on the proper path.

Implementation

"Simple things should be simple and complex things should be
possible"
              Alan Kay

I've seen far too many systems where simple things were
complex and complex things were simple and other
variantions on that theme. I HOPE that's not what I'm
producing. If I am, PLEASE LET ME KNOW!!!

Although the same mechanism is at work at all times, the
defaults are set to display just enough information to let
you know that more is possible:

Postgres [123] 900 - Logging configuration file
"/usr/local/pgsql/data/postgresql.conf" was not found or
denied access. Using default logging.
Postgres [123] 101 - Server started
Postgres [123] 102 - Server shutdown

These messages are routed to stderr (if available) for the
backend process AND to syslog (if available).
If there are other worthy default channels, I'd like to know
them.

The Next Level

The logging configuration file allows customizing of
logging. At one extreme, it can be used to suppress ALL
logging - even the default items listed above. At the other
- suffice it to say that you can get VERY elaborate.

There are 3 types of information in the logging
configuration file (which may, but likely won't, be part of
pg_hba.conf) Logging info is read at startup. There may
exist signals that cause it to be reread, but not just yet.

They are:

1. General log control. For example, suppression of
high-demand activities BEFORE they get run, formatted and
sent to the log subsystem.

2. Message format. Allows definition/override of message
formats on a class (see below) and individual basis. This is
both how formatting for database load and locale are done.
Multiple message formats are supported!

3. Message routing. Allows definition of the destination(s)
of log messages. Each destination (channel) can select which
messages it will format and report. To avoid potential loss
of critical info, any message not explicitly routed at least
once gets reported on the default channel - stderr/syslog,
unless otherwise configured.

Message classes

Implicit in the desire for logging into a database is the
understanding that some types of messages may have identical
formats but different content. To facilitate this (and to
aid in locale support) each possible message has a unique
identifier, and related messages (those which would route to
the same table)  are grouped into classes, identified by
century, as in the http and other familiar protocols.

Classes for PostgreSQL logging are not grouped by severity,
however, but by their affinity for a given
statistical table. Tentatively:

1xz - The PostgreSQL server
2xx - User-related information
3xx - Transaction information
4xx - EXPLAIN results (???)
9xx - General system alerts

Right now, the following are considered likely candidates,
subject to user feedback:

server infoServer name, signal ID
101 - Server started
102 - Server shutdown
103 - Signal xxx received
104 - Server ABEND

user sessionuserid, port or terminal ID, authentication scheme name
(e.g. md5). session ID
201 - User xxxx connected via port/terminal xxxxxxxx
authenticated by aaaaa
202 - User xxxx disconnected
203 - FORBIDDEN - connection denied for user xxxx via
port/terminal xxxxxxxxxx rejected by aaaaaaa

show commandsSession ID, command text
301 - SELECT text
302 - INSERT text
303 - UPDATE text
304 - DELETE text

show resultssession ID, count or OID. primary/first/only table ID
affected
401 - SUCCESS - nnn records retrieved
402 - SUCCESS - record inserted at OID
403 - SUCCESS - nnn records updated
404 - SUCCESS - nnn records deleted
405 - FORBIDDEN - action xxxxxx denied to user xxxx on table
xxxxxxxx

explainas below:
500 EXPLAIN transaction ID sequence cost rows bytes

miscellaneousexplanatory text
900 - Logging configuration file "ffff" was not found or
denied read access. Using default logging.
901 - Logging configuration file "ffff" could not be
processed - invalid text at line nnn.
902 - User overrides non-existent message ID nnn
903 - Channel requests non-existent message ID nnn
904 - end of section starting on line nnn was not found
905 - start of section ending on line nnn was not found
906 - (message from logging configuration file)

Multiple message format tables may be defined. Each of these
overrides or disables one or more of of the messages listed
above - or its "final" equivalent. Messages which aren't
overridden display in their default (English text) mode.
Because this could be VERY rude to a table loader, each
channel must explicitly request which messages are
acceptable (this also facilitates routing of message
classes). The default channel catches unsolicited messages
as a safeguard. To make it easier, both common message
formats in the message format tables and their
correspoonding solicitations in the channel definitions
allow ranges to be defined. E.g., you can define a logfile
format for messages 301-304 rather than having to replicate
the same format for each.

A brief example --- SUBJECT TO RADICAL CHANGE! ---

; One possible implementation of logging configuration:
; an SQL style - verb attribute(value[s]) might be better?
;
<logging>

<options>
level = warning ; of INFO, NOTICE, DEBUG, WARNING, ERROR, or
FATAL
log directory = /var/log/postgresql
startmessage = " This is a sample log configuration" ;
output via message 906
endmessage = "Have a Nice Day" ; output via message 906
</options>

<format name=info>
101 INFO "%n [%p] fing an" 
102 INFO "%n [%p] ist zu Ende" 
</format>

<format name=database>
201-203 INFO "%u %p"
301-304 INFO "'%2s'" ; quote with sql escapes
</format>

<channel>
format name : info
output : syslog( localost )
level : INFO
solicit : 101-104, default
</channel>

<channel>
format name : database-user
timestamp: local
file : user.log
solicit : 201-203
</channel>

<channel>
format name : database-session
file : session.log
solicit : 301-304
</channel>

; *** The default message channel ***
<channel>
output : syslog( dba.mousetech.com )
solicit : 1**, 9**, default
</channel>

</logging>

Apology

Although this scheme may appear elaborate, the internal
realization is fairly simple. I have far more concern that
it may overwhelm someone who is new to the entire PostgreSQL
system and is FAR more interested at that time in learning
POSTGRES! The plus side is that it's possible to amass a
library of mix-and-match blocks so that the more sordid
details need not be recreated endlessly by every DBA in the
world.

Credit where it's due

Asture observers may have noticed that the user-definable
message format is a blatant ripoff from Apache. The concept
of logging channels I lifted from bind, the DNS utility.
Some folding, spindling, stapling and/or mutilation may have
occurred in the process.

Feedback Needed

The details are still very much fluid, so your opinion
counts!!!! What's good? What's bad?
What can be improved, and what should be immediately hauled
off to the nearest toxic waste disposal center? Especially
of interest is what the shape of the config file should be.
Is the the pseudo-HTML format shown good? Would an SQL
statement form be preferable? Maybe something like LISP or
C? Or something entirely different? Please tell me! All I
ask is that it be YACC-parseable. Email your thoughts to me
at mtsinc@leading.net, subject: PostgreSQL logging. Results
will be posted to pgsql-admin and pgsql-hackers mailing
lists. Thank You.
  Tim Holloway    MTS Associates, Inc.


pgsql-hackers by date:

Previous
From: Bryan Ingram
Date:
Subject: Re: [HACKERS] postgres inode q's
Next
From: Peter Eisentraut
Date:
Subject: Re: [HACKERS] New psql startup banner