Thread: logging hook for database audit

logging hook for database audit

From
Itagaki Takahiro
Date:
Hello,

SQL logging is useful for database audit, but there is serveral problems:
 * A large amount of logs degrade performance, especially syslog is used. * Text or CSV logs instead of syslog could
improveperformance,   but DBAs favor syslog over text logs for integrated monitoring. * Error messages are buried in
SQLlogs. SQL logs will be only needed   at auditing, but errors should be detected as soon as possible. * Filtering
logsin syslog is *too late* because we cannot avoid   inter-process communication in syslog(), that is performance
bottleneck.

I think we'd better to support log filtering and splitting in PostgreSQL.
For the purpose, I'll submit to add "logging hook" in core, rather than
fixed features like log_min_messages-per-destination.
A logging hook and exporting logging functions
------------------------------------------------
The logging hook is very simple; just adding send_message_to_server_log_hook
in elog.c. In addition, Logging functions are exported so that logging
hook can reuse them in the hook module. For example, pg_write_textlog(),
pg_write_csvlog(), pg_write_stderr(), pg_write_syslog() etc.

A new field "message_format" is added to ErrorData structure.
It references unlocalized message formats passed to errmsg().
It is useful if we want to filter logs by log type.
i.e, we can write LOG-level messages to syslog but avoid
only SQL logs if the format matchs to a specific pattern.
A sample contrib module using logging hook
--------------------------------------------
A module using logging hook is provided as a contrib module.
Useful "fixed features" are provided by the module:
   per-destination filtering level:       <module>.textlog_min_messages = log       <module>.csvlog_min_messages = log
    <module>.syslog_min_messages = warning
 
   per-destination line prefix:       <module>.textlog_line_prefix = '[%t][%p] '       <module>.syslog_line_prefix =
''

I also have a plan to provide a feature like "log-type specific filter".
It can be used to exclude SQL logs from syslog, but I have not completed
the detailed design.


Do you think the feature is useful and acceptable?
Comments welcome.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



Re: logging hook for database audit

From
Tom Lane
Date:
Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> For the purpose, I'll submit to add "logging hook" in core, rather than
> fixed features like log_min_messages-per-destination.

The more functionality you put inside elog.c, the higher the probability
that it will fail when the going gets tough (out of memory, for instance),
leaving you with no idea of what went wrong.  I also dislike the idea
of having such critical behavior dependent on a loadable module, as that
introduces a whole set of failure modes of its own, particularly with
regard to early-startup behavior.

I think the proposal to put a hook here is an end-run to try to avoid
getting a detailed feature proposal past the criticism of pgsql-hackers,
and I don't like that either.

I can see the potential value of per-destination minimum message levels,
so maybe you should just propose that.  But this is a bad place for a
hook.
        regards, tom lane


Re: logging hook for database audit

From
Itagaki Takahiro
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I think the proposal to put a hook here is an end-run to try to avoid
> getting a detailed feature proposal past the criticism of pgsql-hackers,
> and I don't like that either.

I see. If there are bugs in logging module, no one can know research the cause.

> I can see the potential value of per-destination minimum message levels,
> so maybe you should just propose that.  But this is a bad place for a
> hook.

Features I tried to implement are following:
 * per-destination minimum message levels * per-destination log line prefix       (syslog has their own prefix, but
stderrhas no default prefix) * per-category minimum message levels       (don't write sql logs in syslog, but can write
otherLOGs)
 

I'll take a direction to make them into some pieces of core patches.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center




Re: logging hook for database audit

From
Magnus Hagander
Date:
>> I can see the potential value of per-destination minimum message  
>> levels,
>> so maybe you should just propose that.  But this is a bad place for a
>> hook.
>
> Features I tried to implement are following:
>
>  * per-destination minimum message levels
>  * per-destination log line prefix
>        (syslog has their own prefix, but stderr has no default prefix)
>  * per-category minimum message levels
>        (don't write sql logs in syslog, but can write other LOGs)
>
> I'll take a direction to make them into some pieces of core patches.

I have been working with josh tolley that changes some infrastructure  
around this, with one of the future goals to be able to implement just  
this kind of features. I need to merge some changes from josh, and  
will then try to post a wip patch as soon as possible. Please look at  
this one before you start working on this - probably theoverlap will  
be very large.

/Magnus



Re: logging hook for database audit

From
Itagaki Takahiro
Date:
Magnus Hagander <magnus@hagander.net> wrote:

> I have been working with josh tolley that changes some infrastructure  
> around this, with one of the future goals to be able to implement just  
> this kind of features. I need to merge some changes from josh, and  
> will then try to post a wip patch as soon as possible. Please look at  
> this one before you start working on this - probably theoverlap will  
> be very large.

Thanks, it will be helpful.

At the moment, my idea is adding special sqlcodes for some LOG messages.
   ERRCODE_SUCCESSFUL_AUTOVACUUM  = "00001"   ERRCODE_SUCCESSFUL_CHECKPOINTS = "00002"   ...   ereport(LOG,
errcode(ERRCODE_SUCCESSFUL_AUTOVACUUM),               errmsg("autovacuum messages")));
 

New GUC parameters xxx_category is a list of categories.
Only logs listed here is written for each log. The default
values are '*' (all categories). Messages without sqlcodes
are put into "others" category.
   {stderr|csvlog|syslog}_category = [ * | autovacuum | ... | others ]

This approach is less impact to the codes and just utilizes unused
sqlcodes of LOG or lower level messages. However, it has some limitations:
 * We cannot categorize ERROR or higher level messages   because they've used sqlcodes as original purposes. * The
numberof categories are restricted to 30 or so, because category   filtera are managed with bits32. Categorized
messagesshould use   sqlcodes between "00001" and "0000N". * We need to add sqlcodes for each LOGs. It would be hard
work,but   needed anyway if we will support any kinds of per-category log filters.
 

Comments welcome.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center