Re: [JDBC] New driver logging proposal - Mailing list pgsql-jdbc

From Dave Cramer
Subject Re: [JDBC] New driver logging proposal
Date
Msg-id CADK3HHLLAKBPt6-hTBnN0vNO693eD+cP3LTLnh8coXsfpMF4PA@mail.gmail.com
Whole thread Raw
In response to [JDBC] New driver logging proposal  (Daniel Migowski <dmigowski@ikoffice.de>)
List pgsql-jdbc
Hi Daniel,

First thanks for your well thought out proposal. I'm not happy with the performance degradation of the new logging implementation and would welcome anything that would fix this. 
I would welcome a PR to fix the current situation.
  


On 8 April 2017 at 01:13, Daniel Migowski <dmigowski@ikoffice.de> wrote:

Hello,

 

this is my proposal for a new PostgreSQL JDBC Driver logging. The aim for this is to merge the efforts done in integrating common logging frameworks while keeping the performance we had before logging frameworks were introduced. Also this proposal should add the functionality to identify connections in the logging output, a feature that was lost when a switch to logging frameworks was made.

 

Current (and previous) state:

 

I assume that people who do logging already use a logging framework of their choice, either because they configured it themselves or by using the one provided by the container they use. In either way they know how to configure logging on the platform they chose.

 

Before logging framework support was introduced we had a single logging related Driver parameter called “loglevel”. This could be set to “0” for zero logging, to “1” for a log level corresponding to “DEBUG” in modern logging frameworks, and to “2” for “TRACE” level logging. There was (and seemingly isn’t) a need for higher severity log levels like “INFO”, “WARN”, “ERROR” or “FATAL” because events that would cause these are thrown as exceptions in the error case while warnings and infos are not generated at all from the driver. Logout was written to a LogWriter when this was set on the driver, which would in most cases be used to redirect logout to the logging framework one had chosen.

 

The newer versions of the driver allow the parameters “loggerLevel” and “loggerFile”. These choices are intended to interfere with the logging framework one uses because change the loglevel programmatically on the logging category “org.postgresql”. Also with “loggerFile” one is able to configure a special logfile for the logging category. The default value for

 

The current implementation has the side effect that creating a single connection will change the log level of ALL connections (loaded by the same ClassLoader to be specific) because the loggers are statically defined in the classes which is how loggers are used today. Also, although can be given as connection parameters, they are not connection specific at all. This might not be a problem when one uses only a single connection pool where all connections are created equally, but interferes when one has an application that occasionally has to fetch data from multiple data sources.

 

Also, the current implementation makes have use of TRACE level logging in all getXXX functions of PgResultSet. A typical line is this:

 

      LOGGER.log(Level.FINEST, "  getString columnIndex: {0}", columnIndex);

 

This innocent looking line should do no harm at a first glance, because if tracing is disabled the function should leave immediately and  no performance is lost. One minor problem is the usage of the columnIndex as parameter, which is autoboxed into an Integer while passed to the log function (involves a hidden call to Integer.valueOf(int), which even creates new Objects depending on how aggressive the VM is tuned and on the number of columns in the result set. More important is the implementation of the log function itself, which is relative fast when a good Logging Framework is used, and disastrously slow in many other cases like when Logback is used. In that case an expensive log event is created and passed to the really used logging framework just to be discarded there, but this can IMHO be considered an artefact of a badly configured framework (still throws a bad light on PostgreSQL JDBC performance).

 

Proposed state:

 

I believe the common case is that logging for the driver is disabled. For this I propose a Boolean configuration parameter “logging” that can be truly configured per connection. Except for a few places where no reference to a connection exists, this will silent at least all connection specific logging.

 

The log level and log destination should be configured by the normal means of the logging framework, exactly how it is done in every other library that provides some means of logging.

 

I saw from the discussion while introducing logging frameworks that there was demand for programmatic configuration of log levels and log destinations, but since the current implementation does that on the Driver level anyway, this should be done by calling static functions on it, not by giving connection parameters (this can be discussed). The reason that connection pools can be configured easier when these parameters are available to the connection is not valid, because either the connection pool is created in a container, and then logging is configurable in the same textual way anyway and the container user is used to it, or the connection pool is created programmatically, and then one is also able to configure the driver programmatically on a driver level. Therefore we can drop the parameters “loggerLevel” and “loggerFile”, replacing them with helper methods on Driver that just configure the java.util.Logging framework.

 

Finally it might be nice to be able to identify the connection that logs. I like the idea of ??? to encode it in the logging category. Also usually when logging is used the category is the package that logs. It is common to have different logging categories to be able to filter the log messages. To be able to combine both I propose a fixed set of logging categories (<id> is the connection id):

 

    org.postgresql.connect.<id>  ß Logs connecting and disconnecting events. Can log more or less depending on Level. Level INFO logs simple connection/disconnection. Level DEBUG also logs parameters given to connection. Level TRACE logs the whole connection setup including default parameter settings. This also logs WARN messages when connections are finalized without having been closed explicitly.

    org.postgresql.sql.<id>  ß Logs the SQL queries executed on the backend. Also logs Bind, Prepare and Execute for PreparedStatements. Level INFO logs all SQL here. Level DEBUG could also log all the parameters given to statements. Level TRACE could even log the returned results.

    org.postgresql.api.<id>  ß Logs all interaction with the driver through externally callable functions.

    org.postgresql.protocol.<id>  ß Logs on the protocol level. Should always use TRACE level.

    org.postgresql.pool.<id>  ß Logs connection pool related events. On Level INFO we could print simple pool usage like retrieving or releasing connections. Maybe there is useful logging on DEBUG level but I don’t know for now.  

   org.postgresql.XXX  ß Some more internal logging categories for places where we don’t have access to the connection itself. These will always log at TRACE level and are not called very often. I will remove potentially expensive logging in place called often like in static helper functions of BooleanTypeUtil.

 

The connection id could , depending on your use case, simply be an Integer used to identify the connection. This could be made optional and even configurable because maybe one wants to encode more information in it. I propose a Driver-level ConnectionIdFactory that by default just returns uses an incrementing id for the connection, but personally I’d like to have the database name in it also. Not only a ConnectionLevel Id might be used but also a DataSource id. Is also required to intentify for datasource specific logging.

 

At least in my application I found it useful to set the log level for all third party libraries to WARN, while would also effectively discard most logging from the JDBC driver even in the case where logging is disabled. With the above categories I now could specifically describe what I am interested in, and I believe I covered most use cases one would use logging for in it. It got SQL logging for the end user, protocol logging for the driver developers, API logging for the end users and application profilers, stuff to detect connection leaks and parameters for an administrator.    

 

Implementation details:

 

To achive all this I propose to replace most of the loggers currently used with a PgLogger implementation that wraps the usual. We could have a Null Logger here that does nothing, and I hope for the VM to simply drop all calls to it in compiled code completely when we won’t ever instantiate any other implementations of PgLogger, so this will allow for peak performance in the default case. This class will have log functions for each level so we spare the Level-argument in calls to the functions.

 

Other implementations of that class really do logging. Depending on if we want connection level logging we could have individual instances of the Loggers per connection and category or just by category.

 

Depending on the comments to this mail I might do the work.

 

Regards,

Daniel Migowski

 


pgsql-jdbc by date:

Previous
From: Daniel Migowski
Date:
Subject: [JDBC] New driver logging proposal
Next
From: Jorge Solórzano
Date:
Subject: Re: [JDBC] New driver logging proposal