Thread: [JDBC] New driver logging proposal

[JDBC] New driver logging proposal

From

Daniel Migowski

Date:

08 April 2017, 08:13:30

Hello,

this is my proposal for a new PostgreSQL JDBC Driver logging. The aim for this is to merge the efforts done in integrating common logging frameworks while keeping the performance we had before logging frameworks were introduced. Also this proposal should add the functionality to identify connections in the logging output, a feature that was lost when a switch to logging frameworks was made.

Current (and previous) state:

I assume that people who do logging already use a logging framework of their choice, either because they configured it themselves or by using the one provided by the container they use. In either way they know how to configure logging on the platform they chose.

Before logging framework support was introduced we had a single logging related Driver parameter called “loglevel”. This could be set to “0” for zero logging, to “1” for a log level corresponding to “DEBUG” in modern logging frameworks, and to “2” for “TRACE” level logging. There was (and seemingly isn’t) a need for higher severity log levels like “INFO”, “WARN”, “ERROR” or “FATAL” because events that would cause these are thrown as exceptions in the error case while warnings and infos are not generated at all from the driver. Logout was written to a LogWriter when this was set on the driver, which would in most cases be used to redirect logout to the logging framework one had chosen.

The newer versions of the driver allow the parameters “loggerLevel” and “loggerFile”. These choices are intended to interfere with the logging framework one uses because change the loglevel programmatically on the logging category “org.postgresql”. Also with “loggerFile” one is able to configure a special logfile for the logging category. The default value for

The current implementation has the side effect that creating a single connection will change the log level of ALL connections (loaded by the same ClassLoader to be specific) because the loggers are statically defined in the classes which is how loggers are used today. Also, although can be given as connection parameters, they are not connection specific at all. This might not be a problem when one uses only a single connection pool where all connections are created equally, but interferes when one has an application that occasionally has to fetch data from multiple data sources.

Also, the current implementation makes have use of TRACE level logging in all getXXX functions of PgResultSet. A typical line is this:

LOGGER.log(Level.FINEST, " getString columnIndex: {0}", columnIndex);

This innocent looking line should do no harm at a first glance, because if tracing is disabled the function should leave immediately and no performance is lost. One minor problem is the usage of the columnIndex as parameter, which is autoboxed into an Integer while passed to the log function (involves a hidden call to Integer.valueOf(int), which even creates new Objects depending on how aggressive the VM is tuned and on the number of columns in the result set. More important is the implementation of the log function itself, which is relative fast when a good Logging Framework is used, and disastrously slow in many other cases like when Logback is used. In that case an expensive log event is created and passed to the really used logging framework just to be discarded there, but this can IMHO be considered an artefact of a badly configured framework (still throws a bad light on PostgreSQL JDBC performance).

Proposed state:

I believe the common case is that logging for the driver is disabled. For this I propose a Boolean configuration parameter “logging” that can be truly configured per connection. Except for a few places where no reference to a connection exists, this will silent at least all connection specific logging.

The log level and log destination should be configured by the normal means of the logging framework, exactly how it is done in every other library that provides some means of logging.

I saw from the discussion while introducing logging frameworks that there was demand for programmatic configuration of log levels and log destinations, but since the current implementation does that on the Driver level anyway, this should be done by calling static functions on it, not by giving connection parameters (this can be discussed). The reason that connection pools can be configured easier when these parameters are available to the connection is not valid, because either the connection pool is created in a container, and then logging is configurable in the same textual way anyway and the container user is used to it, or the connection pool is created programmatically, and then one is also able to configure the driver programmatically on a driver level. Therefore we can drop the parameters “loggerLevel” and “loggerFile”, replacing them with helper methods on Driver that just configure the java.util.Logging framework.

Finally it might be nice to be able to identify the connection that logs. I like the idea of ??? to encode it in the logging category. Also usually when logging is used the category is the package that logs. It is common to have different logging categories to be able to filter the log messages. To be able to combine both I propose a fixed set of logging categories (<id> is the connection id):

org.postgresql.connect.<id> ß Logs connecting and disconnecting events. Can log more or less depending on Level. Level INFO logs simple connection/disconnection. Level DEBUG also logs parameters given to connection. Level TRACE logs the whole connection setup including default parameter settings. This also logs WARN messages when connections are finalized without having been closed explicitly.

org.postgresql.sql.<id> ß Logs the SQL queries executed on the backend. Also logs Bind, Prepare and Execute for PreparedStatements. Level INFO logs all SQL here. Level DEBUG could also log all the parameters given to statements. Level TRACE could even log the returned results.

org.postgresql.api.<id> ß Logs all interaction with the driver through externally callable functions.

org.postgresql.protocol.<id> ß Logs on the protocol level. Should always use TRACE level.

org.postgresql.pool.<id> ß Logs connection pool related events. On Level INFO we could print simple pool usage like retrieving or releasing connections. Maybe there is useful logging on DEBUG level but I don’t know for now.

org.postgresql.XXX ß Some more internal logging categories for places where we don’t have access to the connection itself. These will always log at TRACE level and are not called very often. I will remove potentially expensive logging in place called often like in static helper functions of BooleanTypeUtil.

The connection id could , depending on your use case, simply be an Integer used to identify the connection. This could be made optional and even configurable because maybe one wants to encode more information in it. I propose a Driver-level ConnectionIdFactory that by default just returns uses an incrementing id for the connection, but personally I’d like to have the database name in it also. Not only a ConnectionLevel Id might be used but also a DataSource id. Is also required to intentify for datasource specific logging.

At least in my application I found it useful to set the log level for all third party libraries to WARN, while would also effectively discard most logging from the JDBC driver even in the case where logging is disabled. With the above categories I now could specifically describe what I am interested in, and I believe I covered most use cases one would use logging for in it. It got SQL logging for the end user, protocol logging for the driver developers, API logging for the end users and application profilers, stuff to detect connection leaks and parameters for an administrator.

Implementation details:

To achive all this I propose to replace most of the loggers currently used with a PgLogger implementation that wraps the usual. We could have a Null Logger here that does nothing, and I hope for the VM to simply drop all calls to it in compiled code completely when we won’t ever instantiate any other implementations of PgLogger, so this will allow for peak performance in the default case. This class will have log functions for each level so we spare the Level-argument in calls to the functions.

Other implementations of that class really do logging. Depending on if we want connection level logging we could have individual instances of the Loggers per connection and category or just by category.

Depending on the comments to this mail I might do the work.

Regards,

Daniel Migowski

Re: [JDBC] New driver logging proposal

From

Dave Cramer

Date:

08 April 2017, 16:31:13

Hi Daniel,

First thanks for your well thought out proposal. I'm not happy with the performance degradation of the new logging implementation and would welcome anything that would fix this.

I would welcome a PR to fix the current situation.

Dave Cramer

davec@postgresintl.com

www.postgresintl.com

On 8 April 2017 at 01:13, Daniel Migowski <dmigowski@ikoffice.de> wrote:

Hello,

this is my proposal for a new PostgreSQL JDBC Driver logging. The aim for this is to merge the efforts done in integrating common logging frameworks while keeping the performance we had before logging frameworks were introduced. Also this proposal should add the functionality to identify connections in the logging output, a feature that was lost when a switch to logging frameworks was made.

Current (and previous) state:

I assume that people who do logging already use a logging framework of their choice, either because they configured it themselves or by using the one provided by the container they use. In either way they know how to configure logging on the platform they chose.

Before logging framework support was introduced we had a single logging related Driver parameter called “loglevel”. This could be set to “0” for zero logging, to “1” for a log level corresponding to “DEBUG” in modern logging frameworks, and to “2” for “TRACE” level logging. There was (and seemingly isn’t) a need for higher severity log levels like “INFO”, “WARN”, “ERROR” or “FATAL” because events that would cause these are thrown as exceptions in the error case while warnings and infos are not generated at all from the driver. Logout was written to a LogWriter when this was set on the driver, which would in most cases be used to redirect logout to the logging framework one had chosen.

The newer versions of the driver allow the parameters “loggerLevel” and “loggerFile”. These choices are intended to interfere with the logging framework one uses because change the loglevel programmatically on the logging category “org.postgresql”. Also with “loggerFile” one is able to configure a special logfile for the logging category. The default value for

The current implementation has the side effect that creating a single connection will change the log level of ALL connections (loaded by the same ClassLoader to be specific) because the loggers are statically defined in the classes which is how loggers are used today. Also, although can be given as connection parameters, they are not connection specific at all. This might not be a problem when one uses only a single connection pool where all connections are created equally, but interferes when one has an application that occasionally has to fetch data from multiple data sources.

Also, the current implementation makes have use of TRACE level logging in all getXXX functions of PgResultSet. A typical line is this:

      LOGGER.log(Level.FINEST, " getString columnIndex: {0}", columnIndex);

This innocent looking line should do no harm at a first glance, because if tracing is disabled the function should leave immediately and no performance is lost. One minor problem is the usage of the columnIndex as parameter, which is autoboxed into an Integer while passed to the log function (involves a hidden call to Integer.valueOf(int), which even creates new Objects depending on how aggressive the VM is tuned and on the number of columns in the result set. More important is the implementation of the log function itself, which is relative fast when a good Logging Framework is used, and disastrously slow in many other cases like when Logback is used. In that case an expensive log event is created and passed to the really used logging framework just to be discarded there, but this can IMHO be considered an artefact of a badly configured framework (still throws a bad light on PostgreSQL JDBC performance).

Proposed state:

I believe the common case is that logging for the driver is disabled. For this I propose a Boolean configuration parameter “logging” that can be truly configured per connection. Except for a few places where no reference to a connection exists, this will silent at least all connection specific logging.

The log level and log destination should be configured by the normal means of the logging framework, exactly how it is done in every other library that provides some means of logging.

I saw from the discussion while introducing logging frameworks that there was demand for programmatic configuration of log levels and log destinations, but since the current implementation does that on the Driver level anyway, this should be done by calling static functions on it, not by giving connection parameters (this can be discussed). The reason that connection pools can be configured easier when these parameters are available to the connection is not valid, because either the connection pool is created in a container, and then logging is configurable in the same textual way anyway and the container user is used to it, or the connection pool is created programmatically, and then one is also able to configure the driver programmatically on a driver level. Therefore we can drop the parameters “loggerLevel” and “loggerFile”, replacing them with helper methods on Driver that just configure the java.util.Logging framework.

Finally it might be nice to be able to identify the connection that logs. I like the idea of ??? to encode it in the logging category. Also usually when logging is used the category is the package that logs. It is common to have different logging categories to be able to filter the log messages. To be able to combine both I propose a fixed set of logging categories (<id> is the connection id):

    org.postgresql.connect.<id> ß Logs connecting and disconnecting events. Can log more or less depending on Level. Level INFO logs simple connection/disconnection. Level DEBUG also logs parameters given to connection. Level TRACE logs the whole connection setup including default parameter settings. This also logs WARN messages when connections are finalized without having been closed explicitly.
    org.postgresql.sql.<id> ß Logs the SQL queries executed on the backend. Also logs Bind, Prepare and Execute for PreparedStatements. Level INFO logs all SQL here. Level DEBUG could also log all the parameters given to statements. Level TRACE could even log the returned results.
    org.postgresql.api.<id> ß Logs all interaction with the driver through externally callable functions.
    org.postgresql.protocol.<id> ß Logs on the protocol level. Should always use TRACE level.
    org.postgresql.pool.<id> ß Logs connection pool related events. On Level INFO we could print simple pool usage like retrieving or releasing connections. Maybe there is useful logging on DEBUG level but I don’t know for now.
   org.postgresql.XXX ß Some more internal logging categories for places where we don’t have access to the connection itself. These will always log at TRACE level and are not called very often. I will remove potentially expensive logging in place called often like in static helper functions of BooleanTypeUtil.

The connection id could , depending on your use case, simply be an Integer used to identify the connection. This could be made optional and even configurable because maybe one wants to encode more information in it. I propose a Driver-level ConnectionIdFactory that by default just returns uses an incrementing id for the connection, but personally I’d like to have the database name in it also. Not only a ConnectionLevel Id might be used but also a DataSource id. Is also required to intentify for datasource specific logging.

At least in my application I found it useful to set the log level for all third party libraries to WARN, while would also effectively discard most logging from the JDBC driver even in the case where logging is disabled. With the above categories I now could specifically describe what I am interested in, and I believe I covered most use cases one would use logging for in it. It got SQL logging for the end user, protocol logging for the driver developers, API logging for the end users and application profilers, stuff to detect connection leaks and parameters for an administrator.

Implementation details:

To achive all this I propose to replace most of the loggers currently used with a PgLogger implementation that wraps the usual. We could have a Null Logger here that does nothing, and I hope for the VM to simply drop all calls to it in compiled code completely when we won’t ever instantiate any other implementations of PgLogger, so this will allow for peak performance in the default case. This class will have log functions for each level so we spare the Level-argument in calls to the functions.

Other implementations of that class really do logging. Depending on if we want connection level logging we could have individual instances of the Loggers per connection and category or just by category.

Depending on the comments to this mail I might do the work.

Regards,
Daniel Migowski

Re: New driver logging proposal

From

Dave Cramer

Date:

08 April 2017, 16:31:13

Hi Daniel,

First thanks for your well thought out proposal. I'm not happy with the performance degradation of the new logging implementation and would welcome anything that would fix this.

I would welcome a PR to fix the current situation.

Dave Cramer

davec@postgresintl.com

www.postgresintl.com

On 8 April 2017 at 01:13, Daniel Migowski <dmigowski@ikoffice.de> wrote:

Hello,

this is my proposal for a new PostgreSQL JDBC Driver logging. The aim for this is to merge the efforts done in integrating common logging frameworks while keeping the performance we had before logging frameworks were introduced. Also this proposal should add the functionality to identify connections in the logging output, a feature that was lost when a switch to logging frameworks was made.

Current (and previous) state:

I assume that people who do logging already use a logging framework of their choice, either because they configured it themselves or by using the one provided by the container they use. In either way they know how to configure logging on the platform they chose.

Before logging framework support was introduced we had a single logging related Driver parameter called “loglevel”. This could be set to “0” for zero logging, to “1” for a log level corresponding to “DEBUG” in modern logging frameworks, and to “2” for “TRACE” level logging. There was (and seemingly isn’t) a need for higher severity log levels like “INFO”, “WARN”, “ERROR” or “FATAL” because events that would cause these are thrown as exceptions in the error case while warnings and infos are not generated at all from the driver. Logout was written to a LogWriter when this was set on the driver, which would in most cases be used to redirect logout to the logging framework one had chosen.

The newer versions of the driver allow the parameters “loggerLevel” and “loggerFile”. These choices are intended to interfere with the logging framework one uses because change the loglevel programmatically on the logging category “org.postgresql”. Also with “loggerFile” one is able to configure a special logfile for the logging category. The default value for

The current implementation has the side effect that creating a single connection will change the log level of ALL connections (loaded by the same ClassLoader to be specific) because the loggers are statically defined in the classes which is how loggers are used today. Also, although can be given as connection parameters, they are not connection specific at all. This might not be a problem when one uses only a single connection pool where all connections are created equally, but interferes when one has an application that occasionally has to fetch data from multiple data sources.

Also, the current implementation makes have use of TRACE level logging in all getXXX functions of PgResultSet. A typical line is this:

      LOGGER.log(Level.FINEST, " getString columnIndex: {0}", columnIndex);

This innocent looking line should do no harm at a first glance, because if tracing is disabled the function should leave immediately and no performance is lost. One minor problem is the usage of the columnIndex as parameter, which is autoboxed into an Integer while passed to the log function (involves a hidden call to Integer.valueOf(int), which even creates new Objects depending on how aggressive the VM is tuned and on the number of columns in the result set. More important is the implementation of the log function itself, which is relative fast when a good Logging Framework is used, and disastrously slow in many other cases like when Logback is used. In that case an expensive log event is created and passed to the really used logging framework just to be discarded there, but this can IMHO be considered an artefact of a badly configured framework (still throws a bad light on PostgreSQL JDBC performance).

Proposed state:

I believe the common case is that logging for the driver is disabled. For this I propose a Boolean configuration parameter “logging” that can be truly configured per connection. Except for a few places where no reference to a connection exists, this will silent at least all connection specific logging.

The log level and log destination should be configured by the normal means of the logging framework, exactly how it is done in every other library that provides some means of logging.

I saw from the discussion while introducing logging frameworks that there was demand for programmatic configuration of log levels and log destinations, but since the current implementation does that on the Driver level anyway, this should be done by calling static functions on it, not by giving connection parameters (this can be discussed). The reason that connection pools can be configured easier when these parameters are available to the connection is not valid, because either the connection pool is created in a container, and then logging is configurable in the same textual way anyway and the container user is used to it, or the connection pool is created programmatically, and then one is also able to configure the driver programmatically on a driver level. Therefore we can drop the parameters “loggerLevel” and “loggerFile”, replacing them with helper methods on Driver that just configure the java.util.Logging framework.

Finally it might be nice to be able to identify the connection that logs. I like the idea of ??? to encode it in the logging category. Also usually when logging is used the category is the package that logs. It is common to have different logging categories to be able to filter the log messages. To be able to combine both I propose a fixed set of logging categories (<id> is the connection id):

    org.postgresql.connect.<id> ß Logs connecting and disconnecting events. Can log more or less depending on Level. Level INFO logs simple connection/disconnection. Level DEBUG also logs parameters given to connection. Level TRACE logs the whole connection setup including default parameter settings. This also logs WARN messages when connections are finalized without having been closed explicitly.
    org.postgresql.sql.<id> ß Logs the SQL queries executed on the backend. Also logs Bind, Prepare and Execute for PreparedStatements. Level INFO logs all SQL here. Level DEBUG could also log all the parameters given to statements. Level TRACE could even log the returned results.
    org.postgresql.api.<id> ß Logs all interaction with the driver through externally callable functions.
    org.postgresql.protocol.<id> ß Logs on the protocol level. Should always use TRACE level.
    org.postgresql.pool.<id> ß Logs connection pool related events. On Level INFO we could print simple pool usage like retrieving or releasing connections. Maybe there is useful logging on DEBUG level but I don’t know for now.
   org.postgresql.XXX ß Some more internal logging categories for places where we don’t have access to the connection itself. These will always log at TRACE level and are not called very often. I will remove potentially expensive logging in place called often like in static helper functions of BooleanTypeUtil.

The connection id could , depending on your use case, simply be an Integer used to identify the connection. This could be made optional and even configurable because maybe one wants to encode more information in it. I propose a Driver-level ConnectionIdFactory that by default just returns uses an incrementing id for the connection, but personally I’d like to have the database name in it also. Not only a ConnectionLevel Id might be used but also a DataSource id. Is also required to intentify for datasource specific logging.

At least in my application I found it useful to set the log level for all third party libraries to WARN, while would also effectively discard most logging from the JDBC driver even in the case where logging is disabled. With the above categories I now could specifically describe what I am interested in, and I believe I covered most use cases one would use logging for in it. It got SQL logging for the end user, protocol logging for the driver developers, API logging for the end users and application profilers, stuff to detect connection leaks and parameters for an administrator.

Implementation details:

To achive all this I propose to replace most of the loggers currently used with a PgLogger implementation that wraps the usual. We could have a Null Logger here that does nothing, and I hope for the VM to simply drop all calls to it in compiled code completely when we won’t ever instantiate any other implementations of PgLogger, so this will allow for peak performance in the default case. This class will have log functions for each level so we spare the Level-argument in calls to the functions.

Other implementations of that class really do logging. Depending on if we want connection level logging we could have individual instances of the Loggers per connection and category or just by category.

Depending on the comments to this mail I might do the work.

Regards,
Daniel Migowski

Re: New driver logging proposal

From

Jorge Solórzano

Date:

08 April 2017, 21:14:43

Hi Daniel, first of all, thank you for offering to improve the Driver logging. My comments are between lines:

On Fri, Apr 7, 2017 at 11:13 PM, Daniel Migowski <dmigowski@ikoffice.de> wrote:

Hello,

this is my proposal for a new PostgreSQL JDBC Driver logging. The aim for this is to merge the efforts done in integrating common logging frameworks while keeping the performance we had before logging frameworks were introduced. Also this proposal should add the functionality to identify connections in the logging output, a feature that was lost when a switch to logging frameworks was made.

+1 for the functionality to identify connections, it was on my TODO list.

Current (and previous) state:

I assume that people who do logging already use a logging framework of their choice, either because they configured it themselves or by using the one provided by the container they use. In either way they know how to configure logging on the platform they chose.

Sadly you should never "assume" anything of what people use, I still know people that uses System.out.println() to debug.

Before logging framework support was introduced we had a single logging related Driver parameter called “loglevel”. This could be set to “0” for zero logging, to “1” for a log level corresponding to “DEBUG” in modern logging frameworks, and to “2” for “TRACE” level logging. There was (and seemingly isn’t) a need for higher severity log levels like “INFO”, “WARN”, “ERROR” or “FATAL” because events that would cause these are thrown as exceptions in the error case while warnings and infos are not generated at all from the driver. Logout was written to a LogWriter when this was set on the driver, which would in most cases be used to redirect logout to the logging framework one had chosen.

The newer versions of the driver allow the parameters “loggerLevel” and “loggerFile”. These choices are intended to interfere with the logging framework one uses because change the loglevel programmatically on the logging category “org.postgresql”. Also with “loggerFile” one is able to configure a special logfile for the logging category. The default value for

These options are to facilitate the configuration of the logger for those people that don't know or just want an easy way to enable logging.

The current implementation has the side effect that creating a single connection will change the log level of ALL connections (loaded by the same ClassLoader to be specific) because the loggers are statically defined in the classes which is how loggers are used today. Also, although can be given as connection parameters, they are not connection specific at all. This might not be a problem when one uses only a single connection pool where all connections are created equally, but interferes when one has an application that occasionally has to fetch data from multiple data sources.

Do you intent to initialize a logger for each connection?
You don't need to use connection parameters to use the logger, you can also use a properties file too.

Also, the current implementation makes have use of TRACE level logging in all getXXX functions of PgResultSet. A typical line is this:

LOGGER.log(Level.FINEST, " getString columnIndex: {0}", columnIndex);

This innocent looking line should do no harm at a first glance, because if tracing is disabled the function should leave immediately and no performance is lost. One minor problem is the usage of the columnIndex as parameter, which is autoboxed into an Integer while passed to the log function (involves a hidden call to Integer.valueOf(int), which even creates new Objects depending on how aggressive the VM is tuned and on the number of columns in the result set.

This has to be backed up by a benchmark, the autobox should be cheap and Integer.valueOf(int) will always cache values in the range -128 to 127, inclusive.

More important is the implementation of the log function itself, which is relative fast when a good Logging Framework is used, and disastrously slow in many other cases like when Logback is used. In that case an expensive log event is created and passed to the really used logging framework just to be discarded there, but this can IMHO be considered an artefact of a badly configured framework (still throws a bad light on PostgreSQL JDBC performance).

The logging framework currently used is java.util.logging, so if you use the Simple Logging Facade (slf4j) and redirect to another implementation using a bridge then there will be some overhead, with Logback for instance you need to configure the LevelChangePropagator.

Proposed state:

I believe the common case is that logging for the driver is disabled. For this I propose a Boolean configuration parameter “logging” that can be truly configured per connection. Except for a few places where no reference to a connection exists, this will silent at least all connection specific logging.

I'm not sure why introduce another parameter when the logLevel = OFF can be used for the same. Do you have an idea about how "truly configured per connection" can be archived? If I simply don't use the connection parameters and use the method of configuration provided by my app-server, how will that work?

The log level and log destination should be configured by the normal means of the logging framework,

exactly how it is done in every other library that provides some means of logging.

What is the normal means of the logging framework? and exactly how it's done in every other library that provides some means of logging?

I saw from the discussion while introducing logging frameworks that there was demand for programmatic configuration of log levels and log destinations, but since the current implementation does that on the Driver level anyway, this should be done by calling static functions on it, not by giving connection parameters (this can be discussed). The reason that connection pools can be configured easier when these parameters are available to the connection is not valid, because either the connection pool is created in a container, and then logging is configurable in the same textual way anyway and the container user is used to it, or the connection pool is created programmatically, and then one is also able to configure the driver programmatically on a driver level. Therefore we can drop the parameters “loggerLevel” and “loggerFile”, replacing them with helper methods on Driver that just configure the java.util.Logging framework.

The connection parameters are there to facilitate the configuration of the logger, and just for those rare cases when there is a need to enable and discard scenarios, it's not the recomended aproach and not intended to be used in production.

Finally it might be nice to be able to identify the connection that logs. I like the idea of ??? to encode it in the logging category.

Yes, it might be nice to identify the connection, what do you mean by ???

Also usually when logging is used the category is the package that logs. It is common to have different logging categories to be able to filter the log messages. To be able to combine both I propose a fixed set of logging categories (<id> is the connection id):

    org.postgresql.connect.<id> ß Logs connecting and disconnecting events. Can log more or less depending on Level. Level INFO logs simple connection/disconnection. Level DEBUG also logs parameters given to connection. Level TRACE logs the whole connection setup including default parameter settings. This also logs WARN messages when connections are finalized without having been closed explicitly.
    org.postgresql.sql.<id> ß Logs the SQL queries executed on the backend. Also logs Bind, Prepare and Execute for PreparedStatements. Level INFO logs all SQL here. Level DEBUG could also log all the parameters given to statements. Level TRACE could even log the returned results.
    org.postgresql.api.<id> ß Logs all interaction with the driver through externally callable functions.
    org.postgresql.protocol.<id> ß Logs on the protocol level. Should always use TRACE level.
    org.postgresql.pool.<id> ß Logs connection pool related events. On Level INFO we could print simple pool usage like retrieving or releasing connections. Maybe there is useful logging on DEBUG level but I don’t know for now.
   org.postgresql.XXX ß Some more internal logging categories for places where we don’t have access to the connection itself. These will always log at TRACE level and are not called very often. I will remove potentially expensive logging in place called often like in static helper functions of BooleanTypeUtil.

I really like the idea of have a fixed set of logging categories. +1

The connection id could , depending on your use case, simply be an Integer used to identify the connection. This could be made optional and even configurable because maybe one wants to encode more information in it. I propose a Driver-level ConnectionIdFactory that by default just returns uses an incrementing id for the connection, but personally I’d like to have the database name in it also. Not only a ConnectionLevel Id might be used but also a DataSource id. Is also required to intentify for datasource specific logging.

I'm not sure if introduce a connection id it's a good idea, just for the record the logs of the Drivers should be used to debug the driver mostly, and with this it seems to fix a problem that it's not related to the Driver.

At least in my application I found it useful to set the log level for all third party libraries to WARN, while would also effectively discard most logging from the JDBC driver even in the case where logging is disabled. With the above categories I now could specifically describe what I am interested in, and I believe I covered most use cases one would use logging for in it. It got SQL logging for the end user, protocol logging for the driver developers, API logging for the end users and application profilers, stuff to detect connection leaks and parameters for an administrator.

Implementation details:

To achive all this I propose to replace most of the loggers currently used with a PgLogger implementation that wraps the usual. We could have a Null Logger here that does nothing, and I hope for the VM to simply drop all calls to it in compiled code completely when we won’t ever instantiate any other implementations of PgLogger, so this will allow for peak performance in the default case. This class will have log functions for each level so we spare the Level-argument in calls to the functions.

This smells like the "custom logging abstraction" anti-pattern.

Other implementations of that class really do logging. Depending on if we want connection level logging we could have individual instances of the Loggers per connection and category or just by category.

Do you intent to change to SLF4J?

Depending on the comments to this mail I might do the work.

Regards,
Daniel Migowski

Re: [JDBC] New driver logging proposal

From

Jorge Solórzano

Date:

08 April 2017, 21:14:43

Hi Daniel, first of all, thank you for offering to improve the Driver logging. My comments are between lines:

On Fri, Apr 7, 2017 at 11:13 PM, Daniel Migowski <dmigowski@ikoffice.de> wrote:

Hello,

this is my proposal for a new PostgreSQL JDBC Driver logging. The aim for this is to merge the efforts done in integrating common logging frameworks while keeping the performance we had before logging frameworks were introduced. Also this proposal should add the functionality to identify connections in the logging output, a feature that was lost when a switch to logging frameworks was made.

+1 for the functionality to identify connections, it was on my TODO list.

Current (and previous) state:

I assume that people who do logging already use a logging framework of their choice, either because they configured it themselves or by using the one provided by the container they use. In either way they know how to configure logging on the platform they chose.

Sadly you should never "assume" anything of what people use, I still know people that uses System.out.println() to debug.

Before logging framework support was introduced we had a single logging related Driver parameter called “loglevel”. This could be set to “0” for zero logging, to “1” for a log level corresponding to “DEBUG” in modern logging frameworks, and to “2” for “TRACE” level logging. There was (and seemingly isn’t) a need for higher severity log levels like “INFO”, “WARN”, “ERROR” or “FATAL” because events that would cause these are thrown as exceptions in the error case while warnings and infos are not generated at all from the driver. Logout was written to a LogWriter when this was set on the driver, which would in most cases be used to redirect logout to the logging framework one had chosen.

The newer versions of the driver allow the parameters “loggerLevel” and “loggerFile”. These choices are intended to interfere with the logging framework one uses because change the loglevel programmatically on the logging category “org.postgresql”. Also with “loggerFile” one is able to configure a special logfile for the logging category. The default value for

These options are to facilitate the configuration of the logger for those people that don't know or just want an easy way to enable logging.

The current implementation has the side effect that creating a single connection will change the log level of ALL connections (loaded by the same ClassLoader to be specific) because the loggers are statically defined in the classes which is how loggers are used today. Also, although can be given as connection parameters, they are not connection specific at all. This might not be a problem when one uses only a single connection pool where all connections are created equally, but interferes when one has an application that occasionally has to fetch data from multiple data sources.

Do you intent to initialize a logger for each connection?
You don't need to use connection parameters to use the logger, you can also use a properties file too.

Also, the current implementation makes have use of TRACE level logging in all getXXX functions of PgResultSet. A typical line is this:

LOGGER.log(Level.FINEST, " getString columnIndex: {0}", columnIndex);

This innocent looking line should do no harm at a first glance, because if tracing is disabled the function should leave immediately and no performance is lost. One minor problem is the usage of the columnIndex as parameter, which is autoboxed into an Integer while passed to the log function (involves a hidden call to Integer.valueOf(int), which even creates new Objects depending on how aggressive the VM is tuned and on the number of columns in the result set.

This has to be backed up by a benchmark, the autobox should be cheap and Integer.valueOf(int) will always cache values in the range -128 to 127, inclusive.

More important is the implementation of the log function itself, which is relative fast when a good Logging Framework is used, and disastrously slow in many other cases like when Logback is used. In that case an expensive log event is created and passed to the really used logging framework just to be discarded there, but this can IMHO be considered an artefact of a badly configured framework (still throws a bad light on PostgreSQL JDBC performance).

Proposed state:

I believe the common case is that logging for the driver is disabled. For this I propose a Boolean configuration parameter “logging” that can be truly configured per connection. Except for a few places where no reference to a connection exists, this will silent at least all connection specific logging.

The log level and log destination should be configured by the normal means of the logging framework,

exactly how it is done in every other library that provides some means of logging.

What is the normal means of the logging framework? and exactly how it's done in every other library that provides some means of logging?

I saw from the discussion while introducing logging frameworks that there was demand for programmatic configuration of log levels and log destinations, but since the current implementation does that on the Driver level anyway, this should be done by calling static functions on it, not by giving connection parameters (this can be discussed). The reason that connection pools can be configured easier when these parameters are available to the connection is not valid, because either the connection pool is created in a container, and then logging is configurable in the same textual way anyway and the container user is used to it, or the connection pool is created programmatically, and then one is also able to configure the driver programmatically on a driver level. Therefore we can drop the parameters “loggerLevel” and “loggerFile”, replacing them with helper methods on Driver that just configure the java.util.Logging framework.

Finally it might be nice to be able to identify the connection that logs. I like the idea of ??? to encode it in the logging category.

Yes, it might be nice to identify the connection, what do you mean by ???

Also usually when logging is used the category is the package that logs. It is common to have different logging categories to be able to filter the log messages. To be able to combine both I propose a fixed set of logging categories (<id> is the connection id):

    org.postgresql.connect.<id> ß Logs connecting and disconnecting events. Can log more or less depending on Level. Level INFO logs simple connection/disconnection. Level DEBUG also logs parameters given to connection. Level TRACE logs the whole connection setup including default parameter settings. This also logs WARN messages when connections are finalized without having been closed explicitly.
    org.postgresql.sql.<id> ß Logs the SQL queries executed on the backend. Also logs Bind, Prepare and Execute for PreparedStatements. Level INFO logs all SQL here. Level DEBUG could also log all the parameters given to statements. Level TRACE could even log the returned results.
    org.postgresql.api.<id> ß Logs all interaction with the driver through externally callable functions.
    org.postgresql.protocol.<id> ß Logs on the protocol level. Should always use TRACE level.
    org.postgresql.pool.<id> ß Logs connection pool related events. On Level INFO we could print simple pool usage like retrieving or releasing connections. Maybe there is useful logging on DEBUG level but I don’t know for now.
   org.postgresql.XXX ß Some more internal logging categories for places where we don’t have access to the connection itself. These will always log at TRACE level and are not called very often. I will remove potentially expensive logging in place called often like in static helper functions of BooleanTypeUtil.

I really like the idea of have a fixed set of logging categories. +1

The connection id could , depending on your use case, simply be an Integer used to identify the connection. This could be made optional and even configurable because maybe one wants to encode more information in it. I propose a Driver-level ConnectionIdFactory that by default just returns uses an incrementing id for the connection, but personally I’d like to have the database name in it also. Not only a ConnectionLevel Id might be used but also a DataSource id. Is also required to intentify for datasource specific logging.

At least in my application I found it useful to set the log level for all third party libraries to WARN, while would also effectively discard most logging from the JDBC driver even in the case where logging is disabled. With the above categories I now could specifically describe what I am interested in, and I believe I covered most use cases one would use logging for in it. It got SQL logging for the end user, protocol logging for the driver developers, API logging for the end users and application profilers, stuff to detect connection leaks and parameters for an administrator.

Implementation details:

To achive all this I propose to replace most of the loggers currently used with a PgLogger implementation that wraps the usual. We could have a Null Logger here that does nothing, and I hope for the VM to simply drop all calls to it in compiled code completely when we won’t ever instantiate any other implementations of PgLogger, so this will allow for peak performance in the default case. This class will have log functions for each level so we spare the Level-argument in calls to the functions.

This smells like the "custom logging abstraction" anti-pattern.

Other implementations of that class really do logging. Depending on if we want connection level logging we could have individual instances of the Loggers per connection and category or just by category.

Do you intent to change to SLF4J?

Depending on the comments to this mail I might do the work.

Regards,
Daniel Migowski

Re: New driver logging proposal

From

Daniel Migowski

Date:

09 April 2017, 13:00:40

Hello Jorge,

thanks for your critical comments. My answers are also between the lines.

Summary:

· Leave the current logger configuration by properties as it is for users without a configured logging system.

· Show connection IDs in the log messages to identify connections, and not in logging categories.

· Still use custom logging wrapper for fast deactivation of logging on a connection base and to inject connection ID into log messages.

· Enable logging by a “logging=true” configuration parameter, by default it is disabled. Non-connection-specific logging will still be enabled by default, but that’s not required because performance won’t be hurt by it.

Regards,

Daniel Migowski

Von: Jorge Solórzano [mailto:jorsol@gmail.com]
Gesendet: Samstag, 8. April 2017 20:15

On Fri, Apr 7, 2017 at 11:13 PM, Daniel Migowski <dmigowski@ikoffice.de> wrote:

this is my proposal for a new PostgreSQL JDBC Driver logging. The aim for this is to merge the efforts done in integrating common logging frameworks while keeping the performance we had before logging frameworks were introduced. Also this proposal should add the functionality to identify connections in the logging output, a feature that was lost when a switch to logging frameworks was made.

+1 for the functionality to identify connections, it was on my TODO list.

OK, then this will be done.

Current (and previous) state:
I assume that people who do logging already use a logging framework of their choice, either because they configured it themselves or by using the one provided by the container they use. In either way they know how to configure logging on the platform they chose.

Sadly you should never "assume" anything of what people use, I still know people that uses System.out.println() to debug.

Alright, so we still need something to help those people out.

Before logging framework support was introduced we had a single logging related Driver parameter called “loglevel”. This could be set to “0” for zero logging, to “1” for a log level corresponding to “DEBUG” in modern logging frameworks, and to “2” for “TRACE” level logging. There was (and seemingly isn’t) a need for higher severity log levels like “INFO”, “WARN”, “ERROR” or “FATAL” because events that would cause these are thrown as exceptions in the error case while warnings and infos are not generated at all from the driver. Logout was written to a LogWriter when this was set on the driver, which would in most cases be used to redirect logout to the logging framework one had chosen.
The newer versions of the driver allow the parameters “loggerLevel” and “loggerFile”. These choices are intended to interfere with the logging framework one uses because change the loglevel programmatically on the logging category “org.postgresql”. Also with “loggerFile” one is able to configure a special logfile for the logging category.

These options are to facilitate the configuration of the logger for those people that don't know or just want an easy way to enable logging.

OK, I will leave these options in and don’t care for people that want to use the driver that way.

The current implementation has the side effect that creating a single connection will change the log level of ALL connections (loaded by the same ClassLoader to be specific) because the loggers are statically defined in the classes which is how loggers are used today. Also, although can be given as connection parameters, they are not connection specific at all. This might not be a problem when one uses only a single connection pool where all connections are created equally, but interferes when one has an application that occasionally has to fetch data from multiple data sources.

Do you intent to initialize a logger for each connection?
You don't need to use connection parameters to use the logger, you can also use a properties file too.

Yes, indeed. But that might depend on how the user configures the connectionIdentifier. If connections don’t have to be identified in the output we can reuse the loggers. If connections are to be identified, each connection will get it’s own logger, because I’d like to add connection information in the logging category. But now, while I think about it, we could also just prepend a connection-identifying string to the log messages. It would be

2017-04-09 09:59:24,763 [main] DEBUG org.postgresql.sql - CON-123: My logging text

vs.

2017-04-09 09:59:24,763 [main] DEBUG org.postgresql.sql.123 - My logging text

in the logfiles. Personally I’d even like the message prepending more, so you are right, that should be the way to go. I didn’t know that one could configure the driver with a properties file, maybe this is useful in that case.

Also, the current implementation makes have use of TRACE level logging in all getXXX functions of PgResultSet. A typical line is this:
LOGGER.log(Level.FINEST, " getString columnIndex: {0}", columnIndex);
This innocent looking line should do no harm at a first glance, because if tracing is disabled the function should leave immediately and no performance is lost. One minor problem is the usage of the columnIndex as parameter, which is autoboxed into an Integer while passed to the log function (involves a hidden call to Integer.valueOf(int), which even creates new Objects depending on how aggressive the VM is tuned and on the number of columns in the result set.

This has to be backed up by a benchmark, the autobox should be cheap and Integer.valueOf(int) will always cache values in the range -128 to 127, inclusive.

Depends on the VM settings. This might even be much higher but in my testcase I had an absurdly large table with >400 columns, and there I noticed the impact. It is nonetheless not for free.

More important is the implementation of the log function itself, which is relative fast when a good Logging Framework is used, and disastrously slow in many other cases like when Logback is used. In that case an expensive log event is created and passed to the really used logging framework just to be discarded there, but this can IMHO be considered an artefact of a badly configured framework (still throws a bad light on PostgreSQL JDBC performance).

Yes, I know, and personally I am still using log4j. Even with log4j configured (which seems to do something like a level change automatically, else the performance hit would be even worse), I could feel 20% performance improvements in my benchmarks just by outcommenting the log statements. The time I measured even included doing the actual query and my local PostgreSQL server to respond.

Proposed state:
I believe the common case is that logging for the driver is disabled. For this I propose a Boolean configuration parameter “logging” that can be truly configured per connection. Except for a few places where no reference to a connection exists, this will silent at least all connection specific logging.

I wouldn’t rely simply on the logging configuration to disable logging because I want to add an extra layer that turns off logging completely (on a per-connection-base). See below where I explain that. I would even make that the default because it allows the JVM to nearly completely ignore the calls to the logging framework.

Regarding the “logging” parameter: Currently, when configured to something else than OFF it automatically configures the logging handler for the “org.postgresql” category and does not allow to use your own logging configuration at all!

But maybe I want logging at a given level but still don’t want to use that for the whole driver, just for maybe the “org.postgresql.connect” category. Then I would enable logging with “logging=true” and let the user configure their log levels for themselves. I would think a good default value for “logging” is “false” when logLevel = “OFF” or logLevel is not configured, and “true” in all other cases.

So with my new parameter, if you want logging you can either

· use logging=true and use your own logger configuration, or

· use loggerLevel=XXX and let the driver configure logging for you, either to a file specified by “loggerFile”, or to Driver.getLogStream(), or to stderr.

As I said, the “current” implementation would assume logging=true, but the overhead for that is just too high, so I need a way to disable connection specific logging completely. The numbers I can provide after implementing that will prove me right.

The log level and log destination should be configured by the normal means of the logging framework,
exactly how it is done in every other library that provides some means of logging.

What is the normal means of the logging framework? and exactly how it's done in every other library that provides some means of logging?

Just wanted to say that when logging is used you configure the log files yourself, and don’t have the library you use do it. I even don’t feel good with a library that fiddles with my carefully configured logging configs, but if this is what people want, accepted.

I saw from the discussion while introducing logging frameworks that there was demand for programmatic configuration of log levels and log destinations, but since the current implementation does that on the Driver level anyway, this should be done by calling static functions on it, not by giving connection parameters (this can be discussed). The reason that connection pools can be configured easier when these parameters are available to the connection is not valid, because either the connection pool is created in a container, and then logging is configurable in the same textual way anyway and the container user is used to it, or the connection pool is created programmatically, and then one is also able to configure the driver programmatically on a driver level. Therefore we can drop the parameters “loggerLevel” and “loggerFile”, replacing them with helper methods on Driver that just configure the java.util.Logging framework.

Finally it might be nice to be able to identify the connection that logs. I like the idea of ??? to encode it in the logging category.

Yes, it might be nice to identify the connection, what do you mean by ???

Damn, I should have prove read this. I meant https://github.com/vlsi.

Also usually when logging is used the category is the package that logs. It is common to have different logging categories to be able to filter the log messages. To be able to combine both I propose a fixed set of logging categories (<id> is the connection id):

    org.postgresql.connect.<id> ß Logs connecting and disconnecting events. Can log more or less depending on Level. Level INFO logs simple connection/disconnection. Level DEBUG also logs parameters given to connection. Level TRACE logs the whole connection setup including default parameter settings. This also logs WARN messages when connections are finalized without having been closed explicitly.
    org.postgresql.sql.<id> ß Logs the SQL queries executed on the backend. Also logs Bind, Prepare and Execute for PreparedStatements. Level INFO logs all SQL here. Level DEBUG could also log all the parameters given to statements. Level TRACE could even log the returned results.
    org.postgresql.api.<id> ß Logs all interaction with the driver through externally callable functions.
    org.postgresql.protocol.<id> ß Logs on the protocol level. Should always use TRACE level.
    org.postgresql.pool.<id> ß Logs connection pool related events. On Level INFO we could print simple pool usage like retrieving or releasing connections. Maybe there is useful logging on DEBUG level but I don’t know for now.
   org.postgresql.XXX ß Some more internal logging categories for places where we don’t have access to the connection itself. These will always log at TRACE level and are not called very often. I will remove potentially expensive logging in place called often like in static helper functions of BooleanTypeUtil.

I really like the idea of have a fixed set of logging categories. +1

The connection id could , depending on your use case, simply be an Integer used to identify the connection. This could be made optional and even configurable because maybe one wants to encode more information in it. I propose a Driver-level ConnectionIdFactory that by default just returns uses an incrementing id for the connection, but personally I’d like to have the database name in it also. Not only a ConnectionLevel Id might be used but also a DataSource id. Is also required to intentify for datasource specific logging.

The connection ID was meant as a way to identify the connection. As said above this does not need to be included in the category, but could also just be prepended to log message texts.

At least in my application I found it useful to set the log level for all third party libraries to WARN, while would also effectively discard most logging from the JDBC driver even in the case where logging is disabled. With the above categories I now could specifically describe what I am interested in, and I believe I covered most use cases one would use logging for in it. It got SQL logging for the end user, protocol logging for the driver developers, API logging for the end users and application profilers, stuff to detect connection leaks and parameters for an administrator.

Implementation details:
To achive all this I propose to replace most of the loggers currently used with a PgLogger implementation that wraps the usual. We could have a Null Logger here that does nothing, and I hope for the VM to simply drop all calls to it in compiled code completely when we won’t ever instantiate any other implementations of PgLogger, so this will allow for peak performance in the default case. This class will have log functions for each level so we spare the Level-argument in calls to the functions.

This smells like the "custom logging abstraction" anti-pattern.

Yes, and it smells good in this context J. The reason for it is that the logger will contain the connection id and can use it in the log messages. AND most important, the JVM will create extremely short codepaths for the case where the Null logger is configured.

Other implementations of that class really do logging. Depending on if we want connection level logging we could have individual instances of the Loggers per connection and category or just by category.

Do you intent to change to SLF4J?

No, I don’t like PostgreSQL JDBC to have dependencies to other libraries.

Re: [JDBC] New driver logging proposal

From

Daniel Migowski

Date:

09 April 2017, 13:00:40

Hello Jorge,

thanks for your critical comments. My answers are also between the lines.

Summary:

· Leave the current logger configuration by properties as it is for users without a configured logging system.

· Show connection IDs in the log messages to identify connections, and not in logging categories.

· Still use custom logging wrapper for fast deactivation of logging on a connection base and to inject connection ID into log messages.

Regards,

Daniel Migowski

Von: Jorge Solórzano [mailto:jorsol@gmail.com]
Gesendet: Samstag, 8. April 2017 20:15

On Fri, Apr 7, 2017 at 11:13 PM, Daniel Migowski <dmigowski@ikoffice.de> wrote:

this is my proposal for a new PostgreSQL JDBC Driver logging. The aim for this is to merge the efforts done in integrating common logging frameworks while keeping the performance we had before logging frameworks were introduced. Also this proposal should add the functionality to identify connections in the logging output, a feature that was lost when a switch to logging frameworks was made.

+1 for the functionality to identify connections, it was on my TODO list.

OK, then this will be done.

Current (and previous) state:
I assume that people who do logging already use a logging framework of their choice, either because they configured it themselves or by using the one provided by the container they use. In either way they know how to configure logging on the platform they chose.

Sadly you should never "assume" anything of what people use, I still know people that uses System.out.println() to debug.

Alright, so we still need something to help those people out.

Before logging framework support was introduced we had a single logging related Driver parameter called “loglevel”. This could be set to “0” for zero logging, to “1” for a log level corresponding to “DEBUG” in modern logging frameworks, and to “2” for “TRACE” level logging. There was (and seemingly isn’t) a need for higher severity log levels like “INFO”, “WARN”, “ERROR” or “FATAL” because events that would cause these are thrown as exceptions in the error case while warnings and infos are not generated at all from the driver. Logout was written to a LogWriter when this was set on the driver, which would in most cases be used to redirect logout to the logging framework one had chosen.
The newer versions of the driver allow the parameters “loggerLevel” and “loggerFile”. These choices are intended to interfere with the logging framework one uses because change the loglevel programmatically on the logging category “org.postgresql”. Also with “loggerFile” one is able to configure a special logfile for the logging category.

These options are to facilitate the configuration of the logger for those people that don't know or just want an easy way to enable logging.

OK, I will leave these options in and don’t care for people that want to use the driver that way.

The current implementation has the side effect that creating a single connection will change the log level of ALL connections (loaded by the same ClassLoader to be specific) because the loggers are statically defined in the classes which is how loggers are used today. Also, although can be given as connection parameters, they are not connection specific at all. This might not be a problem when one uses only a single connection pool where all connections are created equally, but interferes when one has an application that occasionally has to fetch data from multiple data sources.

Do you intent to initialize a logger for each connection?
You don't need to use connection parameters to use the logger, you can also use a properties file too.

2017-04-09 09:59:24,763 [main] DEBUG org.postgresql.sql - CON-123: My logging text

vs.

2017-04-09 09:59:24,763 [main] DEBUG org.postgresql.sql.123 - My logging text

Also, the current implementation makes have use of TRACE level logging in all getXXX functions of PgResultSet. A typical line is this:
LOGGER.log(Level.FINEST, " getString columnIndex: {0}", columnIndex);
This innocent looking line should do no harm at a first glance, because if tracing is disabled the function should leave immediately and no performance is lost. One minor problem is the usage of the columnIndex as parameter, which is autoboxed into an Integer while passed to the log function (involves a hidden call to Integer.valueOf(int), which even creates new Objects depending on how aggressive the VM is tuned and on the number of columns in the result set.

This has to be backed up by a benchmark, the autobox should be cheap and Integer.valueOf(int) will always cache values in the range -128 to 127, inclusive.

Depends on the VM settings. This might even be much higher but in my testcase I had an absurdly large table with >400 columns, and there I noticed the impact. It is nonetheless not for free.

More important is the implementation of the log function itself, which is relative fast when a good Logging Framework is used, and disastrously slow in many other cases like when Logback is used. In that case an expensive log event is created and passed to the really used logging framework just to be discarded there, but this can IMHO be considered an artefact of a badly configured framework (still throws a bad light on PostgreSQL JDBC performance).

Proposed state:
I believe the common case is that logging for the driver is disabled. For this I propose a Boolean configuration parameter “logging” that can be truly configured per connection. Except for a few places where no reference to a connection exists, this will silent at least all connection specific logging.

So with my new parameter, if you want logging you can either

· use logging=true and use your own logger configuration, or

· use loggerLevel=XXX and let the driver configure logging for you, either to a file specified by “loggerFile”, or to Driver.getLogStream(), or to stderr.

The log level and log destination should be configured by the normal means of the logging framework,
exactly how it is done in every other library that provides some means of logging.

What is the normal means of the logging framework? and exactly how it's done in every other library that provides some means of logging?

I saw from the discussion while introducing logging frameworks that there was demand for programmatic configuration of log levels and log destinations, but since the current implementation does that on the Driver level anyway, this should be done by calling static functions on it, not by giving connection parameters (this can be discussed). The reason that connection pools can be configured easier when these parameters are available to the connection is not valid, because either the connection pool is created in a container, and then logging is configurable in the same textual way anyway and the container user is used to it, or the connection pool is created programmatically, and then one is also able to configure the driver programmatically on a driver level. Therefore we can drop the parameters “loggerLevel” and “loggerFile”, replacing them with helper methods on Driver that just configure the java.util.Logging framework.

Finally it might be nice to be able to identify the connection that logs. I like the idea of ??? to encode it in the logging category.

Yes, it might be nice to identify the connection, what do you mean by ???

Damn, I should have prove read this. I meant https://github.com/vlsi.

Also usually when logging is used the category is the package that logs. It is common to have different logging categories to be able to filter the log messages. To be able to combine both I propose a fixed set of logging categories (<id> is the connection id):

    org.postgresql.connect.<id> ß Logs connecting and disconnecting events. Can log more or less depending on Level. Level INFO logs simple connection/disconnection. Level DEBUG also logs parameters given to connection. Level TRACE logs the whole connection setup including default parameter settings. This also logs WARN messages when connections are finalized without having been closed explicitly.
    org.postgresql.sql.<id> ß Logs the SQL queries executed on the backend. Also logs Bind, Prepare and Execute for PreparedStatements. Level INFO logs all SQL here. Level DEBUG could also log all the parameters given to statements. Level TRACE could even log the returned results.
    org.postgresql.api.<id> ß Logs all interaction with the driver through externally callable functions.
    org.postgresql.protocol.<id> ß Logs on the protocol level. Should always use TRACE level.
    org.postgresql.pool.<id> ß Logs connection pool related events. On Level INFO we could print simple pool usage like retrieving or releasing connections. Maybe there is useful logging on DEBUG level but I don’t know for now.
   org.postgresql.XXX ß Some more internal logging categories for places where we don’t have access to the connection itself. These will always log at TRACE level and are not called very often. I will remove potentially expensive logging in place called often like in static helper functions of BooleanTypeUtil.

I really like the idea of have a fixed set of logging categories. +1

The connection id could , depending on your use case, simply be an Integer used to identify the connection. This could be made optional and even configurable because maybe one wants to encode more information in it. I propose a Driver-level ConnectionIdFactory that by default just returns uses an incrementing id for the connection, but personally I’d like to have the database name in it also. Not only a ConnectionLevel Id might be used but also a DataSource id. Is also required to intentify for datasource specific logging.

The connection ID was meant as a way to identify the connection. As said above this does not need to be included in the category, but could also just be prepended to log message texts.

At least in my application I found it useful to set the log level for all third party libraries to WARN, while would also effectively discard most logging from the JDBC driver even in the case where logging is disabled. With the above categories I now could specifically describe what I am interested in, and I believe I covered most use cases one would use logging for in it. It got SQL logging for the end user, protocol logging for the driver developers, API logging for the end users and application profilers, stuff to detect connection leaks and parameters for an administrator.

Implementation details:
To achive all this I propose to replace most of the loggers currently used with a PgLogger implementation that wraps the usual. We could have a Null Logger here that does nothing, and I hope for the VM to simply drop all calls to it in compiled code completely when we won’t ever instantiate any other implementations of PgLogger, so this will allow for peak performance in the default case. This class will have log functions for each level so we spare the Level-argument in calls to the functions.

This smells like the "custom logging abstraction" anti-pattern.

Other implementations of that class really do logging. Depending on if we want connection level logging we could have individual instances of the Loggers per connection and category or just by category.

Do you intent to change to SLF4J?

No, I don’t like PostgreSQL JDBC to have dependencies to other libraries.

Re: [JDBC] New driver logging proposal

From

Vladimir Sitnikov

Date:

10 April 2017, 00:54:34

Daniel,

I wonder if adding "logging=true" property is required to meed the performance requirements at all.

I would like to refrain from adding new configuration options.

What bothers me with "replace jul with abc" approach is it will be yet another 100500+lines change that is hard to review.

I wonder how far can we go with keeping jul interfaces. Is it possible/enough to just add relevant "log level caching at the resultset/etc levels"?

Daniel>Show connection IDs in the log messages to identify connections, and not in logging categories

ConnectionIDs are unpredictable, so I can hardly imagine a case when end-user would activate logging based on the connection id (e.g. enable debug for a specific ID only).

On the other hand, having something like "org.postgresql.sql.DB_NAME.USER_NAME.CONN_ID" as a logging category could make sense.

Vladimir

Re: New driver logging proposal

From

Vladimir Sitnikov

Date:

10 April 2017, 00:54:34

Daniel,

I wonder if adding "logging=true" property is required to meed the performance requirements at all.

I would like to refrain from adding new configuration options.

What bothers me with "replace jul with abc" approach is it will be yet another 100500+lines change that is hard to review.

I wonder how far can we go with keeping jul interfaces. Is it possible/enough to just add relevant "log level caching at the resultset/etc levels"?

Daniel>Show connection IDs in the log messages to identify connections, and not in logging categories

ConnectionIDs are unpredictable, so I can hardly imagine a case when end-user would activate logging based on the connection id (e.g. enable debug for a specific ID only).

On the other hand, having something like "org.postgresql.sql.DB_NAME.USER_NAME.CONN_ID" as a logging category could make sense.

Vladimir

Re: [JDBC] New driver logging proposal

From

Jorge Solórzano

Date:

10 April 2017, 03:42:59

Yes, I agree with Vladimir, I don't like the idea of a wrapper (anti-pattern), JUL can be mostly fine if configured correctly.

Most performance degradation claims are related to using others logging frameworks and using a bridge to redirect from JUL to SLF4J and that by design it can not impersonate JUL api.

The focus should be that when the logging is disabled it don't introduce some degradation when using slf4j+some other framework (IMO there is where an impact in performance can be found), in other words, when it's disabled, it's truly disabled.

The second focus should be make the logging useful for end users, most of the logging is for developers and related to the protocol, having fixed categories like the proposed ones can help a lot here, and as contradictory as it seems there is needed a lot more log messages. In the mssql-jdbc for instance there is a lot more verbose messages and have a lot entering and exiting messages (not the approach I would use but it's there).

In the issue #788 I have to recompile the driver with more log messages simply because the ones that are there are simply not enough to found a problem in the code.

In the mssql-jdbc the connection id (and statements and resultsets) are simply incremental id, and should be more than enough to identify calls related to a connection/statement/resultset, I mean if you call a prepared statement for instance and you set parameters to it, in a highly concurrent enviroment a setString() related to a prepared statement can seems that was called by another prepared statement in the log.

So far I have seen more benefits of using JUL than claims of performance degradation (and most of the claims are when using slf4j bridge). Of course there is more to do, it might be not the best choice but was that or introduce a optional dependency of SLF4J (wich is a bad idea based on the SLF4J developers https://www.slf4j.org/faq.html#optional_dependency) or simply introduce a direct dependency of SLF4J making a forced used of a dependency that not many want. This is like the chicken-egg problem.

On Sun, Apr 9, 2017 at 3:54 PM, Vladimir Sitnikov <sitnikov.vladimir@gmail.com> wrote:

Daniel,

I wonder if adding "logging=true" property is required to meed the performance requirements at all.
I would like to refrain from adding new configuration options.

What bothers me with "replace jul with abc" approach is it will be yet another 100500+lines change that is hard to review.
I wonder how far can we go with keeping jul interfaces. Is it possible/enough to just add relevant "log level caching at the resultset/etc levels"?

Daniel>Show connection IDs in the log messages to identify connections, and not in logging categories

ConnectionIDs are unpredictable, so I can hardly imagine a case when end-user would activate logging based on the connection id (e.g. enable debug for a specific ID only).
On the other hand, having something like "org.postgresql.sql.DB_NAME.USER_NAME.CONN_ID" as a logging category could make sense.

Vladimir

Re: New driver logging proposal

From

Jorge Solórzano

Date:

10 April 2017, 03:42:59

In the issue #788 I have to recompile the driver with more log messages simply because the ones that are there are simply not enough to found a problem in the code.

On Sun, Apr 9, 2017 at 3:54 PM, Vladimir Sitnikov <sitnikov.vladimir@gmail.com> wrote:

Daniel,

I wonder if adding "logging=true" property is required to meed the performance requirements at all.
I would like to refrain from adding new configuration options.

What bothers me with "replace jul with abc" approach is it will be yet another 100500+lines change that is hard to review.
I wonder how far can we go with keeping jul interfaces. Is it possible/enough to just add relevant "log level caching at the resultset/etc levels"?

Daniel>Show connection IDs in the log messages to identify connections, and not in logging categories

ConnectionIDs are unpredictable, so I can hardly imagine a case when end-user would activate logging based on the connection id (e.g. enable debug for a specific ID only).
On the other hand, having something like "org.postgresql.sql.DB_NAME.USER_NAME.CONN_ID" as a logging category could make sense.

Vladimir

Re: [JDBC] New driver logging proposal

From

Dave Cramer

Date:

10 April 2017, 05:29:09

On 9 April 2017 at 20:42, Jorge Solórzano <jorsol@gmail.com> wrote:

Yes, I agree with Vladimir, I don't like the idea of a wrapper (anti-pattern), JUL can be mostly fine if configured correctly.

Most performance degradation claims are related to using others logging frameworks and using a bridge to redirect from JUL to SLF4J and that by design it can not impersonate JUL api.

Has anyone benchmarked the code before and after ?

The focus should be that when the logging is disabled it don't introduce some degradation when using slf4j+some other framework (IMO there is where an impact in performance can be found), in other words, when it's disabled, it's truly disabled.

The second focus should be make the logging useful for end users, most of the logging is for developers and related to the protocol, having fixed categories like the proposed ones can help a lot here, and as contradictory as it seems there is needed a lot more log messages. In the mssql-jdbc for instance there is a lot more verbose messages and have a lot entering and exiting messages (not the approach I would use but it's there).

In the issue #788 I have to recompile the driver with more log messages simply because the ones that are there are simply not enough to found a problem in the code.

To be honest in 15 years of developing the driver I've never used logging to debug the driver. It is almost always possible to run the driver in a debugger to see what is going on.

The only time this is useful is when a user can't do this but can turn on debugging to send protocol messages.

In the mssql-jdbc the connection id (and statements and resultsets) are simply incremental id, and should be more than enough to identify calls related to a connection/statement/resultset, I mean if you call a prepared statement for instance and you set parameters to it, in a highly concurrent enviroment a setString() related to a prepared statement can seems that was called by another prepared statement in the log.

So far I have seen more benefits of using JUL than claims of performance degradation (and most of the claims are when using slf4j bridge). Of course there is more to do, it might be not the best choice but was that or introduce a optional dependency of SLF4J (wich is a bad idea based on the SLF4J developers https://www.slf4j.org/faq.html#optional_dependency) or simply introduce a direct dependency of SLF4J making a forced used of a dependency that not many want. This is like the chicken-egg problem.

How would the bridge degrade performance if logging is off ?

Dave Cramer

davec@postgresintl.com

www.postgresintl.com

On Sun, Apr 9, 2017 at 3:54 PM, Vladimir Sitnikov <sitnikov.vladimir@gmail.com> wrote:
Daniel,

I wonder if adding "logging=true" property is required to meed the performance requirements at all.
I would like to refrain from adding new configuration options.

What bothers me with "replace jul with abc" approach is it will be yet another 100500+lines change that is hard to review.
I wonder how far can we go with keeping jul interfaces. Is it possible/enough to just add relevant "log level caching at the resultset/etc levels"?

Daniel>Show connection IDs in the log messages to identify connections, and not in logging categories

ConnectionIDs are unpredictable, so I can hardly imagine a case when end-user would activate logging based on the connection id (e.g. enable debug for a specific ID only).
On the other hand, having something like "org.postgresql.sql.DB_NAME.USER_NAME.CONN_ID" as a logging category could make sense.

Vladimir

Re: New driver logging proposal

From

Dave Cramer

Date:

10 April 2017, 05:29:09

On 9 April 2017 at 20:42, Jorge Solórzano <jorsol@gmail.com> wrote:

Yes, I agree with Vladimir, I don't like the idea of a wrapper (anti-pattern), JUL can be mostly fine if configured correctly.

Most performance degradation claims are related to using others logging frameworks and using a bridge to redirect from JUL to SLF4J and that by design it can not impersonate JUL api.

Has anyone benchmarked the code before and after ?

The focus should be that when the logging is disabled it don't introduce some degradation when using slf4j+some other framework (IMO there is where an impact in performance can be found), in other words, when it's disabled, it's truly disabled.

The second focus should be make the logging useful for end users, most of the logging is for developers and related to the protocol, having fixed categories like the proposed ones can help a lot here, and as contradictory as it seems there is needed a lot more log messages. In the mssql-jdbc for instance there is a lot more verbose messages and have a lot entering and exiting messages (not the approach I would use but it's there).

In the issue #788 I have to recompile the driver with more log messages simply because the ones that are there are simply not enough to found a problem in the code.

To be honest in 15 years of developing the driver I've never used logging to debug the driver. It is almost always possible to run the driver in a debugger to see what is going on.

The only time this is useful is when a user can't do this but can turn on debugging to send protocol messages.

In the mssql-jdbc the connection id (and statements and resultsets) are simply incremental id, and should be more than enough to identify calls related to a connection/statement/resultset, I mean if you call a prepared statement for instance and you set parameters to it, in a highly concurrent enviroment a setString() related to a prepared statement can seems that was called by another prepared statement in the log.

So far I have seen more benefits of using JUL than claims of performance degradation (and most of the claims are when using slf4j bridge). Of course there is more to do, it might be not the best choice but was that or introduce a optional dependency of SLF4J (wich is a bad idea based on the SLF4J developers https://www.slf4j.org/faq.html#optional_dependency) or simply introduce a direct dependency of SLF4J making a forced used of a dependency that not many want. This is like the chicken-egg problem.

How would the bridge degrade performance if logging is off ?

Dave Cramer

davec@postgresintl.com

www.postgresintl.com

On Sun, Apr 9, 2017 at 3:54 PM, Vladimir Sitnikov <sitnikov.vladimir@gmail.com> wrote:
Daniel,

I wonder if adding "logging=true" property is required to meed the performance requirements at all.
I would like to refrain from adding new configuration options.

What bothers me with "replace jul with abc" approach is it will be yet another 100500+lines change that is hard to review.
I wonder how far can we go with keeping jul interfaces. Is it possible/enough to just add relevant "log level caching at the resultset/etc levels"?

Daniel>Show connection IDs in the log messages to identify connections, and not in logging categories

ConnectionIDs are unpredictable, so I can hardly imagine a case when end-user would activate logging based on the connection id (e.g. enable debug for a specific ID only).
On the other hand, having something like "org.postgresql.sql.DB_NAME.USER_NAME.CONN_ID" as a logging category could make sense.

Vladimir

Re: New driver logging proposal

From

Daniel Migowski

Date:

10 April 2017, 07:31:41

Hello Vladimir,

Von: Vladimir Sitnikov [mailto:sitnikov.vladimir@gmail.com]
Gesendet: Sonntag, 9. April 2017 23:55

I wonder if adding "logging=true" property is required to meed the performance requirements at all.

I would like to refrain from adding new configuration options.

What bothers me with "replace jul with abc" approach is it will be yet another 100500+lines change that is hard to review.

I wonder how far can we go with keeping jul interfaces. Is it possible/enough to just add relevant "log level caching at the resultset/etc levels"?

I didn’t want to use slfj, also because it is still another extra layer that does the logging. The layer I wanted to add would have added a NullLogger, where all logging functions have empty code, are inlined by the JVM and each call would result in assembly code that does only check if the instance of the logger is still the NullLogger and then does nothing more! It just thought this to be cleaner than log level caching, because then we can spare the level caching code. Think of the wrapper as exactly that: A log level cache.

Daniel>Show connection IDs in the log messages to identify connections, and not in logging categories

ConnectionIDs are unpredictable, so I can hardly imagine a case when end-user would activate logging based on the connection id (e.g. enable debug for a specific ID only).

On the other hand, having something like "org.postgresql.sql.DB_NAME.USER_NAME.CONN_ID" as a logging category could make sense.

I wanted to use an incrementing ID as a connection ID, didn’t know the term meant something else.