Thread: Change in Log Format and Prepared Statements

Change in Log Format and Prepared Statements

From

Alexander Stanier

Date:

13 September 2012, 15:47:31

Hi,

Over the past year we have performed upgrades to both our PostgreSQL
server our jdbc driver over the last few months:

 From PostgreSQL 8.2.10 with pg74.216.jdbc3 driver
To PostgreSQL 8.2.10 with pg8.2-511.jdbc4 driver
To PostgreSQL 9.1.4 with pg9.1-902.jdbc4 driver

(O/S is Centos 5)

When we moved from the pg74.216.jdbc3 driver to the pg8.2-511.jdbc4 we
saw a distinct change in the logging output by the database.

Previously with the pg74.216.jdbc3 driver we saw the statement followed
by a duration:

2012-09-13 13:08:48 BST [22116] [mp_demo] [pg] LOG:  statement: SELECT
users.USERID, users.LOGINID, users.ORGID, users.FIRSTNAME,
users.LASTNAME, users.TITLE, users.TELEPHONE, users.CELLPHONE,
users.PAGER, users.FAX, users.EMAIL, users.LASTLOGIN, users.ADDRESSID,
users.USERSTATUSID, users.SHARED, users.MODDATE, users.ISPROXY FROM
users WHERE users.LOGINID='alex-admin';
2012-09-13 13:08:48 BST [22116] [mp_demo] [pg] LOG:  duration: 0.805 ms

Now with the pg8.2-511.jdbc4 and pg9.1-902.jdbc4 drivers we see the
statement issued as an execute and three durations:

2012-09-13 13:07:33 BST [21664] [mp_demo] [pg] LOG:  execute <unnamed>:
SELECT users.USERID, users.LOGINID, users.ORGID, users.FIRSTNAME,
users.LASTNAME, users.TITLE, users.TELEPHONE, users.CELLPHONE,
users.PAGER, users.FAX, users.EMAIL, users.LASTLOGIN, users.ADDRESSID,
users.USERSTATUSID, users.SHARED, users.MODDATE, users.ISPROXY FROM
users WHERE users.LOGINID='alex-admin'
2012-09-13 13:07:33 BST [21664] [mp_demo] [pg] LOG:  duration: 0.208 ms
2012-09-13 13:07:33 BST [21664] [mp_demo] [pg] LOG:  duration: 1.136 ms
2012-09-13 13:07:33 BST [21664] [mp_demo] [pg] LOG:  duration: 0.156 ms

Indeed we still see the former if I run the same statement direct from
psql. So it looks to me as though the later drivers may be making use of
prepared statements (hence the "execute"?) but why do we now see 3
durations?

Also, as far as I can every single statement coming from the jdbc driver
is being issued as "execute <unnamed>: ......". Even BEGIN, COMMIT and
ROLLBACK statements are being issued in the same way and result in 2 or
3 duration statements. Is this normal?

If so, I am not convinced that we really want to use prepared statements
all the time, as it doesn't appear that statements are being repeated
(repetition being my understanding of why you would want to make use of
prepared statements). After looking at the documentation, I have tried
using the prepareThreshold parameter on the connection string in my
Tomcat server.xml (Tomcat v.6.0.29) to disable prepared statements, e.g.:

url="jdbc:postgresql://servername:5434/mp_demo?prepareThreshold=0"

.....but this doesn't seem to have any effect. Could anyone tell me what
is going on and how to effectively control this jdbc behaviour? I am
concerned the continual use of prepared statements for one-time-only
statements could be a performance drain on our system.

Alex Stanier.


This message has been scanned for malware by SurfControl plc. www.surfcontrol.com

Re: Change in Log Format and Prepared Statements

From

Dave Cramer

Date:

13 September 2012, 22:25:00

Yes, the new driver uses V3 protocol which uses prepared statements,
named and un-named. I think the 3 durations may be due to the prepare
and execute.

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca


On Thu, Sep 13, 2012 at 8:47 AM, Alexander Stanier
<alexander.stanier@egsgroup.com> wrote:
> Hi,
>
> Over the past year we have performed upgrades to both our PostgreSQL server
> our jdbc driver over the last few months:
>
> From PostgreSQL 8.2.10 with pg74.216.jdbc3 driver
> To PostgreSQL 8.2.10 with pg8.2-511.jdbc4 driver
> To PostgreSQL 9.1.4 with pg9.1-902.jdbc4 driver
>
> (O/S is Centos 5)
>
> When we moved from the pg74.216.jdbc3 driver to the pg8.2-511.jdbc4 we saw a
> distinct change in the logging output by the database.
>
> Previously with the pg74.216.jdbc3 driver we saw the statement followed by a
> duration:
>
> 2012-09-13 13:08:48 BST [22116] [mp_demo] [pg] LOG:  statement: SELECT
> users.USERID, users.LOGINID, users.ORGID, users.FIRSTNAME, users.LASTNAME,
> users.TITLE, users.TELEPHONE, users.CELLPHONE, users.PAGER, users.FAX,
> users.EMAIL, users.LASTLOGIN, users.ADDRESSID, users.USERSTATUSID,
> users.SHARED, users.MODDATE, users.ISPROXY FROM users WHERE
> users.LOGINID='alex-admin';
> 2012-09-13 13:08:48 BST [22116] [mp_demo] [pg] LOG:  duration: 0.805 ms
>
> Now with the pg8.2-511.jdbc4 and pg9.1-902.jdbc4 drivers we see the
> statement issued as an execute and three durations:
>
> 2012-09-13 13:07:33 BST [21664] [mp_demo] [pg] LOG:  execute <unnamed>:
> SELECT users.USERID, users.LOGINID, users.ORGID, users.FIRSTNAME,
> users.LASTNAME, users.TITLE, users.TELEPHONE, users.CELLPHONE, users.PAGER,
> users.FAX, users.EMAIL, users.LASTLOGIN, users.ADDRESSID,
> users.USERSTATUSID, users.SHARED, users.MODDATE, users.ISPROXY FROM users
> WHERE users.LOGINID='alex-admin'
> 2012-09-13 13:07:33 BST [21664] [mp_demo] [pg] LOG:  duration: 0.208 ms
> 2012-09-13 13:07:33 BST [21664] [mp_demo] [pg] LOG:  duration: 1.136 ms
> 2012-09-13 13:07:33 BST [21664] [mp_demo] [pg] LOG:  duration: 0.156 ms
>
> Indeed we still see the former if I run the same statement direct from psql.
> So it looks to me as though the later drivers may be making use of prepared
> statements (hence the "execute"?) but why do we now see 3 durations?
>
> Also, as far as I can every single statement coming from the jdbc driver is
> being issued as "execute <unnamed>: ......". Even BEGIN, COMMIT and ROLLBACK
> statements are being issued in the same way and result in 2 or 3 duration
> statements. Is this normal?
>
> If so, I am not convinced that we really want to use prepared statements all
> the time, as it doesn't appear that statements are being repeated
> (repetition being my understanding of why you would want to make use of
> prepared statements). After looking at the documentation, I have tried using
> the prepareThreshold parameter on the connection string in my Tomcat
> server.xml (Tomcat v.6.0.29) to disable prepared statements, e.g.:
>
> url="jdbc:postgresql://servername:5434/mp_demo?prepareThreshold=0"
>
> .....but this doesn't seem to have any effect. Could anyone tell me what is
> going on and how to effectively control this jdbc behaviour? I am concerned
> the continual use of prepared statements for one-time-only statements could
> be a performance drain on our system.
>
> Alex Stanier.
>
>
> This message has been scanned for malware by SurfControl plc.
> www.surfcontrol.com
>
>
> --
> Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-jdbc

Re: Change in Log Format and Prepared Statements

From

Maciek Sakrejda

Date:

13 September 2012, 23:12:57

To clarify, when prepareThreshold is set to zero, the driver still
uses prepared statements, but it uses only the unnamed variant, which
should effectively have no overhead compared to inlining parameters.
The separate log messages are due to this change.

Safely passing in parameters is an important reason to use prepared
statements (perhaps more so than performance), so as to protect
against SQL injection. For what it's worth, based on the logs, it
looks like you're not doing that; you should seriously consider making
that change, especially if data like user names is coming externally.

Re: Change in Log Format and Prepared Statements

From

Craig Ringer

Date:

17 September 2012, 08:15:55

On 09/14/2012 03:24 AM, Dave Cramer wrote:
> Yes, the new driver uses V3 protocol which uses prepared statements,
> named and un-named. I think the 3 durations may be due to the prepare
> and execute.

I noticed this as well, and wasn't entirely thrilled with the effects on
logging.

I wonder if Pg can be convinced to suppress the non-execution times, or
merge them to report a single statement execution time. I'll poke -hackers.

--
Craig Ringer

Re: Change in Log Format and Prepared Statements

From

Alexander Stanier

Date:

17 September 2012, 09:46:09

<font size="-1"><font face="Helvetica, Arial, sans-serif">Thanks Dave and Craig for you input.<br /><br /> Does anyone
knowwhat the three durations specifically represent? Presumably two of them are the time to prepare and execute the
statement?What's the third? Is the overall execution time the sum of the three (or two in some cases)?<br /><br /> It
wouldbe useful to have one summary time for the overall execution of each statement as before. We wrote a little Java
programwhich analyses log file based on a statement followed by single duration (and outputs the most frequent, the
slowestand most costly queries). We could rewrite the program to cope with the new format if needs be but if we can get
asingle duration it would save us some work. I don't mind using prepared statements if there really is no overhead and
Ican fully understand what each part of the new log format is.<br /><br /> Alex Stanier.<br /><br /></font></font><div
class="moz-cite-prefix">On17/09/2012 06:15, Craig Ringer wrote:<br /></div><blockquote
cite="mid:5056B201.5020809@ringerc.id.au"type="cite">On 09/14/2012 03:24 AM, Dave Cramer wrote: <br /><blockquote
type="cite">Yes,the new driver uses V3 protocol which uses prepared statements, <br /> named and un-named. I think the
3durations may be due to the prepare <br /> and execute. <br /></blockquote><br /> I noticed this as well, and wasn't
entirelythrilled with the effects on logging. <br /><br /> I wonder if Pg can be convinced to suppress the
non-executiontimes, or merge them to report a single statement execution time. I'll poke -hackers. <br /><br /> -- <br
/>Craig Ringer <br /><br /><br /><br /> This email has been scanned for viruses by Blackspider's Mail Control System.
Formore information please visit <a class="moz-txt-link-freetext"
href="http://www.blackspider.com">http://www.blackspider.com</a> </blockquote><br /><br /><br /><p align="center"><font
style="BACKGROUND-COLOR:#ffffff">This message has been scanned for malware by SurfControl plc. </font><a
href="http://www.surfcontrol.com/"><fontcolor="#000000" style="BACKGROUND-COLOR:
#ffffff">www.surfcontrol.com</font></a>

Re: Change in Log Format and Prepared Statements

From

Alexander Stanier

Date:

17 September 2012, 09:58:31

<font size="-1"><font face="Helvetica, Arial, sans-serif">Thanks Maciek for your advice which I will pass back to our
Javadevelopers. We are using a combination of Torque and Hibernate to access the database via jdbc. I think Hibernate
passesparameters as you suggest but Torque doesn't. We are slowly moving away from Torque to Hibernate.<br /><br /> Do
youknow if there is any documentation on the "unnamed variant" of prepared statements? I've read the pages on PREPARE
andEXECUTE but they don't go into much depth about what happens in the background.<br /><br /> Alex Stanier.<br /><br
/></font></font><divclass="moz-cite-prefix">On 13/09/2012 21:12, Maciek Sakrejda wrote:<br /></div><blockquote
cite="mid:CAOtHd0Df2_L=k+OxpWgNXso2w298V=D1GY7RDfwczTta=WjaJg@mail.gmail.com"type="cite"><pre wrap="">To clarify, when
prepareThresholdis set to zero, the driver still 
uses prepared statements, but it uses only the unnamed variant, which
should effectively have no overhead compared to inlining parameters.
The separate log messages are due to this change.

Safely passing in parameters is an important reason to use prepared
statements (perhaps more so than performance), so as to protect
against SQL injection. For what it's worth, based on the logs, it
looks like you're not doing that; you should seriously consider making
that change, especially if data like user names is coming externally.


This email has been scanned for viruses by Blackspider's Mail Control System.
For more information please visit <a class="moz-txt-link-freetext"
href="http://www.blackspider.com">http://www.blackspider.com</a>  
</pre></blockquote><br /><br /><br /><p align="center"><font style="BACKGROUND-COLOR: #ffffff">This message has been
scannedfor malware by SurfControl plc. </font><a href="http://www.surfcontrol.com/"><font color="#000000"
style="BACKGROUND-COLOR:#ffffff">www.surfcontrol.com</font></a>

Re: Change in Log Format and Prepared Statements

From

Craig Ringer

Date:

17 September 2012, 14:54:35

On 09/17/2012 02:58 PM, Alexander Stanier wrote:
> Thanks Maciek for your advice which I will pass back to our Java
> developers. We are using a combination of Torque and Hibernate to access
> the database via jdbc. I think Hibernate passes parameters as you
> suggest but Torque doesn't. We are slowly moving away from Torque to
> Hibernate.
>
> Do you know if there is any documentation on the "unnamed variant" of
> prepared statements? I've read the pages on PREPARE and EXECUTE but they
> don't go into much depth about what happens in the background.

The v3 protocol uses something like PREPARE and EXECUTE, but not those
statements as such.

You want the documentation on the fe/be protocol:

   http://www.postgresql.org/docs/current/static/protocol.html


http://www.postgresql.org/docs/current/static/protocol-overview.html#PROTOCOL-QUERY-CONCEPTS


http://www.postgresql.org/docs/current/static/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY

For the exact communications used by PgJDBC you're best off checking the
sources. I haven't read the wire protocol / backend part yet.

--
Craig Ringer

Re: Change in Log Format and Prepared Statements

From

Craig Ringer

Date:

19 September 2012, 08:26:05

On 09/17/2012 02:45 PM, Alexander Stanier wrote:
> Thanks Dave and Craig for you input.
>
> Does anyone know what the three durations specifically represent?
> Presumably two of them are the time to prepare and execute the
> statement? What's the third? Is the overall execution time the sum of
> the three (or two in some cases)?

I have no idea, honestly. Maybe ask on pgsql-general ?

> It would be useful to have one summary time for the overall execution of
> each statement as before.

I strongly agree. Three separate log lines for timing, with no
indication about which is which, sucks.

I wonder what'd happen if a more informative log_line_prefix were added...

--
Craig Ringer

Re: Change in Log Format and Prepared Statements

From

Tom Lane

Date:

19 September 2012, 09:32:23

Craig Ringer <ringerc@ringerc.id.au> writes:
> On 09/17/2012 02:45 PM, Alexander Stanier wrote:
>> Does anyone know what the three durations specifically represent?

> I have no idea, honestly. Maybe ask on pgsql-general ?

Parse, bind, execute phases of extended-query protocol?

http://www.postgresql.org/docs/9.2/static/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY

> I strongly agree. Three separate log lines for timing, with no
> indication about which is which, sucks.

AFAICT from looking at the backend code, they are labeled ...

            regards, tom lane

Re: Change in Log Format and Prepared Statements

From

Craig Ringer

Date:

20 September 2012, 16:16:19

On 09/19/2012 02:32 PM, Tom Lane wrote:
> Craig Ringer <ringerc@ringerc.id.au> writes:
>> On 09/17/2012 02:45 PM, Alexander Stanier wrote:
>>> Does anyone know what the three durations specifically represent?
>
>> I have no idea, honestly. Maybe ask on pgsql-general ?
>
> Parse, bind, execute phases of extended-query protocol?
>
> http://www.postgresql.org/docs/9.2/static/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY
>
>> I strongly agree. Three separate log lines for timing, with no
>> indication about which is which, sucks.
>
> AFAICT from looking at the backend code, they are labeled ...

Yes, they have command tags, and if a suitable log_line_prefix like:

     log_line_prefix = 'cmd=%i pid=%p '

is set they are visible:

cmd=INSERT pid=4067 LOG:  execute <unnamed>: INSERT INTO hold VALUES (1)
cmd=INSERT pid=4067 LOG:  duration: 1.454 ms
cmd=PARSE pid=4067 LOG:  duration: 0.066 ms
cmd=PARSE pid=4067 LOG:  duration: 0.028 ms
cmd=BIND pid=4067 LOG:  duration: 0.007 ms

but that just makes things more confusing. What's with the two PARSEs?
and why is the INSERT logged *before* the PARSE and BIND?

Does all this verbosity really need to be displayed by `log_duration`
even when `log_min_messages` is at the default? It feels like
implementation detail / debug trace stuff, not something that should be
visible to day to day users and admins who just want to log statement
durations. They don't care if the client is using the extended query
protocol or not.

I think this is a bit of a usability wart.

How, at the default log level, does it convey any useful information
whatsoever?

LOG:  execute <unnamed>: INSERT INTO testrs VALUES (1)
LOG:  duration: 0.109 ms
LOG:  duration: 0.026 ms
LOG:  duration: 0.031 ms
LOG:  duration: 0.005 ms
LOG:  execute <unnamed>: INSERT INTO testrs VALUES (2)
LOG:  duration: 0.029 ms
LOG:  duration: 0.023 ms
LOG:  duration: 0.021 ms
LOG:  duration: 0.004 ms

There's no reliable way to associate the durations with the statements
in case of interleaving, and nothing that tells you what the durations
are. Here's what I'd want to see as the *default:

LOG: [4067]  execute <unnamed>: INSERT INTO testrs VALUES (1)
LOG: [4067]  duration: 0.161 ms

If durations are logged on a separate line, a reliable way to associate
them - be it backend pid, vtxid, or whatever - needs to be logged by
default, and IMO the durations need to be reported as a single overall
time with the command-tag of the associated command (if shown in user's
log_line_prefix).

Thoughts/opinions?

--
Craig Ringer

Re: Change in Log Format and Prepared Statements

From

Tom Lane

Date:

20 September 2012, 18:26:26

Craig Ringer <ringerc@ringerc.id.au> writes:
> On 09/19/2012 02:32 PM, Tom Lane wrote:
>> AFAICT from looking at the backend code, they are labeled ...

> Yes, they have command tags, and if a suitable log_line_prefix like:

>      log_line_prefix = 'cmd=%i pid=%p '

> is set they are visible:

> cmd=INSERT pid=4067 LOG:  execute <unnamed>: INSERT INTO hold VALUES (1)
> cmd=INSERT pid=4067 LOG:  duration: 1.454 ms
> cmd=PARSE pid=4067 LOG:  duration: 0.066 ms
> cmd=PARSE pid=4067 LOG:  duration: 0.028 ms
> cmd=BIND pid=4067 LOG:  duration: 0.007 ms

Oh.  You are talking about log_duration output, which is *meant* to be
terse.  It's designed that way for high-query-volume installations that
can't afford more log text than that.  If you want something that's more
easily readable, turn that off and instead set
log_min_duration_statement to zero, as indeed is recommended in the
documentation.

> but that just makes things more confusing. What's with the two PARSEs?
> and why is the INSERT logged *before* the PARSE and BIND?

It's hard to be sure without seeing the more verbose logging, but
I think you are looking at interleaved parsing/execution of multiple
prepared statements.  Because that's allowed, simplifying the logging as
you suggest doesn't seem very practical.

My recollection is that the current behavior of the duration-logging
options was hashed out in considerable detail some years ago, and is
meant to serve multiple sets of requirements.  It's not going to be
that easy to persuade people to change it.

            regards, tom lane