Re: [HACKERS] statement logging / extended query protocol issues - Mailing list pgsql-patches
From | Bruce Momjian |
---|---|
Subject | Re: [HACKERS] statement logging / extended query protocol issues |
Date | |
Msg-id | 200509230154.j8N1sVi17799@candle.pha.pa.us Whole thread Raw |
Responses |
Re: [HACKERS] statement logging / extended query protocol issues
(Simon Riggs <simon@2ndquadrant.com>)
|
List | pgsql-patches |
Here are comments on this posting and the patch (it is long): Simon Riggs wrote: > On Thu, 2005-09-08 at 13:14 +1200, Oliver Jowett wrote: > > Simon Riggs wrote: > > > Oliver, would it be possible to show a simplified call sequence and what > > > you would like to see logged for each call? > > These are good: Maybe it should even be in the docs for the driver? > It would be good if it could be written as a test within the driver, so > we can expand it and test the logging. > > > The JDBC driver generates one of these sequences: > > > > (1) > > Parse (unnamed statement) "SELECT 1" > > Bind (unnamed statement -> unnamed portal) > > Execute (unnamed portal, no row limit) > > > > (2) > > Parse (named statement S_1) "SELECT 1" > > repeatedly: > > Bind (named statement S_1 -> unnamed portal) > > Execute (unnamed portal, no row limit) > > > > (3) > > Parse (named statement S_2) "SELECT 1" > > repeatedly: > > Bind (named statement S_2 -> named portal C_2) > > repeatedly: > > Execute (named portal C_2, row limit 42) > > Are we sure there is just 3 cases? > > > Ideal output is: > > > > (1) LOG: statement: SELECT 1 > > > > (2) LOG: statement: SELECT 1 > > LOG: statement: SELECT 1 > > LOG: statement: SELECT 1 > > > > (3) LOG: statement: SELECT 1 > > LOG: statement: SELECT 1 > > LOG: statement: SELECT 1 > > > > In case (3), that's one log line per repeat of the outer loop, > > regardless of how many Executes are sent in the inner loop. > > > Note that case (1) is the most common case for application queries via > > the JDBC driver, and case (2) is the most common for internally > > generated statements like BEGIN. > > Even if case (3) is not that common, I still want to know it is > occurring, to see what effect or overhead it has. Agreed. > > As you can see from the output I'd like, I don't think that synthesizing > > FETCH / EXECUTE queries that don't actually exist [is a] > > useful thing to do, at least at the Joe Average User > > level. > > Your original point at the top of this thread was valid: a get-next-rows > shouldn't look like a re-execute. We can call it something else if you > like, as long as we can tell the difference. > > We'll only see the output for case (3) when someone has programmed > things that way by using setFetchSize. > > > Also note that the JDBC driver doesn't exercise all of the extended > > protocol -- for example it's possible to re-Bind the same unnamed > > statement repeatedly, or have multiple Executes on an unnamed portal > > with a row limit, but the JDBC driver never does that. > > I agree there's not much gained from displaying the BIND statement as it > is. I argued previously against including the BIND parameters. Now I > would say we should either include them or leave out BIND altogether. Agreed. We should output BIND parameters at some point. We do log client-side PREPARE parameters now. > Here's a new suggestion and patch that brings together > - Oliver and Simon's wish to remove BIND from normal logging > - Oliver's suggestion to remove the PREPARE logging for unnamed > statements, which would otherwise double-up logging for case(1) > - Bruce and Simon's view to keep some form of FETCH logging > - Tom's view to rationalise the way ROWS is mentioned > > (lines beginning jdbc don't show in the log, but are just there to show > clearly the time sequence of activities and what gets subsequently > logged) > > (1) > > Parse (unnamed statement) "SELECT * from pg_proc" > > Bind (unnamed statement -> unnamed portal) > > Execute (unnamed portal, no row limit) > > (1) > jdbc parse > jdbc bind > jdbc execute > LOG: statement: SELECT * from pg_proc > > jdbc parse > jdbc bind > jdbc execute > LOG: statement: SELECT * from pg_proc > > jdbc parse > jdbc bind > jdbc execute > LOG: statement: SELECT * from pg_proc > > > Notice that the parse of the unnamed statement does *not* now generate a > log record. What is the reason for this? I am not so worried about what JDBC uses but what the protocol supports. From my reading of the documentatation is seems to support a single prepare and multiple executes of an unnamed statement. > (2) > > Parse (named statement S_1) "SELECT * from pg_proc" > > repeatedly: > > Bind (named statement S_1 -> unnamed portal) > > Execute (unnamed portal, no row limit) > > (2) > jdbc parse S_1 > LOG: statement: PREPARE S_1 AS SELECT * from pg_proc > (perhaps this should be logged at BIND time, just like the > optimization?) > > jdbc bind S_1 > jdbc execute > LOG: statement: EXECUTE <unnamed> [PREPARE: SELECT * from pg_proc] > > jdbc bind S_1 > jdbc execute > LOG: statement: EXECUTE <unnamed> [PREPARE: SELECT * from pg_proc] > > jdbc bind S_1 > jdbc execute > LOG: statement: EXECUTE <unnamed> [PREPARE: SELECT * from pg_proc] > > > ...I wonder if <unnamed> just confuses what is going on here? I've left > it in for now, but suggest that we take that out again? What do you want to replace it with? A missing name seems confusing. > (3) > > Parse (named statement S_2) "SELECT * from pg_proc" > > repeatedly: > > Bind (named statement S_2 -> named portal C_2) > > repeatedly: > > Execute (named portal C_2, row limit 42) > > (3) > jdbc prepare S_2 > LOG: statement: PREPARE S_2 AS SELECT * from pg_proc > > jdbc bind S_2 to C_2 > jdbc execute C_2 > LOG: statement: EXECUTE C_2 ROWS 42 [PREPARE: SELECT * from pg_proc] > jdbc next (after cache has run out on 42nd row) > v3 protocol sends E for Execute, execution halts at 49 rows for this set > of bind parameters > LOG: statement: FETCH C_2 ROWS 7 > > jdbc bind S_2 to C_2 > jdbc execute C_2 > LOG: statement: EXECUTE C_2 ROWS 42 [PREPARE: SELECT * from pg_proc] > jdbc next (after cache has run out on 42nd row) > v3 protocol sends E for Execute > LOG: statement: FETCH C_2 ROWS 42 > jdbc next (after cache has run out on 84th row) > v3 protocol sends E for Execute, execution halts at 95 rows for this set > of bind parameters > LOG: statement: FETCH C_2 ROWS 11 > > Note: log_min_duration_statement logs after execution so can give > accurate row counts of what was retrieved for first execute and > subsequent fetches. In that case we log using the word ROWS. > log_statement=all logs before execution and so only knows what the > maximum number of rows requested is, not what the actual number of rows > retrieved will be. In that case we log using the word MAXROWS. > ROWS and MAXROWS are *not* mentioned unless we specifically set max_rows > in the execute request using the v3 protocol. > > If we agree, I'd suggest this goes into the docs... > > I've not written a comprehensive test program that covers all of the > different settings of v2/v3, named/unnamed, allrows/restricted rows, > log_statement=all/log_min_duration_statement. I'm not sure that will fit > within the existing test framework. So this patch is still prototype. Here are my comments on the patch: > Index: src/backend/tcop/postgres.c > =================================================================== > RCS file: /projects/cvsroot/pgsql/src/backend/tcop/postgres.c,v > retrieving revision 1.458 > diff -c -c -r1.458 postgres.c > *** src/backend/tcop/postgres.c 2 Sep 2005 21:50:54 -0000 1.458 > --- src/backend/tcop/postgres.c 9 Sep 2005 10:06:46 -0000 > *************** > *** 1162,1173 **** > if (save_log_statement_stats) > ResetUsage(); > > - if (log_statement == LOGSTMT_ALL) > - ereport(LOG, > - (errmsg("statement: PREPARE %s AS %s", > - (*stmt_name != '\0') ? stmt_name : "<unnamed>", > - query_string))); > - > /* > * Start up a transaction command so we can run parse analysis etc. > * (Note that this will normally change current memory context.) > --- 1162,1167 ---- > *************** > *** 1217,1222 **** > --- 1211,1222 ---- > > QueryContext = CurrentMemoryContext; > > + > + /* Log parse only when using named statements, unless in debug1 mode */ > + if (log_statement == LOGSTMT_ALL) > + ereport((is_named ? DEBUG1 : LOG), > + (errmsg("statement: PREPARE %s AS %s", stmt_name, query_string))); > + This seems backwards. The comments says log only named statements, but the '?' test makes named statements DEBUG. Is this intended? Why have you dropped the "<unnamed>"? Whether it is DEBUG or not we should display meaningful output. > /* > * Do basic parsing of the query or queries (this should be safe even > * if we are in aborted transaction state!) > *************** > *** 1467,1473 **** > portal = CreatePortal(portal_name, false, false); > > if (log_statement == LOGSTMT_ALL) > ! ereport(LOG, > (errmsg("statement: <BIND> %s", portal_name))); > > /* > --- 1467,1473 ---- > portal = CreatePortal(portal_name, false, false); > > if (log_statement == LOGSTMT_ALL) > ! ereport(DEBUG1, > (errmsg("statement: <BIND> %s", portal_name))); Can we get the BIND parameters here? They seem significant for debugging. > /* > *************** > *** 1683,1688 **** > --- 1683,1689 ---- > bool save_log_duration = log_duration; > int save_log_min_duration_statement = log_min_duration_statement; > bool save_log_statement_stats = log_statement_stats; > + bool subsequent_fetch = false; Can we call this "execute_is_fetch". That seems clearer. > > /* Adjust destination to tell printtup.c what to do */ > dest = whereToSendOutput; > *************** > *** 1695,1700 **** > --- 1696,1710 ---- > (errcode(ERRCODE_UNDEFINED_CURSOR), > errmsg("portal \"%s\" does not exist", portal_name))); > > + /* > + * If we re-issue an Execute protocol request against an existing > + * portal, then we are only fetching more rows rather than > + * completely re-executing the query from the start. atStart is never > + * reset for a v3 portal, so we are safe to use this check. > + */ > + if (!portal->atStart) > + subsequent_fetch = true; > + > /* > * If the original query was a null string, just return > * EmptyQueryResponse. > *************** > *** 1706,1712 **** > return; > } > > ! if (portal->sourceText) > { > debug_query_string = portal->sourceText; > pgstat_report_activity(portal->sourceText); > --- 1716,1727 ---- > return; > } > > ! if (subsequent_fetch) > ! { > ! debug_query_string = "fetch message"; > ! pgstat_report_activity("<FETCH>"); > ! } > ! else if (portal->sourceText) Shouldn't we supply the name of the prepare here, or don't we have one? If not, we should call it "unnamed". > { > debug_query_string = portal->sourceText; > pgstat_report_activity(portal->sourceText); > *************** > *** 1731,1742 **** > if (save_log_statement_stats) > ResetUsage(); > > if (log_statement == LOGSTMT_ALL) > ! /* We have the portal, so output the source query. */ > ! ereport(LOG, > ! (errmsg("statement: EXECUTE %s [PREPARE: %s]", > ! (*portal_name != '\0') ? portal_name : "<unnamed>", > portal->sourceText ? portal->sourceText : ""))); > > BeginCommand(portal->commandTag, dest); > > --- 1746,1774 ---- > if (save_log_statement_stats) > ResetUsage(); > > + /* > + * log statement details - when using log_statement=all we log the > + * phrase MAXROWS because this occurs prior to execution > + */ > if (log_statement == LOGSTMT_ALL) > ! { > ! if (subsequent_fetch) > ! ereport(LOG, > ! (errmsg("statement: FETCH %s MAXROWS %ld", portal_name, > ! max_rows))); > ! else > ! /* We have the portal, so output the source query. */ > ! if (max_rows <= 0) > ! ereport(LOG, > ! (errmsg("statement: EXECUTE %s [PREPARE: %s]", > ! portal_name, > portal->sourceText ? portal->sourceText : ""))); > + else > + ereport(LOG, > + (errmsg("statement: EXECUTE %s MAXROWS %ld [PREPARE: %s]", > + portal_name, max_rows, > + portal->sourceText ? portal->sourceText : ""))); > + } MAXROWS seems useful. Good. > BeginCommand(portal->commandTag, dest); > > *************** > *** 1865,1877 **** > if (save_log_min_duration_statement == 0 || > (save_log_min_duration_statement > 0 && > usecs >= save_log_min_duration_statement * 1000)) > ! ereport(LOG, > ! (errmsg("duration: %ld.%03ld ms statement: EXECUTE %s [PREPARE: %s]", > (long) ((stop_t.tv_sec - start_t.tv_sec) * 1000 + > (stop_t.tv_usec - start_t.tv_usec) / 1000), > (long) (stop_t.tv_usec - start_t.tv_usec) % 1000, > ! (*portal_name != '\0') ? portal_name : "<unnamed>", > ! portal->sourceText ? portal->sourceText : ""))); > } > > if (save_log_statement_stats) > --- 1897,1929 ---- > if (save_log_min_duration_statement == 0 || > (save_log_min_duration_statement > 0 && > usecs >= save_log_min_duration_statement * 1000)) > ! { > ! if (subsequent_fetch) Should the duration be snprintf into a local buffer and then used in each ereport, rather than duplicating the code three times? > ! ereport(LOG, > ! (errmsg("duration: %ld.%03ld ms statement: FETCH %s ROWS %ld ", > (long) ((stop_t.tv_sec - start_t.tv_sec) * 1000 + > (stop_t.tv_usec - start_t.tv_usec) / 1000), > (long) (stop_t.tv_usec - start_t.tv_usec) % 1000, > ! portal_name, > ! max_rows))); > ! else > ! if (completed) > ! ereport(LOG, > ! (errmsg("duration: %ld.%03ld ms statement: EXECUTE %s [PREPARE: %s]", > ! (long) ((stop_t.tv_sec - start_t.tv_sec) * 1000 + > ! (stop_t.tv_usec - start_t.tv_usec) / 1000), > ! (long) (stop_t.tv_usec - start_t.tv_usec) % 1000, > ! portal_name, > ! portal->sourceText ? portal->sourceText : ""))); > ! else > ! ereport(LOG, > ! (errmsg("duration: %ld.%03ld ms statement: EXECUTE %s ROWS %ld [PREPARE: %s]", > ! (long) ((stop_t.tv_sec - start_t.tv_sec) * 1000 + > ! (stop_t.tv_usec - start_t.tv_usec) / 1000), > ! (long) (stop_t.tv_usec - start_t.tv_usec) % 1000, > ! portal_name, max_rows, > ! portal->sourceText ? portal->sourceText : ""))); > ! } > } > > if (save_log_statement_stats) -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
pgsql-patches by date: