Re: control max length of parameter values logged - Mailing list pgsql-hackers

From Tom Lane
Subject Re: control max length of parameter values logged
Date
Msg-id 27534.1583877782@sss.pgh.pa.us
Whole thread Raw
In response to control max length of parameter values logged  (Alexey Bashtanov <bashtanov@imap.cc>)
Responses Re: control max length of parameter values logged  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
Alexey Bashtanov <bashtanov@imap.cc> writes:
> I personally don't think that we necessarily need to cut the values, we 
> can rely on the users
> being careful when using this feature -- in the same way we trusted them 
> use similarly dangerous
> log_min_duration_statement and especially log_statement for ages. At 
> least it's better than having
> no option to disable it. Alvaro's opinion was different [3]. What do you 
> think
> of adding a parameter to limit max logged parameter length? See patch 
> attached.

This patch is failing to build docs (per the cfbot) and it also fails
check-world because you changed behavior tested by ba79cb5dc's test case.
Attached is an update that hopefully will make the cfbot happy.

I agree that something ought to be done here, but I'm not sure that
this is exactly what.  It appears to me that there are three related
but distinct behaviors under discussion:

1. Truncation of bind parameters returned to clients in error message
   detail fields.
2. Truncation of bind parameters written to the server log in logged
   error messages.
3. Truncation of bind parameters written to the server log in non-error
   statement logging actions (log_statement and variants).

Historically we haven't truncated any of these, I believe.  As of
ba79cb5dc we forcibly truncate #1 and #2 at 64 bytes, but not #3.
Your patch proposes to provide a SUSET GUC that controls the
truncation length for all 3.

I think that the status quo as of ba79cb5dc is entirely unacceptable,
because there is no recourse if you want to find out why a statement
is failing and the reason is buried in some long parameter string.
However, this patch doesn't really fix it, because it still seems
pretty annoying that you need to be superuser to adjust what gets
sent back to the client.  Maybe that isn't a problem in practice
(since the client presumably has the original parameter value laying
about), but it seems conceptually wrong.

On the other hand, that line of reasoning leads to wanting two
separate GUCs (one for #1 and one for the other two), which seems
like overkill, plus it's going to be hard/expensive to have the
outputs for #1 and #2 not be the same.

I do agree that it seems weird and random (not to say 100% backward)
that error cases provide only truncated values but routine query
logging insists on emitting full untruncated values.  I should think
that the most common use-cases would want it the other way round.

So I feel like we'd better resolve these definitional questions about
what behavior we actually want.  I agree that ba79cb5dc is not
terribly well thought out as it stands.

            regards, tom lane

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 371d783..b002df5 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -6632,6 +6632,24 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
       </listitem>
      </varlistentry>

+     <varlistentry id="guc-log-parameter-max-length" xreflabel="log_parameter_max_length">
+      <term><varname>log_parameter_max_length</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>log_parameter_max_length</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        If greater than zero, bind parameter values reported in error
+        messages and statement-logging messages are trimmed to no more
+        than this many bytes.
+        If this value is specified without units, it is taken as bytes.
+        Zero (the default) disables trimming.
+        Only superusers can change this setting.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-log-statement" xreflabel="log_statement">
       <term><varname>log_statement</varname> (<type>enum</type>)
       <indexterm>
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 9dba3b0..70ce5bb 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1843,13 +1843,24 @@ exec_bind_message(StringInfo input_message)
                         if (knownTextValues == NULL)
                             knownTextValues =
                                 palloc0(numParams * sizeof(char *));
-                        /*
-                         * Note: must copy at least two more full characters
-                         * than BuildParamLogString wants to see; otherwise it
-                         * might fail to include the ellipsis.
-                         */
-                        knownTextValues[paramno] =
-                            pnstrdup(pstring, 64 + 2 * MAX_MULTIBYTE_CHAR_LEN);
+
+                        if (log_parameter_max_length > 0)
+                        {
+                            /*
+                             * We can trim the saved string, knowing that we
+                             * won't print all of it.  But we must copy at
+                             * least two more full characters than
+                             * BuildParamLogString wants to use; otherwise it
+                             * might fail to include the trailing ellipsis.
+                             */
+                            knownTextValues[paramno] =
+                                pnstrdup(pstring,
+                                         log_parameter_max_length
+                                         + 2 * MAX_MULTIBYTE_CHAR_LEN);
+                        }
+                        else
+                            knownTextValues[paramno] = pstrdup(pstring);
+
                         MemoryContextSwitchTo(oldcxt);
                     }
                     if (pstring != pbuf.data)
@@ -1912,7 +1923,9 @@ exec_bind_message(StringInfo input_message)
          */
         if (log_parameters_on_error)
             params->paramValuesStr =
-                BuildParamLogString(params, knownTextValues, 64);
+                BuildParamLogString(params,
+                                    knownTextValues,
+                                    log_parameter_max_length);
     }
     else
         params = NULL;
@@ -2401,7 +2414,7 @@ errdetail_params(ParamListInfo params)
     {
         char   *str;

-        str = BuildParamLogString(params, NULL, 0);
+        str = BuildParamLogString(params, NULL, log_parameter_max_length);
         if (str && str[0] != '\0')
             errdetail("parameters: %s", str);
     }
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 4c6d648..5749aed 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -544,6 +544,7 @@ int            log_min_messages = WARNING;
 int            client_min_messages = NOTICE;
 int            log_min_duration_sample = -1;
 int            log_min_duration_statement = -1;
+int            log_parameter_max_length = 0;
 int            log_temp_files = -1;
 double        log_statement_sample_rate = 1.0;
 double        log_xact_sample_rate = 0;
@@ -2836,6 +2837,17 @@ static struct config_int ConfigureNamesInt[] =
     },

     {
+        {"log_parameter_max_length", PGC_SUSET, LOGGING_WHAT,
+            gettext_noop("Limit logged parameter values to first N bytes."),
+            gettext_noop("Zero to print values in full."),
+            GUC_UNIT_BYTE
+        },
+        &log_parameter_max_length,
+        0, 0, INT_MAX / 2,
+        NULL, NULL, NULL
+    },
+
+    {
         {"bgwriter_delay", PGC_SIGHUP, RESOURCES_BGWRITER,
             gettext_noop("Background writer sleep time between rounds."),
             NULL,
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e58e478..e16d63d 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -546,6 +546,8 @@
 #log_lock_waits = off            # log lock waits >= deadlock_timeout
 #log_statement = 'none'            # none, ddl, mod, all
 #log_parameters_on_error = off    # on error log statements with bind parameters
+#log_parameter_max_length = 0           # limit logged parameter values to
+                    # N bytes; 0 means print in full
 #log_replication_commands = off
 #log_temp_files = -1            # log temporary files equal or larger
                     # than the specified size in kilobytes;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 25ea17f..d146e28 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -271,7 +271,9 @@ COMMIT;
     });

 # Verify server logging of parameters.
-$node->append_conf('postgresql.conf', "log_parameters_on_error = true");
+$node->append_conf('postgresql.conf',
+                   "log_parameters_on_error = true\n" .
+                   "log_parameter_max_length = 64");
 $node->reload;
 pgbench(
         '-n -t1 -c1 -M prepared',
@@ -306,8 +308,8 @@ select column1::jsonb from (values (:value), (:long)) as q;
 ]
     });
 my $log = TestLib::slurp_file($node->logfile);
-like($log, qr[DETAIL:  parameters: \$1 = '\{ invalid ', \$2 = '''Valame Dios!'' dijo Sancho; ''no le dije yo a vuestra
mercedque mirase bien lo que hacia\?'''], 
-     "parameter report does not truncate");
+like($log, qr[DETAIL:  parameters: \$1 = '\{ invalid ', \$2 = '''Valame Dios!'' dijo Sancho; ''no le dije yo a vuestra
mercedque \.\.\.'], 
+     "parameter report truncates");
 $log = undef;

 $node->append_conf('postgresql.conf', "log_min_duration_statement = -1");
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index ce93ace..b4186c6 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -235,6 +235,7 @@ typedef enum
 /* GUC vars that are actually declared in guc.c, rather than elsewhere */
 extern bool log_duration;
 extern bool log_parameters_on_error;
+extern int    log_parameter_max_length;
 extern bool Debug_print_plan;
 extern bool Debug_print_parse;
 extern bool Debug_print_rewritten;

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: shared-memory based stats collector
Next
From: David Rowley
Date:
Subject: Re: Index Skip Scan