Thread: control max length of parameter values logged

control max length of parameter values logged

From
Alexey Bashtanov
Date:
Hello

Patch ba79cb5 (for full discussion see [1]) introduces a feature to log 
bind parameter values on error,
which greatly helps to reproduce errors artificially having only server 
log -- thanks everyone who
reviewed and improved it!

However, it cuts the values, as some of the reviewers were worried log 
could fill up too quickly.
This applies both to the new case of logging parameter values and to the 
existing ones due to
log_min_duration_statement or log_statement.

This is a backwards-incompatible change, and also ruins the idea of 
reproducible errors -- sorry Tom
I failed to second this idea [2] in time, before the change was pushed.

I personally don't think that we necessarily need to cut the values, we 
can rely on the users
being careful when using this feature -- in the same way we trusted them 
use similarly dangerous
log_min_duration_statement and especially log_statement for ages. At 
least it's better than having
no option to disable it. Alvaro's opinion was different [3]. What do you 
think
of adding a parameter to limit max logged parameter length? See patch 
attached.

Best, Alex

[1] https://postgr.es/m/0146a67b-a22a-0519-9082-bc29756b93a2@imap.cc
[2] https://postgr.es/m/11425.1575927321%40sss.pgh.pa.us
[3] https://postgr.es/m/20191209200531.GA19848@alvherre.pgsql

Attachment

Re: control max length of parameter values logged

From
Tom Lane
Date:
Alexey Bashtanov <bashtanov@imap.cc> writes:
> I personally don't think that we necessarily need to cut the values, we 
> can rely on the users
> being careful when using this feature -- in the same way we trusted them 
> use similarly dangerous
> log_min_duration_statement and especially log_statement for ages. At 
> least it's better than having
> no option to disable it. Alvaro's opinion was different [3]. What do you 
> think
> of adding a parameter to limit max logged parameter length? See patch 
> attached.

This patch is failing to build docs (per the cfbot) and it also fails
check-world because you changed behavior tested by ba79cb5dc's test case.
Attached is an update that hopefully will make the cfbot happy.

I agree that something ought to be done here, but I'm not sure that
this is exactly what.  It appears to me that there are three related
but distinct behaviors under discussion:

1. Truncation of bind parameters returned to clients in error message
   detail fields.
2. Truncation of bind parameters written to the server log in logged
   error messages.
3. Truncation of bind parameters written to the server log in non-error
   statement logging actions (log_statement and variants).

Historically we haven't truncated any of these, I believe.  As of
ba79cb5dc we forcibly truncate #1 and #2 at 64 bytes, but not #3.
Your patch proposes to provide a SUSET GUC that controls the
truncation length for all 3.

I think that the status quo as of ba79cb5dc is entirely unacceptable,
because there is no recourse if you want to find out why a statement
is failing and the reason is buried in some long parameter string.
However, this patch doesn't really fix it, because it still seems
pretty annoying that you need to be superuser to adjust what gets
sent back to the client.  Maybe that isn't a problem in practice
(since the client presumably has the original parameter value laying
about), but it seems conceptually wrong.

On the other hand, that line of reasoning leads to wanting two
separate GUCs (one for #1 and one for the other two), which seems
like overkill, plus it's going to be hard/expensive to have the
outputs for #1 and #2 not be the same.

I do agree that it seems weird and random (not to say 100% backward)
that error cases provide only truncated values but routine query
logging insists on emitting full untruncated values.  I should think
that the most common use-cases would want it the other way round.

So I feel like we'd better resolve these definitional questions about
what behavior we actually want.  I agree that ba79cb5dc is not
terribly well thought out as it stands.

            regards, tom lane

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 371d783..b002df5 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -6632,6 +6632,24 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
       </listitem>
      </varlistentry>

+     <varlistentry id="guc-log-parameter-max-length" xreflabel="log_parameter_max_length">
+      <term><varname>log_parameter_max_length</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>log_parameter_max_length</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        If greater than zero, bind parameter values reported in error
+        messages and statement-logging messages are trimmed to no more
+        than this many bytes.
+        If this value is specified without units, it is taken as bytes.
+        Zero (the default) disables trimming.
+        Only superusers can change this setting.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-log-statement" xreflabel="log_statement">
       <term><varname>log_statement</varname> (<type>enum</type>)
       <indexterm>
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 9dba3b0..70ce5bb 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1843,13 +1843,24 @@ exec_bind_message(StringInfo input_message)
                         if (knownTextValues == NULL)
                             knownTextValues =
                                 palloc0(numParams * sizeof(char *));
-                        /*
-                         * Note: must copy at least two more full characters
-                         * than BuildParamLogString wants to see; otherwise it
-                         * might fail to include the ellipsis.
-                         */
-                        knownTextValues[paramno] =
-                            pnstrdup(pstring, 64 + 2 * MAX_MULTIBYTE_CHAR_LEN);
+
+                        if (log_parameter_max_length > 0)
+                        {
+                            /*
+                             * We can trim the saved string, knowing that we
+                             * won't print all of it.  But we must copy at
+                             * least two more full characters than
+                             * BuildParamLogString wants to use; otherwise it
+                             * might fail to include the trailing ellipsis.
+                             */
+                            knownTextValues[paramno] =
+                                pnstrdup(pstring,
+                                         log_parameter_max_length
+                                         + 2 * MAX_MULTIBYTE_CHAR_LEN);
+                        }
+                        else
+                            knownTextValues[paramno] = pstrdup(pstring);
+
                         MemoryContextSwitchTo(oldcxt);
                     }
                     if (pstring != pbuf.data)
@@ -1912,7 +1923,9 @@ exec_bind_message(StringInfo input_message)
          */
         if (log_parameters_on_error)
             params->paramValuesStr =
-                BuildParamLogString(params, knownTextValues, 64);
+                BuildParamLogString(params,
+                                    knownTextValues,
+                                    log_parameter_max_length);
     }
     else
         params = NULL;
@@ -2401,7 +2414,7 @@ errdetail_params(ParamListInfo params)
     {
         char   *str;

-        str = BuildParamLogString(params, NULL, 0);
+        str = BuildParamLogString(params, NULL, log_parameter_max_length);
         if (str && str[0] != '\0')
             errdetail("parameters: %s", str);
     }
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 4c6d648..5749aed 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -544,6 +544,7 @@ int            log_min_messages = WARNING;
 int            client_min_messages = NOTICE;
 int            log_min_duration_sample = -1;
 int            log_min_duration_statement = -1;
+int            log_parameter_max_length = 0;
 int            log_temp_files = -1;
 double        log_statement_sample_rate = 1.0;
 double        log_xact_sample_rate = 0;
@@ -2836,6 +2837,17 @@ static struct config_int ConfigureNamesInt[] =
     },

     {
+        {"log_parameter_max_length", PGC_SUSET, LOGGING_WHAT,
+            gettext_noop("Limit logged parameter values to first N bytes."),
+            gettext_noop("Zero to print values in full."),
+            GUC_UNIT_BYTE
+        },
+        &log_parameter_max_length,
+        0, 0, INT_MAX / 2,
+        NULL, NULL, NULL
+    },
+
+    {
         {"bgwriter_delay", PGC_SIGHUP, RESOURCES_BGWRITER,
             gettext_noop("Background writer sleep time between rounds."),
             NULL,
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e58e478..e16d63d 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -546,6 +546,8 @@
 #log_lock_waits = off            # log lock waits >= deadlock_timeout
 #log_statement = 'none'            # none, ddl, mod, all
 #log_parameters_on_error = off    # on error log statements with bind parameters
+#log_parameter_max_length = 0           # limit logged parameter values to
+                    # N bytes; 0 means print in full
 #log_replication_commands = off
 #log_temp_files = -1            # log temporary files equal or larger
                     # than the specified size in kilobytes;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 25ea17f..d146e28 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -271,7 +271,9 @@ COMMIT;
     });

 # Verify server logging of parameters.
-$node->append_conf('postgresql.conf', "log_parameters_on_error = true");
+$node->append_conf('postgresql.conf',
+                   "log_parameters_on_error = true\n" .
+                   "log_parameter_max_length = 64");
 $node->reload;
 pgbench(
         '-n -t1 -c1 -M prepared',
@@ -306,8 +308,8 @@ select column1::jsonb from (values (:value), (:long)) as q;
 ]
     });
 my $log = TestLib::slurp_file($node->logfile);
-like($log, qr[DETAIL:  parameters: \$1 = '\{ invalid ', \$2 = '''Valame Dios!'' dijo Sancho; ''no le dije yo a vuestra
mercedque mirase bien lo que hacia\?'''], 
-     "parameter report does not truncate");
+like($log, qr[DETAIL:  parameters: \$1 = '\{ invalid ', \$2 = '''Valame Dios!'' dijo Sancho; ''no le dije yo a vuestra
mercedque \.\.\.'], 
+     "parameter report truncates");
 $log = undef;

 $node->append_conf('postgresql.conf', "log_min_duration_statement = -1");
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index ce93ace..b4186c6 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -235,6 +235,7 @@ typedef enum
 /* GUC vars that are actually declared in guc.c, rather than elsewhere */
 extern bool log_duration;
 extern bool log_parameters_on_error;
+extern int    log_parameter_max_length;
 extern bool Debug_print_plan;
 extern bool Debug_print_parse;
 extern bool Debug_print_rewritten;

Re: control max length of parameter values logged

From
Alvaro Herrera
Date:
On 2020-Mar-10, Tom Lane wrote:

> I agree that something ought to be done here, but I'm not sure that
> this is exactly what.  It appears to me that there are three related
> but distinct behaviors under discussion:
> 
> 1. Truncation of bind parameters returned to clients in error message
>    detail fields.
> 2. Truncation of bind parameters written to the server log in logged
>    error messages.
> 3. Truncation of bind parameters written to the server log in non-error
>    statement logging actions (log_statement and variants).
> 
> Historically we haven't truncated any of these, I believe.  As of
> ba79cb5dc we forcibly truncate #1 and #2 at 64 bytes, but not #3.
> Your patch proposes to provide a SUSET GUC that controls the
> truncation length for all 3.

The reason I didn't change the other uses was precisely that it was
established behavior, but my opinion was that truncating them would
be better, now that we have code to handle doing so.

Maybe it would make sense to always log complete parameters for error
cases when that feature is enabled, and have the GUC only control the
lengths logged for non-error cases?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: control max length of parameter values logged

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Maybe it would make sense to always log complete parameters for error
> cases when that feature is enabled, and have the GUC only control the
> lengths logged for non-error cases?

I could get behind that.  It's a bit different from the original
idea here, but I think it's closer to being real-world-useful.

Another way to slice this up would be to have a USERSET GUC that
controls truncation of parameter values in errors, and a separate
SUSET GUC that controls it for the non-error statement logging
cases.  I'm not sure how much that's actually worth, but if we
feel that truncation in error cases can be useful, that's how
I'd vote to expose it.

            regards, tom lane



Re: control max length of parameter values logged

From
Alvaro Herrera
Date:
On 2020-Mar-11, Tom Lane wrote:

> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > Maybe it would make sense to always log complete parameters for error
> > cases when that feature is enabled, and have the GUC only control the
> > lengths logged for non-error cases?
> 
> I could get behind that.  It's a bit different from the original
> idea here, but I think it's closer to being real-world-useful.
> 
> Another way to slice this up would be to have a USERSET GUC that
> controls truncation of parameter values in errors, and a separate
> SUSET GUC that controls it for the non-error statement logging
> cases.  I'm not sure how much that's actually worth, but if we
> feel that truncation in error cases can be useful, that's how
> I'd vote to expose it.

Either of these ideas work for me.  I think I like the latter more,
since it allows to configure truncation in all cases.  (I'm not really
sure I understand why one of them must be SUSET.)

The reason I'm so hot about parameter truncation is that we've seen
cases where customers' log files contain log lines many megabytes long
because of gigantic parameters; UUID arrays with tens of thousands of
entries, and such.  Sometimes we see those in the normal "statement"
line because $customer interpolates into the query literal; normally the
"solution" is to move the params from interpolated into a parameter.
But if we log all parameters whole, that workaround no longer works, so
a way to clip is necessary.

I agree that truncating the value that can be disabled while not
truncating the values that cannot be disabled, is a bit silly.

I'm okay with the default being not to clip anything.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: control max length of parameter values logged

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2020-Mar-11, Tom Lane wrote:
>> Another way to slice this up would be to have a USERSET GUC that
>> controls truncation of parameter values in errors, and a separate
>> SUSET GUC that controls it for the non-error statement logging
>> cases.  I'm not sure how much that's actually worth, but if we
>> feel that truncation in error cases can be useful, that's how
>> I'd vote to expose it.

> Either of these ideas work for me.  I think I like the latter more,
> since it allows to configure truncation in all cases.  (I'm not really
> sure I understand why one of them must be SUSET.)

We generally suppose that GUCs that control statement logging should be
SUSET, so that unprivileged users don't get to hide their activity from
the log.  On the other hand, I think it's okay for error logging (as
opposed to statement tracing) to be under user control, because the user
can simply avoid or trap an error if he doesn't want it to be logged.

> The reason I'm so hot about parameter truncation is that we've seen
> cases where customers' log files contain log lines many megabytes long
> because of gigantic parameters; UUID arrays with tens of thousands of
> entries, and such.  Sometimes we see those in the normal "statement"
> line because $customer interpolates into the query literal; normally the
> "solution" is to move the params from interpolated into a parameter.
> But if we log all parameters whole, that workaround no longer works, so
> a way to clip is necessary.

Agreed, it seems like there's a fairly compelling case for being
able to clip.

> I'm okay with the default being not to clip anything.

Also agreed.  It's been like it is for a long time with not that
many complaints, so the case for changing the default behavior
seems a bit weak.

Barring other opinions, I think we have consensus here on what
to do.  Alexey, will you update your patch?

            regards, tom lane



Re: control max length of parameter values logged

From
Bruce Momjian
Date:
On Thu, Mar 12, 2020 at 12:01:24PM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > The reason I'm so hot about parameter truncation is that we've seen
> > cases where customers' log files contain log lines many megabytes long
> > because of gigantic parameters; UUID arrays with tens of thousands of
> > entries, and such.  Sometimes we see those in the normal "statement"
> > line because $customer interpolates into the query literal; normally the
> > "solution" is to move the params from interpolated into a parameter.
> > But if we log all parameters whole, that workaround no longer works, so
> > a way to clip is necessary.
> 
> Agreed, it seems like there's a fairly compelling case for being
> able to clip.
> 
> > I'm okay with the default being not to clip anything.
> 
> Also agreed.  It's been like it is for a long time with not that
> many complaints, so the case for changing the default behavior
> seems a bit weak.
> 
> Barring other opinions, I think we have consensus here on what
> to do.  Alexey, will you update your patch?

I am sorry --- I am confused.  Why are we truncating or allowing control
of truncation of BIND parameter values, but have no such facility for
queries.  Do we assume queries are shorter than BIND parameters, or is
it just that it is easier to trim BIND parameters than values embedded
in non-EXECUTE queries.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +



Re: control max length of parameter values logged

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> I am sorry --- I am confused.  Why are we truncating or allowing control
> of truncation of BIND parameter values, but have no such facility for
> queries.  Do we assume queries are shorter than BIND parameters, or is
> it just that it is easier to trim BIND parameters than values embedded
> in non-EXECUTE queries.

The cases that Alvaro was worried about were enormous values supplied
via bind parameters.  We haven't heard comparable complaints about
the statement text.  Also, from a security standpoint, the contents
of the statement text are way more critical than the contents of
an out-of-line parameter; you can't do SQL injection from the latter.
So I think the audience for trimming would be a lot smaller for
statement-text trimming.

            regards, tom lane



Re: control max length of parameter values logged

From
Alvaro Herrera
Date:
On 2020-Mar-14, Tom Lane wrote:

> Bruce Momjian <bruce@momjian.us> writes:
> > I am sorry --- I am confused.  Why are we truncating or allowing control
> > of truncation of BIND parameter values, but have no such facility for
> > queries.  Do we assume queries are shorter than BIND parameters, or is
> > it just that it is easier to trim BIND parameters than values embedded
> > in non-EXECUTE queries.
> 
> The cases that Alvaro was worried about were enormous values supplied
> via bind parameters.  We haven't heard comparable complaints about
> the statement text.

To be more precise, I have seen cases of enormous statement text, but
those are fixed precisely by moving the bulk to parameters.  So the
ability to trim the parameter is important.  I've never seen a very
large query without the bulk being parameterizable.

> Also, from a security standpoint, the contents
> of the statement text are way more critical than the contents of
> an out-of-line parameter; you can't do SQL injection from the latter.

That's a good point too.

> So I think the audience for trimming would be a lot smaller for
> statement-text trimming.

Nod.  (I think if we really wanted to trim queries, it would have to be
something semantically sensible, not just trim whatever is at the end of
the statement literal.  Say, only trim parts of the where clause that
are of the form "something op constant", and rules like that, plus put
placeholders to show that they were there.  This sounds a lot of work to
figure out usefully ...)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: control max length of parameter values logged

From
keisuke kuroda
Date:
Hi,

BIND parameter truncation is good to me.
Logs could be very hard to read due to
the very long parameters recorded.

Now, parameter is extracuted from left.
e.g. "AAAA-BBBB-CCCC-DDDD-EEEE" to "AAAA-BBBB-CCCC..."

Is not necessary from right?
e.g. "AAAA-BBBB-CCCC-DDDD-EEEE" to "...CCCC-DDDD-EEEE"

It is sometimes nice to be able to check
the last of strings. For example, if there
are difference only in the last of strings
of many parameters.

Best Regards,
Keisuke Kuroda



Re: control max length of parameter values logged

From
Justin Pryzby
Date:
On Sun, Mar 15, 2020 at 08:48:33PM -0300, Alvaro Herrera wrote:
> On 2020-Mar-14, Tom Lane wrote:
> 
> > Bruce Momjian <bruce@momjian.us> writes:
> > > I am sorry --- I am confused.  Why are we truncating or allowing control
> > > of truncation of BIND parameter values, but have no such facility for
> > > queries.  Do we assume queries are shorter than BIND parameters, or is
> > > it just that it is easier to trim BIND parameters than values embedded
> > > in non-EXECUTE queries.
> > 
> > The cases that Alvaro was worried about were enormous values supplied
> > via bind parameters.  We haven't heard comparable complaints about
> > the statement text.
> 
> To be more precise, I have seen cases of enormous statement text, but
> those are fixed precisely by moving the bulk to parameters.  So the
> ability to trim the parameter is important.  I've never seen a very
> large query without the bulk being parameterizable.

I don't claim our use is a common case or a good example but I'm going to offer
a data point.

We have very long query strings, even while using bind parameters.
Our loader process uses upsert and prepared statements.
So we might run: INSERT INTO t (k1,k2,a,b,...) VALUES($1,$2,$3,$4)
ON CONFLICT(k1,k2) DO UPDATE SET a=excluded.a,b=excluded.b
..which is fine, but we also have large number of columns - historically up to
1600.  If a query fails, the error might be a query string 2+ pages long.

Looks like we have common cases (and executed many times) with:
24k long message and 86k long param string
70k long message and 10k long param string

Having full log on error is important, more to the client but also in the
server log.  But it would be nice if we could reduce the server logs.  Most of
the prepare string is of little value if there's no error:  VALUES ($1,$2,$3,)
(but prepared query is at least better than repeating the query string).

Related: a year ago, I wrote about the repetition of the "PREPARE" statement in
query logs.
https://www.postgresql.org/message-id/20190208132953.GF29720@telsasoft.com

Ultimately I withdrew that patch and switched to log_statement_min_duration.

> Nod.  (I think if we really wanted to trim queries, it would have to be
> something semantically sensible, not just trim whatever is at the end of
> the statement literal.

If it were easy, I would truncate query strings to a few hundred bytes.

-- 
Justin



Re: control max length of parameter values logged

From
Alexey Bashtanov
Date:
> Also agreed.  It's been like it is for a long time with not that
> many complaints, so the case for changing the default behavior
> seems a bit weak.
>
> Barring other opinions, I think we have consensus here on what
> to do.  Alexey, will you update your patch?
>
Sorry for the delay, please could you have a look?

Best, Alex

Attachment

Re: control max length of parameter values logged

From
Tom Lane
Date:
Alexey Bashtanov <bashtanov@imap.cc> writes:
> Sorry for the delay, please could you have a look?

Got it, will look tomorrow.  (I think it's important to get this
done for v13, before we ship this behavior.)

            regards, tom lane



Re: control max length of parameter values logged

From
Justin Pryzby
Date:
Hi,

On Wed, Apr 01, 2020 at 01:52:48AM +0100, Alexey Bashtanov wrote:
> +++ b/doc/src/sgml/config.sgml
> +     <varlistentry id="guc-log-parameter-max-length" xreflabel="log_parameter_max_length">
> +      <term><varname>log_parameter_max_length</varname> (<type>integer</type>)
> +      <indexterm>
> +       <primary><varname>log_parameter_max_length</varname> configuration parameter</primary>
> +      </indexterm>
> +      </term>
> +      <listitem>
> +       <para>
> +        If greater than zero, bind parameter values reported in non-error
> +        statement-logging messages are trimmed to no more than this many bytes.

Can I suggest to say:

"Limit bind parameter values reported by non-error statement-logging messages
to this many bytes". Or,

"The maximum length of bind parameter values to log with non-error
statement-logging messages".

> --- a/src/backend/utils/misc/guc.c
> +++ b/src/backend/utils/misc/guc.c
> @@ -2855,6 +2857,28 @@ static struct config_int ConfigureNamesInt[] =
>          NULL, NULL, NULL
>      },
>  
> +    {
> +        {"log_parameter_max_length", PGC_SUSET, LOGGING_WHAT,
> +            gettext_noop("When logging statements, limit logged parameter values to first N bytes."),
> +            gettext_noop("Zero to print values in full."),

Could you make zero a normal value and -1 the "special" value to disable
trimming ?

Setting to zero will avoid displaying parameters at all, setting to -1 wil
display values in full.

Cheers,
-- 
Justin



Re: control max length of parameter values logged

From
Alexey Bashtanov
Date:
Hi,
>> +        If greater than zero, bind parameter values reported in non-error
>> +        statement-logging messages are trimmed to no more than this many bytes.
> Can I suggest to say:
>
> "Limit bind parameter values reported by non-error statement-logging messages
> to this many bytes". Or,
>
> "The maximum length of bind parameter values to log with non-error
> statement-logging messages".
Okay I'll rephrase.
> Could you make zero a normal value and -1 the "special" value to disable
> trimming ?
>
> Setting to zero will avoid displaying parameters at all, setting to -1 wil
> display values in full.
I can, but then for the sake of consistency I'll have to do the same for 
log_parameter_max_length.
Then we'll end up with two ways to enable/disable parameter logging on 
error:
using the primary boolean setting and setting length to 0.
One of them will require superuser privileges, the other one won't.
Do you think it's okay? I have no objections but I'm a bit worried 
someone may find it clumsy.

Best, Alex



Re: control max length of parameter values logged

From
Justin Pryzby
Date:
On Wed, Apr 01, 2020 at 10:10:55AM +0100, Alexey Bashtanov wrote:
> Hi,
> > > +        If greater than zero, bind parameter values reported in non-error
> > > +        statement-logging messages are trimmed to no more than this many bytes.
> > Can I suggest to say:
> > 
> > "Limit bind parameter values reported by non-error statement-logging messages
> > to this many bytes". Or,
> > 
> > "The maximum length of bind parameter values to log with non-error
> > statement-logging messages".
> Okay I'll rephrase.
> > Could you make zero a normal value and -1 the "special" value to disable
> > trimming ?
> > 
> > Setting to zero will avoid displaying parameters at all, setting to -1 wil
> > display values in full.
> I can, but then for the sake of consistency I'll have to do the same for
> log_parameter_max_length.
> Then we'll end up with two ways to enable/disable parameter logging on
> error:
> using the primary boolean setting and setting length to 0.
> One of them will require superuser privileges, the other one won't.

I guess you're referring to log_parameters_on_error.
Does it have to be SUSET ?

Or maybe log_parameters_on_error doesn't need to exist at all, and setting
log_parameter_max_length=0 can be used to disable parameter logging.

I showed up late to this thread, so let's see what others think.

-- 
Justin



Re: control max length of parameter values logged

From
Tom Lane
Date:
Justin Pryzby <pryzby@telsasoft.com> writes:
> On Wed, Apr 01, 2020 at 10:10:55AM +0100, Alexey Bashtanov wrote:
>>> Could you make zero a normal value and -1 the "special" value to disable
>>> trimming ?

>> I can, but then for the sake of consistency I'll have to do the same for
>> log_parameter_max_length.
>> Then we'll end up with two ways to enable/disable parameter logging on
>> error:
>> using the primary boolean setting and setting length to 0.
>> One of them will require superuser privileges, the other one won't.

> I guess you're referring to log_parameters_on_error.
> Does it have to be SUSET ?
> Or maybe log_parameters_on_error doesn't need to exist at all, and setting
> log_parameter_max_length=0 can be used to disable parameter logging.
> I showed up late to this thread, so let's see what others think.

I think Justin's got a point: defining zero this way is weirdly
inconsistent.  -1, being clearly outside the domain of possible
length limits, makes more sense as a marker for "don't trim".

Alexey's right that having a separate boolean flag is pointless, but
I think we could just drop the boolean; we haven't shipped that yet.
The privilege argument seems irrelevant to me.  We already decided
that the plan is (a) SUSET for non-error statement logging purposes and
(b) USERSET for logging caused by errors, and that would have to apply
to length limits as well as enable/disable ability.  Otherwise a user
could pretty effectively disable logging by setting the length to 1.

            regards, tom lane



Re: control max length of parameter values logged

From
Alexey Bashtanov
Date:
Hi,
> The privilege argument seems irrelevant to me.  We already decided
> that the plan is (a) SUSET for non-error statement logging purposes and
> (b) USERSET for logging caused by errors, and that would have to apply
> to length limits as well as enable/disable ability.  Otherwise a user
> could pretty effectively disable logging by setting the length to 1.
The only privilege that user can gain if we drop the boolean is to 
*enable* logging parameters on error.
That gives user a little bit easier way to fill up the disk with logs, 
but they anyway can do that if they want to.
If that's okay with everyone, please see the new version attached.

Best, Alex

Attachment

Re: control max length of parameter values logged

From
Justin Pryzby
Date:
Thanks for updating the patch.

On Thu, Apr 02, 2020 at 01:29:04AM +0100, Alexey Bashtanov wrote:
> +        If greater than zero, bind parameter values reported in non-error
> +        statement-logging messages are trimmed to no more than this many bytes.
> +        If this value is specified without units, it is taken as bytes.
> +        Zero disables logging parameters with statements.
> +        <literal>-1</literal> (the default) makes parameters logged in full.

say: "..causes parameters to be logged in full".

Also..I just realized that this truncates *each* parameter, rather than
truncating the parameter list.

So say: "
|If greater than zero, each bind parameter value reported in a non-error
|statement-logging messages is trimmed to no more than this number of bytes.

> +        Non-zero values add some overhead, as
> +        <productname>PostgreSQL</productname> will compute and store textual
> +        representations of parameter values in memory for all statements,
> +        even if they eventually do not get logged.        

say: "even if they are ultimately not logged"

> +++ b/src/backend/nodes/params.c
> @@ -280,6 +280,7 @@ BuildParamLogString(ParamListInfo params, char **knownTextValues, int maxlen)
>                  oldCxt;
>      StringInfoData buf;
>  
> +    Assert(maxlen == -1 || maxlen > 0);

You can write: >= -1

> -                    if (log_parameters_on_error)
> +                    if (log_parameter_max_length_on_error != 0)

I would write this as >= 0

> +                        if (log_parameter_max_length_on_error > 0)
> +                        {
> +                                                       /*
> +                                                        * We can trim the saved string, knowing that we
> +                                                        * won't print all of it.  But we must copy at
> +                                                        * least two more full characters than
> +                                                        * BuildParamLogString wants to use; otherwise it
> +                                                        * might fail to include the trailing ellipsis.
> +                                                        */
> +                                                       knownTextValues[paramno] =
> +                                                               pnstrdup(pstring,
> +                                                                                log_parameter_max_length_on_error
> +                                                                                + 2 * MAX_MULTIBYTE_CHAR_LEN);

The comment says we need at least 2 chars, but
log_parameter_max_length_on_error might be 1, so I think you can either add 64
byte fudge factor, like before, or do Max(log_parameter_max_length_on_error, 2).

> +                        }
> +                        else
> +                            knownTextValues[paramno] = pstrdup(pstring);

I suggest to handle the "== -1" case first and the > 0 case as "else".

Thanks,
-- 
Justin



Re: control max length of parameter values logged

From
Alexey Bashtanov
Date:
Hi,

Please see the new version attached.
>> +        If greater than zero, bind parameter values reported in non-error
>> +        statement-logging messages are trimmed to no more than this many bytes.
>> +        If this value is specified without units, it is taken as bytes.
>> +        Zero disables logging parameters with statements.
>> +        <literal>-1</literal> (the default) makes parameters logged in full.
> say: "..causes parameters to be logged in full".
>
> Also..I just realized that this truncates *each* parameter, rather than
> truncating the parameter list.
>
> So say: "
> |If greater than zero, each bind parameter value reported in a non-error
> |statement-logging messages is trimmed to no more than this number of bytes.
okay

I also changed "trimmed to no more than this many bytes" to "trimmed to 
this many bytes".
It's not that pedantic any more but hopefully less awkward.

>> +        Non-zero values add some overhead, as
>> +        <productname>PostgreSQL</productname> will compute and store textual
>> +        representations of parameter values in memory for all statements,
>> +        even if they eventually do not get logged.
> say: "even if they are ultimately not logged"
okay

>> +++ b/src/backend/nodes/params.c
>> @@ -280,6 +280,7 @@ BuildParamLogString(ParamListInfo params, char **knownTextValues, int maxlen)
>>                   oldCxt;
>>       StringInfoData buf;
>>   
>> +    Assert(maxlen == -1 || maxlen > 0);
> You can write: >= -1
no, I'm specifically checking they don't pass 0


>> -                    if (log_parameters_on_error)
>> +                    if (log_parameter_max_length_on_error != 0)
> I would write this as >= 0
no, I'm specifically checking for both positives and -1

>> +                        if (log_parameter_max_length_on_error > 0)
>> +                        {
>> +                                                       /*
>> +                                                        * We can trim the saved string, knowing that we
>> +                                                        * won't print all of it.  But we must copy at
>> +                                                        * least two more full characters than
>> +                                                        * BuildParamLogString wants to use; otherwise it
>> +                                                        * might fail to include the trailing ellipsis.
>> +                                                        */
>> +                                                       knownTextValues[paramno] =
>> +                                                               pnstrdup(pstring,
>> +                                                                                log_parameter_max_length_on_error
>> +                                                                                + 2 * MAX_MULTIBYTE_CHAR_LEN);
> The comment says we need at least 2 chars, but
> log_parameter_max_length_on_error might be 1, so I think you can either add 64
> byte fudge factor, like before, or do Max(log_parameter_max_length_on_error, 2).
That's the code I reused without deep analysis TBH.
I think it's mostly for to allocate the space for the ellipsis in case 
it needs to be added,
not to copy any actual characters, that's why we add.

>> +                        }
>> +                        else
>> +                            knownTextValues[paramno] = pstrdup(pstring);
> I suggest to handle the "== -1" case first and the > 0 case as "else".
Good, as long as I thought of this too, I'm changing that.

Best, Alex

Attachment

Re: control max length of parameter values logged

From
Alvaro Herrera
Date:
On 2020-Apr-02, Alexey Bashtanov wrote:


> > > +                        if (log_parameter_max_length_on_error > 0)
> > > +                        {
> > > +                                                       /*
> > > +                                                        * We can trim the saved string, knowing that we
> > > +                                                        * won't print all of it.  But we must copy at
> > > +                                                        * least two more full characters than
> > > +                                                        * BuildParamLogString wants to use; otherwise it
> > > +                                                        * might fail to include the trailing ellipsis.
> > > +                                                        */
> > > +                                                       knownTextValues[paramno] =
> > > +                                                               pnstrdup(pstring,
> > > +
log_parameter_max_length_on_error
> > > +                                                                                + 2 * MAX_MULTIBYTE_CHAR_LEN);
> > The comment says we need at least 2 chars, but
> > log_parameter_max_length_on_error might be 1, so I think you can either add 64
> > byte fudge factor, like before, or do Max(log_parameter_max_length_on_error, 2).
> That's the code I reused without deep analysis TBH.
> I think it's mostly for to allocate the space for the ellipsis in case it
> needs to be added,
> not to copy any actual characters, that's why we add.

More or less.  If you don't add these chars, mbcliplen doesn't think
there's character there, so it ends up not adding the ellipsis.  (I
don't remember why it has to be two chars rather than just one.)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: control max length of parameter values logged

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> More or less.  If you don't add these chars, mbcliplen doesn't think
> there's character there, so it ends up not adding the ellipsis.  (I
> don't remember why it has to be two chars rather than just one.)

I think the idea is to be sure that there's a full multibyte character
after the truncation point; if the truncation point is within a multibyte
character, then you might have only a partial multibyte character after
that, which could cause problems.  Doing it this way, mbcliplen will
never look at the last possibly-truncated character.

            regards, tom lane



Re: control max length of parameter values logged

From
Tom Lane
Date:
Alexey Bashtanov <bashtanov@imap.cc> writes:
> Please see the new version attached.

Pushed with a bit of editorialization.

            regards, tom lane



Re: control max length of parameter values logged

From
Alexey Bashtanov
Date:
> Pushed with a bit of editorialization.
Great, and thanks for the fixes!

Best, Alex