Thread: Add value to error message when size extends

Add value to error message when size extends

From
Maor Lipchuk
Date:
Hi all,

We have encountered an issue when executing an insert command,
when one of the values' length was bigger than the column size limitation.

the error message which been displayed was "value too long for type char..."
but there was no indication which value has exceeded the limited size.
(See bug #8880)

Attached is a WIP patch which attend to make the message clearer.

Regards,
Maor and Daniel

Attachment

Re: Add value to error message when size extends

From
Tom Lane
Date:
Maor Lipchuk <mlipchuk@redhat.com> writes:
> We have encountered an issue when executing an insert command,
> when one of the values' length was bigger than the column size limitation.

> the error message which been displayed was "value too long for type char..."
> but there was no indication which value has exceeded the limited size.
> (See bug #8880)

> Attached is a WIP patch which attend to make the message clearer.

This sort of thing has been proposed before ...

We have a message style guideline that says that primary error messages
should fit on "one line", which is generally understood to mean maybe
80 characters.  Complaining about a too-long varchar string in this style
seems practically guaranteed to violate that.  More, putting the string
contents before the meat of the message is the worst possible choice if
a client user interface decides to truncate the message.  And if the
string were *really* long, say in the millions of characters, it's not
unlikely that this would end up causing the message to get replaced by
"out of memory", which is totally counterproductive.

You could avoid the first two of those objections by putting the string
contents into a DETAIL message; but not the third one.

Aside from that, it's less than clear whether this approach would actually
help much: maybe the string is readily identifiable as to which column
it's meant for, and maybe not.  People have speculated about ways to
name the target column in the error report, which would probably be
far more useful; but it's not real clear how to do that in our system
structure.

One thing that occurs to me just now is that perhaps we could report
the failure as if it were a syntax error, that is with an error cursor
pointing to where in the triggering SQL query the coercion is being done.
(Years ago this would not have been possible because we didn't always
keep around the query text until runtime, but I think we do now.)
It would take some work to make that happen, and I'm not sure it would
really resolve the usability problem, but it seems worth proposing.
An advantage is that over time such a facility could be extended to
run-time errors happening in any function not just this particular one.
        regards, tom lane



Re: Add value to error message when size extends

From
Marti Raudsepp
Date:
On Sun, Jan 19, 2014 at 8:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Complaining about a too-long varchar string in this style
> seems practically guaranteed to violate that.

Agreed. But I think it would be useful to add the length of the string
being inserted; we already have it in the len variable.

> One thing that occurs to me just now is that perhaps we could report
> the failure as if it were a syntax error

That would be cool, if it can be made to work.

Regards,
Marti



Re: Add value to error message when size extends

From
Tom Lane
Date:
Marti Raudsepp <marti@juffo.org> writes:
> Agreed. But I think it would be useful to add the length of the string
> being inserted; we already have it in the len variable.

Hm, maybe, but I'm having a hard time visualizing cases in which it
helps much.
        regards, tom lane



Re: Add value to error message when size extends

From
Tom Lane
Date:
Marti Raudsepp <marti@juffo.org> writes:
> On Sun, Jan 19, 2014 at 8:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> One thing that occurs to me just now is that perhaps we could report
>> the failure as if it were a syntax error

> That would be cool, if it can be made to work.

Just as a five-minute proof-of-concept hack, attached is a patch that
makes varchar() report an error position if it can get one.  This is
*very* far from being production quality --- debug_query_string is the
wrong thing to rely on in general, and we'd certainly want to encapsulate
the logic rather than have individual functions know about how to do it.
But here's some testing that shows that the idea seems to have promise
from a usability standpoint:

regression=# create table test (f1 varchar(10), f2 varchar(5), f3 varchar(10));
CREATE TABLE

regression=# insert into test values ('a', 'b', 'foobar');
INSERT 0 1

regression=# insert into test values ('foobar', 'foobar', 'foobar');
ERROR:  value too long for type character varying(5)
LINE 1: insert into test values ('foobar', 'foobar', 'foobar');
                                           ^

regression=# update test set f2 = f3 where f1 = 'a';
ERROR:  value too long for type character varying(5)
LINE 1: update test set f2 = f3 where f1 = 'a';
                             ^

The first error case points out a limitation of relying on the contents of
the string to figure out where your problem is.  The error-cursor approach
has its own limitations, of course; for instance the second case might not
be thought to be all that helpful.

            regards, tom lane

diff --git a/src/backend/utils/adt/varchar.c b/src/backend/utils/adt/varchar.c
index 502ca44..4438ed8 100644
*** a/src/backend/utils/adt/varchar.c
--- b/src/backend/utils/adt/varchar.c
***************
*** 19,24 ****
--- 19,25 ----
  #include "access/tuptoaster.h"
  #include "libpq/pqformat.h"
  #include "nodes/nodeFuncs.h"
+ #include "tcop/tcopprot.h"
  #include "utils/array.h"
  #include "utils/builtins.h"
  #include "mb/pg_wchar.h"
*************** varchar(PG_FUNCTION_ARGS)
*** 617,626 ****
      {
          for (i = maxmblen; i < len; i++)
              if (s_data[i] != ' ')
                  ereport(ERROR,
                          (errcode(ERRCODE_STRING_DATA_RIGHT_TRUNCATION),
                        errmsg("value too long for type character varying(%d)",
!                              maxlen)));
      }

      PG_RETURN_VARCHAR_P((VarChar *) cstring_to_text_with_len(s_data,
--- 618,645 ----
      {
          for (i = maxmblen; i < len; i++)
              if (s_data[i] != ' ')
+             {
+                 int            pos = 0;
+
+                 if (debug_query_string &&
+                     fcinfo->flinfo->fn_expr)
+                 {
+                     int location = exprLocation(fcinfo->flinfo->fn_expr);
+
+                     if (location >= 0)
+                     {
+                         /* Convert offset to character number */
+                         pos = pg_mbstrlen_with_len(debug_query_string,
+                                                    location) + 1;
+                     }
+                 }
+
                  ereport(ERROR,
                          (errcode(ERRCODE_STRING_DATA_RIGHT_TRUNCATION),
                        errmsg("value too long for type character varying(%d)",
!                              maxlen),
!                          errposition(pos)));
!             }
      }

      PG_RETURN_VARCHAR_P((VarChar *) cstring_to_text_with_len(s_data,

Re: Add value to error message when size extends

From
Maor Lipchuk
Date:
Hi Tom and Marti
Thank you so much for your respond.

The solution Tom proposed is much more better, and it will be a great
solution for clarifying many issues regarding this error.

Regards,
Maor


On 01/19/2014 10:00 PM, Tom Lane wrote:
> Marti Raudsepp <marti@juffo.org> writes:
>> On Sun, Jan 19, 2014 at 8:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> One thing that occurs to me just now is that perhaps we could report
>>> the failure as if it were a syntax error
> 
>> That would be cool, if it can be made to work.
> 
> Just as a five-minute proof-of-concept hack, attached is a patch that
> makes varchar() report an error position if it can get one.  This is
> *very* far from being production quality --- debug_query_string is the
> wrong thing to rely on in general, and we'd certainly want to encapsulate
> the logic rather than have individual functions know about how to do it.
> But here's some testing that shows that the idea seems to have promise
> from a usability standpoint:
> 
> regression=# create table test (f1 varchar(10), f2 varchar(5), f3 varchar(10));
> CREATE TABLE
> 
> regression=# insert into test values ('a', 'b', 'foobar');
> INSERT 0 1
> 
> regression=# insert into test values ('foobar', 'foobar', 'foobar');
> ERROR:  value too long for type character varying(5)
> LINE 1: insert into test values ('foobar', 'foobar', 'foobar');
>                                            ^
> 
> regression=# update test set f2 = f3 where f1 = 'a';
> ERROR:  value too long for type character varying(5)
> LINE 1: update test set f2 = f3 where f1 = 'a';
>                              ^
> 
> The first error case points out a limitation of relying on the contents of
> the string to figure out where your problem is.  The error-cursor approach
> has its own limitations, of course; for instance the second case might not
> be thought to be all that helpful.
Yes, but in this case you will know specifically which column is the
problematic one.
The highlight of your approach gains much more benefit when
updating/inserting multiple values in one update/insert command.
> 
>             regards, tom lane
> 




Re: Add value to error message when size extends

From
Daniel Erez
Date:
Hi,

Many thanks for the prompt response and the suggestions!

So, regarding the issue of "production quality" you've mentioned,
we understand there are two remaining matters to address:
1. debug_query_string:  As we can't rely on this flag, is there any alternative we can rely on?
2. encapsulation:  Is there any "utility file" we could extract the logic to?  Or, any other functions that are used
forencapsulation mechanism?
 

Thanks!
Daniel 

----- Original Message -----
> From: "Maor Lipchuk" <mlipchuk@redhat.com>
> To: "Tom Lane" <tgl@sss.pgh.pa.us>, "Marti Raudsepp" <marti@juffo.org>
> Cc: "pgsql-hackers" <pgsql-hackers@postgresql.org>, "Daniel Erez" <derez@redhat.com>
> Sent: Monday, January 20, 2014 2:32:57 AM
> Subject: Re: [HACKERS] Add value to error message when size extends
> 
> Hi Tom and Marti
> Thank you so much for your respond.
> 
> The solution Tom proposed is much more better, and it will be a great
> solution for clarifying many issues regarding this error.
> 
> Regards,
> Maor
> 
> 
> On 01/19/2014 10:00 PM, Tom Lane wrote:
> > Marti Raudsepp <marti@juffo.org> writes:
> >> On Sun, Jan 19, 2014 at 8:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> One thing that occurs to me just now is that perhaps we could report
> >>> the failure as if it were a syntax error
> > 
> >> That would be cool, if it can be made to work.
> > 
> > Just as a five-minute proof-of-concept hack, attached is a patch that
> > makes varchar() report an error position if it can get one.  This is
> > *very* far from being production quality --- debug_query_string is the
> > wrong thing to rely on in general, and we'd certainly want to encapsulate
> > the logic rather than have individual functions know about how to do it.
> > But here's some testing that shows that the idea seems to have promise
> > from a usability standpoint:
> > 
> > regression=# create table test (f1 varchar(10), f2 varchar(5), f3
> > varchar(10));
> > CREATE TABLE
> > 
> > regression=# insert into test values ('a', 'b', 'foobar');
> > INSERT 0 1
> > 
> > regression=# insert into test values ('foobar', 'foobar', 'foobar');
> > ERROR:  value too long for type character varying(5)
> > LINE 1: insert into test values ('foobar', 'foobar', 'foobar');
> >                                            ^
> > 
> > regression=# update test set f2 = f3 where f1 = 'a';
> > ERROR:  value too long for type character varying(5)
> > LINE 1: update test set f2 = f3 where f1 = 'a';
> >                              ^
> > 
> > The first error case points out a limitation of relying on the contents of
> > the string to figure out where your problem is.  The error-cursor approach
> > has its own limitations, of course; for instance the second case might not
> > be thought to be all that helpful.
> Yes, but in this case you will know specifically which column is the
> problematic one.
> The highlight of your approach gains much more benefit when
> updating/inserting multiple values in one update/insert command.
> > 
> >             regards, tom lane
> > 
> 
> 



Re: Add value to error message when size extends

From
Tom Lane
Date:
Daniel Erez <derez@redhat.com> writes:
> Many thanks for the prompt response and the suggestions!

> So, regarding the issue of "production quality" you've mentioned,
> we understand there are two remaining matters to address:
> 1. debug_query_string:
>    As we can't rely on this flag, is there any alternative we can rely on?
> 2. encapsulation:
>    Is there any "utility file" we could extract the logic to?
>    Or, any other functions that are used for encapsulation mechanism?

The big picture here is that there are two ways we could go:

* Encapsulate some improved version of the logic I posted into an error
reporting auxiliary function, along the lines of  function_errposition(fcinfo)
and then run around and add calls to this function in the ereports
where it seems useful.  This would be the right answer if we think
only a few such ereports need the extra info --- but if we want it
to apply to many/most function-level error reports, it sounds pretty
darn tedious.  Also it'd require that everyplace that is going to
throw such reports have access to the appropriate fcinfo, which would
require some refactoring in places where SQL functions have been
divided into multiple C routines.

* Add logic in execQual.c to automatically add the information whenever
an error is thrown while executing an identifiable expression node.
This solves the problem for all functions at one stroke.  The problem
is that it would add some overhead to the non-error code path.
It's hard to see how to do this without at least half a dozen added
instructions per function or operator node (it'd likely involve
saving, setting, and restoring a global variable ...).  I'm not sure
if that would be significant but it would very possibly be measurable
on function-call-heavy workloads.  We might think that such overhead
is worth paying for better error reporting; but given how few people
have commented on this thread, I'm not sure many people are excited
about it.

I'd like to hear some more comments before undertaking either approach.

As for getting rid of the use of debug_query_string, it'd take some
work, though it seems do-able.  I think ActivePortal->sourceText is
the right thing to reference in the "normal" case, but it may not
be the right thing when executing queries from SQL-language functions
for instance.  Also, the reason I didn't use that in the quick-hack
patch was that the test cases I wanted to post involved calls that
will get executed during the planner's constant-simplification pass
(since varchar() is marked IMMUTABLE).  It turns out there isn't
an ActivePortal at that point, so we'd need some thought about how
to make the right query string accessible during planning.
        regards, tom lane