Fwd: Problems with Error Messages wrt Domains, Checks - Mailing list pgsql-hackers

From john frazer
Subject Fwd: Problems with Error Messages wrt Domains, Checks
Date
Msg-id CAC-4GVbKeex=zdsHYvkcec7nr3Q4E6qDS=HFGthEsY1t9QLLAA@mail.gmail.com
Whole thread Raw
In response to Problems with Error Messages wrt Domains, Checks  (john frazer <johnfrazer783@gmail.com>)
Responses Re: Problems with Error Messages wrt Domains, Checks  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-hackers

---------- Forwarded message ----------
From: john frazer <johnfrazer783@gmail.com>
Date: Sat, Mar 17, 2018 at 6:28 PM
Subject: Re: Problems with Error Messages wrt Domains, Checks
To: "David G. Johnston" <david.g.johnston@gmail.com>




Thanks for your consideration,

I'll try to be brief.


>     As such, it could be anywhere in my many, many kLOCs big DB definition. I cannot even search the RegEx with a RegEx because all I know is some parenthesis is missing, somewhere:
>
> ​Well, the error does point to the first statement in the chain of issues - working backward a couple of steps is possible.

In this particular case that is possible, and I did manage to do it. The point is that in the
general case the faulty regular expression could be anywhere, and there's no clue given at all.

>     RegExes cannot match parentheses,
> ​Sure they can​.
>     and PG RegExes do not have a unique syntactic marker to them.
> ​True​

I meant to say they can't detect matching parentheses or lack thereof.

>     FAILURE: before the insert statement, everything runs dandy. We could have built an entire data warehouse application on top of a table definition that can never be syntactically processed but which will only fail when someone accidentally tries to insert a line.
>
> ​Since this is going to fail every single time you add a record I'm lacking sympathy here.  "Accidentally tries to insert a line" - if the table wasn't meant to be used why does it exist in the first place?​  And if it is intended to be used then functional testing should quickly point out something like this.

But there clearly can be tables that are used only now and then and might get checked
for absence of rows. But regardless, I think the point stands that ideally you shouldn't
be able to succesfully declare nonsensical objects and only be told so some kinds of
usage patterns by runtime errors (with defective contexts), and in most cases, pgSQL
does keep that promise.

>     FAILURE: I can select from a table with a syntactically invalid definition.
> ​You're stretching here if you think this is an important failure point.  Since the table cannot not have valid data there would be nothing to select.  Checking constraints during selection is undesireable - they should be an are only checked during insertion or when the constraint itself changes.

To clarify, I do not suggest checking constraints during `select`, I suggest
tighter checks at table creation time. I should not be able to construct
any kind of object that says (in SQL or maybe even plpgsql, too) something
to the effect `x ~ '('` because that string in that syntactic context which must
be a RegexLiteral is syntactically bogus. (Yes operators can be redefined but the
only thing that counts here is the definition of the `~` operator at the point in time
that the table gets created, and that expression is constant with a literal, so it's
not much difference between this and checking against, say, `x > 0` which could and
would fail for illegal literals and non-matching types.)

> ​I suppose the best practice when dealing with a lack of information in the error handle code flows is to limit the amount of context that is in scope by unit testing.  And while they are absolutely short-comings overcoming them has cost in terms of both developer effort and, more importantly, runtime performance.

I'm afraid no amount of unit testing of the DDL code can do this for me. Yes,
in the first reported cases (the invalid RegExp), I can make sure I use each
expression at least once so unsyntactic ones will make themselves shown. But
in the other two cases, well, the production environment in which this came up
has an insert statement that takes data from a largish source into the target table
(20k rows of altogether >2m rows), and I *can't* unit test that data.

FWIW the workaround that IÄve found is this:

create table X.table_with_constraints (
  my_column text,
  constraint "my_column must start with 'x'"  check ( Q.starts_with_x( my_column ) ),
  constraint "my_column must have 3 chrs"     check ( Q.has_3_characters( my_column ) ) );

In other words, I dispense with domains and use (small, boolean) functions (defined as `select` one-liners)
because only then do I get told what piece of data comes doen the wrong way and where.
It's a shame because this is essentially what I expect to do in a language like
JavaScript.







On Sat, Mar 17, 2018 at 4:20 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Sat, Mar 17, 2018 at 6:14 AM, john frazer <johnfrazer783@gmail.com> wrote:

Today I realized a number of points where PostgreSQL (v10.3) is rather lackluster and sparse in its error messages.

​You may find the following thread and its predecessors enlightening.


​Basically, the fundamental problem is type input is performed in a relatively isolated fashion since there is no requirement that a table or column of said type even exist.
psql:db/experiments/pg-error-fail-illegal-regex.sql:17: ERROR:
invalid regular expression: parentheses () not balanced

There are several problems with this error message:

FAILURE: the error is really in line 5 where a syntactically invalid RegEx is created; the fact that it is a RegEx and not a general string is obvious from the semantics of the ~ (tilde) operator at that point in time.

​Yeah, the fact that we don't "compile" expressions is unfortunate.  Not sure if there are any plans to do so or what limitations there are​
 

FAILURE: the offending RegEx is not referred to and not quoted in the error message.

​This seems like an easy enough oversight to correct.​  In all the discussion about being challenging to identify location I don't recall seeing anything about why we aren't at least showing the offending input value.

As such, it could be anywhere in my many, many kLOCs big DB definition. I cannot even search the RegEx with a RegEx because all I know is some parenthesis is missing, somewhere:

​Well, the error does point to the first statement in the chain of issues - working backward a couple of steps is possible.

RegExes cannot match parentheses,

​Sure they can​.

and PG RegExes do not have a unique syntactic marker to them.

​True​

FAILURE: before the insert statement, everything runs dandy. We could have built an entire data warehouse application on top of a table definition that can never be syntactically processed but which will only fail when someone accidentally tries to insert a line.

​Since this is going to fail every single time you add a record I'm lacking sympathy here.  "Accidentally tries to insert a line" - if the table wasn't meant to be used why does it exist in the first place?​  And if it is intended to be used then functional testing should quickly point out something like this.

FAILURE: I can select from a table with a syntactically invalid definition.

​You're stretching here if you think this is an important failure point.  Since the table cannot not have valid data there would be nothing to select.  Checking constraints during selection is undesireable - they should be an are only checked during insertion or when the constraint itself changes.

With only line B active, this gives:

psql:db/experiments/pg-error-fail-no-constraint-name.sql:16:
ERROR:  new row for relation "table_with_constraints" violates
check constraint "field b must have 3 characters"
DETAIL:  Failing row contains (xxxx).

SUCCESS: we get the name of the relation and the name of the violated rule.

SUCCESS: the offending piece of data is quoted.

FAILURE: we don't get the full name of the relation, which is "X"."table_with_constraints". Neither do we get the name of the column that received the offending value.

​No, you get "check constraint field b must have 3 characters" with the owning table.  You've defined a table constraint so there is no directly attached column to report - the expression as a whole fails and we don't report which boolean aspects of the expression where true and false.  You do get the input value which makes manually resolving the expression possible.  The lack of schema-qualification on the table identifier seems like an oversight but at the moment I'd not willing to go find evidence in support to opposition to that thought.​

Lastly, with only line A (not line B) active:

psql:db/experiments/pg-error-fail-no-constraint-name.sql:16:
ERROR:  value for domain x.a_legal_regex violates check constraint "a_legal_regex_check"

FAILURE: no reference to the affected table, column is made. 

FAILURE: no reference to the offending piece of data is made

​Repeat of the first example, same explanations apply.  Hopefully this gets improved eventually.​
 

FAILURE: no reference to the offended constraint is made ("column a must start with x").

​It never got that far into the validation process.  It couldn't even form a value of correct type that the constraint expression could evaluate.  I suppose this is just another aspect of the first problem - the isolation of type conversion and the absence of keeping and reporting a stack-trace.  Someone more knowledgeable than I would need to expound on the similarities and differences.

What are the best practices or workarounds for the above shortcomings? I've been trying for several hours to figure out what causes an error message a la value for domain xxx violates check constraint "xxx_check" by rewriting table definitions, inserting data row by row and so on, to no avail. What I need is a full chain of the objects (column -> table -> constraint -> domain -> check) that are involved in the error.

​If the error starts with "value for domain xxx violates" then the only place to look is at your "create domain" statement for "xxx".  Well, at least once you know what the actually offending value is...which right now might be a bit of a challenge depending on the situation (if your first example applies the data doesn't matter and critically evaluating the create domain statement might be sufficient).
 
I'm writing this to the developers' list because I see the above observations as serious shortcomings in an otherwise great piece of software that can probably not be fixed by using client-side code only.

​I suppose the best practice when dealing with a lack of information in the error handle code flows is to limit the amount of context that is in scope by unit testing.  And while they are absolutely short-comings overcoming them has cost in terms of both developer effort and, more importantly, runtime performance.

David J.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)
Next
From: Projat Banerjee
Date:
Subject: HELP