Thread: Better error message for select_common_type()

Better error message for select_common_type()

From

Peter Eisentraut

Date:

23 April 2007, 17:55:32

So I was informed today that UNION types integer and text cannot be 
matched.  Alright, but it failed to tell which particular expressions 
in this 3-branch, 30-columns-each UNION clause in a 100-line statement 
it was talking about.  So I made the attached patch to give some better 
pointers.  Example:

peter=# values(0,1), (1::bigint,2), ('text'::text,3);
ERROR:  42804: VALUES types bigint at position 2 and text at position 3 
cannot be matched in instance 1

I'm not sure about the terminology "position" and "instance"; they're 
just two coordinates to get at the problem.

None of this will help if you have multiple unrelated clauses that 
invoke select_common_type(), but that might be better handled using the 
parser location mechanism.

Comments?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: Better error message for select_common_type()

From

Gregory Stark

Date:

23 April 2007, 18:11:36

"Peter Eisentraut" <peter_e@gmx.net> writes:

> peter=# values(0,1), (1::bigint,2), ('text'::text,3);
> ERROR:  42804: VALUES types bigint at position 2 and text at position 3 
> cannot be matched in instance 1
>
> I'm not sure about the terminology "position" and "instance"; they're 
> just two coordinates to get at the problem.

Wouldn't that just be column 1 in rows 2 and 3?

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com

Re: Better error message for select_common_type()

From

Tom Lane

Date:

23 April 2007, 18:53:09

Peter Eisentraut <peter_e@gmx.net> writes:
> None of this will help if you have multiple unrelated clauses that 
> invoke select_common_type(), but that might be better handled using the 
> parser location mechanism.

+1 on using the parser location mechanism and avoiding the terminology
problem altogether.  I fear though that we're not set up to have
multiple locations in one error report.  Will it be sufficient if we
point at one of the two offending expressions?  (I'd guess pointing at
the second makes the most sense, if feasible.)
        regards, tom lane

Re: Better error message for select_common_type()

From

Peter Eisentraut

Date:

23 April 2007, 19:11:35

Gregory Stark wrote:
> "Peter Eisentraut" <peter_e@gmx.net> writes:
> > peter=# values(0,1), (1::bigint,2), ('text'::text,3);
> > ERROR:  42804: VALUES types bigint at position 2 and text at
> > position 3 cannot be matched in instance 1
> >
> > I'm not sure about the terminology "position" and "instance";
> > they're just two coordinates to get at the problem.
>
> Wouldn't that just be column 1 in rows 2 and 3?

In this case yes, but the routine is also used for UNION, IN, GREATEST, 
JOIN/USING and others.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: Better error message for select_common_type()

From

Peter Eisentraut

Date:

23 April 2007, 19:18:26

Tom Lane wrote:
> +1 on using the parser location mechanism and avoiding the
> terminology problem altogether.

I figured we would let the parser only point to the UNION or VALUES or 
whatever word.  It would be fairly cumbersome to drag the individual 
expression positions down into select_common_value() for full 
precision.

> I fear though that we're not set up 
> to have multiple locations in one error report.  Will it be
> sufficient if we point at one of the two offending expressions?  (I'd
> guess pointing at the second makes the most sense, if feasible.)

I don't think that would help.  In the example I was looking at 90 
expression and I had no idea in most cases what their results types 
are, so if it tells me that the 15th expression somewhere doesn't 
match, I would need to know which is the other mismatching one.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: Better error message for select_common_type()

From

Tom Lane

Date:

23 April 2007, 20:02:43

Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane wrote:
>> +1 on using the parser location mechanism and avoiding the
>> terminology problem altogether.

> I figured we would let the parser only point to the UNION or VALUES or 
> whatever word.  It would be fairly cumbersome to drag the individual 
> expression positions down into select_common_value() for full 
> precision.

Possibly.  I was thinking of demanding that callers pass an additional
list containing positions for the expressions, but hadn't looked to see
how easy that might be.  In any case, if we need to point at both
expressions then it's not gonna work.

For the VALUES case, the suggestion of "row" and "column" terminology
seems the right thing, but for UNION it would be better to use "branch"
perhaps ("row" certainly seems misleading).  How can we make that work
without indulging in untranslatable keyword-insertion?

Another possibility is "alternative" and "column", which seems like it
applies more or less equally poorly to both cases.
        regards, tom lane

Re: Better error message for select_common_type()

From

Alvaro Herrera

Date:

23 April 2007, 20:33:35

Tom Lane wrote:

> For the VALUES case, the suggestion of "row" and "column" terminology
> seems the right thing, but for UNION it would be better to use "branch"
> perhaps ("row" certainly seems misleading).  How can we make that work
> without indulging in untranslatable keyword-insertion?
> 
> Another possibility is "alternative" and "column", which seems like it
> applies more or less equally poorly to both cases.

Maybe it would be good to have distinctive terminology if at all
possible, as it will be clearer for both cases.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Better error message for select_common_type()

From

Gregory Stark

Date:

23 April 2007, 21:11:48

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> For the VALUES case, the suggestion of "row" and "column" terminology
> seems the right thing, but for UNION it would be better to use "branch"
> perhaps ("row" certainly seems misleading).  How can we make that work
> without indulging in untranslatable keyword-insertion?

Hm, I guess the SQL spec terminology in both cases would be "table
expression".

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com

Re: Better error message for select_common_type()

From

Tom Lane

Date:

17 March 2008, 13:55:24

Peter Eisentraut <peter_e@gmx.net> writes:
> ... I'm not sure about the terminology "position" and "instance"; they're 
> just two coordinates to get at the problem.

> None of this will help if you have multiple unrelated clauses that 
> invoke select_common_type(), but that might be better handled using the 
> parser location mechanism.

Were there any objections to changing this patch so that it reports
the second expression's parser location, instead of some arbitrary
numbers?  The way I'm envisioning doing it is:

1. Invent an exprLocation() function (comparable to, say, exprType)
that knows how to get the parser location from any subtype of Node that
has one.

2. Make a variant of select_common_type() that takes a list of Exprs
instead of just type OIDs.  It can get the type IDs from these
using exprType(), and it can get their locations using exprLocation()
if needed.

We could almost just replace the current form of select_common_type()
with the version envisioned in #2.  In a quick grep, there is only
one usage of select_common_type() that isn't invoking it on a list
of exprType() results that could be trivially changed over to the
underlying expressions instead --- and that is in
transformSetOperationTree, which may be dealing with inputs that
are previously-resolved output types for child set operations.
I'm not sure about a better way to complain about type mismatches in
nested set operations, anyway.  We could possibly keep track of one of
the sub-expressions that had determined the resolved output type, and
point to that, but it would likely seem a bit arbitrary to the user.
Thoughts anyone?
        regards, tom lane

Re: Better error message for select_common_type()

From

Tom Lane

Date:

17 March 2008, 20:19:32

I wrote:
> Were there any objections to changing this patch so that it reports
> the second expression's parser location, instead of some arbitrary
> numbers?  The way I'm envisioning doing it is:

> 1. Invent an exprLocation() function (comparable to, say, exprType)
> that knows how to get the parser location from any subtype of Node that
> has one.

> 2. Make a variant of select_common_type() that takes a list of Exprs
> instead of just type OIDs.  It can get the type IDs from these
> using exprType(), and it can get their locations using exprLocation()
> if needed.

I started to look at this and immediately found out that the above
blithe sketch has nothing to do with reality.  The problem is that
the current parser location mechanism stores locations only for
nodes that appear in raw grammar trees (gram.y output), *not* in
analyzed expressions (transformExpr output).  This was an intentional
choice based on a couple of factors:

* Once we no longer have the parser input string available, the location
information would be just so much wasted space.

* It would add a weird special case to the equalfuncs.c routines:
should location fields be compared?  (Probably not, but it seems a bit
unprincipled to ignore them.)  And other places might have comparable
uncertainties what to do with 'em.

We'd need to either go back on that decision or pass in location
information separately to select_common_type.  I think I prefer the
latter, but it's messier.

(On the third hand, you could make a case that including location
info in analyzed expressions makes it feasible to point at problems
detected at higher levels of analyze.c than just the first-level
transformExpr() call.  select_common_type's problem could be seen
as just one aspect of what might be a widespread need.)

Another problem is that only a rather small subset of raw-grammar
expression node types actually carry locations at all.  I had always
intended to go back and extend that, but it's not done yet.  One reason
it's not done is that currently a lot of expression node types are used
for both raw-grammar output and analyzed expressions, which brings us
right back up against the issue above.  I'd be inclined to fix that by
extending AExpr even more, and/or inventing an analogous raw-grammar
node type for things that take variable numbers of arguments, but
still it's more work.

So this is all eminently do-able but it seems too much to be tackling
during commit fest.  I'd like to throw this item back on the TODO list.

Or we could apply Peter's patch more or less as-is, but I don't like
that.  I don't think it solves the stated problem: if you know that CASE
branches 3 and 5 don't match, that still doesn't help you in a monster
query with lots of CASEs.  I think we can and must do better.
        regards, tom lane

Re: Better error message for select_common_type()

From

Gregory Stark

Date:

17 March 2008, 21:31:50

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Or we could apply Peter's patch more or less as-is, but I don't like
> that.  I don't think it solves the stated problem: if you know that CASE
> branches 3 and 5 don't match, that still doesn't help you in a monster
> query with lots of CASEs.  I think we can and must do better.

Do we have something more helpful than "branches 3 and 5"? Perhaps printing
the actual transformed expressions?


--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about
EnterpriseDB'sPostgreSQL training!

Re: Better error message for select_common_type()

From

Tom Lane

Date:

17 March 2008, 22:01:22

Gregory Stark <stark@enterprisedb.com> writes:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> Or we could apply Peter's patch more or less as-is, but I don't like
>> that.  I don't think it solves the stated problem: if you know that CASE
>> branches 3 and 5 don't match, that still doesn't help you in a monster
>> query with lots of CASEs.  I think we can and must do better.

> Do we have something more helpful than "branches 3 and 5"?

That's exactly the point of discussion.  A parser location is what we
need, the problem is that this patch doesn't provide it.

> Perhaps printing the actual transformed expressions?

Don't think it solves the problem either.  For instance, if there are
a hundred references to variable X in your query, printing "X" isn't
going to get you far.
        regards, tom lane

Re: Better error message for select_common_type()

From

Peter Eisentraut

Date:

18 March 2008, 15:45:23

Am Dienstag, 18. März 2008 schrieb Tom Lane:
> Or we could apply Peter's patch more or less as-is, but I don't like
> that.  I don't think it solves the stated problem: if you know that CASE
> branches 3 and 5 don't match, that still doesn't help you in a monster
> query with lots of CASEs.  I think we can and must do better.

Yeah, that and the other reason I sort of gave up on this approach is that it 
is nearly impossible to find some good terminology that works for all callers 
of select_common_type() (VALUES, UNION, JOIN, IN, CASE, ARRAY, COALESCE, 
GREATEST, according to my notes).  A pointer into the statement would 
certainly be much nicer.

Re: Better error message for select_common_type()

From

Bruce Momjian

Date:

24 March 2008, 13:06:06

Added to TODO:

* Improve reporting of UNION type mismatches
 http://archives.postgresql.org/pgsql-hackers/2007-04/msg00944.php
http://archives.postgresql.org/pgsql-hackers/2008-03/msg00597.php


---------------------------------------------------------------------------

Peter Eisentraut wrote:
> So I was informed today that UNION types integer and text cannot be 
> matched.  Alright, but it failed to tell which particular expressions 
> in this 3-branch, 30-columns-each UNION clause in a 100-line statement 
> it was talking about.  So I made the attached patch to give some better 
> pointers.  Example:
> 
> peter=# values(0,1), (1::bigint,2), ('text'::text,3);
> ERROR:  42804: VALUES types bigint at position 2 and text at position 3 
> cannot be matched in instance 1
> 
> I'm not sure about the terminology "position" and "instance"; they're 
> just two coordinates to get at the problem.
> 
> None of this will help if you have multiple unrelated clauses that 
> invoke select_common_type(), but that might be better handled using the 
> parser location mechanism.
> 
> Comments?
> 
> -- 
> Peter Eisentraut
> http://developer.postgresql.org/~petere/

[ Attachment, skipping... ]

> 
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
> 
>                 http://www.postgresql.org/about/donate

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +