Thread: Better error message for select_common_type()
So I was informed today that UNION types integer and text cannot be matched. Alright, but it failed to tell which particular expressions in this 3-branch, 30-columns-each UNION clause in a 100-line statement it was talking about. So I made the attached patch to give some better pointers. Example: peter=# values(0,1), (1::bigint,2), ('text'::text,3); ERROR: 42804: VALUES types bigint at position 2 and text at position 3 cannot be matched in instance 1 I'm not sure about the terminology "position" and "instance"; they're just two coordinates to get at the problem. None of this will help if you have multiple unrelated clauses that invoke select_common_type(), but that might be better handled using the parser location mechanism. Comments? -- Peter Eisentraut http://developer.postgresql.org/~petere/
"Peter Eisentraut" <peter_e@gmx.net> writes: > peter=# values(0,1), (1::bigint,2), ('text'::text,3); > ERROR: 42804: VALUES types bigint at position 2 and text at position 3 > cannot be matched in instance 1 > > I'm not sure about the terminology "position" and "instance"; they're > just two coordinates to get at the problem. Wouldn't that just be column 1 in rows 2 and 3? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Peter Eisentraut <peter_e@gmx.net> writes: > None of this will help if you have multiple unrelated clauses that > invoke select_common_type(), but that might be better handled using the > parser location mechanism. +1 on using the parser location mechanism and avoiding the terminology problem altogether. I fear though that we're not set up to have multiple locations in one error report. Will it be sufficient if we point at one of the two offending expressions? (I'd guess pointing at the second makes the most sense, if feasible.) regards, tom lane
Gregory Stark wrote: > "Peter Eisentraut" <peter_e@gmx.net> writes: > > peter=# values(0,1), (1::bigint,2), ('text'::text,3); > > ERROR: 42804: VALUES types bigint at position 2 and text at > > position 3 cannot be matched in instance 1 > > > > I'm not sure about the terminology "position" and "instance"; > > they're just two coordinates to get at the problem. > > Wouldn't that just be column 1 in rows 2 and 3? In this case yes, but the routine is also used for UNION, IN, GREATEST, JOIN/USING and others. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Tom Lane wrote: > +1 on using the parser location mechanism and avoiding the > terminology problem altogether. I figured we would let the parser only point to the UNION or VALUES or whatever word. It would be fairly cumbersome to drag the individual expression positions down into select_common_value() for full precision. > I fear though that we're not set up > to have multiple locations in one error report. Will it be > sufficient if we point at one of the two offending expressions? (I'd > guess pointing at the second makes the most sense, if feasible.) I don't think that would help. In the example I was looking at 90 expression and I had no idea in most cases what their results types are, so if it tells me that the 15th expression somewhere doesn't match, I would need to know which is the other mismatching one. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut <peter_e@gmx.net> writes: > Tom Lane wrote: >> +1 on using the parser location mechanism and avoiding the >> terminology problem altogether. > I figured we would let the parser only point to the UNION or VALUES or > whatever word. It would be fairly cumbersome to drag the individual > expression positions down into select_common_value() for full > precision. Possibly. I was thinking of demanding that callers pass an additional list containing positions for the expressions, but hadn't looked to see how easy that might be. In any case, if we need to point at both expressions then it's not gonna work. For the VALUES case, the suggestion of "row" and "column" terminology seems the right thing, but for UNION it would be better to use "branch" perhaps ("row" certainly seems misleading). How can we make that work without indulging in untranslatable keyword-insertion? Another possibility is "alternative" and "column", which seems like it applies more or less equally poorly to both cases. regards, tom lane
Tom Lane wrote: > For the VALUES case, the suggestion of "row" and "column" terminology > seems the right thing, but for UNION it would be better to use "branch" > perhaps ("row" certainly seems misleading). How can we make that work > without indulging in untranslatable keyword-insertion? > > Another possibility is "alternative" and "column", which seems like it > applies more or less equally poorly to both cases. Maybe it would be good to have distinctive terminology if at all possible, as it will be clearer for both cases. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > For the VALUES case, the suggestion of "row" and "column" terminology > seems the right thing, but for UNION it would be better to use "branch" > perhaps ("row" certainly seems misleading). How can we make that work > without indulging in untranslatable keyword-insertion? Hm, I guess the SQL spec terminology in both cases would be "table expression". -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Peter Eisentraut <peter_e@gmx.net> writes: > ... I'm not sure about the terminology "position" and "instance"; they're > just two coordinates to get at the problem. > None of this will help if you have multiple unrelated clauses that > invoke select_common_type(), but that might be better handled using the > parser location mechanism. Were there any objections to changing this patch so that it reports the second expression's parser location, instead of some arbitrary numbers? The way I'm envisioning doing it is: 1. Invent an exprLocation() function (comparable to, say, exprType) that knows how to get the parser location from any subtype of Node that has one. 2. Make a variant of select_common_type() that takes a list of Exprs instead of just type OIDs. It can get the type IDs from these using exprType(), and it can get their locations using exprLocation() if needed. We could almost just replace the current form of select_common_type() with the version envisioned in #2. In a quick grep, there is only one usage of select_common_type() that isn't invoking it on a list of exprType() results that could be trivially changed over to the underlying expressions instead --- and that is in transformSetOperationTree, which may be dealing with inputs that are previously-resolved output types for child set operations. I'm not sure about a better way to complain about type mismatches in nested set operations, anyway. We could possibly keep track of one of the sub-expressions that had determined the resolved output type, and point to that, but it would likely seem a bit arbitrary to the user. Thoughts anyone? regards, tom lane
I wrote: > Were there any objections to changing this patch so that it reports > the second expression's parser location, instead of some arbitrary > numbers? The way I'm envisioning doing it is: > 1. Invent an exprLocation() function (comparable to, say, exprType) > that knows how to get the parser location from any subtype of Node that > has one. > 2. Make a variant of select_common_type() that takes a list of Exprs > instead of just type OIDs. It can get the type IDs from these > using exprType(), and it can get their locations using exprLocation() > if needed. I started to look at this and immediately found out that the above blithe sketch has nothing to do with reality. The problem is that the current parser location mechanism stores locations only for nodes that appear in raw grammar trees (gram.y output), *not* in analyzed expressions (transformExpr output). This was an intentional choice based on a couple of factors: * Once we no longer have the parser input string available, the location information would be just so much wasted space. * It would add a weird special case to the equalfuncs.c routines: should location fields be compared? (Probably not, but it seems a bit unprincipled to ignore them.) And other places might have comparable uncertainties what to do with 'em. We'd need to either go back on that decision or pass in location information separately to select_common_type. I think I prefer the latter, but it's messier. (On the third hand, you could make a case that including location info in analyzed expressions makes it feasible to point at problems detected at higher levels of analyze.c than just the first-level transformExpr() call. select_common_type's problem could be seen as just one aspect of what might be a widespread need.) Another problem is that only a rather small subset of raw-grammar expression node types actually carry locations at all. I had always intended to go back and extend that, but it's not done yet. One reason it's not done is that currently a lot of expression node types are used for both raw-grammar output and analyzed expressions, which brings us right back up against the issue above. I'd be inclined to fix that by extending AExpr even more, and/or inventing an analogous raw-grammar node type for things that take variable numbers of arguments, but still it's more work. So this is all eminently do-able but it seems too much to be tackling during commit fest. I'd like to throw this item back on the TODO list. Or we could apply Peter's patch more or less as-is, but I don't like that. I don't think it solves the stated problem: if you know that CASE branches 3 and 5 don't match, that still doesn't help you in a monster query with lots of CASEs. I think we can and must do better. regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > Or we could apply Peter's patch more or less as-is, but I don't like > that. I don't think it solves the stated problem: if you know that CASE > branches 3 and 5 don't match, that still doesn't help you in a monster > query with lots of CASEs. I think we can and must do better. Do we have something more helpful than "branches 3 and 5"? Perhaps printing the actual transformed expressions? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB'sPostgreSQL training!
Gregory Stark <stark@enterprisedb.com> writes: > "Tom Lane" <tgl@sss.pgh.pa.us> writes: >> Or we could apply Peter's patch more or less as-is, but I don't like >> that. I don't think it solves the stated problem: if you know that CASE >> branches 3 and 5 don't match, that still doesn't help you in a monster >> query with lots of CASEs. I think we can and must do better. > Do we have something more helpful than "branches 3 and 5"? That's exactly the point of discussion. A parser location is what we need, the problem is that this patch doesn't provide it. > Perhaps printing the actual transformed expressions? Don't think it solves the problem either. For instance, if there are a hundred references to variable X in your query, printing "X" isn't going to get you far. regards, tom lane
Am Dienstag, 18. März 2008 schrieb Tom Lane: > Or we could apply Peter's patch more or less as-is, but I don't like > that. I don't think it solves the stated problem: if you know that CASE > branches 3 and 5 don't match, that still doesn't help you in a monster > query with lots of CASEs. I think we can and must do better. Yeah, that and the other reason I sort of gave up on this approach is that it is nearly impossible to find some good terminology that works for all callers of select_common_type() (VALUES, UNION, JOIN, IN, CASE, ARRAY, COALESCE, GREATEST, according to my notes). A pointer into the statement would certainly be much nicer.
Added to TODO: * Improve reporting of UNION type mismatches http://archives.postgresql.org/pgsql-hackers/2007-04/msg00944.php http://archives.postgresql.org/pgsql-hackers/2008-03/msg00597.php --------------------------------------------------------------------------- Peter Eisentraut wrote: > So I was informed today that UNION types integer and text cannot be > matched. Alright, but it failed to tell which particular expressions > in this 3-branch, 30-columns-each UNION clause in a 100-line statement > it was talking about. So I made the attached patch to give some better > pointers. Example: > > peter=# values(0,1), (1::bigint,2), ('text'::text,3); > ERROR: 42804: VALUES types bigint at position 2 and text at position 3 > cannot be matched in instance 1 > > I'm not sure about the terminology "position" and "instance"; they're > just two coordinates to get at the problem. > > None of this will help if you have multiple unrelated clauses that > invoke select_common_type(), but that might be better handled using the > parser location mechanism. > > Comments? > > -- > Peter Eisentraut > http://developer.postgresql.org/~petere/ [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +