Thread: [HACKERS] New CORRESPONDING clause design
On Tue, Jan 17, 2017 at 12:37 AM, Surafel Temsgen <surafel3000@gmail.com> wrote: > I am new here and I really want to contribute, I have read same resource > that help understanding database system and postgresql. I would like to > start implementing sql syntax corresponding by clause because I believe > implementing sql syntax gives an opportunity to familiarize many part of > postgresql source code. Previous implementation is here and have an issue on > explain query and break cases on unlabeled NULLs > To repeat what a corresponding by clause means > Corresponding clause either contains a BY(...) clause or not. If it > doesn't have a BY(...) clause the usage is as follows. This is great stuff. Does the syntax only apply to UNION? I would imagine it would also apply to INTERSECT/EXCEPT? What about UNION ALL? merlin
On Tue, Jan 17, 2017 at 08:20:25AM -0600, Merlin Moncure wrote: > On Tue, Jan 17, 2017 at 12:37 AM, Surafel Temsgen <surafel3000@gmail.com> wrote: > > I am new here and I really want to contribute, I have read same resource > > that help understanding database system and postgresql. I would like to > > start implementing sql syntax corresponding by clause because I believe > > implementing sql syntax gives an opportunity to familiarize many part of > > postgresql source code. Previous implementation is here and have an issue on > > explain query and break cases on unlabeled NULLs > > To repeat what a corresponding by clause means > > Corresponding clause either contains a BY(...) clause or not. If it > > doesn't have a BY(...) clause the usage is as follows. > > This is great stuff. Does the syntax only apply to UNION? I would > imagine it would also apply to INTERSECT/EXCEPT? What about UNION > ALL? My draft working standard from 2011 says in 7IWD-02-Foundation section 7.13 <query expression>: a) If CORRESPONDING is specified, then: i) Within the columns of T1, equivalent <column name>s shall not be specified more than once and within the columns of T2, equivalent <column name>s shall not be specified more than once. ii) At least one column of T1 shall have a <column name> that is the <column name> of some column of T2. iii) Case: 1) If <corresponding column list> is not specified, then let SL be a <select list> of those <column name>s that are <column name>s of both T1 and T2 in the order that those <column name>s appear in T1. 2) If <corresponding column list> is specified, then let SL be a <select list> of those <column name>s explicitly appearing in the <corresponding column list> in the order that these <column name>s appear in the <corresponding column list>. Every <column name> in the <corresponding column list> shall be a <column name> of both T1 and T2. iv) The <query term> or <query expression body> is equivalent to: ( SELECT SL FROM TN1 ) OP ( SELECT SL FROM TN2 ) Earlier, it defines ( UNION | EXCEPT ) [ ALL | DISTINCT ] to have a meaning in this context, to wit: <query expression body> ::= <query term> | <query expression body> UNION [ ALL | DISTINCT ] [ <corresponding spec> ] <query term> | <query expression body> EXCEPT [ ALL | DISTINCT ] [ <corresponding spec> ] <query term> Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Surafel Temsgen <surafel3000@gmail.com> writes: > My design is > *In parsing stage* > 1. Check at least one common column name appear in queries > 2. If corresponding column list is not specified, then make corresponding > list from common column name in queries target lists in the order > that those column names appear in first query > 3. If corresponding column list is specified, then check that every column > name > in The corresponding column list appears in column names of both queries > *In planner* > 1. Take projection columen name from corresponding list > 2. Reorder left and right queries target lists according to corresponding > list > 3. For each query, project columens as many as number of corresponding > columen. FWIW, I think you need to do most of that work in the parser, not the planner. The parser certainly has to determine the actual output column names and types of the setop construct, and we usually consider that the parsing stage is expected to fully determine the semantics of the query. So I'd envision the parser as emitting some explicit representation of the column mapping for each side, rather than expecting the planner to determine for itself what maps to what. It probably does make sense for the actual implementation to involve inserting projection nodes below the setop. I'm pretty sure the executor code for setops expects to just pass input tuples through without projection. You could imagine changing that, but it would add a penalty for non-CORRESPONDING setops, which IMO would be the wrong tradeoff. regards, tom lane
This patch compiles and tests successfully with master branch on ubuntu-15.10-desktop-amd64.It also includes documentation and new regression tests in union.sql.
Regards
Surafel Temesgen
Surafel Temsgen <surafel3000@gmail.com> writes:
> My design is
> *In parsing stage*
> 1. Check at least one common column name appear in queries
> 2. If corresponding column list is not specified, then make corresponding
> list from common column name in queries target lists in the order
> that those column names appear in first query
> 3. If corresponding column list is specified, then check that every column
> name
> in The corresponding column list appears in column names of both queries
> *In planner*
> 1. Take projection columen name from corresponding list
> 2. Reorder left and right queries target lists according to corresponding
> list
> 3. For each query, project columens as many as number of corresponding
> columen.
FWIW, I think you need to do most of that work in the parser, not the
planner. The parser certainly has to determine the actual output column
names and types of the setop construct, and we usually consider that the
parsing stage is expected to fully determine the semantics of the query.
So I'd envision the parser as emitting some explicit representation of
the column mapping for each side, rather than expecting the planner to
determine for itself what maps to what.
It probably does make sense for the actual implementation to involve
inserting projection nodes below the setop. I'm pretty sure the executor
code for setops expects to just pass input tuples through without
projection. You could imagine changing that, but it would add a penalty
for non-CORRESPONDING setops, which IMO would be the wrong tradeoff.
regards, tom lane
Attachment
On Thu, Feb 16, 2017 at 8:28 PM, Surafel Temsgen <surafel3000@gmail.com> wrote: > Here is the implementation of the clause with the slight change, instead of > doing column mapping for each side of leaf Queries in planner I make the > projection nodes output to corresponding column lists only. > > This patch compiles and tests successfully with master branch on > ubuntu-15.10-desktop-amd64.It also includes documentation and new regression > tests in union.sql. You should probably add this to https://commitfest.postgresql.org/ so that it doesn't get forgotten about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
postgres=# SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, 6 c, 8 d; ERROR: each UNION query must have the same number of columns LINE 1: ...1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, ...
postgres=# create table t1(a int, b int, c int); CREATE TABLE Time: 63,260 ms postgres=# create table t2(a int, b int); CREATE TABLE Time: 57,120 ms postgres=# select * from t1 union corresponding select * from t2; ERROR: each UNION query must have the same number of columns LINE 1: select * from t1 union corresponding select * from t2;
Hi ,
Here is a patch corrected as your feedback except missed tests case because corresponding by clause is implemented on the top of set operation and you can’t do that to set operation without corresponding by clause too
Eg
postgres=# SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;
ERROR: each UNION query must have the same number of columns
LINE 1: SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;
^
postgres=# create table t1(a int, b int, c int);
CREATE TABLE
postgres=# create table t2(a int, b int);
CREATE TABLE
postgres=# select * from t1 union select * from t2;
ERROR: each UNION query must have the same number of columns
LINE 1: select * from t1 union select * from t2;
Thanks
Surafel
HiI am sending a review of this interesting feature.I found following issues, questions:1. unclosed tags <optional> in documentation2. bad name "changeTargetEntry" - should be makeTargetEntry?3. Why you removed lot of asserts in prepunion.c? These asserts should be valid still4. make_coresponding_target has wrong formatting5. error "%s queries with a CORRESPONDING clause must have at least one column with the same name" has wrong formatting, you can show position6. previous issue is repeated - look on formatting ereport function, please, you can use DETAIL and HINT fields7. corresponding clause should to contain column list (I am looking to ANSI/SQL 99) - you are using expr_list, what has not sense and probably it has impact on all implementation.8. typo orderCorrespondingLsit(List *targetlist)9. I miss more tests for CORRESPONDING BY10. if I understand to this feature, this query should to workpostgres=# SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, 6 c, 8 d; ERROR: each UNION query must have the same number of columns LINE 1: ...1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, ...postgres=# create table t1(a int, b int, c int); CREATE TABLE Time: 63,260 ms postgres=# create table t2(a int, b int); CREATE TABLE Time: 57,120 ms postgres=# select * from t1 union corresponding select * from t2; ERROR: each UNION query must have the same number of columns LINE 1: select * from t1 union corresponding select * from t2;If it is your first patch to Postgres, then it is perfect work!The @7 is probably most significant - I dislike a expression list there. name_list should be better there.RegardsPavel
Attachment
Hi ,
Here is a patch corrected as your feedback except missed tests case because corresponding by clause is implemented on the top of set operation and you can’t do that to set operation without corresponding by clause too
ERROR: each UNION query must have the same number of columns LINE 1: ...elect 10 as a, 20 as b union corresponding select 20 as a, 3...Corresponding clause should to work like projection filter.
Regards
Pavel
Eg
postgres=# SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;
ERROR: each UNION query must have the same number of columns
LINE 1: SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;
^ postgres=# create table t1(a int, b int, c int);
CREATE TABLE
postgres=# create table t2(a int, b int);
CREATE TABLE
postgres=# select * from t1 union select * from t2;
ERROR: each UNION query must have the same number of columns
LINE 1: select * from t1 union select * from t2;
Thanks
Surafel
On Tue, Mar 7, 2017 at 10:26 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:HiI am sending a review of this interesting feature.I found following issues, questions:1. unclosed tags <optional> in documentation2. bad name "changeTargetEntry" - should be makeTargetEntry?3. Why you removed lot of asserts in prepunion.c? These asserts should be valid still4. make_coresponding_target has wrong formatting5. error "%s queries with a CORRESPONDING clause must have at least one column with the same name" has wrong formatting, you can show position6. previous issue is repeated - look on formatting ereport function, please, you can use DETAIL and HINT fields7. corresponding clause should to contain column list (I am looking to ANSI/SQL 99) - you are using expr_list, what has not sense and probably it has impact on all implementation.8. typo orderCorrespondingLsit(List *targetlist)9. I miss more tests for CORRESPONDING BY10. if I understand to this feature, this query should to workpostgres=# SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, 6 c, 8 d; ERROR: each UNION query must have the same number of columns LINE 1: ...1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, ...postgres=# create table t1(a int, b int, c int); CREATE TABLE Time: 63,260 ms postgres=# create table t2(a int, b int); CREATE TABLE Time: 57,120 ms postgres=# select * from t1 union corresponding select * from t2; ERROR: each UNION query must have the same number of columns LINE 1: select * from t1 union corresponding select * from t2;If it is your first patch to Postgres, then it is perfect work!The @7 is probably most significant - I dislike a expression list there. name_list should be better there.RegardsPavel
2017-03-09 13:18 GMT+01:00 Surafel Temesgen <surafel3000@gmail.com>:Hi ,
Here is a patch corrected as your feedback except missed tests case because corresponding by clause is implemented on the top of set operation and you can’t do that to set operation without corresponding by clause too
I don't understand.The following statement should to workpostgres=# select 10 as a, 20 as b union corresponding select 20 as a, 30 as b, 40 as c;ERROR: each UNION query must have the same number of columns LINE 1: ...elect 10 as a, 20 as b union corresponding select 20 as a, 3...Corresponding clause should to work like projection filter.
RegardsPavel
Eg
postgres=# SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;
ERROR: each UNION query must have the same number of columns
LINE 1: SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;
^ postgres=# create table t1(a int, b int, c int);
CREATE TABLE
postgres=# create table t2(a int, b int);
CREATE TABLE
postgres=# select * from t1 union select * from t2;
ERROR: each UNION query must have the same number of columns
LINE 1: select * from t1 union select * from t2;
Thanks
Surafel
On Tue, Mar 7, 2017 at 10:26 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:HiI am sending a review of this interesting feature.I found following issues, questions:1. unclosed tags <optional> in documentation2. bad name "changeTargetEntry" - should be makeTargetEntry?3. Why you removed lot of asserts in prepunion.c? These asserts should be valid still4. make_coresponding_target has wrong formatting5. error "%s queries with a CORRESPONDING clause must have at least one column with the same name" has wrong formatting, you can show position6. previous issue is repeated - look on formatting ereport function, please, you can use DETAIL and HINT fields7. corresponding clause should to contain column list (I am looking to ANSI/SQL 99) - you are using expr_list, what has not sense and probably it has impact on all implementation.8. typo orderCorrespondingLsit(List *targetlist)9. I miss more tests for CORRESPONDING BY10. if I understand to this feature, this query should to workpostgres=# SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, 6 c, 8 d; ERROR: each UNION query must have the same number of columns LINE 1: ...1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, ...postgres=# create table t1(a int, b int, c int); CREATE TABLE Time: 63,260 ms postgres=# create table t2(a int, b int); CREATE TABLE Time: 57,120 ms postgres=# select * from t1 union corresponding select * from t2; ERROR: each UNION query must have the same number of columns LINE 1: select * from t1 union corresponding select * from t2;If it is your first patch to Postgres, then it is perfect work!The @7 is probably most significant - I dislike a expression list there. name_list should be better there.RegardsPavel
Yes, you are correct it should to work on CORRESPONDING clause case. SQL 20nn standard draft only said each query to be of the same degree in a case of set operation without corresponding clause. The attached patch is corrected as such .I add those new test case to regression test too
Regards
Surafel
hi2017-03-09 17:19 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:2017-03-09 13:18 GMT+01:00 Surafel Temesgen <surafel3000@gmail.com>:Hi ,
Here is a patch corrected as your feedback except missed tests case because corresponding by clause is implemented on the top of set operation and you can’t do that to set operation without corresponding by clause too
I don't understand.The following statement should to workpostgres=# select 10 as a, 20 as b union corresponding select 20 as a, 30 as b, 40 as c;ERROR: each UNION query must have the same number of columns LINE 1: ...elect 10 as a, 20 as b union corresponding select 20 as a, 3...Corresponding clause should to work like projection filter.I found a link to postgresql mailing list related to some older try to this feature implementationRegardsPavel
Eg
postgres=# SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;
ERROR: each UNION query must have the same number of columns
LINE 1: SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;
^ postgres=# create table t1(a int, b int, c int);
CREATE TABLE
postgres=# create table t2(a int, b int);
CREATE TABLE
postgres=# select * from t1 union select * from t2;
ERROR: each UNION query must have the same number of columns
LINE 1: select * from t1 union select * from t2;
Thanks
Surafel
On Tue, Mar 7, 2017 at 10:26 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:HiI am sending a review of this interesting feature.I found following issues, questions:1. unclosed tags <optional> in documentation2. bad name "changeTargetEntry" - should be makeTargetEntry?3. Why you removed lot of asserts in prepunion.c? These asserts should be valid still4. make_coresponding_target has wrong formatting5. error "%s queries with a CORRESPONDING clause must have at least one column with the same name" has wrong formatting, you can show position6. previous issue is repeated - look on formatting ereport function, please, you can use DETAIL and HINT fields7. corresponding clause should to contain column list (I am looking to ANSI/SQL 99) - you are using expr_list, what has not sense and probably it has impact on all implementation.8. typo orderCorrespondingLsit(List *targetlist)9. I miss more tests for CORRESPONDING BY10. if I understand to this feature, this query should to workpostgres=# SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, 6 c, 8 d; ERROR: each UNION query must have the same number of columns LINE 1: ...1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, ...postgres=# create table t1(a int, b int, c int); CREATE TABLE Time: 63,260 ms postgres=# create table t2(a int, b int); CREATE TABLE Time: 57,120 ms postgres=# select * from t1 union corresponding select * from t2; ERROR: each UNION query must have the same number of columns LINE 1: select * from t1 union corresponding select * from t2;If it is your first patch to Postgres, then it is perfect work!The @7 is probably most significant - I dislike a expression list there. name_list should be better there.RegardsPavel
Attachment
Yes, you are correct it should to work on CORRESPONDING clause case. SQL 20nn standard draft only said each query to be of the same degree in a case of set operation without corresponding clause. The attached patch is corrected as such .I add those new test case to regression test too
Regards
Surafel
On Thu, Mar 9, 2017 at 9:49 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:hi2017-03-09 17:19 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:2017-03-09 13:18 GMT+01:00 Surafel Temesgen <surafel3000@gmail.com>:Hi ,
Here is a patch corrected as your feedback except missed tests case because corresponding by clause is implemented on the top of set operation and you can’t do that to set operation without corresponding by clause too
I don't understand.The following statement should to workpostgres=# select 10 as a, 20 as b union corresponding select 20 as a, 30 as b, 40 as c;ERROR: each UNION query must have the same number of columns LINE 1: ...elect 10 as a, 20 as b union corresponding select 20 as a, 3...Corresponding clause should to work like projection filter.I found a link to postgresql mailing list related to some older try to this feature implementationRegardsPavel
Eg
postgres=# SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;
ERROR: each UNION query must have the same number of columns
LINE 1: SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;
^ postgres=# create table t1(a int, b int, c int);
CREATE TABLE
postgres=# create table t2(a int, b int);
CREATE TABLE
postgres=# select * from t1 union select * from t2;
ERROR: each UNION query must have the same number of columns
LINE 1: select * from t1 union select * from t2;
Thanks
Surafel
On Tue, Mar 7, 2017 at 10:26 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:HiI am sending a review of this interesting feature.I found following issues, questions:1. unclosed tags <optional> in documentation2. bad name "changeTargetEntry" - should be makeTargetEntry?3. Why you removed lot of asserts in prepunion.c? These asserts should be valid still4. make_coresponding_target has wrong formatting5. error "%s queries with a CORRESPONDING clause must have at least one column with the same name" has wrong formatting, you can show position6. previous issue is repeated - look on formatting ereport function, please, you can use DETAIL and HINT fields7. corresponding clause should to contain column list (I am looking to ANSI/SQL 99) - you are using expr_list, what has not sense and probably it has impact on all implementation.8. typo orderCorrespondingLsit(List *targetlist)9. I miss more tests for CORRESPONDING BY10. if I understand to this feature, this query should to workpostgres=# SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, 6 c, 8 d; ERROR: each UNION query must have the same number of columns LINE 1: ...1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, ...postgres=# create table t1(a int, b int, c int); CREATE TABLE Time: 63,260 ms postgres=# create table t2(a int, b int); CREATE TABLE Time: 57,120 ms postgres=# select * from t1 union corresponding select * from t2; ERROR: each UNION query must have the same number of columns LINE 1: select * from t1 union corresponding select * from t2;If it is your first patch to Postgres, then it is perfect work!The @7 is probably most significant - I dislike a expression list there. name_list should be better there.RegardsPavel
2017-03-10 10:13 GMT+01:00 Surafel Temesgen <surafel3000@gmail.com>:Yes, you are correct it should to work on CORRESPONDING clause case. SQL 20nn standard draft only said each query to be of the same degree in a case of set operation without corresponding clause. The attached patch is corrected as such .I add those new test case to regression test too
Thank you - I will recheck it.
Hi2017-03-10 12:55 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:2017-03-10 10:13 GMT+01:00 Surafel Temesgen <surafel3000@gmail.com>:Yes, you are correct it should to work on CORRESPONDING clause case. SQL 20nn standard draft only said each query to be of the same degree in a case of set operation without corresponding clause. The attached patch is corrected as such .I add those new test case to regression test too
Thank you - I will recheck it.Fast check - it looks well
RegardsPavel
Attachment
I am sending minor update - cleaning formatting and white spaces, error messages + few more tests
Maybe correspondingClause needs own node type with attached location. Then context can be much better positioned.
I think we can solve it by using your option or using expr_list for corresponding column and check the syntax manually.
In my opinion, the last option eliminate the introduction of new node for only the sake of error position.
What did you think about the second option?
Regards
Surafel
On Sat, Mar 11, 2017 at 9:01 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:I am sending minor update - cleaning formatting and white spaces, error messages + few more testsThank you very much for your helpMaybe correspondingClause needs own node type with attached location. Then context can be much better positioned.I think we can solve it by using your option or using expr_list for corresponding column and check the syntax manually.
In my opinion, the last option eliminate the introduction of new node for only the sake of error position.
What did you think about the second option?
Regards
Surafel
Some errors are related to just CORRESPONDING without any columns. So using expr doesn't help here. So parse node CORRESPONDING can solve both issues.
Attachment
hiSome errors are related to just CORRESPONDING without any columns. So using expr doesn't help here. So parse node CORRESPONDING can solve both issues.In current implementation pointing to a node means pointing to a node’s first element so I don’t think we can be able to point to CORRESPONDING without any columnsI find out that there is already a node prepare for the case called A_Const.The attached patch use that node
RegardsSurafel
Attachment
Pavel Stehule <pavel.stehule@gmail.com> writes: > I have not any objection - I'll mark this patch as ready for commiter I took a quick look through this and noted that it fails to touch ruleutils.c, which means that dumping of views containing CORRESPONDING certainly doesn't work. Also, the changes in parser/analyze.c seem rather massive and correspondingly hard to review. Is it possible to rearrange the patch to reduce the size of that diff? If you can avoid moving or reindenting existing code, that'd help. The code in that area seems rather confused, too. For instance, I'm not sure exactly what orderCorrespondingList() is good for, but it certainly doesn't look to be doing anything that either its name or its header comment (or the comments at the call sites) would suggest. Its input and output tlists are always in the same order. I'm a little disturbed by the fact that determineMatchingColumns() is called twice, and more disturbed by the fact that it looks to be O(N^2). This could be really slow with a lot of columns, couldn't it? I also think there should be some comments about exactly what matching semantics we're enforcing. The SQL standard says a) If CORRESPONDING is specified, then: i) Within the columns of T1, equivalent <column name>s shall not be specified more than once and within the columns of T2, equivalent <column name>sshall not be specified more than once. That seems unreasonably stringent to me; it ought to be sufficient to forbid duplicates of the names listed in CORRESPONDING, or the common column names if there's no BY list. But whichever restriction you prefer, this code seems to be failing to check it --- I certainly don't see any new error message about "column name "foo" appears more than once". I don't think you want to be leaving behind debugging cruft like this: + elog(DEBUG4, "%s", ltle->resname); I'm not impressed by using A_Const for the members of the CORRESPONDING name list. That's not a clever solution, that's a confusing kluge, because it's a complete violation of the meaning of A_Const. Elsewhere we just use lists of String for name lists, and that seems sufficient here. Personally I'd just use the existing columnList production rather than rolling your own. The comments in parsenodes.h about the new fields seem rather confused, in fact I think they're probably backwards. Also, I thought the test cases were excessively uninspired and repetitive. This, for example, seems like about two tests too many: +SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY(a) SELECT 4 a, 5 b, 6 c; + a +--- + 1 + 4 +(2 rows) + +SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY(b) SELECT 4 a, 5 b, 6 c; + b +--- + 2 + 5 +(2 rows) + +SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY(c) SELECT 4 a, 5 b, 6 c; + c +--- + 3 + 6 +(2 rows) without even considering the fact that these are pretty duplicative of tests around them. And some of the added tests seem to be just testing basic UNION functionality, which we already have tests for. I fail to see the point of adding any of these: +SELECT 1 AS two UNION CORRESPONDING SELECT 2 two; + two +----- + 1 + 2 +(2 rows) + +SELECT 1 AS one UNION CORRESPONDING SELECT 1 one; + one +----- + 1 +(1 row) + +SELECT 1 AS two UNION ALL CORRESPONDING SELECT 2 two; + two +----- + 1 + 2 +(2 rows) + +SELECT 1 AS two UNION ALL CORRESPONDING SELECT 1 two; + two +----- + 1 + 1 +(2 rows) What I think actually is needed is some targeted testing showing non-interference with nearby features. As an example, CORRESPONDING can't safely be implemented by just replacing the tlists of the input sub-selects with shortened versions, because that would break situations such as DISTINCT in the sub-selects. I think it'd be a good idea to have a test along the lines of SELECT DISTINCT x, y FROM ...UNION ALL CORRESPONDINGSELECT DISTINCT x, z FROM ... with values chosen to show that the DISTINCT operators did operate on x/y and x/z, not just x alone. regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes:
> I have not any objection - I'll mark this patch as ready for commiter
I took a quick look through this and noted that it fails to touch
ruleutils.c, which means that dumping of views containing CORRESPONDING
certainly doesn't work.
Also, the changes in parser/analyze.c seem rather massive and
correspondingly hard to review. Is it possible to rearrange the
patch to reduce the size of that diff? If you can avoid moving
or reindenting existing code, that'd help.
The code in that area seems rather confused, too. For instance,
I'm not sure exactly what orderCorrespondingList() is good for,
but it certainly doesn't look to be doing anything that either its
name or its header comment (or the comments at the call sites) would
suggest. Its input and output tlists are always in the same order.
I'm a little disturbed by the fact that determineMatchingColumns()
is called twice, and more disturbed by the fact that it looks to be
O(N^2). This could be really slow with a lot of columns, couldn't it?
I also think there should be some comments about exactly what matching
semantics we're enforcing. The SQL standard says
a) If CORRESPONDING is specified, then:
i) Within the columns of T1, equivalent <column name>s shall
not be specified more than once and within the columns of
T2, equivalent <column name>s shall not be specified more
than once.
That seems unreasonably stringent to me; it ought to be sufficient to
forbid duplicates of the names listed in CORRESPONDING, or the common
column names if there's no BY list. But whichever restriction you prefer,
this code seems to be failing to check it --- I certainly don't see any
new error message about "column name "foo" appears more than once".
I don't think you want to be leaving behind debugging cruft like this:
+ elog(DEBUG4, "%s", ltle->resname);
I'm not impressed by using A_Const for the members of the CORRESPONDING
name list. That's not a clever solution, that's a confusing kluge,
because it's a complete violation of the meaning of A_Const. Elsewhere
we just use lists of String for name lists, and that seems sufficient
here. Personally I'd just use the existing columnList production rather
than rolling your own.
The comments in parsenodes.h about the new fields seem rather confused,
in fact I think they're probably backwards.
Also, I thought the test cases were excessively uninspired and repetitive.
This, for example, seems like about two tests too many:
+SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY(a) SELECT 4 a, 5 b, 6 c;
+ a
+---
+ 1
+ 4
+(2 rows)
+
+SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY(b) SELECT 4 a, 5 b, 6 c;
+ b
+---
+ 2
+ 5
+(2 rows)
+
+SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY(c) SELECT 4 a, 5 b, 6 c;
+ c
+---
+ 3
+ 6
+(2 rows)
without even considering the fact that these are pretty duplicative of
tests around them. And some of the added tests seem to be just testing
basic UNION functionality, which we already have tests for. I fail to
see the point of adding any of these:
+SELECT 1 AS two UNION CORRESPONDING SELECT 2 two;
+ two
+-----
+ 1
+ 2
+(2 rows)
+
+SELECT 1 AS one UNION CORRESPONDING SELECT 1 one;
+ one
+-----
+ 1
+(1 row)
+
+SELECT 1 AS two UNION ALL CORRESPONDING SELECT 2 two;
+ two
+-----
+ 1
+ 2
+(2 rows)
+
+SELECT 1 AS two UNION ALL CORRESPONDING SELECT 1 two;
+ two
+-----
+ 1
+ 1
+(2 rows)
What I think actually is needed is some targeted testing showing
non-interference with nearby features. As an example, CORRESPONDING
can't safely be implemented by just replacing the tlists of the
input sub-selects with shortened versions, because that would break
situations such as DISTINCT in the sub-selects. I think it'd be a
good idea to have a test along the lines of
SELECT DISTINCT x, y FROM ...
UNION ALL CORRESPONDING
SELECT DISTINCT x, z FROM ...
with values chosen to show that the DISTINCT operators did operate
on x/y and x/z, not just x alone.
regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes: > 2017-03-18 17:50 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>: >> I'm not impressed by using A_Const for the members of the CORRESPONDING >> name list. That's not a clever solution, that's a confusing kluge, >> because it's a complete violation of the meaning of A_Const. Elsewhere >> we just use lists of String for name lists, and that seems sufficient >> here. Personally I'd just use the existing columnList production rather >> than rolling your own. > The reason was attach a location to name for more descriptive error > message. [ shrug... ] The patch fails to actually use the location anywhere. If it had, you might have noticed that it's attaching the wrong location to all elements except the first :-(. So I'm not very excited about that. I definitely don't see a reason for CORRESPONDING to track locations of name list elements when no other name list productions do. It might be worth proposing a followon patch to change all of them (perhaps by adding a location field to struct "Value") and then make use of the locations in error messages more widely. regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes:
> 2017-03-18 17:50 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:
>> I'm not impressed by using A_Const for the members of the CORRESPONDING
>> name list. That's not a clever solution, that's a confusing kluge,
>> because it's a complete violation of the meaning of A_Const. Elsewhere
>> we just use lists of String for name lists, and that seems sufficient
>> here. Personally I'd just use the existing columnList production rather
>> than rolling your own.
> The reason was attach a location to name for more descriptive error
> message.
[ shrug... ] The patch fails to actually use the location anywhere.
If it had, you might have noticed that it's attaching the wrong location
to all elements except the first :-(. So I'm not very excited about that.
I definitely don't see a reason for CORRESPONDING to track locations of
name list elements when no other name list productions do. It might be
worth proposing a followon patch to change all of them (perhaps by adding
a location field to struct "Value") and then make use of the locations in
error messages more widely.
regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes: > 2017-03-18 18:32 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>: >> I definitely don't see a reason for CORRESPONDING to track locations of >> name list elements when no other name list productions do. It might be >> worth proposing a followon patch to change all of them (perhaps by adding >> a location field to struct "Value") and then make use of the locations in >> error messages more widely. > I had a idea use own node for CORRESPONDING with location - and using this > location in related error messages. I think using a private node type for CORRESPONDING is exactly the wrong thing. It's a columnList and it should be like other columnLists. If there's an argument for providing a location for "no such column" errors for CORRESPONDING, then surely there's also an argument for providing a location for "no such column" errors for FOREIGN KEY and the other places where we have lists of column names. regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes:
> 2017-03-18 18:32 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:
>> I definitely don't see a reason for CORRESPONDING to track locations of
>> name list elements when no other name list productions do. It might be
>> worth proposing a followon patch to change all of them (perhaps by adding
>> a location field to struct "Value") and then make use of the locations in
>> error messages more widely.
> I had a idea use own node for CORRESPONDING with location - and using this
> location in related error messages.
I think using a private node type for CORRESPONDING is exactly the wrong
thing. It's a columnList and it should be like other columnLists. If
there's an argument for providing a location for "no such column" errors
for CORRESPONDING, then surely there's also an argument for providing
a location for "no such column" errors for FOREIGN KEY and the other
places where we have lists of column names.
regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes:
> I have not any objection - I'll mark this patch as ready for commiter
I'm a little disturbed by the fact that determineMatchingColumns()
is called twice, and more disturbed by the fact that it looks to be
O(N^2). This could be really slow with a lot of columns, couldn't it?
I also think there should be some comments about exactly what matching
semantics we're enforcing. The SQL standard says
a) If CORRESPONDING is specified, then:
i) Within the columns of T1, equivalent <column name>s shall
not be specified more than once and within the columns of
T2, equivalent <column name>s shall not be specified more
than once.
That seems unreasonably stringent to me; it ought to be sufficient to
forbid duplicates of the names listed in CORRESPONDING, or the common
column names if there's no BY list. But whichever restriction you prefer,
this code seems to be failing to check it --- I certainly don't see any
new error message about "column name "foo" appears more than once".
What I think actually is needed is some targeted testing showing
non-interference with nearby features. As an example, CORRESPONDING
can't safely be implemented by just replacing the tlists of the
input sub-selects with shortened versions, because that would break
situations such as DISTINCT in the sub-selects. I think it'd be a
good idea to have a test along the lines of
SELECT DISTINCT x, y FROM ...
UNION ALL CORRESPONDING
SELECT DISTINCT x, z FROM ...
with values chosen to show that the DISTINCT operators did operate
on x/y and x/z, not just x alone.
I took a quick look through this and noted that it fails to touch
ruleutils.c, which means that dumping of views containing CORRESPONDING
certainly doesn't work.
Also, the changes in parser/analyze.c seem rather massive and
correspondingly hard to review. Is it possible to rearrange the
patch to reduce the size of that diff? If you can avoid moving
or reindenting existing code, that'd help.
The code in that area seems rather confused, too. For instance,
I'm not sure exactly what orderCorrespondingList() is good for,
but it certainly doesn't look to be doing anything that either its
name or its header comment (or the comments at the call sites) would
suggest. Its input and output tlists are always in the same order.
I also think there should be some comments about exactly what matching
semantics we're enforcing. The SQL standard says
a) If CORRESPONDING is specified, then:
i) Within the columns of T1, equivalent <column name>s shall
not be specified more than once and within the columns of
T2, equivalent <column name>s shall not be specified more
than once.
That seems unreasonably stringent to me; it ought to be sufficient to
forbid duplicates of the names listed in CORRESPONDING, or the common
column names if there's no BY list. But whichever restriction you prefer,
this code seems to be failing to check it --- I certainly don't see any
new error message about "column name "foo" appears more than once".
I'm not impressed by using A_Const for the members of the CORRESPONDING
name list. That's not a clever solution, that's a confusing kluge,
because it's a complete violation of the meaning of A_Const. Elsewhere
we just use lists of String for name lists, and that seems sufficient
here. Personally I'd just use the existing columnList production rather
than rolling your own.
Attachment
I took a quick look through this and noted that it fails to touch
ruleutils.c, which means that dumping of views containing CORRESPONDING
certainly doesn't work.fixedAlso, the changes in parser/analyze.c seem rather massive and
correspondingly hard to review. Is it possible to rearrange the
patch to reduce the size of that diff? If you can avoid moving
or reindenting existing code, that'd help.Part of transformSetOperationTree that make union data type ofset operation target list became makeUnionDatatype inorder toeasy using it multiple time and avoid very long transformSetOperationTreefunctionThe code in that area seems rather confused, too. For instance,
I'm not sure exactly what orderCorrespondingList() is good for,
but it certainly doesn't look to be doing anything that either its
name or its header comment (or the comments at the call sites) would
suggest. Its input and output tlists are always in the same order.It give corresponding target list a sequential resnosInorder to avoid touching generate_append_tlist I changethe comment and function name as suchI also think there should be some comments about exactly what matching
semantics we're enforcing. The SQL standard says
a) If CORRESPONDING is specified, then:
i) Within the columns of T1, equivalent <column name>s shall
not be specified more than once and within the columns of
T2, equivalent <column name>s shall not be specified more
than once.
That seems unreasonably stringent to me; it ought to be sufficient to
forbid duplicates of the names listed in CORRESPONDING, or the common
column names if there's no BY list. But whichever restriction you prefer,
this code seems to be failing to check it --- I certainly don't see any
new error message about "column name "foo" appears more than once".fixedI'm not impressed by using A_Const for the members of the CORRESPONDING
name list. That's not a clever solution, that's a confusing kluge,
because it's a complete violation of the meaning of A_Const. Elsewhere
we just use lists of String for name lists, and that seems sufficient
here. Personally I'd just use the existing columnList production rather
than rolling your own.fixed
Attachment
Attachment
Hifresh update - I enhanced Value node by location field as Tom proposal.Few more regress tests.But I found significant issue, that needs bigger fix - Surafel, please, can you fix it.It crash onSELECT 0 AS x1, 1 AS a, 0 AS x2, 2 AS b, 0 AS x3, -1 AS x3UNION ALL CORRESPONDING SELECT 4 AS b, 0 AS x4, 3 AS a, 0 AS x6, -1 AS x6UNION ALL CORRESPONDING SELECT 0 AS x8, 6 AS b, -100 AS x9;I'll mark this patch as waiting on authorRegardsPavel
can you help with fixing it Pavel?
On Mon, Mar 27, 2017 at 11:48 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:Hifresh update - I enhanced Value node by location field as Tom proposal.Few more regress tests.But I found significant issue, that needs bigger fix - Surafel, please, can you fix it.It crash onSELECT 0 AS x1, 1 AS a, 0 AS x2, 2 AS b, 0 AS x3, -1 AS x3UNION ALL CORRESPONDING SELECT 4 AS b, 0 AS x4, 3 AS a, 0 AS x6, -1 AS x6UNION ALL CORRESPONDING SELECT 0 AS x8, 6 AS b, -100 AS x9;I'll mark this patch as waiting on authorRegardsPavel
2017-03-28 13:58 GMT+02:00 Surafel Temesgen <surafel3000@gmail.com>:can you help with fixing it Pavel?There must be some new preanalyze stage - you have to know result columns before you are starting a analyze
RegardsPavelOn Mon, Mar 27, 2017 at 11:48 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:Hifresh update - I enhanced Value node by location field as Tom proposal.Few more regress tests.But I found significant issue, that needs bigger fix - Surafel, please, can you fix it.It crash onSELECT 0 AS x1, 1 AS a, 0 AS x2, 2 AS b, 0 AS x3, -1 AS x3UNION ALL CORRESPONDING SELECT 4 AS b, 0 AS x4, 3 AS a, 0 AS x6, -1 AS x6UNION ALL CORRESPONDING SELECT 0 AS x8, 6 AS b, -100 AS x9;I'll mark this patch as waiting on authorRegardsPavel
2017-03-28 14:18 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:2017-03-28 13:58 GMT+02:00 Surafel Temesgen <surafel3000@gmail.com>:can you help with fixing it Pavel?There must be some new preanalyze stage - you have to know result columns before you are starting a analyzemaybe some recheck after analyze stage to remove invalid columns can be good enough.RegardsPavelRegardsPavelOn Mon, Mar 27, 2017 at 11:48 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:Hifresh update - I enhanced Value node by location field as Tom proposal.Few more regress tests.But I found significant issue, that needs bigger fix - Surafel, please, can you fix it.It crash onSELECT 0 AS x1, 1 AS a, 0 AS x2, 2 AS b, 0 AS x3, -1 AS x3UNION ALL CORRESPONDING SELECT 4 AS b, 0 AS x4, 3 AS a, 0 AS x6, -1 AS x6UNION ALL CORRESPONDING SELECT 0 AS x8, 6 AS b, -100 AS x9;I'll mark this patch as waiting on authorRegardsPavel
Attachment
hiThank you very much for your help .here is the patch fix that issue as you suggest
RegardsSurafelOn Tue, Mar 28, 2017 at 5:44 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:2017-03-28 14:18 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:2017-03-28 13:58 GMT+02:00 Surafel Temesgen <surafel3000@gmail.com>:can you help with fixing it Pavel?There must be some new preanalyze stage - you have to know result columns before you are starting a analyzemaybe some recheck after analyze stage to remove invalid columns can be good enough.RegardsPavelRegardsPavelOn Mon, Mar 27, 2017 at 11:48 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:Hifresh update - I enhanced Value node by location field as Tom proposal.Few more regress tests.But I found significant issue, that needs bigger fix - Surafel, please, can you fix it.It crash onSELECT 0 AS x1, 1 AS a, 0 AS x2, 2 AS b, 0 AS x3, -1 AS x3UNION ALL CORRESPONDING SELECT 4 AS b, 0 AS x4, 3 AS a, 0 AS x6, -1 AS x6UNION ALL CORRESPONDING SELECT 0 AS x8, 6 AS b, -100 AS x9;I'll mark this patch as waiting on authorRegardsPavel
Attachment
Pavel Stehule <pavel.stehule@gmail.com> writes: > Is following use case defined in standard? > postgres=# SELECT 0 AS x1, 1 AS a, 0 AS x2, 2 AS b, 0 AS x3, -1 AS x3 > UNION ALL CORRESPONDING BY(a,b) SELECT 4 AS b, 0 AS x4, 3 AS a, > 0 AS x6, -1 AS x6 > UNION ALL CORRESPONDING SELECT 0 AS x8, 6 AS a, -100 AS aa; > ┌───┐ > │ a │ > ╞═══╡ > │ 1 │ > │ 3 │ > │ 6 │ > └───┘ > (3 rows) > It depends on order of implementation > if we do (T1 U T2) U T3 ---> then result is correct, > but if we do T1 U (T2 U T3) ---> than it should to fail UNION ALL should associate left-to-right, just like most other binary operators, so this looks fine to me. Did you check that you get an error if you put in parens to force the other order? regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes:
> Is following use case defined in standard?
> postgres=# SELECT 0 AS x1, 1 AS a, 0 AS x2, 2 AS b, 0 AS x3, -1 AS x3
> UNION ALL CORRESPONDING BY(a,b) SELECT 4 AS b, 0 AS x4, 3 AS a,
> 0 AS x6, -1 AS x6
> UNION ALL CORRESPONDING SELECT 0 AS x8, 6 AS a, -100 AS aa;
> ┌───┐
> │ a │
> ╞═══╡
> │ 1 │
> │ 3 │
> │ 6 │
> └───┘
> (3 rows)
> It depends on order of implementation
> if we do (T1 U T2) U T3 ---> then result is correct,
> but if we do T1 U (T2 U T3) ---> than it should to fail
UNION ALL should associate left-to-right, just like most other binary
operators, so this looks fine to me. Did you check that you get an
error if you put in parens to force the other order?
regards, tom lane
2017-03-30 21:43 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:Pavel Stehule <pavel.stehule@gmail.com> writes:
> Is following use case defined in standard?
> postgres=# SELECT 0 AS x1, 1 AS a, 0 AS x2, 2 AS b, 0 AS x3, -1 AS x3
> UNION ALL CORRESPONDING BY(a,b) SELECT 4 AS b, 0 AS x4, 3 AS a,
> 0 AS x6, -1 AS x6
> UNION ALL CORRESPONDING SELECT 0 AS x8, 6 AS a, -100 AS aa;
> ┌───┐
> │ a │
> ╞═══╡
> │ 1 │
> │ 3 │
> │ 6 │
> └───┘
> (3 rows)
> It depends on order of implementation
> if we do (T1 U T2) U T3 ---> then result is correct,
> but if we do T1 U (T2 U T3) ---> than it should to fail
UNION ALL should associate left-to-right, just like most other binary
operators, so this looks fine to me. Did you check that you get an
error if you put in parens to force the other order?yes - it failspostgres=# SELECT 0 AS x1, 1 AS a, 0 AS x2, 2 AS b, 0 AS x3, -1 AS x3 UNION ALL CORRESPONDING BY(a,b) (SELECT 4 AS b, 0 AS x4, 3 AS a, 0 AS x6, -1 AS x6 UNION ALL CORRESPONDING SELECT 0 AS x8, 6 AS a, -100 AS aa);ERROR: column name "b" can not be used in CORRESPONDING BY listLINE 1: ...b, 0 AS x3, -1 AS x3 UNION ALL CORRESPONDING BY(a,b) (SELECT...^HINT: UNION queries with a CORRESPONDING BY clause must contain column names from both tables.Time: 1,135 ms
RegardsPavel
regards, tom lane
Attachment
Pavel Stehule <pavel.stehule@gmail.com> writes: > [ corresponding_clause_v12.patch ] I worked on this for awhile but eventually decided that it's not very close to being committable. The main thing that's scaring me off is a realization that there are a *lot* of places that assume that the output columns of a set operation are one-for-one with the columns of the inputs. One such place is the subquery qual pushdown logic in allpaths.c, which this patch hasn't touched, resulting in regression=# create table t1 (a int, b int, c int); CREATE TABLE regression=# create table t2 (a int, b int, c int); CREATE TABLE regression=# create view vv2 as regression-# select * from t1 union all corresponding by (b,c) select * from t2; CREATE VIEW regression=# select * from vv2 where c = 44; ERROR: wrong number of tlist entries Another is the code that converts a UNION ALL subquery into an appendrel. The best thing there might be to just reject CORRESPONDING in is_simple_union_all, but this patch doesn't, which means we get the wrong results from regression=# create table t3 (b int, a int, c int); CREATE TABLE regression=# explain verbose select * from t1 union all corresponding select * from t3; QUERYPLAN --------------------------------------------------------------------Append (cost=0.00..60.80 rows=4080 width=12) -> SeqScan on public.t1 (cost=0.00..30.40 rows=2040 width=12) Output: t1.a, t1.b, t1.c -> Seq Scan on public.t3 (cost=0.00..30.40rows=2040 width=12) Output: t3.b, t3.a, t3.c (5 rows) Notice it's failed to rearrange the columns to match. There are also such assumptions in ruleutils.c. Trying to reverse-list the above view gives regression=# \d+ vv2 View "public.vv2"Column | Type | Collation | Nullable | Default | Storage| Description --------+---------+-----------+----------+---------+---------+-------------b | integer | | | | plain | c | integer | | | | plain | View definition:SELECT t1.a AS b, t1.b AS c, t1.c FROM t1 UNION ALL CORRESPONDING BY (b, c)SELECT t2.a AS b, t2.b AS c, t2.c FROM t2; which is obviously wrong. The reason for that is that the code is trying to ensure that the SELECT's output column names match the view's column names, so it sticks AS clauses on the select-list expressions. If we go a little further: regression=# alter table vv2 rename column c to ceetwo; ALTER TABLE regression=# \d+ vv2 View "public.vv2"Column | Type | Collation | Nullable | Default | Storage| Description --------+---------+-----------+----------+---------+---------+-------------b | integer | | | | plain | ceetwo | integer | | | | plain | View definition:SELECT t1.a AS b, t1.b AS ceetwo, t1.c FROM t1 UNION ALL CORRESPONDING BY (b, c)SELECT t2.a AS b, t2.b AS ceetwo, t2.c FROM t2; Now things are a *complete* mess, because this view definition would successfully parse during a dump-and-reload, but it means something else than it did before; the CORRESPONDING BY list has failed to track the change in names of the columns. In general, there's a lot of subtle logic in ruleutils.c about what we need to do to ensure that reverse-listed views keep the same semantic meaning in the face of column renamings or additions in their input tables. CORRESPONDING is really scary in this situation because its semantics depend on column names. I tried to break it by adding/renaming columns of the input tables to see if I could get ruleutils to print something that didn't mean what it meant before. I didn't succeed immediately, but I am thinking it would be smart for us always to reverse-list the construct as CORRESPONDING BY with an explicit list of column names. We don't ever reverse-list a NATURAL JOIN as such, preferring to use an explicit JOIN USING list, for closely related reasons. (You might care to study the logic in ruleutils.c around the usingNames list, which exists to ensure that JOIN USING doesn't get broken by column renames. It's entirely likely that we're going to need something equivalently complicated for CORRESPONDING BY. Or if we don't, I'd want to have some comments in there that clearly explain why we don't, because otherwise somebody will break it in future.) I also found that WITH RECURSIVE fails immediately if you stick CORRESPONDING into the recursive union step; eg in this lightly modified version of a query from with.sql, CREATE TEMPORARY TABLE tree( id INTEGER PRIMARY KEY, parent_id INTEGER REFERENCES tree(id) ); WITH RECURSIVE t(id, path) AS ( select 1 as id, ARRAY[]::integer[] as path UNION ALL CORRESPONDING SELECT tree.id, t.path || tree.id as path FROM tree JOIN t ON (tree.parent_id = t.id) ) SELECT t1.*, t2.* FROM t AS t1 JOIN t AS t2 ON(t1.path[1] = t2.path[1] ANDarray_upper(t1.path,1) = 1 ANDarray_upper(t2.path,1)> 1)ORDER BY t1.id, t2.id; ERROR: column t.id does not exist LINE 5: FROM tree JOIN t ON (tree.parent_id = t.id) ^ HINT: Perhaps you meant to reference the column "tree.id". I was expecting to find trouble if the CORRESPONDING had to rearrange columns (breaking the 1-to-1 assumption), but it fails even without that. There may well be other places in the system that are making similar assumptions about 1-to-1 mapping between setop inputs and outputs. This comment in plan_set_operations is pretty scary, for instance: * Find the leftmost component Query. We need to useits column names for * all generated tlists (else SELECT INTO won't work right). because the column names in the component query certainly won't be 1-to-1 with upper tlists if CORRESPONDING is active. I wonder whether it wouldn't be a smarter plan to abandon this implementation approach and rely on inserting an explicit level of sub-selects during the parse analysis transformation. That is, SELECT * FROM foo UNION CORRESPONDING BY (x, y) SELECT * FROM bar would get converted into SELECT x, y FROM (SELECT * FROM foo) ss1 UNION SELECT x, y FROM (SELECT * FROM bar) ss2 This isn't very pretty, but I would have a lot fewer worries about how long we'd be finding bugs induced by the feature. If we do stick with trying to represent CORRESPONDING BY explicitly in the parse tree, I'm not very satisfied with the representation you've chosen for SetOperationStmt.correspondingColumns. The patch says + List *correspondingColumns; /* list of corresponding column names */ which is a flat out lie. debug_print_parse and some study of the code showed me that it's actually a list of TargetEntry nodes containing copies of the left-hand input's target expression trees. This seems bulky, bizarre, and asymmetric; among other things it requires the planner to re-derive the column matching that the parser had already figured out. There certainly doesn't seem to be any value in carrying around an extra copy of the left-hand expressions. I'd be inclined to suggest that the best representation in SetOperationStmt is two integer lists of the left-hand and right-hand column numbers of the CORRESPONDING columns. This would probably simplify matters greatly for the planner. It would mean that ruleutils.c would have to do a bit more work to find out the column names to print in CORRESPONDING BY; but as I showed above, just printing the names that appeared at parse time is a doomed strategy anyway. Also, just in passing --- while it may well be a good idea to add a location field to struct Value, I'm not on board with doing so and then using it in only one place. As I said earlier, I think that CORRESPONDING's column name list should act like every other column name list in the grammar, and that includes making use of error cursor locations where appropriate. And changing struct Value has repercussions way beyond that, for instance it's not clear why we'd keep A_Const as a separate node type if Value has a location field. I think if you want to push that forward, it would be a good idea to submit a separate patch that just focuses on adding error cursor location reports everywhere that adding locations to Values would enable. I'll set this back to Waiting on Author, but I think the chances of getting to a committable patch before the end of the commitfest are about nil. regards, tom lane
On 4/1/17 1:54 PM, Tom Lane wrote: > > I'll set this back to Waiting on Author, but I think the chances of > getting to a committable patch before the end of the commitfest are > about nil. I think this is especially true now that another three days have passed. This submission has been marked "Returned with Feedback". Please feel free to resubmit to a future commitfest. Regards, -- -David david@pgmasters.net