Thread: Re: Add system column support to the USING clause

Re: Add system column support to the USING clause

From
Tom Lane
Date:
Denis Garsh <d.garsh@arenadata.io> writes:
> The patch adds support for system columns in JOIN USING clause.

I think this is an actively bad idea, and it was likely intentional
that it's not supported today.  A few reasons why:

* There are, fundamentally, no use-cases for joining on system
columns.  The only one that is stable enough to even consider
using for the purpose is tableoid, and I'm not detecting a reason
why that'd be useful.  If there are any edge cases where people
would actually wish to do that, it can be done easily enough with
a standard JOIN ON clause.

* Non-heap table AMs may not provide the same system columns that
heap does, further reducing the scope for plausible use-cases.
(Yeah, I know we don't support that too well today.)

* This breaks ruleutils.c's mechanism for dealing with name
conflicts across multiple USING clauses.  That relies on being
able to assign aliases to the USING inputs at the table level
(that is, "FROM realtable AS aliastable(aliascolumn,...)").
There's no way to alias a system column in the FROM syntax.

            regards, tom lane



Re: Add system column support to the USING clause

From
Denis Garsh
Date:


On 13.09.2024 17:56, Tom Lane wrote:
I think this is an actively bad idea, and it was likely intentional
that it's not supported today.  A few reasons why:

Thank you, Tom and David, for your feedback.

I admit my mistake. I should have asked if this problem was worth
solving before diving in. However, since I’ve already spent a lot of
time into the patch, so I'll try to fight a little ;-)

It looks like this feature hasn't been added because it's not obvious how to do it. And it is difficult to assess the consequences of adding a system column in RTE. Personally, I had to sweat to do it. >* There are, fundamentally, no use-cases for joining on system  
>columns. The only one that is stable enough to even consider  
>using for the purpose is tableoid, and I'm not detecting a reason  
>why that'd be useful. If there are any edge cases where people  
>would actually wish to do that, it can be done easily enough with  
>a standard JOIN ON clause.

But after all, it's implemented in `JOIN ON`. Accordingly, it seems like it should also be supported in `JOIN USING`. And is there any guarantee that new system columns won't be added in the future that may be more useful?

> * This breaks ruleutils.c's mechanism for dealing with name  
> conflicts across multiple USING clauses. That relies on being  
> able to assign aliases to the USING inputs at the table level  
> (that is, "FROM realtable AS aliastable(aliascolumn,...)").  
> There's no way to alias a system column in the FROM syntax.

Could you please provide an example of such a query? I've tried creating multi-join queries with aliases, but I couldn't break it. For example:
```sql
CREATE TABLE t    (id1 int);
CREATE TABLE tt   (id2 int);
CREATE TABLE ttt  (id3 int);
CREATE TABLE tttt (id4 int);

BEGIN;
INSERT INTO t    VALUES (1);
INSERT INTO tt   VALUES (101);
INSERT INTO ttt  VALUES (201);
INSERT INTO tttt VALUES (301);
COMMIT;

BEGIN;
INSERT INTO t    VALUES (2);
INSERT INTO tt   VALUES (102);
INSERT INTO ttt  VALUES (202);
INSERT INTO tttt VALUES (302);
COMMIT;

INSERT INTO t    VALUES (3);
INSERT INTO tt   VALUES (103);
INSERT INTO ttt  VALUES (203);
INSERT INTO tttt VALUES (303);

SELECT *FROM t FULL JOIN tt USING (xmin);
-- xmin | id1 | id2
--------+-----+-----
-- 1057 |   1 | 101
-- 1058 |   2 | 102
-- 1059 |   3 |    
-- 1060 |     | 103
--(4 rows)

SELECT *FROM ttt FULL JOIN tttt USING (xmin);
-- xmin | id3 | id4
--------+-----+-----
-- 1057 | 201 | 301
-- 1058 | 202 | 302
-- 1061 | 203 |    
-- 1062 |     | 303
--(4 rows)

SELECT * FROM t FULL JOIN tt USING (xmin) FULL JOIN ttt USING (xmin);
-- xmin | id1 | id2 | id3
--------+-----+-----+-----
-- 1057 |   1 | 101 | 201
-- 1058 |   2 | 102 | 202
-- 1059 |   3 |     |    
-- 1060 |     | 103 |    
-- 1061 |     |     | 203
--(5 rows)

SELECT *FROM
    (t FULL JOIN tt USING (xmin)) AS alias1(col1, col21, col31)
    JOIN
    (ttt FULL JOIN tttt USING (xmin)) AS alias2(col1, col22, col32)
    USING (col1);
-- col1 | col21 | col31 | col22 | col32
--------+-------+-------+-------+-------
-- 1057 |     1 |   101 |   201 |   301
-- 1058 |     2 |   102 |   202 |   302
--(2 rows)
```

I noticed that after adding it to the RTE, the negative system column attributes will be used in `ruleutils.c` (see [here](https://github.com/postgres/postgres/blob/52c707483ce4d0161127e4958d981d1b5655865e/src/backend/utils/adt/ruleutils.c#L5055)), and then in the `colinfo` structure. However, I didn't find any issues with `colinfo`. For example:

```sql
create table tt2 (a int, b int, c int);
create table tt3 (ax int8, b int2, c numeric);
create table tt4 (ay int, b int, q int);
create view v2 as select * from
tt2 join tt3 using (b,c,xmin) join tt4 using (b, xmin);
select pg_get_viewdef('v2', true);
-- SELECT tt2.b, tt2.xmin, tt3.c, tt2.a, tt3.ax, tt4.ay, tt4.q
--    FROM tt2 JOIN tt3 USING (b, c, xmin) JOIN tt4 USING (b, xmin);
alter table tt2 add column d int;
alter table tt2 add column e int;
select pg_get_viewdef('v2', true);
-- SELECT tt2.b, tt2.xmin, tt3.c, tt2.a, tt3.ax, tt4.ay, tt4.q
--    FROM tt2 JOIN tt3 USING (b, c, xmin) JOIN tt4 USING (b, xmin);
--       alter table tt3 rename c to d;

select pg_get_viewdef('v2', true);
-- SELECT tt2.b, tt2.xmin, tt3.c, tt2.a, tt3.ax, tt4.ay, tt4.q
--    FROM tt2 JOIN tt3 tt3(ax, b, c) USING (b, c, xmin) JOIN tt4 USING (b, xmin);
alter table tt3 add column c int;
alter table tt3 add column e int;
select pg_get_viewdef('v2', true);
-- SELECT tt2.b, tt2.xmin, tt3.c, tt2.a, tt3.ax, tt4.ay, tt4.q
--    FROM tt2 JOIN tt3 tt3(ax, b, c, c_1, e) USING (b, c, xmin)
--       JOIN tt4 USING (b, xmin);

alter table tt2 drop column d;
select pg_get_viewdef('v2', true);
-- SELECT tt2.b, tt2.xmin, tt3.c, tt2.a, tt3.ax, tt4.ay, tt4.q
--    FROM tt2 JOIN tt3 tt3(ax, b, c, c_1, e) USING (b, c, xmin)
--       JOIN tt4 USING (b, xmin);
```

-- 
Best regards,  
Denis Garsh