Thread: combination join against multiple tables

combination join against multiple tables

From
Geoff Winkless
Date:
Hi

I have a query involving multiple tables that I would like to return in a single query. That means returning multiple sets of the data from the first base table, but that's acceptable for the simplicity in grabbing all the data in one hit.

An example set:

CREATE TABLE t1 (a int, b int, c int);
CREATE TABLE t2 (a int, b int, c int);
CREATE TABLE base (a int);
INSERT INTO t1 (a, b, c) VALUES (1, 1, 111), (1,2,112), (2,1,121), (4,1,141);
INSERT INTO t2 (a, b, c) VALUES (1, 1, 211), (2, 2, 222), (5,3,253);
INSERT INTO base(a) SELECT * FROM GENERATE_SERIES(1,10);

Now the problem is that I would like to return all the rows from a, but with a single row where t2.b and t1.b match.

So the results I would like:


 a  |  c  |  c
----+-----+-----
  1 | 111 | 211
  1 | 112 |
  2 | 121 |
  2 |     | 222
  3 |     |
  4 | 141 |
  5 |     | 253
  6 |     |
  7 |     |
  8 |     |
  9 |     |
 10 |     |

At the moment I'm doing

SELECT base.a, t1.c, t2.c
FROM base
CROSS JOIN (SELECT b FROM t1 UNION SELECT b FROM t2 UNION SELECT -1) tmpset
LEFT JOIN t1 ON t1.a=base.a AND t1.b=tmpset.b
LEFT JOIN t2 ON t2.a=base.a AND t2.b=tmpset.b
WHERE t1.a IS NOT NULL 
OR t2.a IS NOT NULL 
OR (tmpset.b=-1 
  AND NOT EXISTS (SELECT FROM t1 WHERE t1.a=base.a) 
  AND NOT EXISTS (SELECT FROM t2 WHERE t2.a=base.a)
);


but this seems like a really convoluted way to do it.

Is there a join style that will return the set I want without the pain?

I should be clear that the real world data is much more complex than this, but it describes the basic problem.

Thanks

Geoff

Re: combination join against multiple tables

From
"David G. Johnston"
Date:
On Friday, January 31, 2020, Geoff Winkless <pgsqladmin@geoff.dj> wrote:
Now the problem is that I would like to return all the rows from a, but with a single row where t2.b and t1.b match.

So, the final,number of rows for each “a” is the larger row count of “b” and “c” having the same “a”.  Furthermore for the first “n” rows “b” and “c” should be paired together by position.  The smaller count column just gets nulls for the extra rows.

Probably the easiest way is to combine the matches for “b” and “c” into arrays the jointly unnest those arrays in the final result - with in the select list or maybe as part,of a lateral join, not sure without experimentation.

Otherwise you can add “row_number” to “b” and “c” and then left join on (a, row_number).

David J.

Re: combination join against multiple tables

From
Geoff Winkless
Date:
On Fri, 31 Jan 2020 at 15:25, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Friday, January 31, 2020, Geoff Winkless <pgsqladmin@geoff.dj> wrote:
Now the problem is that I would like to return all the rows from a, but with a single row where t2.b and t1.b match.

So, the final,number of rows for each “a” is the larger row count of “b” and “c” having the same “a”.  Furthermore for the first “n” rows “b” and “c” should be paired together by position.  The smaller count column just gets nulls for the extra rows.

Probably the easiest way is to combine the matches for “b” and “c” into arrays the jointly unnest those arrays in the final result - with in the select list or maybe as part,of a lateral join, not sure without experimentation.

Otherwise you can add “row_number” to “b” and “c” and then left join on (a, row_number).


Thanks for the reply. Using array() hadn't occurred to me, I'll look at that.

I actually came up with this:

SELECT base.a, t1.c, t2.c 
FROM base 
LEFT JOIN (t1 FULL OUTER JOIN t2 ON t1.b=t2.b AND t1.a=t2.a)
ON COALESCE(t1.a, base.a)=base.a AND COALESCE(t2.a, base.a)=base.a;

which does solve the described problem; sadly I realise that I'd oversimplified my question: I haven't fully described the problem because in reality "t2" is joined to "base" with a different field, and I can't seem to get the join to do what I want without joining them together like this.

Geoff

Re: combination join against multiple tables

From
Steven Lembark
Date:
On Fri, 31 Jan 2020 14:01:17 +0000
Geoff Winkless <pgsqladmin@geoff.dj> wrote:

> Hi
> 
> I have a query involving multiple tables that I would like to return
> in a single query. That means returning multiple sets of the data
> from the first base table, but that's acceptable for the simplicity
> in grabbing all the data in one hit.
> 
> An example set:
> 
> CREATE TABLE t1 (a int, b int, c int);
> CREATE TABLE t2 (a int, b int, c int);
> CREATE TABLE base (a int);
> INSERT INTO t1 (a, b, c) VALUES (1, 1, 111), (1,2,112), (2,1,121),
> (4,1,141);
> INSERT INTO t2 (a, b, c) VALUES (1, 1, 211), (2, 2, 222), (5,3,253);
> INSERT INTO base(a) SELECT * FROM GENERATE_SERIES(1,10);
> 
> Now the problem is that I would like to return all the rows from a,
> but with a single row where t2.b and t1.b match.
> 
> So the results I would like:
> 
>  a  |  c  |  c
> ----+-----+-----
>   1 | 111 | 211
>   1 | 112 |
>   2 | 121 |
>   2 |     | 222
>   3 |     |
>   4 | 141 |
>   5 |     | 253
>   6 |     |
>   7 |     |
>   8 |     |
>   9 |     |
>  10 |     |
> 
> At the moment I'm doing
> 
> SELECT base.a, t1.c, t2.c
> FROM base
> CROSS JOIN (SELECT b FROM t1 UNION SELECT b FROM t2 UNION SELECT -1)
> tmpset LEFT JOIN t1 ON t1.a=base.a AND t1.b=tmpset.b
> LEFT JOIN t2 ON t2.a=base.a AND t2.b=tmpset.b
> WHERE t1.a IS NOT NULL
> OR t2.a IS NOT NULL
> OR (tmpset.b=-1
>   AND NOT EXISTS (SELECT FROM t1 WHERE t1.a=base.a)
>   AND NOT EXISTS (SELECT FROM t2 WHERE t2.a=base.a)
> );
> 
> 
> but this seems like a really convoluted way to do it.
> 
> Is there a join style that will return the set I want without the
> pain?
> 
> I should be clear that the real world data is much more complex than
> this, but it describes the basic problem.
> 
> Thanks
> 
> Geoff

The join on T1 & t2 seems to just be a full outer join of 
t1 & t2 on a & b.

Note that you cannot have two columns in the output with the
same name (e.g., a+c+c, above, is not a valid table). Call them
"c1" & "c2":

A full outer join of t1 & t2 on a & b seems to give you all of
the necessary combinations of c necessary; at which point an
outer join on a associates base values with anything that 
mathes on a:

    select
      distinct
        base.a
      , z.c1
      , z.c2
    from
      base
      left join
      (
        select
          distinct
            t1.a
          , t1.c  "c1"
          , t2.c  "c2"
        from
          t1
          full outer join
          t2
          on
            t1.a    = t2.a
            and
            t1.b    = t2.b
      ) z
      on
      base.a  = z.a
    ;

No idea what the real data looks like, but distinct likely to be
helpful if real t's have more than three cols.

-- 
Steven Lembark                                        3646 Flora Place
Workhorse Computing                                St. Louis, MO 63110
lembark@wrkhors.com                                    +1 888 359 3508



Re: combination join against multiple tables

From
Steven Lembark
Date:
On Fri, 31 Jan 2020 14:01:17 +0000
Geoff Winkless <pgsqladmin@geoff.dj> wrote:

>  a  |  c  |  c
> ----+-----+-----
>   1 | 111 | 211
>   1 | 112 |
>   2 | 121 |
>   2 |     | 222
>   3 |     |
>   4 | 141 |
>   5 |     | 253
>   6 |     |
>   7 |     |
>   8 |     |
>   9 |     |
>  10 |     |

The c's look just like a full outer join of t1 & t2 on a & b.

Giving them saner names to avoid duplicate output col's, let's
call them "c1" & "c2".

At that point a left outer join on a gives you all of the base.a
values with any t{1,2} rows that have a matching a:

No idea what your data really looks like but if t1 or t2 has more
than three col's distinct can save some annoying cross-products:

    select
      distinct
        base.a
      , z.c1
      , z.c2
    from
      base
      left join
      (
        select
            t1.a
          , t1.c  "c1"
          , t2.c  "c2"
        from
          t1
          full outer join
          t2
          on
            t1.a    = t2.a
            and
            t1.b    = t2.b
      ) z
      on
      base.a  = z.a
    ;

-- 
Steven Lembark                                        3646 Flora Place
Workhorse Computing                                St. Louis, MO 63110
lembark@wrkhors.com                                    +1 888 359 3508