Limitting full join to one match - Mailing list pgsql-general

From Phil Endecott
Subject Limitting full join to one match
Date
Msg-id 1544056443924@dmwebmail.dmwebmail.chezphil.org
Whole thread Raw
Responses Re: Limitting full join to one match
Re: Limitting full join to one match
List pgsql-general
Dear Experts,

I have a couple of tables that I want to reconcile, finding rows 
that match and places where rows are missing from one table or the 
other:

db=> select * from a;
+------------+--------+
|    date    | amount |
+------------+--------+
| 2018-01-01 |  10.00 |
| 2018-02-01 |   5.00 |  <-- missing from b
| 2018-04-01 |   5.00 |
+------------+--------+

db=> select * from b;
+------------+--------+
|    date    | amount |
+------------+--------+
| 2018-01-01 |  10.00 |
| 2018-03-01 |   8.00 |  <-- missing from a
| 2018-04-01 |   5.00 |
+------------+--------+

db=> select a.date, a.amount, b.date, b.amount from a full join b using (date,amount);
+------------+--------+------------+--------+
|    date    | amount |    date    | amount |
+------------+--------+------------+--------+
| 2018-01-01 |  10.00 | 2018-01-01 |  10.00 |
| 2018-02-01 |   5.00 |            |        |
|            |        | 2018-03-01 |   8.00 |
| 2018-04-01 |   5.00 | 2018-04-01 |   5.00 |
+------------+--------+------------+--------+

This works fine until I have multiple items with the same date 
and amount:

db=> select * from a;
+------------+--------+
|    date    | amount |
+------------+--------+
| 2018-01-01 |  10.00 |
| 2018-02-01 |   5.00 |
| 2018-04-01 |   5.00 |
| 2018-05-01 |  20.00 |  <--
| 2018-05-01 |  20.00 |  <--
+------------+--------+


db=> select * from b;
+------------+--------+
|    date    | amount |
+------------+--------+
| 2018-01-01 |  10.00 |
| 2018-03-01 |   8.00 |
| 2018-04-01 |   5.00 |
| 2018-05-01 |  20.00 |  <--
| 2018-05-01 |  20.00 |  <--
+------------+--------+

db=> select a.date, a.amount, b.date, b.amount from a full join b using (date,amount);
+------------+--------+------------+--------+
|    date    | amount |    date    | amount |
+------------+--------+------------+--------+
| 2018-01-01 |  10.00 | 2018-01-01 |  10.00 |
| 2018-02-01 |   5.00 |            |        |
|            |        | 2018-03-01 |   8.00 |
| 2018-04-01 |   5.00 | 2018-04-01 |   5.00 |
| 2018-05-01 |  20.00 | 2018-05-01 |  20.00 |  1
| 2018-05-01 |  20.00 | 2018-05-01 |  20.00 |  2
| 2018-05-01 |  20.00 | 2018-05-01 |  20.00 |  3
| 2018-05-01 |  20.00 | 2018-05-01 |  20.00 |  4
+------------+--------+------------+--------+

It has, of course, put four rows in the output for the new items.

So my question is: how can I modify my query to output only two rows, 
like this:?

+------------+--------+------------+--------+
|    date    | amount |    date    | amount |
+------------+--------+------------+--------+
| 2018-01-01 |  10.00 | 2018-01-01 |  10.00 |
| 2018-02-01 |   5.00 |            |        |
|            |        | 2018-03-01 |   8.00 |
| 2018-04-01 |   5.00 | 2018-04-01 |   5.00 |
| 2018-05-01 |  20.00 | 2018-05-01 |  20.00 |  1
| 2018-05-01 |  20.00 | 2018-05-01 |  20.00 |  2
+------------+--------+------------+--------+


Any suggestions anyone?


The best I have found so far is something involving EXCEPT ALL:

db=> select * from a except all select * from b;
db=> select * from b except all select * from a;

That's not ideal, though, as what I ultimately want is something 
that lists everything with its status:

+------------+--------+--------+
|    date    | amount | status |
+------------+--------+--------+
| 2018-01-01 |  10.00 |   OK   |
| 2018-02-01 |   5.00 | a_only |
| 2018-03-01 |   8.00 | b_only |
| 2018-04-01 |   5.00 |   OK   |
| 2018-05-01 |  20.00 |   OK   |
| 2018-05-01 |  20.00 |   OK   |
+------------+--------+--------+

That would be easy enough to achieve from the JOIN.


Thanks, Phil.




pgsql-general by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: simple division
Next
From: John W Higgins
Date:
Subject: Re: Limitting full join to one match