Re: [HACKERS] Idea on how to simplify comparing two sets - Mailing list pgsql-hackers

From Nico Williams
Subject Re: [HACKERS] Idea on how to simplify comparing two sets
Date
Msg-id 20170223215457.GE30233@localhost
Whole thread Raw
In response to Re: [HACKERS] Idea on how to simplify comparing two sets  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, Feb 07, 2017 at 01:03:14PM -0500, Tom Lane wrote:
> "David G. Johnston" <david.g.johnston@gmail.com> writes:
> 
> Actually ... now that you mention full join, I believe this works:
> 
> select * from (select ...) s1 full join (select ...) s2
>   on ((s1.*)=(s2.*)) where s1.* is distinct from s2.*;

You can drop the .*s:
 select * from (select ...) s1 full join (select ...) s2   on s1 = s2 where s1 is distinct from s2;

And even:
 select s1, s2 from (select ...) s1 natural full outer join (select ...) s2;

This makes it possible to write very generic (schema-wise) code for
comparing table sources.

As I've mentioned elsewhere, there is an issue with NULLs in columns...

I really, really would like either a full equijoin where equality treats
NULL = NULL -> true for this purpose, or a natural join where only
primary key or not-nullable columns are used, or a USING clause form
where I can specify such behavior without having to list all the columns
that should be used.

I use NATURAL FULL OUTER JOIN for computing materialized view diffs in
my alternate view materialization system.  NULLs are poison for this
purpose, yielding false positive differences.  But my code also uses the
table row value form above in order to avoid having to generate column
lists for a USING clause or expressions for ON.

These requests are not for syntactic sugar, not really.  But I realize
they may be non-trivial -- I may be looking for unobtanium.

> > That said I'm not sure how much we want to go down this road on our own.
> > It'd be nice to have when its needed but its not something that gets much
> > visibility on these lists to suggest a large pent-up demand.
> 
> Yeah, if this isn't in the standard and not in other databases either,
> that would seem to suggest that it's not a big requirement.

SQLite3 famously lacks FULL joins.  It kills me because the alternative
constructions become O(N log M) instead of O(N) for a properly
implemented FULL join (assuming suitable indices anyways).

I wouldn't suggest that that's a reason not to support FULL joins in any
other RDBMS, rather, I'd suggest that SQLite3 is missing an important
feature.

Pardon the tangent.  It may not really be applicable here, as here I
think OP is looking for syntactic sugar rather than an important
optimization.  But the point is that sometimes you have to lead the
standards-setting and/or the competition.

Nico
-- 



pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands:\quit_if, \quit_unless)
Next
From: Jim Nasby
Date:
Subject: Re: [HACKERS] Faster methods for getting SPI results (460%improvement)