Re: [HACKERS] Re: Improve OR conditions on joined columns (commonstar schema problem) - Mailing list pgsql-hackers

From Noah Misch
Subject Re: [HACKERS] Re: Improve OR conditions on joined columns (commonstar schema problem)
Date
Msg-id 20190318000942.GA2804728@rfd.leadboat.com
Whole thread Raw
In response to Re: [HACKERS] Re: Improve OR conditions on joined columns (common star schema problem)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] Re: Improve OR conditions on joined columns (common star schema problem)
List pgsql-hackers
On Tue, Oct 02, 2018 at 10:53:40AM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > On Mon, Oct 01, 2018 at 09:32:10PM -0400, Tom Lane wrote:
> >> FWIW, my problem with this patch is that I remain unconvinced of the basic
> >> correctness of the transform (specifically the unique-ification approach).
> >> Noah's points would be important to address if we were moving the patch
> >> towards commit, but I don't see much reason to put effort into it until
> >> we can think of a way to prove whether that works.
> 
> > Not even effort to fix the assertion failures I reported?
> 
> If it seemed relevant to the proof-of-correctness problem, I would look
> into it, but ...

I put some hours into theoretical study of the proof, and I didn't find any
holes.  When the planner removes "l0 LEFT JOIN r1", it does that because
there's one output row per l0.ctid regardless of the rows of r1.  Hence,
deduplicating on (l0.ctid) is equivalent to deduplicating on (l0.ctid,
r1.ctid).  In the bad FULL JOIN case, (l0.ctid, r1.ctid) would be sufficient
as a key, but we're out of luck because removing the join makes us have only
l0.ctid for some tuples and only r1.ctid for other tuples.

If PostgreSQL ever gets inner join removal, it would be harder to preserve
this optimization in this form.  At that point, perhaps we'd cost the path
that retains the join for the benefit of $SUBJECT.  Given the performance test
results, $SUBJECT may already need a cost-based decision on whether to use it.


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Rare SSL failures on eelpout
Next
From: Michael Paquier
Date:
Subject: Re: Make pg_checksums complain if compiled BLCKSZ and data folder'sblock size differ